• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

Some Ghidra tools for ROM exploration

gm_stack

Member

I've posted some scripts to load into Ghidra to play with ROM images.

There's full info in the README.md in the Git repository, but in short:

AnnotateRomTables.py:
Initially based on rb6502's unirom, this one creates a bunch of structs to let you browse the Universal machine support tables in the ROM, and when disassembling the code, allows the decompiler view to show actual struct names and decode bitfields for some of this data (more to come).

So far only tested on the Quadra 800 ROM (F1ACDA13) - but might be made to work on others.

FixupBSR6.py:
A lot of the subroutines in the early boot use a calling convention where the subroutine is JMP'd to, with a return address in A6. Ghidra does not understand this, and especially does not understand that the JMP (A6) in the end of the subroutine is not an entire-address-space sized jump table. This finds JMP instructions that follow a LEA into A6 with a relative offset to PC and applies a flow override to tag them as a CALL. JMP(A6) instructions are then tagged as RETURN, and the decompiler is now a lot happier.

FindRomWrites.py:
I've got a copy of the ROM mapped as an "overlay" in Ghidra's address space to 0x00000000 as that's where it is at very early boot before the overlay is switched off. This reassociates all cross-references to memory read/write in this area with the bank of RAM defined there, though some of them are genuine and will need to be fixed up.

After it jumps to the "main" copy of the ROM at base address 0x40800000, all the ROM accesses are up there, so no problems.

ImportLomemGlobals.py
Imports a (modified) copy of the Low Memory Globals list from the Mac Almanac, so all those memory accesses to short immediate have the correct names.

ImportSymbolsShifted.py
Modified copy of the one that comes with Ghidra, that allows specifying a base address, and memory segment. Used with https://github.com/cy384/68k-mac-rom-maps to import them offset to the base address of the ROM.

RemoveUndefinedTypes.py
Removes any data blocks with undefined1 / undefined2 / undefined4 type but no actual reference to anywhere else, as these stop decompilation when hit.

... and the Ghidra project?
It's a pretty big mess at the moment, but when I have a cleaner version I'll post it. A few screenshots attached showing the decompiler almost giving a decent result. (though to be fair GetHardwareInfo is a massive mess that jumps all around the ROM with pretty crazy program flow)

(one final note re the decompiler - no, there's no chance you'll ever be able to actually compile that C code and get a working ROM. It's basically C syntax pseudocode for what the assembly is doing - much easier to read!)
 

Attachments

  • Screenshot 2023-12-03 at 12.00.24 am.png
    Screenshot 2023-12-03 at 12.00.24 am.png
    897 KB · Views: 24
  • Screenshot 2023-12-03 at 12.01.15 am.png
    Screenshot 2023-12-03 at 12.01.15 am.png
    1.3 MB · Views: 24
  • Screenshot 2023-12-03 at 12.03.09 am.png
    Screenshot 2023-12-03 at 12.03.09 am.png
    1.4 MB · Views: 24

gm_stack

Member
I've pushed a few little updates -

FixupBSR6.py now handles BSR5 and BigBSR5/6.

AnnotateRomTables.py has had a major overhaul, and I've now implemented wrapper classes for the ability to use Ghidra's parsing of the struct back in my code!

Python:
# Create the MachineInfo table at the right address
m = structWithRelativePointers(info_ptr, machineInfo, machineInfoRelativePointers)

# and now in our code we can read back the values from the struct!
labelName = cleanup_identifier(LABEL_PREFIX + "Machine_%s_%s" % (m.productKindBoxInfo, m.decoderKind))
createLabel(info_ptr, labelName , True)

print(m.addrDecoderInfo.private.defaultBases.IWM_SWIMExists)
print(m.nuBusInfo.SlotA)

It's all getting pretty big though, I think I need to create a library for common functions...

(and work has started on AnnotateRomResources.py - thanks to @eharmon for giving me a reference implementation to test against :) )
 

dougg3

Well-known member
Really nice work! I'm looking forward to trying this out. I've been using IDA for years (yes, I paid for it a while ago), and I'm still way more comfortable in IDA than Ghidra, but tools like this are really going to change my mind. The BSR6 things were always a big pain in the butt to deal with. Thank you so much for creating and sharing these!
 

gm_stack

Member
I'm just starting to dabble with Ghidra and I am wondering if there is any automated way to import the comments, labels, symbols, and notes from these files:


Doing it manually would take ages! Maybe there is some value in a tool that imports FDisasm files, as apparently, that tool has some knowledge of Mac idiosyncracies?
Yep, you certainly can do that.

You'll need to write a script to parse those files, and determine the address and the comment. Then you'd simply use a function to set the comment at that address and/or create a function at that address. Most of what you need is in the FlatProgramAPI.


These are the two of mine that you'd probably be best placed to edit to do that - you can create Ghidra scripts in Python or Java. Be warned that the Python is really Jython - so it's Python 2, not Python 3.



In regards to this project, I've somehow ended up starting a few more projects, some of which are not even on a computer, so I haven't done much more, but I'm sure I'll get back to it at some point...
 
Top