Here's my understanding, which is by no means complete or authoritative...:
The ProductInfo tables allocate address space for 64MB/bank with 2 banks per SIMM slot.
When determining the size of memory available, the ROM starts at address 0 of each bank, probing to determine the amount of memory in each bank, assuming each bank is contiguous. However, the djMEMC's per-bank DRAM configuration register requires a different value for >32MB SIMMs, and the ROM doesn't iterate the probe twice, with the two different configuration values. It just iterates once, detecting <= 32MB in each bank. So when we plugin a 128MB SIMM, it finds 32MB in the first bank, skips the 2nd 32MB, finds the 3rd 32MB in the second bank, and skips the 4th 32MB.
My hack was to just change the value written to every bank's DRAM config register to only work in 64MB. So, it really only detects 64 or 128MB SIMMs. A more complete solution might be to iterate once looking for <=32MB in each bank, then iterate again looking for 64MB in each bank, and making sure to set each bank's configuration register appropriately for what was found. But that'd require restructuring the logic of the code (and moving it to make more room), rather than just messing with an existing value, and I'm not really sure that's worth the effort.
This code is the same for all of the djMEMC based machines, so as long as the ROM SIMM works in the 650 and 800 machines, it seems pretty easy to get 520MB working in them.
Disabling the RAM test was very helpful. I didn't disable it initially and thought my change didn't work, since the machine sat there without video so long at power on.