• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

Wombat (650, 800) board overclocking limitations

lobust

Well-known member
2. My Daystar 128K Quad Cache (aka FastCache Quadra) does not work in the overclocked configuration. The Daystar QuadControl v2.3 init hangs at startup. Unlike the IIci cache card, the Quad Cache card is not enabled until the software kicks in.
2. Sounds about right. I don't know where it is, but there should be an 040 cache thread with info, or Bolle might've chimed in earlier in the thread, but I believe the long and short is that the wait state config on the cache cards is definitely made to work with stock Quadras (not sure about the 840av, since it's a completely different beast), so 33 and below is likely the target.

Apologies for going a bit off-topic, but you two seem to know a bit about the FastCache Quadra and I haven't found much solid information about it before now.

I have one, and installed in my Quadra 800 (modified 650 board, my original 800 board has dead SCSI right now so I can't test it in that one) running 8.1 I can't tell if it's working or not. It gets warm, and the control panel says it's enabled, but tattletech doesn't report anything in the PDS slot, nor does it report any additional L2 cache, and benchmark scores seem unaffected.

The only hint I have seen that it might be doing something, is that it won't enable (greyed out) in the control panel when I have my Silicon Express II card installed as well.

Do you guys know what's up with that? Is there a definitive method to tell if one of these cards is actually enabled or not?
 

jessenator

Well-known member
Apologies for going a bit off-topic, but you two seem to know a bit about the FastCache Quadra and I haven't found much solid information about it before now.
Well, you're giving me far too much credit ;)

I'm sort of along for the ride at this point, really. I made a few "discoveries" here and there, but it was more to fuel my assertion that with a few tweaks and good/late-stage 040 improvement, that overclocks were possible and practical today.

Some conversation was started here:

It might've gotten lost in the crash, but I thought there was a thread solely dedicated to the FastCache Quadra. I don't think all of the PALs got decoded, since one was damaged. I wonder (hopefully not!) if there's a fault in one of the same PALs on your cache card.

Apple's and oranges here, but there's documented evidence on the Micron Xceed card of regular failure on specific RAM ICs. This is complete supposition, but it might be a similar thing where the chip fails in the same way on the same device. Again, hopefully not with yours!

I think it's the perfect idea for a new thread, though! Might bring out the real experts that way :)

Is there a definitive method to tell if one of these cards is actually enabled or not?

Wild guess here, but I think either MacBench 4 or TechTool Pro will give an accurate-ish readout. It's far more accurate than Apple System Profiler. MB4 will actually detect and accurately tell me what my PPC's L2 cache is, while ASP just says "256k" regardless…
 

trag

Well-known member
Cacheometer (cachometer?) from NewerTech will usually report accurately if a cache is present adn working. Sometimes easier to find in the "guages" collection.
 

David Cook

Well-known member
Cacheometer (cachometer?) from NewerTech will usually report accurately if a cache is present adn working. Sometimes easier to find in the "guages" collection.

I found Newer Technology Cache-22 v1.2.1, v1.41, and v1.5.1 on the Macintosh Repository. All indicate they need a PowerPC processor to run. I'd be interested if anyone has a different experience or version that runs on the 680x0 series.
 

trag

Well-known member
I may have been mistaken in suggesting them. Memories are old and vague. There probably wasn't any reason for NewerTechnology's utilities to support anything before the PPC machines.

Sorry.
 

David Cook

Well-known member
Here's some fun for your Saturday night. I wrote a quick little program that allocates a memory block and reads it repeatedly for 1 second. It then repeats that for various memory block sizes. Theoretically, it should show hints at the sizes of various caches.

There are four tests plotted (two runs each of no cache and 128KB cache), but only two are obvious because the test results show strong repeatability. To simplify, the red line is 'no cache' and the orange line is '128K cache'.

1657422380233.png

Here you see that performance starts to drop off around 4 KB. That's the internal L1 68040 cache. There is the start of a drop between 2 KB and 4 KB, but that's because the CPU is using some of the cache for other purposes and can't fit my entire 4 KB memory chunk.

The orange line shows a moderate drop starting at 4 KB, but that's because the 128 KB cache is at level 2 (L2) so is not quite as fast as the internal CPU cache. After 128KB we see a significant drop.

Surprise: The 128 KB cache causes further memory accesses to be SLOWER than without the cache. This is the cost of a cache miss (or memory controller timing change?). Bummer!

The difference between the slope of the CPU cache miss (curved) and the 128 KB cache miss (hard drop) suggests Motorola had a better caching algorithm.
 
Last edited:

David Cook

Well-known member
Now let's look at the Macintosh IIci.

1657423015686.png

The peak at 256 bytes is because the 68030 has a 256 byte L1 data cache!

The drift upward of the red/blue and green/light blue lines is an artifact of the code. Basically, the overhead of the loop code itself slows down the total number of bytes it can read per second for smaller block sizes.

The yellow/grey line shows a drop after 32KB, owing to the 32 KB L2 cache card.

The green/light blue lines drop after 128KB, showing that the 128 KB L2 cache card is working.

Surprise: The Micron 128KB cache card is slightly more effective at reads outside the cache range than the Apple 32KB cache card. Cache algorithm? Memory controller timing?
 

eharmon

Well-known member
Necromancing this thread again, did we ever settle on proper overclock expectations? Here's some rough things I've seen from above and elsewhere:
  1. 33-42MHz: Easy, any Q650/Q800 board ID can do it.
  2. 42MHz-44MHz: Limit of serial ports. Possible limit of onboard Ethernet...but that seems to vary? Perhaps by driver?
  3. 44MHz+ needs more wait states (Speedbump config) for reliable operation:
    1. 44-46MHz: Works at times, but unstable. Needs more wait states to run rock solid.
    2. 46-48MHz: Progressively more unreliable. Needs more wait states for usability.
    3. 48-50MHz: Not even gonna boot consistently without Speedbump, but still exceeds limit of VRAM and is unstable, might work with video card?
My own Q650 (normal wait states) experience: with a video card, ethernet card, and active cooling, it runs nearly room temperature, and as near as I can tell, rock stable at 42MHz. 43-44 seem mostly reliable when using an ethernet card. Pushing 45-46 it gets "quirky", with probable memory corruption. Though I had it running fine at 50MHz for a few days somehow (no idea how that worked...yes Clockometer confirmed), it holds pretty steady in this state at the moment.

Separately, did we work out if Board ID 51 just flips the waits, or does it change anything else? That seems like it would fit with @David Cook 's 1% slowdown clock-for-clock when switching to 51.
 
Last edited:

eharmon

Well-known member
Separately, did we work out if Board ID 51 just flips the waits, or does it change anything else? That seems like it would fit with @David Cook 's 1% slowdown clock-for-clock when switching to 51.
One day I'll learn to re-skim the thread one more time before replying...even if it's 9 pages!

Yes, it does change a few values, per post 34: https://68kmla.org/bb/index.php?thr...rd-overclocking-limitations.38538/post-417538

And if I read the hax tool output correctly, @cy384 forced exactly the same bits in the ROM.

Anyway, adding to the data points about memory speeds, I've accumulated 3 Wombat boards over the years (all 8MB/ethernet), and, respectively, they have:

C650: RAM: 70ns, VRAM: 80ns
Q650: <need to open up the case and check>
Q800: RAM: 60ns, VRAM: 80ns

That follows the trend others have seen.

I don't think it was explored earlier in the thread, but from post 34 it does seem like the Q800 runs different stock timings from the Q650, perhaps explaining why it needs 60ns memory. Returning to the decoded data:
Code:
@DJ_ORIG,%00010010    ; 33MHz Frigidaire package (Quadra 800)
Code:
@DJ_BUMP,%01010010    ; 33MHz Lego package (Quadra 650)
The Q800 uses the ORIG values, and the Q650 uses the BUMP, which we see here:
Code:
@dj33Config    dc.w    %0000000010100011    ; mhz33=1, cyc23ta=1, ROMspeed=3
Code:
@bump33Config    dc.w    %0000000011111011    ;          mhz33=1, drcpw=1, cyc2ta=1, drpchg=1  drpw=1, ROMspeed=4

So if I'm understanding correctly, the Q800 is faster stock, thanks to fewer wait states? That would restore credence to the claim it's a "bit faster".
 

volvo242gt

Well-known member
^A slight oddity about the Q800 board. If power LED is added, it identifies itself as a Q650 board. If the power LED is not present, then it identifies itself as a Q800 board. Noticed that about a Centris 650 that I upgraded with an 800 board in 2014. Before the LED was added, it said it was a Quadra 800. After the LED was installed, it identified itself as a Quadra 650.
 

eharmon

Well-known member
^A slight oddity about the Q800 board. If power LED is added, it identifies itself as a Q650 board. If the power LED is not present, then it identifies itself as a Q800 board. Noticed that about a Centris 650 that I upgraded with an 800 board in 2014. Before the LED was added, it said it was a Quadra 800. After the LED was installed, it identified itself as a Quadra 650.
Yeah it’s a clever trick to save money on another resistor. You can see the whole matrix in post 3: https://68kmla.org/bb/index.php?thr...rd-overclocking-limitations.38538/post-416575
 

eharmon

Well-known member
Alright, for Science™ I finally went ahead and swapped my Q800 board into my 650. I can take some screenshots tomorrow, but default Q800 timings definitely run faster than Q650 timings:

Running both boards at ~42MHz, the Q800 scores 10% higher in Norton SysInfo than the Q650 for memory-based tests. All other tests are basically equal, as you'd expect.

Despite having 60ns RAM onboard and in the slots, it somewhat unsurprisingly seems to have a lower overclocking limit -- I had the Q650 running at 45MHz and the Q800 won't get past the boot splash at that speed. I would say clearly the timings at play.

So that implies that the Q650 might be king of the overclock when you can't loosen timings -- despite 70ns onboard memory, the ROM's default timings give you a little more bus speed overhead to work with. So it might be faster overall when you push the machine to it's limits since it's gentler on the memory.

But with a custom ROM the Q800 should be king of the Wombats. What gets interesting is this starts to look more like Intel overclocking: there might be a sweet spot with a slower bus/CPU but tighter timings that make the system overall a bit faster. And that’s likely a bit different for every system!
 

Phipli

Well-known member
But with a custom ROM the Q800 should be king of the Wombats.
They have identical ROMs though? You just need to pick the different IDs using resistors to get the different timings?

Any difference in overclockability will be just down to variations in components between the boards.

Once you're using the 40MHz timings, the bottleneck perhaps will be the MC88920 (what speed rating are yours?).
 

Jockelill

Well-known member
can’t chip into the overclocking (yet), but I can confirm that the ROM from a 475 works in a Q650. It will then identify itself as a Q800. Would be interesting to see the timings from that ROM! My recent experiments with ROM hacking (see also @dougg3 thread) has proved this. You need a special SIMM, the current ones will not work, but it’s close to ready to be available. Of course you also need to solder in a ROM socket, but that I assume was clear already 😂

Here are two pictures with my own Q650 with custom ROM, 296MB RAM and loaded ROM disk.635D20FE-4CA0-49F2-B658-A59DE2805BF2.jpeg4CEFE7C6-4A2C-463C-A1BC-A6CD9FB522B5.jpeg
 

eharmon

Well-known member
They have identical ROMs though? You just need to pick the different IDs using resistors to get the different timings?

Any difference in overclockability will be just down to variations in components between the boards.

Once you're using the 40MHz timings, the bottleneck perhaps will be the MC88920 (what speed rating are yours?).
With a custom ROM you could adjust the timings or wait states, as in post 48: https://68kmla.org/bb/index.php?thr...rd-overclocking-limitations.38538/post-417626

So theoretically since the boards already seem to be able to handle 40+ at wait states designed for 33 (but probably not within the tolerances the original designers were comfortable with), it's possible you can run a Q800 board with 60ns memory at tighter timings at say, 48MHz, than you could run a Q650 board with 70ns memory. But still looser timings than the stock ROM to improve stability.
can’t chip into the overclocking (yet), but I can confirm that the ROM from a 475 works in a Q650. It will then identify itself as a Q800. Would be interesting to see the timings from that ROM! My recent experiments with ROM hacking (see also @dougg3 thread) has proved this. You need a special SIMM, the current ones will not work, but it’s close to ready to be available. Of course you also need to solder in a ROM socket, but that I assume was clear already 😂

Here are two pictures with my own Q650 with custom ROM, 296MB RAM and loaded ROM disk.
It identifies as a Q800? That's pretty weird! It would be interesting to run the ultra hax tool and see what configuration is coming up.

Also might be interesting to try a Q630 or Performa 580 ROM. I think those were the last desktop boards from that ROM branch, working backwards from dates, versions, and sizes.
 

Phipli

Well-known member
So theoretically since the boards already seem to be able to handle 40+ at wait states designed for 33 (but probably not within the tolerances the original designers were comfortable with), it's possible you can run a Q800 board with 60ns memory at tighter timings at say, 48MHz, than you could run a Q650 board with 70ns memory. But still looser timings than the stock ROM to improve stability.
The ROM already contains timings for 40MHz, although they don't seem to be as well refined, some of my RAM doesn't work so well with the 40MHz timings.

Tweaking them to be better would be good, but it doesn't sound like you've tried them? Perhaps your machine would overclock further with the 40MHz timings?

One thing that can help is to just desolder the onboard RAM so you can just use fast SIMMs. That's what I did with my Q610. I did briefly have it running at 50MHz with 33MHz timings (the ROM (same one as the 650/800) doesn't recognise a 610 with 40MHz timings sadly, but that is my main interest wrt the custom ROMs - make my 610 use 40MHz timings.
 

eharmon

Well-known member
The ROM already contains timings for 40MHz, although they don't seem to be as well refined, some of my RAM doesn't work so well with the 40MHz timings.

Tweaking them to be better would be good, but it doesn't sound like you've tried them? Perhaps your machine would overclock further with the 40MHz timings?

One thing that can help is to just desolder the onboard RAM so you can just use fast SIMMs. That's what I did with my Q610. I did briefly have it running at 50MHz with 33MHz timings (the ROM (same one as the 650/800) doesn't recognise a 610 with 40MHz timings sadly, but that is my main interest wrt the custom ROMs - make my 610 use 40MHz timings.
Right, it would definitely run more reliably at high clocks with 40MHz timings/waits. But, for instance, there's actually two sets of 40MHz waits:

Code:
@dj40Config    dc.w    %0000001011110100    ; dwcpw=1, mhz33=1, drcpw=1, cyc2ta=1, drpchg=1, ROMspeed=4
@bump40Config    dc.w    %0000001011111100    ; dwcpw=1, mhz33=1, drcpw=1, cyc2ta=1, drpchg=1, drpw=1, ROMspeed=5

So even there, there are options. I suspect the intention is the Q650+Speedbump would use bump40, and the Q800+Speedbump would use dj40, since that matches with the 33MHz configs. But who knows.

At any rate, afaik those settings all slow memory access. It's possible that, making up an example, a 45MHz machine with half those waits removed would actually run faster overall than a 46MHz machine with the bump40 config. And still run stable, with 60ns memory, where at 46MHz it wouldn't. Basically there's a sweet spot somewhere, and being able to tweak all the waits opens up options to find that sweet spot per board.

So I'm suggesting it might be interesting to create a new dj40Config that removes some waits to maximize overall performance, at the cost of raw clock speed, and see what happens.

But yeah, desoldering the RAM and running fast as possible memory in the slots would probably help it run tighter timings too. I think FPM went down to 50ns?
 
Top