Finally found a 1MB L2 cache for my 6100

killvore

6502
I wasn't even sure they were even around but last week I found one on eBay and jumped on it! I've run some tests this evening, and I've made arrangements to have it reverse engineered. If everything goes to plan, there should be some open sourced schematics which should allow anyone who wants one to build their own :D will probably still cost a bit since the SRAM is quite expensive just on its own, but at least it should be available - and maybe someone can find some clever improvements once its documented!

Anyway, some numbers. The test machine(s) are:
- a PM 6100/60 and an overclocked PM 6100/80
- 40MB RAM
- 640 x 480 video, 8 bit color
- System 7.6.1 with Speed Doubler
- HDI-45 internal video (with an extra HPV test)

The tests are:
- Quake 1 FPS using software render at 320x200 running "timedemo demo2" right after launch.
- Marathon 2 FPS (shift+?) fullscreened at 640 x 480 (no HUD). Spawn in - location is random, so strafe to the left corner and backpedal until you stop, wait until framerate stabilizes.
- Bryce 2 render. Time to render a test scene with some geometry and a bunch of different material types
- Finder scroll. Time to scroll from top to bottom of a massive folder, sorted by Name. This is me attempting to quantify the subjective experience of a snappier Finder.

MHzL2 CacheQuake fpsMarathon 2 fpsBryce 2 render (sec)Finder scroll (sec)
6007.68.064644.8
60256kB8.48.448539.4
601MB9.29.446034.2
8009.39.1656943.3
80256kB10.59.9239434.9
801MB11.911.4336028.8

And finally I ran a test with the HPV card - yes, I cracked 12 fps in Quake on a Family Pizza without a G3!
MHzL2 CacheQuake fpsMarathon 2 fpsBryce 2 render (sec)Finder scroll (sec)
801MB12.112.036629.1

At some point I might get around to making a stripped back Mac OS 8.6 since I've had slightly better performance there but I just hate the long bootup time sooo much 😅

For these very limited tests the bigger cache seems to provide twice the speed boost of the regular 256kB cache - nice! Bryce renders also get a speed increase from the bigger cache but it's dwarfed by the Sick Gainz from having any cache at all.

Screenshot 2026-03-11 at 22.44.00.png
 
Thank you! And no I don't think its made by Apple, I'm not sure they made them in that size. Here's the front and back - the chip with the missing markings is apparently a switch type 74*257 mux, according to the person who is reverse engineering it:
IMG_5720.jpeg


IMG_5719.jpeg
 
Amazing! Thanks for doing what you can to get your lucky find reverse-engineered. I hope it will be easy to dump the GALs!
My 6100 has no cache so I'll be keeping an eye out for a clone of this thing.
 
the chip with the missing markings is apparently a switch type 74*257 mux, according to the person who is reverse engineering it
Wow, why was that such a big secret that someone saw fit to hit it with sandpaper? Good luck with the reverse engineering effort, I hope the GALs are unlocked!
 
For these very limited tests the bigger cache seems to provide twice the speed boost of the regular 256kB cache
Not quite; you have an interesting set of benchmarks here - they all reacts differently to the combinations of parameters! Good choices :-)
If we compute the evolution from one line to the next for the 60 and 80 MHz, and then add the performance gain from 60=>80 MHz for the no-cache case (don't want to clutter the table with too much data), we have:
MHzQuake (FPS)Marathon (FPS)Bryce (s)Finder (s)
60
6010,53 %5,00 %-24,92 %-12,05 %
609,52 %11,90 %-5,15 %-13,20 %
80(22,37 %)(14,50 %)(-11,92 %)(-3,35 %)
8012,90 %8,30 %-30,76 %-19,40 %
8013,33%15,22 %-8,63 %-17,48 %

Quake and the Finder have similar benefits from the 0=>256 and 256=>1024 change. However, the jump in frequency is useless without cache for the Finder, while it's very meaningful for Quake (super bound gain is 33% for FPS and -25% for time going from 60 to 80). Marathon gains a bit from the small L2 and significantly more from the larger one, and reacts positively but not very well to frequency. Bryce wants 256 KiB of L2 to do anything useful.
My (educated) guesses:
* Quake is likely compute-intensive in a working set that reuse sublinearly - no surprise here, Quake is well optimized.
* The Finder is more unexpected, as scrolling should be basically bandwidth-bound - additional cache shouldn't really help here (I expect no temporal locality and near 100% spatial locality in each read cache line, everywhere but horizontal boundaries of the scrolled area). However, it depends what "scrolling" means here - if it's a one-shot movement, then it is unexpected. If it is repeated movement from continuous scrolling, then we get temporal locality and the benefits from caches can become dependent on the amount of data scrolled.
* Marathon has gain that are double from 256=>1024 than 0=>256 - unlike Quake, it has a working set that probably has more long-distance reuse. Maybe too much and that could be optimized a bit to benefit more from L1 and smaller L2.
* Bryce is compute-intensive in a working set that is basically size(L1)<< working set <= 256 KiB. It cannot live in L1 so the frequency gain is unrealized (limited by memory), but once you give it 256 KiB it's happy, realize most of the frequency gain and doesn't really need the extra L2 cache.

I'd say you want the overclock first for Quake, the (small) L2 first for Bryce, check for potential optimizations in the Marathon code before changing the hardware, and an accelerated video card (hardware blitting) for the Finder :-)
 
Nice find, thanks for posting your results!

I wonder if it could go to 2MB or more? The tag chips have extra pads and it looks like there are options for the config resistors. Good luck with the clone project.
 
Not quite; you have an interesting set of benchmarks here - they all reacts differently to the combinations of parameters! Good choices :-)
If we compute the evolution from one line to the next for the 60 and 80 MHz, and then add the performance gain from 60=>80 MHz for the no-cache case (don't want to clutter the table with too much data), we have:
MHzQuake (FPS)Marathon (FPS)Bryce (s)Finder (s)
60
6010,53 %5,00 %-24,92 %-12,05 %
609,52 %11,90 %-5,15 %-13,20 %
80(22,37 %)(14,50 %)(-11,92 %)(-3,35 %)
8012,90 %8,30 %-30,76 %-19,40 %
8013,33%15,22 %-8,63 %-17,48 %

Quake and the Finder have similar benefits from the 0=>256 and 256=>1024 change. However, the jump in frequency is useless without cache for the Finder, while it's very meaningful for Quake (super bound gain is 33% for FPS and -25% for time going from 60 to 80). Marathon gains a bit from the small L2 and significantly more from the larger one, and reacts positively but not very well to frequency. Bryce wants 256 KiB of L2 to do anything useful.
My (educated) guesses:
* Quake is likely compute-intensive in a working set that reuse sublinearly - no surprise here, Quake is well optimized.
* The Finder is more unexpected, as scrolling should be basically bandwidth-bound - additional cache shouldn't really help here (I expect no temporal locality and near 100% spatial locality in each read cache line, everywhere but horizontal boundaries of the scrolled area). However, it depends what "scrolling" means here - if it's a one-shot movement, then it is unexpected. If it is repeated movement from continuous scrolling, then we get temporal locality and the benefits from caches can become dependent on the amount of data scrolled.
* Marathon has gain that are double from 256=>1024 than 0=>256 - unlike Quake, it has a working set that probably has more long-distance reuse. Maybe too much and that could be optimized a bit to benefit more from L1 and smaller L2.
* Bryce is compute-intensive in a working set that is basically size(L1)<< working set <= 256 KiB. It cannot live in L1 so the frequency gain is unrealized (limited by memory), but once you give it 256 KiB it's happy, realize most of the frequency gain and doesn't really need the extra L2 cache.

I'd say you want the overclock first for Quake, the (small) L2 first for Bryce, check for potential optimizations in the Marathon code before changing the hardware, and an accelerated video card (hardware blitting) for the Finder :-)
Thank you for the great analysis!
1. The Finder scrolling is me just holding the mouse button down on the "scroll down"-button in the Finder, so maybe that is sent as repeated actions maybe? I don't know how it is implemented! But I must say subjectively at least, the Finder felt a lot faster with this 1MB cache.
2. I would like to do more tests with Marathon 2 - with different resolutions, for example, since this was pushing a lot more pixels than Quake was (running at 640x480 8bit instead of 320x200 8bit). Maybe the bigger jump Marathon 2 gets from the 1MB is the thing kan.org mentioned that the screen buffer now fits in the cache rather than using DRAM?

I forgot to mention that I ran the 80MHz 1MB cache test in Quake with doubled pixels and render every other line, and the performance hit was only 3% - maybe even more evidence that it is strongly CPU bound?

There is lots more that I would like to do for testing, but this was a fun start and I wanted to ship it off asap to get the reverse engineering started 😆

Tangentially related, I'm working on a 3D bracket which mounts 2x60mm fans to blast cool air into the "Hot Sandwich" if you have a PDS card installed - the 80MHz stays below 50 degrees even with this first iteration!
IMG_5755.jpeg
 
Those fans probably help with the cover off but once sealed you don't have any cool air coming in or hot are going out unless you cut holes somewhere.
 
Nono I run it with the lid on - I took it off to show the setup! The air comes in from the vent at the bottom under the hard drive bay (there's no plastic mount blocking it there), passes through the hot sandwich and gets sucked out through the PSU which also has a vent (I think I reversed the fan there, can't remember). I want to see if I can move the fan mount closer to the PDS/CPU zone 😆
 
Nono I run it with the lid on - I took it off to show the setup! The air comes in from the vent at the bottom under the hard drive bay (there's no plastic mount blocking it there), passes through the hot sandwich and gets sucked out through the PSU which also has a vent (I think I reversed the fan there, can't remember). I want to see if I can move the fan mount closer to the PDS/CPU zone 😆
Nice fit there, but might help to put some tape over the case holes in front (in terms of air flow direction) of the fans so that the air doesn't leave there and just recirculate around the fans. Not saying it doesn't work, but worth testing to see if it works better.

Air comes up before the fans, goes through, over the tape, down the tunnel made by the card chassis and cardboard, and then through the rest of the case.

Only thing is, are the PSU fans inlets? I feel like I've got this backwards before, but If it is, you might consider reversing your fans to work with, instead of against, because if you have fans fighting you might get odd airflows, or even an area of stationary air somewhere important.

1000034475.jpg
 
Last edited:
Back
Top