• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

Accelerator Benchmark collection

zigzagjoe

Well-known member
So, I find the various accelerator designs for compact macs absolutely fascinating - but there's very little info out there on how the different designs stack up against each other. I figured I'd start a thread to have a place to gather some of that info.

Of course, I'm curious about how the assorted SE/30 designs stack up, since we've got a healthy variety between the vintage Tokamac 040s, Bolle's carreras, as well as the 68030 based cards like the Powercache, Diimo, and the simple clock doubler type (forgot the name). The 68030/68020 accelerators for the SE, Plus, and classics are also terribly interesting too, some of them strap most of a motherboard onto the original!

I'll kick the thread off with a SE/30, experimental Diimo clone @ 50mhz cache enabled. (No idea if the performance is representative)

Software used - MacBench 3.0, Speedometer 4.01, and Norton System Info 3.2.1. It might be useful to do Speedometer 3.x too... thoughts?
1686193123569.png1686193102710.png1686193085612.png
 

Phipli

Well-known member
So, I find the various accelerator designs for compact macs absolutely fascinating - but there's very little info out there on how the different designs stack up against each other. I figured I'd start a thread to have a place to gather some of that info.

Of course, I'm curious about how the assorted SE/30 designs stack up, since we've got a healthy variety between the vintage Tokamac 040s, Bolle's carreras, as well as the 68030 based cards like the Powercache, Diimo, and the simple clock doubler type (forgot the name). The 68030/68020 accelerators for the SE, Plus, and classics are also terribly interesting too, some of them strap most of a motherboard onto the original!

I'll kick the thread off with a SE/30, experimental Diimo clone @ 50mhz cache enabled. (No idea if the performance is representative)

Software used - MacBench 3.0, Speedometer 4.01, and Norton System Info 3.2.1. It might be useful to do Speedometer 3.x too... thoughts?
View attachment 57730View attachment 57729View attachment 57728
Interesting. Is it worth just showing the CPU score in Norton? If you're talking about CPU upgrades. Otherwise the disk heavily influences the result.

Need to make sure results are searchable so people need to clearly describe their setups with keywords in the post text.

Can't help thinking tabulated results would be best - perhaps a shared Google spreadsheet as well as photos here?
 

Phipli

Well-known member
My 50MHz Macintosh SE with a 50MHz Total Systems Gemini Ultra. It has onboard RAM and so it is taking advantage of "fast" 32bit RAM.

Only Norton System Info results on hand sorry.

Scores for a 16MHz Total Systems Mercury card (16MHz 030 and 20MHz 68882, System RAM only "Upgraded Macintosh SE") and a 25MHz Mobius (25MHz 030EC and 20MHz 68882, 32bit RAM on card "Mobius 25EC+20") card are also visible in the photos.

I believe that Total Systems also sold a clone of this Mobius card as the Gemini, and Applied Engineering did too. The cards often reference "Quesse" on the silkscreen.

20230324_163057.jpg
20230324_163309.jpg
20230324_161354.jpg
20230608_114410.jpg
20230608_120403.jpg
20230608_120422.jpg

Excuse the lens distortion.
 
Last edited:

zigzagjoe

Well-known member
That Gemini is fantastic! Thanks for the pics, I love seeing these cards.

Good point about excluding Disk results - with SCSI2SD and so on, disk results have kind of ceased to have be useful info. I like the detailed results out of Speedometer for that reason, except that it weights Whetstones and Dhrystones very heavily in Math results.

I like your idea of a table. I threw together a google survey: https://docs.google.com/forms/d/e/1FAIpQLSdjrQLGwtRj5Jp6iF89aylEYFrcuqAKAtn7g5AmBrtXHAc8MQ/viewform

Results: https://docs.google.com/spreadsheets/d/1bdO3rF0DlVGDzlcmVvv-0KkzPsc94GhDklZwZFaOUPY/edit?usp=sharing
 

JC8080

Well-known member
Here is my SE/30 with both 40mhz and 50mhz DayStar PowerCache accelerators. I believe this is Speedometer 3, I'm not sure how useful this is since it's comparing the accelerators to each other and not to stock. I can compare to stock if anyone is interested.

Below I also included my SE Radius 25mhz 68020 accelerator w/ FPU, compared to the stock SE. Based on the results I believe this was Speedometer 2.

PXL_20230311_200821599.jpg

SE Radius vs SE.jpg
 

zigzagjoe

Well-known member
Here is my SE/30 with both 40mhz and 50mhz DayStar PowerCache accelerators. I believe this is Speedometer 3, I'm not sure how useful this is since it's comparing the accelerators to each other and not to stock. I can compare to stock if anyone is interested.

Below I also included my SE Radius 25mhz 68020 accelerator w/ FPU, compared to the stock SE. Based on the results I believe this was Speedometer 2.

View attachment 57778

View attachment 57779
I think that first one is Speedometer 4.0, it looks like you have SANE patches enabled as that results in a big jump in the KWhet test and nothing else. If we put that aside (and the disproportionate effect on the Math score it has) - the rest of the scores lines up with my Diimo clone closely.

This leads to a conclusion on cache sizes: Any cache is head and feet better than no cache, and 32KB cache vs 64KB cache doesn't make a notable difference for Speedometer.

Curiously, your video score is slightly better than mine. I noted an improvement in video performance with cache on, my guess being due to less contention on the 16mhz system bus. I wonder if the Powercache does a better job than Diimo at handling the system bus accesses?
 

Phipli

Well-known member
That Gemini is fantastic! Thanks for the pics, I love seeing these cards.

Good point about excluding Disk results - with SCSI2SD and so on, disk results have kind of ceased to have be useful info. I like the detailed results out of Speedometer for that reason, except that it weights Whetstones and Dhrystones very heavily in Math results.

I like your idea of a table. I threw together a google survey: https://docs.google.com/forms/d/e/1FAIpQLSdjrQLGwtRj5Jp6iF89aylEYFrcuqAKAtn7g5AmBrtXHAc8MQ/viewform

Results: https://docs.google.com/spreadsheets/d/1bdO3rF0DlVGDzlcmVvv-0KkzPsc94GhDklZwZFaOUPY/edit?usp=sharing
Any chance you could add an OS version column? Some accelerators won't work in newer OSes, so it is interesting to know, plus it causes differences sometimes (I think there were some SANE improvements in 7.0 that were later dropped? Something like that).
 

zigzagjoe

Well-known member
Any chance you could add an OS version column? Some accelerators won't work in newer OSes, so it is interesting to know, plus it causes differences sometimes (I think there were some SANE improvements in 7.0 that were later dropped? Something like that).
Good point, added. That raises another question - do most folks use the SANE patches/traps/"INSANE" mode made available by accelerators?
 

Phipli

Well-known member
Good point, added. That raises another question - do most folks use the SANE patches/traps/"INSANE" mode made available by accelerators?
I tend to use card defaults, so it varies by brand. I'm rarely doing scientific computation and would turn off the approximations if I was...

Daystar cards always seem to patch maths routines.
 

zigzagjoe

Well-known member
What started as some debugging information has morphed into a pretty decent collection of speedometer 4.02 scores with the assistance of Paulie, demik, and joshc from IRC. Speedometer selected as tests quickly and the individual tests can be characterized decently (at least on 030s) ie. dhrystones directly reflect memory subsystem performance while sieve is almost pure CPU. To nobody's surprise, cache is king. I am quite curious how @Phipli's crazy SE accelerator would match up with its fast RAM, though... :)

There's some interesting trends evident, and I'm now rather curious about the performance of the assorted SE/30 video options as it seems there's more variation than I expected for "dumb" framebuffers. I don't have a great way to ingest data for this, it's all manual, but if you have speedometer results for 68k systems I'll happily take them.

https://docs.google.com/spreadsheets/d/1JR5cCSfitfoo5JuMKVo3bK2dx5Zx7z-spgEBZkbpEeI/edit#gid=0
 

zigzagjoe

Well-known member
Here's some more benchmarks of interest. Background: I've got a couple of unique socketted accelerators I've created and I've been exploring performance with them.

1698432891565.png
This is a clone of the DiimoCache 030 I made. As always, massive thanks to Bolle for his reverse engineering effort.
CPU runs at 58mhz (rather than 50), FPU also at 58mhz (compared to 25mhz), and it has 128KB of cache (64kb stock).

1698432716521.jpeg
This is a clone of the Carrera 040. CPU running at 45mhz rather than 40, with the onboard 128KB cache working.

Doom timedemo: apparently very limited by bus access.
1698434074259.png

All Benchmarks.JPG

Note: "carrera 45" has external cache disabled, and internal in write-through mode. "45 wb" has internal 68040 cache in write-back mode, "wb+cache" has both caches. "slow" has a non-optimal clock phase change that slows memory access.

I am trying to get together some real world benchmarks (compiling code, compression, doom, etc) together to assauge my curiosity. So far System Info's detail CPU performace seems to give the most specific detail into CPU performance. Some interesting tidbits have come out of the mix....
  1. Additional cache on the Diimo 030 helped Doom quite a bit, more than just the raw clock speed increase explains.
  2. 68040 accelerators are very slow at accessing the main system bus, where 68030 accelerators are much faster.
    1. Bolle was able to confirm this behavior on both his Carrera and Daystar 040, as well as a Powercache 030 to compare to.
    2. Powercache 030 and Diimo have similar bus access characteristics - almost 2x as fast as an 040. The 040 is essentially no faster than the original 16mhz 68030 at accessing the bus.
    3. This results in slow video performance since the 040 can't update video memory quickly. Drawing primitives (lines, etc) is faster since that's limited by the CPU, but block copies (copying a picture from main memory, for example) are slow. Basically, not any faster than the original 030.
    4. So this answers why 030 accelerators with cache feel more responsive than 040s: Faster UI.
  3. Interesting regression: 68040 accelerator w/ both caches (write-back cache and external cache) is slower at memory access than a 68040 with no cache/internal cache in write-through mode
    1. Caches otherwise massively increases performance (as expected).
    2. However, when accessing (presumably writing) to main memory, the caches slow things down.
    3. This is counter to expectations: if anything, one would expect writes to be faster with the write back cache enabled.
  4. Speedometer is not a good overall test of a system. Macbench and System info are both better.
  5. Diimo uses around 5 watts additional (idle), Carrera uses around 8 watts (idle).
 

JC8080

Well-known member
Neat cards, thanks for adding to the collection.

Are board files for those available for DIY projects?
 

zigzagjoe

Well-known member
Neat cards, thanks for adding to the collection.

Are board files for those available for DIY projects?

I'm keeping the board files to myself for now. However, I can confirm the schematic images and GAL/PROM dumps Bolle posted in the Diimo and Carrera threads respectively are accurate for anyone else that wants to give these a tinker.
 

JC8080

Well-known member
Here are some MacBench 2.0 benchmarks of my Color Classic with a Sonnet Presto LC 040 accelerator. This is not the Presto Plus, so there is no RAM on the card. This causes a significant bottle neck since the '040 is accessing the RAM through the 16 bit data bus. I compared the accelerated CC against the same machine without the accelerator (8mb RAM for both tests). The machine is running at the stock resolution, it has not been modded for 640x480. I also added a LC 575 from the included results since this should accurately represent a Mystic mod. And an IIfx for fun. Not surprisingly, the Presto accelerator is much faster than the stock CC, and considerably slower than the LC 575.

Edit: this is the version of the Presto with a full 68040, not the sans-FPU 68LC040, this difference is apparent in the floating point tests vs. the LC 575.

Click thumbnails for larger images.

Benchmark 1.jpg
Benchmark 2.jpg
 

herd

Well-known member
I always liked this CPU benchmark:


That's nbench, which is derived from BYTE Magazine's BYTEmark benchmark program. It has a long history with results from a wide variety of machines over the years. That implementation compiles on 'nux, but I'm sure some of the C experts around here could build it for native Macintosh. From the posted results, it looks like the 68k CPUs are roughly equivalent to 486 or SPARC chips from around the same time.
 
Top