Maximum SiliconExpress IV Throughput

eharmon · Dec 31, 2025

Continuing my SiliconExpress experimentation, I've been doing some benchmarks on my Quadra 650 board with ZuluSCSI.

Read

Native SCSI	ZuluSCSI (RP2040)	4,700KB/s
SiliconExpress IV (8-bit) - 1.6.5	ZuluSCSI (RP2040)	8,123KB/s
SiliconExpress IV (16-bit) - 1.6.5	ZuluSCSI Wide	8,959KB/s

Write results are generally ~30% slower.

I need to try the SCSI 4.3 firmware to see if it's any different.

Wombat boards have a NuBus implementation that leaves a bit to be desired (no double data rate transfers @ 20MHz except between cards), which TIL 9305 implies should give you 8-10MB/s to the logic board and a theoretical 20MB/s out of the logic board (if the destination device could accept block transfers at zero wait). So that seems ballpark to what we're getting.

Still, surprisingly rough! So it seems on the earlier Quadras there's only a small boost from moving to 16-bit (~10%).

The official documentation always seemed ambiguous to me if later devices really supported 20MHz transfers to the logic board. Has anyone benchmarked a Quadra AV or 6100/7100/8100 and a SiliconExpress?

Unknown_K · Dec 31, 2025

I benched spinning disks ages ago on Jackhammers and SEIV (I have SEII's and other Nubus SCSI cards as well). There should be results from back then on the forum if you search.

lobust · Dec 31, 2025

An interesting data point - unfortunately I didn't document much of it - I did exactly what you've done here a couple of years ago with a SCSI2SD v6. What stands out to me is that I got the same results as you through the SEIV, but my native reads were significantly slower than yours at ~2800

I do have an 840av now and still have the SEIV, but I have little spare time for this kind of thing right now...

Here are my old threads on the topic, someone else posted some Q950 and PM8100 benchmarks in the SEIV one:

ATTO Silicon Express II Nubus

I recently bought one of these to try out - google turns up almost nothing about this card, so I thought I'd document what I've learned about it. 1. It is bootable, at least on a Quadra 650/800, haven't tried any others yet. It will not find a bootable disk on it's own, but if you boot by some...

tinkerdifferent.com

ATTO Silicon Express IV Nubus

I have read some comments here and there suggesting the Silicon Express IV is not bootable. I have just acquired one, and I can confirm that it IS in fact bootable, at least on a Quadra 800, which is the only Mac I have thus far tested it in. The same rules apply as with the SE II - it will...

tinkerdifferent.com

eharmon · Dec 31, 2025

lobust said:
An interesting data point - unfortunately I didn't document much of it - I did exactly what you've done here a couple of years ago with a SCSI2SD v6. What stands out to me is that I got the same results as you through the SEIV, but my native reads were significantly slower than yours at ~2800

I do have an 840av now and still have the SEIV, but I have little spare time for this kind of thing right now...

Here are my old threads on the topic, someone else posted some Q950 and PM8100 benchmarks in the SEIV one:

ATTO Silicon Express II Nubus

I recently bought one of these to try out - google turns up almost nothing about this card, so I thought I'd document what I've learned about it. 1. It is bootable, at least on a Quadra 650/800, haven't tried any others yet. It will not find a bootable disk on it's own, but if you boot by some...

tinkerdifferent.com

ATTO Silicon Express IV Nubus

I have read some comments here and there suggesting the Silicon Express IV is not bootable. I have just acquired one, and I can confirm that it IS in fact bootable, at least on a Quadra 800, which is the only Mac I have thus far tested it in. The same rules apply as with the SE II - it will...

tinkerdifferent.com

Interesting, so from those benchmarks, you can't really break the 10MB/s barrier on an 8100 either. I also noticed the same dip with a PowerPC card in the Quadra.

The RP2040 Zulu is quite a bit faster than the SCSI2SD v6. I wonder if the card's read ahead cache makes up for lower transaction performance, which is why you see a bigger jump on the SCSI2SD.

I want to run three more benchmarks:

Run the Q650 @ 40MHz. It won't speed NuBus but it will tighten memory performance which might squeak a little more speed out. Doubt it makes much of a difference.
Switch to the SCSI 4.3 firmware. Theoretically this improves burst performance as we're less bottlenecked on the CPU. Again, I doubt it makes much of a difference, but maybe it brings back PPC perf.
Assign the card to a Radius Rocket, which should allow direct NuBus 90 transfers. The chip on the SE IV is definitely capable of 20MB/s, so if anything can do it, that should.

zigzagjoe · Dec 31, 2025

eharmon said:
Interesting, so from those benchmarks, you can't really break the 10MB/s barrier on an 8100 either. I also noticed the same dip with a PowerPC card in the Quadra.

The RP2040 Zulu is quite a bit faster than the SCSI2SD v6. I wonder if the card's read ahead cache makes up for lower transaction performance, which is why you see a bigger jump on the SCSI2SD.

I want to run three more benchmarks:

Run the Q650 @ 40MHz. It won't speed NuBus but it will tighten memory performance which might squeak a little more speed out. Doubt it makes much of a difference.

Switch to the SCSI 4.3 firmware. Theoretically this improves burst performance as we're less bottlenecked on the CPU. Again, I doubt it makes much of a difference, but maybe it brings back PPC perf.

Assign the card to a Radius Rocket, which should allow direct NuBus 90 transfers. The chip on the SE IV is definitely capable of 20MB/s, so if anything can do it, that should.

The earlier rockets don't support 20mhz transfers; the extra pins to support 2x transfers aren't wired. Rockets all support block transfers and can perform block transfers between cards (even if the host doesn't support block transfers) but it's at the standard 10mhz rate. Maybe the later Stage II Rockets support 2x. You might check the SCSI card too - if those pins aren't wired, then no 2x transfers.

jeremywork · Dec 31, 2025

Not in front of me right now, but I was able to get ~18MB/sec on a Quadra 840AV with a 10kRPM 16-bit drive attached to an SEIV. I think the 8100/100 and 8100/110 will perform within ballpark. The 6100, 7100, and 8100/80 have an older version of the NuBus controller IIRC which may inhibit performance.

eharmon · Dec 31, 2025

zigzagjoe said:
The earlier rockets don't support 20mhz transfers; the extra pins to support 2x transfers aren't wired. Rockets all support block transfers and can perform block transfers between cards (even if the host doesn't support block transfers) but it's at the standard 10mhz rate. Maybe the later Stage II Rockets support 2x. You might check the SCSI card too - if those pins aren't wired, then no 2x transfers.

Yeah, I'm gonna try a Stage II. The docs claim they're wired for 20MHz operation, but I'll take a look at the board!

jeremywork said:
Not in front of me right now, but I was able to get ~18MB/sec on a Quadra 840AV with a 10kRPM 16-bit drive attached to an SEIV. I think the 8100/100 and 8100/110 will perform within ballpark. The 6100, 7100, and 8100/80 have an older version of the NuBus controller IIRC which may inhibit performance.

Interesting. So maybe they really are faster, or maybe there's an interaction between the SE IV and a ZuluSCSI Blaster. There's a number of improvements in later machines that could definitely explain performance improvements, but it's rather ambiguous:

The Quadra AVs have a MUNI controller and the x100 a BART, which should be a an updated version, implying the x100's should be more capable but...you never know.
The Quadra AV Developer Note notes "faster data transfer rates to and from the CPU bus" and "NuBus '90 transfers between cards at a clock rate of 20MHz". The x100 Developer Note notes that BART supports "transferring one-cycle or four-cycle transactions".
I haven't found a copy of the 8100/110's Developer Note, but the 8100's spec sheet claims "Three internal NuBus expansion slots; the 8100/110 also supports higher-performance burst mode between NuBus cards". Calling it out implies something new, but it still ambiguously claims the performance boost is only "between cards".
For all these newer machines, there's quite a few DMA improvements as well which could reduce overhead.

It's pretty bizarre how poorly documented it is. While PCI was on the horizon, you'd think better NuBus would have been something worth marketing! I'm inclined to believe the AVs and x100 are faster (and maybe 8100/110 even more).

David Cook · Dec 31, 2025

Very interesting! How much does the maximum throughput affect user perception? That is, how much does it shave off of boot time or launching an application?

I don't know enough about SCSI, but can you make a call asynchronously to one drive, and then to another drive, and when the results are ready from either drive it will ask for attention? Or is the bus reserved until the first drive returns its result?

eharmon · Jan 1, 2026

David Cook said:
Very interesting! How much does the maximum throughput affect user perception? That is, how much does it shave off of boot time or launching an application?

I don't know enough about SCSI, but can you make a call asynchronously to one drive, and then to another drive, and when the results are ready from either drive it will ask for attention? Or is the bus reserved until the first drive returns its result?

I haven't timed it (someone else might have stats), but it's definitely noticeable. Not as much as the low latency from solid state, but it helps large programs load.

SCSI is pretty complex. There's interactions between async bus communication, async drivers, and bus disconnect. I'll go first so someone can correct me

:

Async bus is slower overall but frees the processor. This isn't really valuable when you have a SCSI ASIC as it's already offloaded the processing.
However, you really want an async driver, since the OS will wait for data to return regardless of bus communication on a sync driver. I believe this was introduced in SCSI Manager 4.3. Even on a sync bus, the ASIC handles the interaction and DMAs the data back, so you get the sync performance without drawback.
This gets a little funky with the SE IV, as the older (non-4.3) firmware directly drives the card over NuBus can handle this itself (it's not a SCSI driver at all). I'm not sure how it behaves currently.
Finally, devices normally hold the bus while they're processing transactions (as they remain selected). This means you'll get slower speeds if you have a number of devices busy on the same bus as they conflict. For instance, if you're emulating a few drives at once. Disconnect support means a drive can detach from the bus while it's operating to allow a transaction to be sent to another drive. I believe this is a prerequisite for async operation on 4.3.
4.3 is only in ROM on Quadra AV and newer (or PPC upgrades). However, it's in the OS on 7.5+ and supports older devices, and was provided as an extension for developers which you can copy back to 7.1 (or maybe even 7.0). Devices need to be formatted with both a "regular" and 4.3 driver, and it'll swap over while it boots.
Since 4.3 added other features (like 16-bit SCSI IDs), this means older devices can't boot from anything above 7, though I believe the other drives will mount up once the OS starts.

I think I went on a tangent there...but it's useful stuff to know for boot perf!

zigzagjoe · Jan 1, 2026

David Cook said:
Very interesting! How much does the maximum throughput affect user perception? That is, how much does it shave off of boot time or launching an application?

I don't know enough about SCSI, but can you make a call asynchronously to one drive, and then to another drive, and when the results are ready from either drive it will ask for attention? Or is the bus reserved until the first drive returns its result?

Throughput doesn't help much for user perception. You can compare it to SSDs today: faster random read access with flash storage is by far and away most important for user experience so even eMMC is perceived to be much faster than a HDD. In a vintage context, throughput was really only necessary if you needed to ingest (or output) a lot of data quickly for realtime video / audio production or needed increased storage capacity.

SCSI Manager 4.3 supports aynchronous access and AFAIK adds command queuing too, given drive support. So, to your question, possibly, if everything was fully implemented on both the application and driver side. There's a dev note on SCSI Manager 4.3 that you might find interesting. In the real world, you'd want to try to avoid contention by keeping your boot/application disk on the system bus and put the data drives only on the fast bus.

While I was working on NuCF, I had a test case with slow timings to slow down maximum sequential throughput to about the same as the internal SCSI. Even with that in place it was still perceptually much faster due to the improved random access performance on the CF and bypass of the slow SCSI stack. Most user code is going to be requesting random small bits of data rather than big chunks where the sequential performance would really pay off / be noticable.

jeremywork · Jan 1, 2026

eharmon said:
Interesting. So maybe they really are faster, or maybe there's an interaction between the SE IV and a ZuluSCSI Blaster.

I think it's owed to an architectural improvement... the same drive/SEIV on a Q950 reached only about 8MB/s (closer to 9MB/s when overclocked to 40-45MHz or running a 100MHz DayStar 601.)

David Cook · Jan 1, 2026

eharmon said:
Devices need to be formatted with both a "regular" and 4.3 driver, and it'll swap over while it boots.

Wow. I assumed there would be some flags or a driver call that would ask "Hey, can you run in 4.3 mode".

eharmon said:
since the OS will wait for data to return regardless of bus communication on a sync driver.

Ok. That's the heart of my line of questioning. If the OS is really waiting, then a faster bus will make more of a difference to the user's perception, as the bus response time is a bottleneck. Whereas if the OS is able to do other things in parallel, then those a faster bus will only matter somewhat, as theoretically the CPU is busy doing other stuff.

Imagine a file being unstuffed. If the the decompression algorithm can happen in parallel to the SCSI writes, then faster SCSI writes may not greatly improve overall time. However, if the OS effectively waits the CPU until SCSI responds, then faster SCSI would make a big difference.

Thank you for the detailed response.

eharmon · Jan 1, 2026

David Cook said:
Wow. I assumed there would be some flags or a driver call that would ask "Hey, can you run in 4.3 mode".

FWIW, a driver can support both in a single binary, but it's basically two drivers in one. One old API, one new API. IIRC 4.3 drivers are always required to be backwards compatible (or at least, the documentation instructs you to).

If you have an Apple_Driver43 partition, a 4.3-supporting driver is installed.

eharmon · Jan 3, 2026

Some results from 4.3 today. As a reminder, I've got this in a Wombat board, which doesn't have native 4.3 support in ROM. So:

You can't boot from drives attached to the SE IV. It doesn't seem to provide non-4.3 compatibility so presumably this requires a Quadra AV or PowerPC. Or at least it wasn't working for me.
Which I can confirm by enabling the PowerPC upgrade card. With that enabled (which also has the effect of replacing the ROM with a PowerPC version), I can boot from the SE IV.
However, it's really unstable for me with that configuration. I seem to have trouble booting a real OS though I can mount the drives if I boot from floppy.
- It's possible the SCSI driver I'm using just isn't very compatible, I didn't really explore further.
With it booted from floppy, performance tests show no difference.

This might be the source of claims "you can't boot from the card", because you can't if you're using older hardware with the SCSI 4.3 firmware.

So: SCSI 4.3 features are nice and may improve parallelization performance, but for older systems it's not bootable, might be unstable, and does not seem to increase raw throughput. For Wombats and older, I'd say stick with 1.6.5.

eharmon · Jan 3, 2026

David Cook said:
Wow. I assumed there would be some flags or a driver call that would ask "Hey, can you run in 4.3 mode".

Ok. That's the heart of my line of questioning. If the OS is really waiting, then a faster bus will make more of a difference to the user's perception, as the bus response time is a bottleneck. Whereas if the OS is able to do other things in parallel, then those a faster bus will only matter somewhat, as theoretically the CPU is busy doing other stuff.

Imagine a file being unstuffed. If the the decompression algorithm can happen in parallel to the SCSI writes, then faster SCSI writes may not greatly improve overall time. However, if the OS effectively waits the CPU until SCSI responds, then faster SCSI would make a big difference.

Thank you for the detailed response.

Some more info on async in DV26: https://wiki.preterhuman.net/DV_26_-_I_am_Curious_SCSI

Maximum SiliconExpress IV Throughput

Similar threads