• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

SCSI Disk perf testing

bbraun

Well-known member
The subject of SCSI-IDE-CF and other alternatives to aging SCSI HDD's gets brought up regularly, and inevitably performance is discussed. Tests are performed, posted, and conclusions are drawn perhaps without an understanding of what exactly is happening between the benchmark app and hardware. I'd like to share my understanding of what's going on, in an effort to provide a framework to help when drawing conclusions from posted performance test results. Each benchmark app is different, and each test within a benchmark app can interact with the system differently.

I am by no means an expert on the subject. This reflects my current understanding which is (hopefully) always evolving.

When an application accesses a disk in the 68k MacOS world, the read/write request from the application can be an arbitrary size. This request goes through the File Manager which handles accesses to specific Files on various types of File Systems, on Volumes, which reside on Drives. Each of these capitalized words is a structure in memory within the File Manager. Each Drive has a Driver associated with it, which is what handles the communication with the underlying storage medium (SCSI disk for our purposes).

In the case of SCSI disks, when the system boots, each SCSI ID of each SCSI bus is probed for presence and the first sector off the device is read (SCSIRead sector 0), looking for a valid Driver Description Record. This is done using SCSI Manager to directly talk to each SCSI device. The DDR contains information about the device, and where/how to find the device driver to access the device. There can be multiple device drivers (for example SCSIManager & SCSIManager4.3 device drivers), and the DDR describes where to find them all, and which is preferred.

Once the device driver is loaded, the partition table is read, and Drive and Volume structures are created for the File Manager associated with the loaded device driver, and the Volumes get mounted.

The File Manager manages a block cache for devices to avoid going out to disk if you're reading the same (set of) block(s) repeatedly. This helps with performance, but can interfere with performance testing. AFAIK, most performance tests disable or bypass this cache, but being aware of it is crucial.

Additionally, the File Manager is accessing the disk through the filesystem. So, any overhead involved with the filesystem access will be included, such as dealing with fragmented filesystems, unbalanced b-trees of HFS, etc. The state of the filesystem can potentially introduce anomalies in performance testing.

The device driver is a critical performance piece, since it can vary from device to device, and it controls how the SCSI device is accessed. In MacOS, block device drivers (such as disks) are supposed to be accessed in multiples of block size (512 bytes). So a request of 8KB is fine, but a request for 5 bytes is not. It must be padded out to block size.

A SCSI HDD driver will then use the SCSIManager to send read and write requests through the SCSI controller, out over the bus, and to the device. This means the device driver controls how the device driver request gets translated into SCSI commands. Things such as which of the several different SCSI Read commands were used, how many blocks to request in a single transaction, etc. were all up to the device driver and could significantly affect performance.

For pre-4.3 SCSIManager, requests to a device were synchronous. This means when a write operation was issued, no other operations to device could be issued while waiting for the device to return the status of that write operation. In SCSIManager 4.3, asynchronous requests were supported, allowing multiple requests to be in flight on the bus at once. For example if a write request came in, followed by a read request, under pre-4.3 SM, the read request would be queued up by the device driver until the write request returned. Under SM 4.3, the write request could be issued, followed by the read request, and the read could potentially complete before the write, yielding much greater throughput.

When I say synchronous here, I'm referring to commands to a device. SCSIManager offers asynchronous APIs, but that does not mean the request is issued asynchronously or concurrently on the SCSI bus. That just means the caller of the API doesn't block until the command returns, which is important since you still want your app processing UI events while talking to the disk.

Additionally, this means you could bypass the device driver and use SCSIManager yourself to directly access the SCSI device. This would bypass the File Manager and device driver, eliminating variables in performance testing. The problem here is your test would either have to implement the filesystem support the File Manager provides in order to do perform your test in a non-destructive fashion, or you just assume the disk is blank and you can directly read & write anywhere without regard to existing data (like file systems).

This type of testing wouldn't really provide any real-world insight into how the device would perform with HFS for instance, but it would help eliminate the variables of File Manager and the device driver for the purposes of testing bus and device throughput and more importantly, latencies.

So, that brings us from the Application down to the SCSI bus.

The SCSI-IDE adapter will advertise a supported set of SCSI commands that the host (and device driver) may or may not utilize, and in the case of pre-4.3 SM, they probably don't. The SCSI-IDE adapter then translates the SCSI commands to ATA commands to be issued over a PATA (IDE) bus. This translation takes a non-zero amount of time, contributing to latency in requests. Latency is crucially important in pre-4.3 SCSIManager since requests are synchronous. No other SCSI requests can be processed until the previous request completes, so the longer a request takes, the lower the throughput. For SCSIManager 4.3, request latency's effect on throughput is less of an issue since multiple requests can be inflight simultaneously, providing much better throughput on higher latency busses. I have not measured how much latency is added by the SCSI command translation on various devices, and how it compares to the latency of SCSI hdd controller processors, or how much this matters in practice. From my higher level testing, it seems to make an extremely small difference vs. 10k and 15k RPM SCSI drives. Device drivers and the IDE device selection make a far more significant difference, but it is worth noting.

If you're using a SCSI-SATA or SATA-IDE, awareness of the existence of latencies in command/signalling translation and how that latency affects your (SCSIManager's) usage is just as applicable.

I'm going to leave the CompactFlash part of the description for another time...

Supported transfer modes, flash controllers, wear leveling, etc. is a huge discussion in its own right.

 

johnklos

Well-known member
I'd suggest that latencies are not an issue. Considering that the slowest of the Acard adapters can do ultra speeds (20 MB/sec) and I've seen 7726Q devices do in excess of 100 MB/sec, the total latency is probably unmeasurable on m68k hardware. Any measurable latency would make it very difficult to attain those speeds.

Seeing how two CF cards can be completely different speeds via five different environments bears out that much of it may be down to what modes the cards and devices connecting the cards can attain. I've tested SSD-SCSI, CF-SATA-SCSI, CF-SCSI direct (in a nice SCSI card reader: http://a4000t.com/store/index.php?main_page=product_info&cPath=65_79&products_id=195), CF-Addonics-IDE-SCSI, and CF-Addonics-IDE. A fast card goes as fast as the computer's SCSI bus, and a slower card is noticeably slower even on a modest machine.

 

bbraun

Well-known member
Agreed, in general when comparing throughput in pre-4.3 SCSI Manager macs to the fastest SCSI disks available (which have their own latencies) in my testing, I would need a much larger set of samples to tell if the differences are statistically significant, it's that close. However, when trying to account for bus cycles, understanding the behavior of SCSIManager and latency is important.

And also 100% agreed on CF cards making a huge difference, the single largest difference in the whole setup IMO.

 
Top