For kicks, here's some code to copy memory from one location to another:
Code:
BigLoop:
move.w count, d2
move.l src, a0
move.l dst, a1
SmallLoop:
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
move.l (a0)+, (a1)+
dbra d2, SmallLoop
dbra d1,BigLoop
Each of the memory copy moves is 8 cycles (the (aX)+ is 4 cycles each) * 8 moves = 64. The dbra is 10 cycles. 74 cycles for each run through the inner loop. I run through that 8192 times for 256KB, which should be 606208 cycles for the inner loop. Plus 10 cycles for the outter dbra, 16 for the initializations, and I run through the outer loop 1000 times for 250MB, which should come out to 606234000. The SE/30 has a 15.6672MHz clock, so the 250MB memory copy should take a theoretical 38.69447 seconds, running with data cache disabled, and all instructions running out of the instruction cache. Note that about 5 seconds of the almost 39s of theoretical running time is spent just in the dbra looping construct.
I tried running with interrupts disabled, and pulling time from the RTC before & after in order to measure time, but I couldn't get that working. So, I'm running with interrupts enabled, which includes the 60Hz timer for the Vertical Blanking tasks which include updating the global Time variable, updating mouse cursor location global variables (and if the mouse moves, updating the displayed cursor), etc.
The code above, with wrapper timing code, ran in approximately 42s for me, which would be almost 6MB/s RAM to RAM copy, or a total memory bandwidth utilization of about 12MB/s. Which doesn't seem too bad considering setup overhead, instruction fetch, looping overhead, and interrupt handling. The theoretical for the code being run is about 1MB/s more total utilization.
Anyway, that's just for kicks. SCSI's performance is pretty well capped due to SCSIMgr if nothing else. Then there's the plethora of drivers depending on how you formatted the drive, and aside from some cursory benchmarking, I don't think anyone has disassembled the various drivers to see what they're actually doing and what the various performance limitations are.
For performance related projects going forward, avoiding SCSI if for no other reason than to get away from SCSIMgr and the driver problems, seems to be the way to go. Unfortunately, all of the logic for partitions, which are required for making larger media useful due to filesystem limitations for the earlier machines, is wrapped up with SCSIMgr. It would be possible to implement a partitioning scheme with whatever driver ends up being written to handle newer interfaces, it's just more to do.
But that's just performance, which is nothing more than an interesting intellectual exercise given the technology found in dumpsters these days is faster than anything you'll get out of an SE/30.