• Hello MLAers! We've re-enabled auto-approval for accounts. If you are still waiting on account approval, please check this thread for more information.

SCSI2SD Project - anyone interested ?

I did a quick speed test with a friend of mine with the STM32F4 with SD card connected using hardware 4-bit mode. We seemed to get 10 megabytes per second, DMA mode.

You can get these with ridiculous amounts of GPIO in reasonably easy-to-handle LQFP packages.

 
That is wicked fast. With the code being in C, isnt it possible to port/compile using a GCC toolchain for another CPU? Well there might be obvious timing issues with bitbanging for the SCSI.

 
Depends on how much it relies on the specific features of the PSoC. If it's just bit-banging the GPIO, it should be doable, yeah. The STM32F4 core runs at 168MHz and has 80MHz GPIO...

The Mac SCSI controller would likely be significantly slower than this, though.

 
From my research, a non-SCSI Manager 4.3 compliant interface on a Mac maxes out at ~500 kB/s, and that is only under perfectly optimal conditions. So, anything that puts ~0.5 MB/s will fully saturate the older Mac SCSI interface. I can't speak to anything that is SCSI Manager 4.3 compliant or newer.

 
Agreed. Mmcmaster, I hope you haven't given up on this project due to that bottleneck! Even if it's not super speedy, I think there's still a ton of value in a SD-based SCSI drive. It would make sharing files between multiple old Macs, or between old Macs and new computers, so much easier. Way better than a compact flash solution, or any kind of IDE adapter to a real hard drive, in my opinion.

 
I'm all in on the project too! I've ordered a batch of PCBs, and hope to assemble my own. I'm also on the reserved list for an assembled version, so I can have a known good one to compare against. :) I'm under the impression I should be able to use my Segger jlink with the proper cabling to program this, and I have the 20pin to 9 (well 10) pin adapter. Hopefully I can program it with what I've got, if not I'll grab one of the Cypress programmers.

Just having a programmable device that speaks SCSI gives me all kinds of ideas.

From my research, a non-SCSI Manager 4.3 compliant interface on a Mac maxes out at ~500 kB/s, and that is only under perfectly optimal conditions.
Not entirely accurate. SCSI Manager before 4.3 did not provide for asynchronous SCSI operation. Meaning, when you issued a read from one device, SCSI Manager was blocked until that returned. You couldn't have multiple operations in flight and couldn't queue multiple SCSI operations. Additionally, on the early machines like the Plus, there was no hardware acknowledgement, so when that synchronous request was outstanding, the CPU sat polling the SCSI controller to see if it was done. Later machines (including the SE/30, I believe) had hardware acknowledgement that allowed the CPU to continue doing whatever it needed to do, and would be interrupted by the SCSI controller when the request completed. Since requests are synchronous, the throughput varies significantly based on transfer size. Since the block device driver interface on Mac OS operated on 512byte blocks, a lot of the transfers ended up being 512 bytes at a time, which would be about the worst case scenario, and it is unlikely you could get much above 500KB/s. However, if you bump the request size up to about 256KB, I have gotten 1.5MB/s on an SE/30. There were a couple file copy speedup system extensions that would bump up the Finder's file copy buffer so it could issue larger read/write requests, and speed things up a bit.

This is a bit of a tangent, but another tidbit on synchronous SCSI is: Finder and other programs knew SCSI (and even floppy disk) operations were synchronous and broke their own rules about IO. This is why Cloud 7, my http block driver, has problems on earlier System 7 machines. Some background: Mac OS drivers can be called either synchronously or asynchronously. You're supposed to always implement your code as asynchronous, and then the synchronous call would just wait for the asynchronous operation to complete. The reason for this is you can detect how you were called, but you have no idea if someone up the call stack was called in a different fashion. If you're being invoked from an interrupt handler, for instance, issuing a synchronous operation would generally be bad, since the entire system's event loop will hang until your synchronous operation completed. Of course Apple got away with cheating here on things like localtalk, since they would hack around a bit to make sure the cursor still moved, serial data from the other serial port didn't get dropped while interrupts were disabled, etc. For example, doing a synchronous file read or write operation from an interrupt handler could be particularly bad since the caller will issue the request to the device driver, then block in the interrupt handler waiting for the request to complete. If the device driver then needed to do an asynchronous operation, that operation would never complete because the system's event loop isn't being serviced while you're blocked in an interrupt handler, and you'd get a system lockup. However, if the device driver did not have any other dependencies, and could synchronously process the request, you could get away with it.

And this is exactly what early System7 Finders did. As part of the disk mounting process, they would issue a synchronous file operation from a previous operation's completion routine, which happened to be running in an interrupt handler, and it would Just Work, because the SCSIManager call was synchronous (and floppy). And it mostly worked too, since floppies and SCSI devices were the only disks available.

 
The first 10 PCB's are on their way from smart-prototyping.com, all parts ordered from DigiKey, and I already have 20 microcontrollers direct from Cypress. I should have the first Rev 3.0 board built early next week for testing.

Did I miss somewhere what one of these beta units might cost? ;)
Check the very first post :)

I've ordered a batch of PCBs, and hope to assemble my own. ... I'm under the impression I should be able to use my Segger jlink with the proper cabling to program this, and I have the 20pin to 9 (well 10) pin adapter.
When did you order the PCB's ? I made a small change to the layout to add 1 more trace to enable JTAG on the 10-pin header as well as SWD programming. See commit at http://www.codesrc.com/gitweb/index.cgi?p=SCSI2SD.git;a=commit;h=aa5e83b9db6a9b4775ead1ffb03675d8e8da7d3b

Depends on how much it relies on the specific features of the PSoC. If it's just bit-banging the GPIO, it should be doable, yeah. The STM32F4 core runs at 168MHz and has 80MHz GPIO...
Porting the current software based solution to any other 32bit microcontroller should be quite trivial, but I expect all future updates will make porting harder. I'm about to commit an update that changes from bit-banging SCSI in software to make use of the PSoC UDB's to implement parity generation and the scsi data tx/rx sections. I'm starting to think it is technically possible to implement the proprietry 4-bit interface using the PSoC UDB's, but patents/licensing restrictions will stand in the way.

 
Alrighty, I am down for a board set, too. PM sent, mmcmaster.

So many questions! And a suggestion-type thought or two as well.

mmcmaster:

• Did you use a commercial hardware development board for the 5LP, or did you go straight to prototyping? If so, which one?

• Would you consider shrinking a future rev of this board down to fit into / connect to a Powerbook 2.5" SCSI bay? It should be possible, I think, to have mounting holes for both 2.5" SCSI or 3.5"+power on the one layout. If not, a short adapter or ribbon cable to the larger sockets could make up the difference. It's easier to adapt a smaller device to go into a larger bay than the reverse :p

Consider, if you will, that the only existing replacement for these drives sells for >$100, without media.

(If I recall correctly - 63.5mm max width, 50 pins @ 0.1" DIL, including power)

• How many pins, if any, are unused on the PSoC? It's a 72 pin module IIRC. Would you consider bringing any unused pins out to headers for future uses? And/or routing the SD connections to them as well, for people who may not want to use an SD slot. This would turn your board into *both* a replacement flash drive *and* a general purpose SCSI development system.

• Disk cache - is there enough RAM on the PSoC to implement cache? If not, could it be added as off-chip/on-board RAM via DMA?

(I'll note here that the PSoC is able to address 4GB ...)

Just having a programmable device that speaks SCSI gives me all kinds of ideas.
Does it ever }:) It's a damn shame that Cypress haven't included USB Host on the PSoC series, but nevertheless... ideas ideas ideas...

Re CF/IDE vs SD - just off the top of my head, it seems like getting faster read/writes might be a reason to go to CF. Is there an existing UDB for IDE/ATA (or PCMCIA, which is adaptable to CF)?

 
In terms of speed - as far as I remember, the earliest Macs are theoretically capable of ~1.5MB/s, later 68ks and PPCs of 5MB/s, with a few exceptions like the dual-channel 9500(?) having one 10MB/s channel, and Nubus SCSI controllers topping out at 20MB/s.

The latter involves moving to Fast Wide (16 bit) SCSI though, which would eat further into the pin budget on those devices.

So my next question is - how many PSoC pins are tied up just in SCSI handling at the moment?

 
• Did you use a commercial hardware development board for the 5LP, or did you go straight to prototyping? If so, which one?
I went straight to prototyping. When I started the freesoc project didn't exist (


). If I was starting again I would probably try to use that board first.
Would you consider shrinking a future rev of this board down to fit into / connect to a Powerbook 2.5" SCSI bay?
Yes, I would consider that. The width of the 50 pin SCSI header and molex power adaptor is driving the width of the current design. A smaller SCSI connector would enable a much thinner design.

Currently the width is not enough to suit the underside 3.5" mounting holes. I decided it's cheaper to print a bracket to suit a drive bay then it would be making a PCB slightly wider than 10cm. The PCB prototype houses I use charge a premium for a "up to 15cm" board vs a "up to 10cm" board. I still haven't fully tested the bracket - still waiting on an M3 tap to create the screw threads.

• How many pins, if any, are unused on the PSoC? It's a 72 pin module IIRC. Would you consider bringing any unused pins out to headers for future uses? And/or routing the SD connections to them as well, for people who may not want to use an SD slot. This would turn your board into *both* a replacement flash drive *and* a general purpose SCSI development system.
That is a great idea. I'll work on making that happen. There are 12 spare pins.

• Disk cache - is there enough RAM on the PSoC to implement cache? If not, could it be added as off-chip/on-board RAM via DMA?
I could create a small 8kb cache, possibly 16kb, but I'm not convinced that the extra software complexity is worth the marginal performance gains. There are insufficient external pins to use a fast off-chip sram of any meaningful size (but if there was, there is an external memory interface accessible via DMA. http://www.cypress.com/?rID=56752

 
and Nubus SCSI controllers topping out at 20MB/s.
The latter involves moving to Fast Wide (16 bit) SCSI though, which would eat further into the pin budget on those devices.

So my next question is - how many PSoC pins are tied up just in SCSI handling at the moment?
36 pins. 9 data bits (including parity), 9 control pins. The 18 signal lines are connected directly. Another 18 pins are used for output via a 7406 Hex Inverter).

It is possible to add a few more glue ICs and halve the number of pins being used on the microcontroller, which could enable a 16bit transfer. But the current bottleneck is reading/writing from the SD card anyway. I've timed the SCSI interface at 1.5MB/s in my latest commit, and I suspect that the PCI SCSI controller is artificially limiting it to that rate.

 
So I guess you're out of pins to even look at a CF/IDE interface at this point? (unless you can economise on the SCSI end as you suggest.)

 
the freesoc project
Now at http://moeller.io/ , for anyone curious.

The Moeller pages offer a much clearer explanation for newbs (like me) of what the PSoC system is all about than Cypress' own pages.

There's also an even cheaper kit direct from Cypress now, for the PSoC 4 - which notably takes Arduino shields and Digilent PMODs.

http://www.cypress.com/?rID=77780

a future rev of this board down to / 2.5" SCSI
Yes, I would consider that.
That is excellent. There is definitely pent-up demand for a reasonably-priced 2.5" replacement.

I decided it's cheaper to print a bracket to suit a drive bay
If the new board matches a 2.5" drive, there are of course existing 3.5" mounting brackets, making that no longer your problem.

bringing any unused pins out to headers for future uses?
That is a great idea. I'll work on making that happen. There are 12 spare pins.
Splendid :b&w:

 
I've been doing some more SD card performance benchmarking recently, and I've found that "class 10" or whatever doesn't really tell the whole story, at least not for small transfers using the SPI interface. On my class 10 PNY card, I can get about 266 kBytes/sec when writing 12KByte blocks, but on an older SD card that predates speed ratings (a SanDisk Ultra II), I can get about 520 kBytes/sec. I suspect it would be even faster with a larger block than 12K. And for single-sector (512 byte) random writes, the difference is even larger: about 9ms per sector on the class 10 PNY card, vs about 2ms per sector on the San Disk Ultra II.

I think the difference is mostly due to the card's erase page size, which isn't something that's generally advertised, but it can be queried from the SD card. Generally speaking, a lower capacity card should have a smaller minimum erase size, and so should offer better performance for small writes like we'll typically need for retro Mac projects. I've ordered a couple of old SD cards with capacities below 1GB, to see how they stack up in terms of performance.

By querying the cards' status registers, I was able to discover that the class 10 PNY card has an erase size of 64 kByte, while the San Disk Ultra II has an erase size of 32 kByte.

Long story short: I'd suggest you try experimenting with a few different sizes and brands of cards, especially some smaller capacity cards, and see if you can get higher write speeds with your hardware.

 
Got the project to compile on my computer. Looks like several of the header and source files are referenced to an absolute path on your system, so I modified the .cyprj file to have relative paths for them. If you'd like a patch I can send one to you however is convenient...otherwise, it's simple enough to fix in a text editor with find/replace :-)

 
I've been doing some more SD card performance benchmarking recently, and I've found that "class 10" or whatever doesn't really tell the whole story
I've tested four cards now, and my 8GB class 10 "high speed" card was actually the worst performer for small sized random writes. It looks like smaller capacity cards really are faster, at least for these purposes.

 
Got the project to compile on my computer. Looks like several of the header and source files are referenced to an absolute path on your system, so I modified the .cyprj file to have relative paths for them. If you'd like a patch I can send one to you however is convenient...otherwise, it's simple enough to fix in a text editor with find/replace :-)
Thanks dougg3. It was easier to just do the find/replace myself in this instance. The updated project file has been committed.

 
Back
Top