Jump to content
ZaneKaminski

ROMBUS - 64 MB flash interface for Mac Plus

Recommended Posts

Hi 68kMLA,

 

Maybe some people here remember my ARM-based "Maccelerator" proposal from a few years ago. I have cost-reduced the BOM for that project significantly and plan to get the hardware released soon, but that's not for today. Since the Maccelerator proposal, I have been working with a friend of mine, Garrett Fellers, and we have been selling an Apple IIGS memory expansion under the Garrett's Workshop brand. People seem really pleased with that card, so we are trying to get some more of my designs out there. I have a ton of vintage product designs in the backlog which I really wanna release. My post today is about one of them, which we call ROMBUS (or GW4101). It's a 64 MB flash disk interface for Macintosh Plus, replacing its two socketed DIP ROM chips:

ROMBUS-B.thumb.png.2ade5a4f6c4caba7d3490c721a4d43d0.png

 

The original inspiration for the project was Big Mess O' Wires' Mac ROMinator. When I heard it was being discontinued, I tried to come up with a worthy replacement. ROMBUS implements a 16-bit-wide interface to four SPI flash memory chips. The idea is to get the flash chips into quad read/write mode, in which four bits can be read or written at once from each of four flash memories, thus making a 16-bit interface. ROMBUS interfaces with Mac Plus via the Mac's two ROM sockets. To store a patched toolbox ROM with the flash driver, there are two sockets on ROMBUS which can each accommodate a 512 kilobyte flash ROM, making a total of 1 MB of parallel flash ROM. The Mac Plus has 128 kB of ROM, but can address 256 kB through the sockets, so bank-switching is required to access the total 1 MB capacity. Because the R/W signal is not sent to the ROM socket, writing to the 64 MB serial flash and 1 MB parallel flash is accomplished by bank selection.

 

I have had boards for this project in hand for 6 months or so, and the CPLD programming is done too, save for any tweaks or bug fixes:

IMG_0848.thumb.JPG.c7ca786da37f6276d33087505074b401.JPG

Actually, this is the second revision (GW4101B). The first revision (GW4101A) is basically electrically identical, so the same driver and CPLD programming will work on both, but the flash memory land patterns are wrong, so the GW4201A one can only accommodate 16 MB serial flash despite the board advertising itself as a "32 MB Disk."

 

I will be releasing the design files for this product soon, under some kind of open-source license, probably GPL, so anyone can make (or sell) their own, or make improvements. We (at Garrett's Workshop) have just finished our SMD assembly line, and we will be selling the card for $40 USD, shipped to the US. Not sure if many will purchase it, since BMOW says his ROMinator was not too popular, but the price strikes me as fair, and I am pleased to have eliminated the need to jumper the R/W signal and the couple of address signals like you have to in order to install the ROMinator. We have 20 of the 64 MB boards, but I want to gauge the interest to see if I oughta order more in preparation for a product launch. 

 

Now, what I need help with is the driver. I have looked at the drivers for the ROMinator and BBraun's previous work, but the way to access the flash memory is different, plus there oughta be a wear-leveling scheme to minimize erase time overhead and maintain the endurance of the flash. (I have a fair solution for the wear-leveling but it takes 256 kB of memory for a 64 MB disk.) My hope is that some other skilled members of the community can assist with the driver development, and then the whole thing can be put onto GitHub for others to build themselves, improve, learn from, etc. And of course we will sell the boards for $40 each, which we think is fair.

 

So, does anyone want this? Is it fair to ask for assistance on this when we are trying to make some money on sales of the product? And if so, who can assist a little with the driver development? Just pointing me in the right direction in terms of what development tools to use, environment setup, etc. would be really appreciated. Of course, free hardware will be provided to the top contributors. Looking forward to hearing what everyone thinks.

 

Edited by ZaneKaminski

Share this post


Link to post
Share on other sites

@ZaneKaminski looks interest, I’d buy one or two once your at that point.  Love the idea of this.  Also, I contacted Garret asking if he would be interested in helping with our LC to SE/30 PDS adapter project. Once that is complete we can integrate WiFi.  I’m happy to offer payment. If you could chat with him about that, I’d appreciate it.  

Share this post


Link to post
Share on other sites

I'm interested - you'd significantly broaden your market if it worked in a 128K, 512K ... or SE?

Share this post


Link to post
Share on other sites

I will proceed more quickly then since there is interest!

 

Hopefully someone with some experience in driver development for the classic Mac OS can come along. I've read Inside Macintosh, but, unless I missed something, there isn't much about how to install a new driver in the system, though there is a lot about what kind of calls can be made to a disk driver. Also, what kind of development environment is best? I am not sure if I should be trying to develop the driver in Mini vMac or if I should go the unix route and build it in Linux/OS X with a crosscompiler. Does anyone have any experience with this?

 

 

@LaPorta, it's sort of like a ROM disk but it's also writable, so that makes it an SSD. Sorry if my first post was a little unclear or technobabbley, sometimes I get a little too deep into this stuff. Since Mac Plus doesn't have the capability for an internal hard disk, I wanted to make a fast internal disk that would allow a Plus to boot up an install System 6 with lots of apps and games and stuff. A ROM disk is okay but it would be even better to be able to change the disk from the Finder in the normal way.

 

@Byrd I eyeballed it and thought it would work in a Macintosh SE, but unfortunately the SE has its ROM sockets slightly further apart than the Plus, something like 0.025" further from each other, so a new board would be required to for it to fit properly in an SE. But it's of course doable. The 128k and 512k have a different ROM spacing too, but that's not the biggest problem. The difficult thing is that their RAM is so limited. The driver I had planned needs to store a total of 256 kB of data. Most Pluses have 4 MB of RAM so that's not too bad in exchange for such a fast boot disk, but obviously the earlier Macs can't do it.

 

@maceffects Yes, you and I talked as well. I was going to do a quick board for you to do the requisite address mapping to access the LC PDS card properly in an SE/30, but it sort of slipped my mind as other things came up. You guys should skip right to making a board. PCBs are cheap these days, especially if visual quality isn't critical, as is the case with a prototype. Many PCB vendors will sell you 10 boards of the size you need for $10, then there's shipping which might even be 10-15 more, but hey, if your design works, you have a working, reproducible prototype really easily. I never spend much time in the prototyping stage. Instead I read the manuals and study the existing products that are similar to my design to verify that I'm on the right track. And if your design is wrong, just cut and jumper to make it work, then change that in the PCB and spend another $20 to get the board remade. It's so much easier than wire-wrapping or whatever. If you design a PDS adapter board, I'll check it over and include your it in my next board fabrication order. My plate is really full so I can't really take it on. I just announced a five new Apple II cards on AppleFritter (not much traction over there though), so I've gotta work on finishing those.

Edited by ZaneKaminski

Share this post


Link to post
Share on other sites

@ZaneKaminski Thanks for the information.  We are actually stick with the video issue in slot $E.  Making a simply adapter is not possible, we must over come that issue first.  I know other cards did overcome the issue, but haven't figured it out.  Maybe @Bolle and @Trash80toHP_Mini can chime in of the specifics holding the project back.  Should you have any time to assist with this project, I'm happy to officer a considerable amount for your time, you'd be surprised :cool:

 

AppleFritter is great but has a small audience.  Have you tried the forum at http://vintage-computer.com/ this is a good place.  As well as the Apple II Enthusiasts Facebook page.  I know there are others but I am just learning about the Apple II community better from my Apple II case project. 

Share this post


Link to post
Share on other sites

So, to be clear I don't intend this as a criticism, but I am curious: if the storage on the 64MB flash you're adding is accessed like a "disk", what is the advantage of your design compared to hanging, say, an interface to an SD card controller off the ROM sockets? From an API standpoint SD cards present a "SCSI-like" storage interface and you get whatever rudimentary wear-leveling the manufacturer sticks on the card for free.

 

(Flip side is it would probably be painfully slow if you had the Mac running the card via SPI itself through minimal glue, although, honestly, if it at least presented itself as a buffered 8-bit port and didn't require the CPU to bit-bang everything it may well be as fast as the Plus' built-in SCSI controller.)

Share this post


Link to post
Share on other sites

@Gorgonops Yes, there is a good reason to use SPI flash instead of an SD card. The goal was to make the fastest disk possible. If I want to move 16 bits off of an SD card, I need to have a clock going faster than the CPU transfer rate to serialize the data, plus more macrocells and routing resources in the CPLD to buffer the data. And since no clock signal is sent to the ROM socket, I’d have to have an even faster clock on the card and then synchronize the inputs coming from the Mac. So the SPI flash is easier since you can reasonably put 4 and work them all in parallel to make a 16-but port, at the expense of a smaller capacity and the need to wear-level

 

So the aim is to bit-bang but to be able to read data at maximum speed once the proper bits have been twiddled to submit the command and address. This struck me as the easiest way to do that.

Edited by ZaneKaminski

Share this post


Link to post
Share on other sites

@LaPorta Thank you, but there’s no need! I’m pretty sure this hardware is gonna work, and I’ve been sitting on boards and parts for a while. I’ve gotta make the driver, verify the sort of low-level hardware programming in the big chip, and then make the boards. Little additional expense is involved.

 

@GorgonopsOne more thing on the subject of clock signals and timing, there is a somewhat unusual element (some would even say it’s bad practice) to this board insofar as how it generates the clock signal to be sent to the SPI flashes. For read operations, data must be clocked out of the flash sort of in the middle of the read cycle. What I mean is that the 68000 asserts /AS and /LDS/UDS and then that in turn causes the ROMs to be selected. Somehow, as a consequence of the single falling edge of the ROM’s select signal, a clock pulse with a width of 10ns or so must be sent to the flash. This is generated by a little RC network in conjunction with the CPLD. The RC network is carefully chosen to meet the minimum pulse width spec of the flash as well as the maximum rise time spec of the CPLD. The need for this self-timed circuit is another consequence of not having the clock signal at the ROM socket, or else I’d have to have a faster oscillator always running on the board, and synchronize the inputs. It’s not clear that such a design would fit in the CPLD. 128 macrocells sounds like a lot, but there are fan-in limitations so a bus-oriented design rarely achieves full macrocell utilization in my experience compared to some random logic-type functions. 

Share this post


Link to post
Share on other sites
1 hour ago, ZaneKaminski said:

@GorgonopsOne more thing on the subject of clock signals and timing, there is a somewhat unusual element (some would even say it’s bad practice) to this board insofar as how it generates the clock signal to be sent to the SPI flashes...

Okay. Well, I mean if it looks like it's going to work I'm in no position to criticize. I do have to admit something does make me just a *little* leery, though; I googled up an application note for these quad-SPI flash chips, and this is what I got:

 

https://www.st.com/content/ccc/resource/technical/document/application_note/group0/b0/7e/46/a8/5e/c1/48/01/DM00227538/files/DM00227538.pdf/jcr:content/translations/en.DM00227538.pdf

 

The application note talks about running these chips in parallel; it doesn't have an example of four in parallel, but it does have one for two. This diagram has me scratching my head:

 

image.thumb.png.bf1b01a99fe1612bee440c80a75d7a6f.png

Maybe I'm missing something, but the implication I get from this is that the way the memory cycle works with these things is that when you "go wide" with multiple packages it still expects each chip to send and receive a full "byte". IE, the data cycle is going to send two sets of four bits across the I/O lines as a unit. I don't find anywhere in this manual where it talks about the chip acting like it's only composed of four bit "nibbles" which are individually accessible. So if you have four of them in parallel (IE, "16 bits" worth of quad-SPI lines) won't you actually be pushing 32 bits per data cycle? Not that it would be something you couldn't handle, since you're planning on driving this with software on the 68000 instead of depending on this to just "transparently" look like ROM or a disk.

Anyway I think mostly what I had in mind when I suggested the SD thing was, given the aforementioned limits on the signals you have access to on the socket, was implementing the port in the form of a "mailbox" buffer between the Mac and a self-clocked MCU that actually handled the communication with the SD card. (Or, actually, in this scenario an MCU that can act as a USB host might be better? Something like a PIC 24FJ64GB00x?) That'd let it run asynchronously from whatever the Mac is doing, all you'd need is a register to communicate a "busy/ready" flag. Ultimately I think something like that would be stuck with performance somewhere in the same ballpark as the Mac Plus' built in SCSI controller. (Which also communicates entirely by polling in that machine.) So maybe there's no point. (Might be a fun upgrade for a 512k?)

 

I'm not sure off the top of my head what would really benefit from "lightning fast" bulk storage throughput in a Plus but it's certainly an unfilled niche. So by all means go for it.

Share this post


Link to post
Share on other sites

@GorgonopsYes, although each device reads out 4 bits at once, you have to specify a byte address to each flash, and then there are four in parallel, so yeah, an address as sent to the SPI flash chips refers to a 32-bit data word. But like you said, it doesn't matter, since we can just write the driver to handle the addressing correctly, plus I believe the API we need to implement concerns itself with 512-byte sectors anyway. My aim with this project is to have a really quick boot time on a Plus, even faster than from floppy or a SCSI disk, and insignificantly slower than the ROMinator-type ROM disks.

 

The MCU approach is very workable, but I wanted to minimize the number of software pieces for this project. More software always means more bugs, whereas it's easier to iron things out with hardware in my opinion. I also wanted to eliminate the need for the Mac to busywait or poll a register in the course of a read operation. (For write operations, the driver must poll the SPI flash to see when the operation is complete, but I believe the time required for that is less than the seek time of a typical SCSI drive, so it's not too bad.)

 

If you're interested, I did use the MCU approach recently on two Apple II cards for which I have just finished the hardware, "Mouserial" and "Library Card."

 

Mouserial implements an Apple II mouse card with a PS/2 mouse interface. There is an AVR microcontroller on the card in conjunction with a CPLD. The CPLD's main function, other than some decoding and generation of select signals, is to implement a few dual-port registers. The AVR runs at 7 MHz, synchronous with the rest of the Apple II, and frequently polls the CPLD's dual-port registers to check if any new commands have been deposited in the command register by the 6502. The AVR interleaves polling the dual-port registers with bit-banging PS/2 such that the registers are polled whenever the AVR must wait in the context of bit-banging PS/2. If there is a new command, the AVR services it and updates the result and status registers in the CPLD. Meanwhile, the 6502 in the Apple II polls the status register to wait for completion of the command. This way is good for low-bandwidth applications, and the implementation of the dual-port registers in the CPLD was made easier by the fact that I could run everything from the same clock. The CPLD always latches write data into the dual-port registers at the end of PHI0, whereas the AVR has an extra wait state and the CPLD makes sure not to write the AVR's data at the same time that the 6502 is reading or writing. (That's sort of a simplification but it provides the gist of it)

 

Library Card interfaces the Apple II to an ESP32-based WiFi/Bluetooth module. The ESP32 has dual cores and can run at 240 MHz, so I did the bus interfacing differently on this one. In the CPLD, I buffered the data bus and part of the address bus to the ESP32, as well as sent it a select signal. Since the ESP32 is fast and has dual cores, it is just barely possible to poll a select signal in software and input/output data just fast enough to satisfy the 6502's timing. The polling loop has to be run all by itself on the second core, with no scheduler or OS getting in the way. That allows any arbitrary interface to be implemented. Honestly, I don't really know what I should be doing with this card in terms of drivers and software, so I thought that this type of busy-waiting implementation would be the most flexible, at the expense of using a whole core of the ESP32.

 

So the MCU approach isn't foreign to me. I just figured the dumber way is best. Also, a lot of people had this criticism of my ARM-based "Maccelerator" proposal, that it was too new, what with the ARM and the emulation and all, not to mention complicated. I wanted to keep this thing as simple as possible.

Edited by ZaneKaminski

Share this post


Link to post
Share on other sites

@maceffects Oops, somehow I missed your comment. I don't wanna be paid anything upfront... I don't like doing business that way. It would be much better to collaborate on a gizmo and then we can split the profits. Does the PDS card you are trying to run do DMA? If not, you can unidirectionally invert a single address bit so as to change the slot address using a 74HCT04 (or AHCT04, avoid 74HC04, 74AHC04, 74AC04, 74ACT04, 74LVC04).

If the card does DMA, that is to say that it drives an address onto the bus, thus commanding an access to memory in the Mac, you must send the address back to the Mac unchanged. You can do this and the inversion very easily with a GAL, I think. Well, the so-called GAL has been discontinued by Lattice, but you can get another version of the same thing, ATF16V8, from most IC suppliers and this will solve your problem if you can manage to write some logic equations to describe the functionality I just described. Use PALASM in DOSBox to do this. I'll make the board, assemble it, and program the GAL for you if you do the design work.

 

@Gorgonops I read your comment again and maybe now I see what you mean about the 32-bits per cycle. One access to the flash register location generates a single clock pulse (low-high-low), so the only limitation is that you can only begin reading on 32-bit-aligned boundaries. There is no PLL or anything like that, so you can send the clock pulses as irregularly or far away as you want, as long as you keep to the 5ns or so minimum clock high time. So if you wanna load an odd 16-bit word from the flash, you twiddle the bits in software to send the command and address the same way as for reading the preceding even 16-bit word. The difference is that you must perform one additional read and discard the result, thus ignoring the even word. Then the 68000 can read data out of the same register location at 16 bits per one read access to the ROM, same as in the 16-bit-even/32-bit-aligned case. It's all bit-banging, but insofar as the command and address are short compared to a 512-byte sector, it's as fast as ROM.

Edited by ZaneKaminski

Share this post


Link to post
Share on other sites

@ZaneKaminski Thanks for your reply.  I'd also be interested in a collaboration with sharing of the sales of the part.  The reason I suggested a direct pay was that in addition to wanting to develop such a card I have 1,500 (you read that right) brand new LC Ethernet cards.  But I'm sure some arrangement could be made to make it great for everyone.  With SE/30 Ethernet cards alone bringing $150, it would be a really good place to start.  Unfortunately, I don't have much electrical engineering knowledge, I'm more on the business and hobbyist side of things.  @Bolle created a SE/30 > Cache/Pass-though card already so we just need to figure out how to make it work with LC PDS Pass-though.  @Trash80toHP_Mini has been doing much of the technical work including mapping.  @Trash80toHP_Mini what do you think of the method he mentioned regarding our issue?  I do have an electrical engineer who can actually do the design work once we have on paper what we need.  Unfortunately, I'm stuck at this point.

 

I read and replied in AppleFriter the WiFi & Bluetooth Card would be amazing I'd buy some of this.  The SE ROM replacement sounded interesting as well, especially if it can be programmed.  If that is possible, I'd like to buy a whole pile of those!  You guys have some really great designs!  Places like Reactive Micro can also be distributors for the Apple II cards if you reach a deal with them.  That could allow more sales for your cars if that's what your interested in.  Anyway, keep up the great work. 

Share this post


Link to post
Share on other sites
2 hours ago, maceffects said:

@Bolle created a SE/30 > Cache/Pass-though card already so we just need to figure out how to make it work with LC PDS Pass-though.  @Trash80toHP_Mini has been doing much of the technical work including mapping.  @Trash80toHP_Mini what do you think of the method he mentioned regarding our issue?  I do have an electrical engineer who can actually do the design work once we have on paper what we need.  Unfortunately, I'm stuck at this point.

PCB sounds good, but I only need to wrap a little over 60 connections. There's a technically oriented thread about the project that I haven't located. If the boffins figure out which lines go where to the inverter I'm good to go. currently I'm looking at integrating a second jumper block to let me feed the addressing from either direction to the inverter socket.

 

@ZaneKaminski Have boards large enough for a 120 pin connector come down to the pricing levels you're quoting? 10cmx10cm is cheap, but you need a bare minimum of 12cm width for the 030 PDS connector. I threw together a wire wrap version because I have everything on hand from a previous project. Once I have the directional spec for the address signals I can complete my prototype in a couple of hours, three with testing.

 

Curious about ROMBUS, could you do it on a ROM SIMM for the SE/30? If we're doing an LC PDS NIC to 030 PDS adapter, you could easily jumper R/W and any number of signals from LC NIC adapter to ROMBUS SIMM? With an oscillator a/o GAL on board the NIC adapter you phase in whatever clock you'd like for the internal workings of a ROMBUS SIMM. Overcoming lame disk I/O over SCSI is the last frontier for SE/30 exploration.

Edited by Trash80toHP_Mini

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×