Jump to content

Serious proposal: accelerator and peripheral expansion system


Recommended Posts

I have reworked my design. The major changes are:

- Processor board (Snapdragon 410E) sourced externally, not designed by me
- FPGA has been eliminated
- No USB 3.0
- Has microSD slot on accelerator board
- Small CPLD implements interrupt priority control
 
Here are the block diagram pictures for SE and SE/30. There is a lot more detail about the bus signals now... I've figured that aspect out pretty fully.
post-6543-0-60357600-1478630194_thumb.png
post-6543-0-30092300-1478630211_thumb.png
 
The benefits are lower cost, faster processing, and less special software that must be written.
 
I decided to go with the Snapdragon 410E, but it was impossible for me to get on a board at a reasonable price. Luckily, Variscite sells the Dart SD410 module, which has the 410E, associated power management IC, 1 GByte LPDDR3, 8 GBytes eMMC flash, ad WiFi, Bluetooth, GPS, FM radio capability (external antenna needed).
 
The SD410 module will be mounted on the board with two connectors that are about $1.50 each. It’s also half the size of a DDR2 SODIMM.
 
I’ve eliminated the FPGA and now am using the SD410’s GPIO and some latches and level-shifters to implement the bus interface.
 
Since the FPGA was supposed to be a QFP-type part, not BGA, this and the smaller processor module will allow us to greatly reduce the size of the main board.
 
I think that the Snapdragon can run a custom-rolled version of Linux. The details of the Snapdragon 410E are so complicated, secret, and proprietary, that if we ran our code bare-on-the-metal or rolled our own RTOS, we would be missing out on so many capabilities and the benefits of fully-developed driver software.
 
The WiFi and Bluetooth functionality on the SD410 module are two obvious things we’d be missing without using Linux. However, the use of Linux will allow individuals who are more experienced in Linux system administration, rather than system design, programming, EE, etc., to contribute and make the Maccelerator better.
 
Now, in order to run Linux, the accelerator process which actually executes the code must run as some kind of a driver so that it can directly manipulate the GPIO pin control I/O registers without needing to perform a context switch. Either that or the emulator can run in userspace and just the bus stuff can run as a driver. That’s the correct way to do it, but I’m trying to achieve the best performance possible with the amount of time and money I can devote to the project. We will see.
 
Since we’re running Linux, there is less of a need for the I/O board, especially if we can get the SD410’s WiFi and Bluetooth working. The I/O board also is a bit mechanically difficult, in terms of the amount of room available for it. Rather than the I/O board, a much cheaper option would be to just run USB to the back of the Mac. Nonetheless, I/O boards will still be supported.
 
Anyway, here’s a rough bil of materials (for SE/30):
Variscite Dart SD410 (Processor Board)     1 x $57   = $57
Hirose DF40C-90DS-0.4V (SD410 conn.)       2 x $1.50 = $3
Euro-DIN 120 (PDS Connector)               1 x $7.50 = $7.50
Atmel SAMD10D14 (System Controller)        1 x $2    = $2
Lattice LC4032ZE (IRQ Controller)          1 x $1    = $1
microSD slot                               est.      = $2.50
16-bit latch                               4 x $0.50 = $2
16-bit level shift                         6 x $1.50 = $9
1-bit level shift                          3 x $0.50 = $1.50
32.768 kHz crystal oscillator              1 x $0.50 = $0.50
Power stuff (L, bypass C, V. regs.)        est.      = $5
PCB                                        est.      = $15
 
This stuff totals $106. This is without a doubt an underestimate of the final cost (assembly, shipping, packaging, etc. not included), but we should be able to it out the door for under $150, as long as 15 or so people want one. I will work on making it even cheaper.
Edited by ZaneKaminski
Link to post
Share on other sites
  • Replies 203
  • Created
  • Last Reply

Top Posters In This Topic

I've been looking into how to convert the Snapdragon's MIPI-DSI display interface into something supported by modern monitors.

 

Unfortunately, there does not seem to be an easy solution. Few ICs exist which convert DSI to anything useful, and the ones that do are usually very small BGA-type parts, being targeted at smartphone and other similar applications.

 

I will break out the MIPI-DSI interface and maybe someone else will be able to solve the problem. It would be cool to see a Micron Xceed-type grayscale mod running from the DSI interface!

 

So for now we will still have to run video over the USB 2.0 link to an I/O board. That presents some bandwidth constraints but it should work okay.

 

The Snapdragon 410 also has dual MIPI-CSI camera interfaces. I don't think I'm gonna expose connectors on the accelerator card for those. No need.

Edited by ZaneKaminski
Link to post
Share on other sites

I'm astonished.  And very interested.

 

Just out of curiosity, what led you to choose the Snapdragon over this one you mentioned earlier?

 

TI AM437x / $20 for a 1 GHz one with a Cortex-A9 and four PRU cores.

 

I'm also sending you a PM now.  Check your MLA mailbox at the top of the page ;)

Link to post
Share on other sites

And while I realise it's a somewhat less grunty processor than the one you're looking at, I'll just drop in the BeagleBone Black here as a suggestion.  It's a ready-built board for US$55 in small quantities, with a 1GHz AM335x A8, two 200MHz PRUs (92 pins on DIL headers), 512MB of DDR3, SD and eMMC, USB client and host, Ethernet, and HDMI out.

It's also an open-hardware design which could be forked if necessary, say for a faster ARM.

Link to post
Share on other sites

So for now we will still have to run video over the USB 2.0 link to an I/O board. That presents some bandwidth constraints but it should work okay.

 

USB video out converters (VGA, HDMI, DVI) are available retail.  If you're running a *nix on your board, those would be an option.

Link to post
Share on other sites

Well, the Snapdragon 410 is quite fast compared to basically anything else at anywhere close of a price-point. I was stupid to try and get it on a board myself... the tooling costs alone would be $1200+, and then as much as $100 for each PCB. So the Variscite SD410 module seemed to be the solution. Variscite's website says it "starts at $57." I don't know if that means you have to buy 1000 or something to get it for $57, or if they'll sell you 1 or 10 or something at that unit price. I tried to get a quote but they haven't responded. I'll try and call them later today.

 

As for the TI chips, the PRU system I looked at on the AM335x and AM437x didn't seem to have enough I/O pins per PRU core to accommodate operation with the PDS bus. I think on the AM437x, they only had 20 pins per core, and there are already 30 or so control signals (for the level shifter control and for the M68k bus) that have to be manipulated. I wasn't sure if I could parallelize the operation across multiple PRU cores, which would give 20 extra pins per core. The Snapdragon 410, which has as many as 122 I/O pins, was a better choice in my mind.

 

The disadvantage of the Snapdragon is that we have to run Linux or Windows CE or something or else; with little documentation from Qualcomm, we'll miss out on any cool features of the hardware.

 

The main issue with running Linux is ensuring that the bus driver can directly manipulate the GPIO pin registers in memory, and can do so without being preempted, for as long as a microsecond or so (~length of bus cycle). With a single-core processor, I would say this approach sucks, but with four cores, hogging one to operate the bus sounds fine. Once I make more progress on the schematic (which is coming along), I will purchase the DragonBoard 410c evaluation kit and learn more about building a custom Linux distro for the Maccelerator.

 

So yeah, the BeagleBone could work, but as long as the SD410 module is cheap enough, I think it's a better choice.

 

The only functionality the BeagleBone offers that would be hard to get with the SD410 module is HDMI. All of the chips to convert DSI to HDMI or anything else useful are just too small to implement cheaply in the design. The USB video converters must have a driver that compresses the video, then the adapter must decode it before sending it to the screen. Hopefully we can support that without too much effort. I'm gonna put a single USB Type-A port on the accelerator card, and then either a generic hub, video adapter, whatever, can be plugged into it, or a custom module designed to fit the SE and SE/30.

 

By the way, here are my sketches of USB hub and hub+video cards for the SE and SE/30. I dunno if these would be in too much demand, given that they aren't really supposed to be used independently of the accelerator.

 

post-6543-0-40778000-1479146051_thumb.png

post-6543-0-59285200-1479146060_thumb.png

 

The idea of how to structure the software for four cores is as follows:

Put the bus operation and the emulation function in the same thread, running like a driver, as part of the kernel. That way, context switches will be avoided when performing a bus access. The other three cores will run the OS normally.

The emulator should be able to either run a cached translation of some M68k code, or, if no translation is available, queue that block of code for translation and then interpret it. In performing interpretation as well as translation, we get the great performance of translation, but the option to interpret ensures that there will never be a long delay in results that would cause, for example, a floppy operation to go awry. The translation can be performed on a different core than the emulator and bus stuff. It would just have to some share memory with the driver process. Dunno how to do that, but I'm sure it's all possible. 

Link to post
Share on other sites

I've made a lot of progress on the accelerator card schematic for the Mac SE. I'm attaching what I have so far as a PDF. I'll release the KiCAD .sch files when I feel it's finished. There are some problems, oversights, areas of sloppiness, etc. in the current schematic.

 

Maccelerator-SE.pdf

 

In particular, I need to add some more power filtering and bypass stuff, switch to a system controller with more I/O, make sure I have series protection resistors in the right areas for the address-data and IRQ bidirectional buses, add pull-ups resistors and filtering to the SD410's reset pins for good measure, add a way for the for the system controller to reset the SD410 system-on-a-module without powering it down and back up, uhh, that's all I can think of right now.

 

Once I'm done with the SE, I'll port what I have for the SE to the SE/30, and then I'll begin the board designs.

Edited by ZaneKaminski
Link to post
Share on other sites

In addition to the Variscite Dart SD410 module:

http://www.variscite.com/products/system-on-module-som/cortex-a53-krait/dart-sd410-qualcomm-snapdragon-410

 

There are also these options:

http://www.inforcecomputing.com/products/system-on-modules-som/qualcomm-snapdragon-410-inforce-6301-micro-som

http://shop.intrinsyc.com/products/open-q-410-system-on-module

https://eragon.einfochips.com/products/system-on-modules/eic-q410-200.html

 

It turns out that Variscite does not sell single units of their SD410 module. Either we have to buy a lot, find a distributor that sells them, or change the specific system-on-a-module used. The other three manufacturers of similar Snapdragon 410 modules I linked do sell singles, but I'm not sure that the other three modules expose the right amount of I/O for our purposes. I will investigate further.

Link to post
Share on other sites

I finished fixing most of the problems in the schematic I mentioned yesterday, but now a new issue has come to my attention.

 

Something like 46 pins are required to operate the 68000 bus and 53 are required for 68030. Plus there are 6 (4 and 2) signals for two UARTs. I am gonna more heavily multiplex some of the functions. Many of the available Snapdragon 410 SoMs just don't break out enough GPIO pins.

 

I will redesign the bus interface to be more heavily multiplexed. The current design has a 16-bit bus shared for address and data. I think I'm gonna change it to an 8-bit bus multiplexed between address, data, IPL, and FC.

 

That will require another CPLD or two to implement. They're only a buck or so.

 

Edit: I've figured something out which will should only take 29 pins in the case of 68030, and 24 for 68000. The disadvantage of this approach is that it uses an 8-bit bus multiplexed 15 times over... So some microseconds will wasted as all of the 15 latches are loaded with data.

 

Not sure if anyone will be able to make sense of it, but here is my sketch of what signals are required for this scheme:

 

(8 ) B[07:00] multiplexed over the following functions:

  • Aout[07:00] output to A latch
  • Aout[15:08] output to A latch
  • Aout[23:16] output to A latch
  • Aout[31:24] output to A latch
  • Dout[07:00] output to D latch
  • Dout[15:08] output to D latch
  • Dout[23:16] output to D latch
  • Dout[31:24] output to D latch
  • Din[07:00] input from D latch
  • Din[15:08] input from D latch
  • Din[23:16] input from D latch
  • Din[31:24] input from D latch
  • BusCtrlOut[7:0]
  • [ FC[2:0], IPLset[2:0], 0b00 } output to FC latch, output to IPL priority
  • IPLin[2:0] current IPL input
 
(1) BOLE bus output/latch enable
(3) Bsel[3:0] chooses which signals to output on bus
 
(1) ALSOE enables output of A[31:0] during read and write cycles
(1) DLSOE enables output of D[31:0] during write cycle
(1) DinLE latches input D[31:0] during read cycle
(1) CLSOE enables output of other bus control signals
 
(1) BG (in) 
(3) HALT, BERR, RESET
(1) REQ_RESET
 
BusCtrlOut[7:0] for 68000
  • LDS (out)
  • UDS (out)
  • VMA (out)
  • RW (out)
  • AS (out)
  • BR (out)
  • BGACK (out)
 
BusCtrlOut[7:0] for 68030
  • SIZ[1:0] (out)
  • CIOUT (out)
  • CBREQ (out)
  • RW (out)
  • AS (out)
  • BR (out)
  • BGACK (out)
 
68000 only: (3)
  • (1) DTACK (in)
  • (1) VPA (in)
  • (1) PMCYC (in)
 
68030 only: (8 )
(1) PWROFF (in)
(2) DSACK[1:0] (in)
(2) STERM (in), CBACK (in)
(3) DOrder[2:0] (out) chooses what data order to choose (for 68030 dynamic bus sizing)
  • 000: normal
  • 001: 2nd 8-bit word
  • 010: 3rd 8-bit word
  • 011: 4th 8-bit word
  • 100: 2nd 16-bit transfer
Edited by ZaneKaminski
Link to post
Share on other sites

Thank you for the encouraging words, sstaylor.

 

I've abandoned the idea of using the processor itself to do the timing-intensive work for interfacing with the bus. Instead, I've implemented a design where 3 very cheap FPGAs implement the bus logic. The FPGAs are from Lattice's MachXO series and are about $2.50 each. This approach is much less costly than the original design with the Altera Cyclone IV FPGA implementing the bus interface, but still costs more than the schematic I posted last week. These cheap little MachXO chips have many cost and board size advantages over a more complex unit like the Cyclone IV.

 

The interface between the FPGA and Snapdragon is also going to be fully asynchronous now, meaning that the Snapdragon does not need to implement any precise timing to talk to the bus FPGAs.

 

I've also cleaned up the schematic in the ways I've wanted, for example adding more electrostatic discharge protection, upgrading the system controller, etc. I will post another schematic for Mac SE later today, and then I'll port it to SE/30.

 

The coolest feature I've added in this version is certainly the display sub-board connector. I've added a low-profile, shielded, 30-pin board-to-board connector that breaks out the MIPI-DSI display interface. Looks like this:

post-6543-0-93911400-1479832475.jpg

 

Additionally, I've found that an FPGA from the Lattice MachXO3 series is probably the cheapest way to convert the high-speed DSI to a more workable digital or analog signal, for example to implement VGA or a Micron Xceed-style grayscale solution for the SE/30. The path I'm seeing to achieving grayscale is to clone the Xceed yoke board (hopefully they won't mind lol), figure out what inputs it accepts, and then design a display sub-board to generate that from the DSI interface.

 

Okay, now the bad news is that the Variscite Dart SD410 module can't be purchased for $57 as advertised, and so I have switched to the Intrinsyc Open-Q 410 module (https://www.intrinsyc.com/computing-platforms/410-som/), which can be purchased in single quantities for $79. The switch to the MachXO FPGAs were a consequence of this change, since the Intrinsyc module has fewer I/O pins than the Variscite one. Should work better this way, anyway, though the FPGAs have increased the cost by another few dollars.

Edited by ZaneKaminski
Link to post
Share on other sites

Here's another nagging detail about the grayscale output on SE/30.

 

Note the bandwidth requirement for 8-bit grayscale on a compact Mac.

512 x 342 x 60.15fps x 8bit = 84 Mbit/sec
 
84 Mbit/sec is very manageable over USB, so maybe the grayscale output should be a feature of the I/O board for the SE and SE/30. That makes sense because those USB hub I/O boards I posted the block diagram for a week ago will only fit in compact Macs. That would be a cool feature that would make the I/O board more enticing.
 
But then the I/O board can't have VGA, since another microcontroller with a display controller would be required.
 
So users who want more video output would have to use a USB video adapter thingy. I will try to support these if there are Linux drivers available (easier than designing a display sub-board). Either that or someone can design a VGA output display sub-board to go on the main accelerator card.
Edited by ZaneKaminski
Link to post
Share on other sites

And another thought... the display sub-board connector is like $1.50 and will probably require the board to be bigger (maybe would cost another buck or so). Routing the DSI interface on the board is also a bit difficult (time-consuming and error-prone might be a better description) because it's fast and requires careful routing and impedance matching.

 

So if it ends up adding 2-3 bucks to the price of the accelerator (which is quickly approaching $150) and more of my time that could be used making the emulator software faster or something, is that worth it? Realistic price for a VGA card that fits in the slot is $25-40, I think.

Link to post
Share on other sites

Well my opinion is the more options the better. It would be a dream to have a super Mac se/300, and if the cost is a couple hundred dollars and some change, so be it. The current offering is pretty much the cost of a TwinSpark, PowerCache, and micron XCEED which unless you are very lucky is unobtanium, well into a couple thousand dollars and nets you 50MHz.

 

I project like yours only comes around only so often, and probably never as ambitious as the goals you have set. I say let the feature creep creep on!

Edited by joethezombie
Link to post
Share on other sites

Tbh I think that if you're using a classic mac you're not really looking for vga video. I mean the display is there. I'd make a difference if it was the only video out but ehhh (bit banging vga isnt that hard though :p )

I'd focus on getting the entire thing working as an accelerator then worry about video. :) Others are free to chime in of course.

Link to post
Share on other sites

I'm not sure how popular this opinion is. But personally, I don't see the attraction of adding external displays to a compact Mac. At that point, one might as well just use an LC or Quadra.

 

Maybe replacing the internal CRT with a color LCD. Maaaybe. Probably not. At least not until all the flyback transformers are burnt out. But I doubt there are panels available with the appropriate specs anyway.

Link to post
Share on other sites

Yeah, the ideas for features are out of control at the moment. I don't plan to implement all of the features I've been planning for. I just think it would be a shame to sell this $150 thing and for it to be impossible to upgrade it to add some feature like video output. So I'm trying to strike a balance where I have a minimum set of features that I will implement while reserving hardware interfaces for others to be creative with and add something.

 

Here are the pieces I plan to make:

  • Schematics for SE and SE/30 (almost done)
  • Board deign for SE and SE/30 (will begin soon)
  • FPGA programming for 68000 and 68030 buses
  • MC68000 emulation engine for ARMv8-A
  • GUI front-end for emulator (for use on desktop ARMv8-A PCs, e.g. Raspberry Pi 3. This will help with testing independently from the Maccelerator hardware.)
  • At least a little Mac Plus/SE peripheral emulation when running the emulator on a PC (again, makes the emulator easy to test without the Maccelerator hardware. Maybe this portion can come from Mini vMac)
  • Integration of Linux system for the Maccelerator

 

I do agree, though, that VGA output is not the most important thing since the Mac already has a screen. Hmm. I think this image of the Radius Full-Page Display was what got me onto the idea of an external display:

post-6543-0-72577800-1479848254_thumb.jpeg

 

Maybe I'll place a footprint and route the traces for the display sub-board but not actually solder on the connector. Adventurous owners can reflow the connector on with hot air.

 

Before I finish the schematic and begin the board layout, however, I need to talk to Intrinsyc and buy their Open-Q 410 module. They won't give me any technical information (even the pinout... they will answer my questions about capabilities but not give me the actual info) unless I own the thing already.

Edited by ZaneKaminski
Link to post
Share on other sites

Feature creep often results in vaporware. I suggest reining in the features.

 

Seconding this right here.  At what point is the line drawn between "accelerator" and "essentially a new computer shoehorned into a classic Mac"?  Right now the only thing this doesn't have (potentially) that a new computer would is a sound card.  Don't get me wrong, I like the idea of an affordable accelerator for a system where current options are near unobtanium (either by scarcity, cost, or some combination of the two) but I think the feature list should be pared down somewhat.  An upgraded processor, RAM, and upgrading the internal video from black and white to grayscale is a good stopping point IMHO.  Replacement storage and the addition of external video, WiFi, USB seem a bit excessive, unless they are left as potential future features to be implemented using an expansion card.

Link to post
Share on other sites

I wanna stress that I don't plan to implement grayscale video or video output myself, or any of the peripheral boards I've given rough designs for. (I may become frustrated and design the USB/JTAG header board while debugging the Maccelerator though. It's pretty easy.) I really am just trying to do the things in the list I posted above, but I'm trying to make sure that with my hardware, there exist plausible implementation strategies for certain desirable features.

 

In my mind, video is one of the features that should be possible and about which I should have a clear implementation strategy; I just won't actually implement it. Someone else can do it if they want, and hopefully the plan (which I have/am detailing now) will be helpful or serve as a starting point.

 

Since I now plan to run Linux on the Snapdragon 410, features relating to storage, internet connectivity, and USB come free. (Well, no, you've gotta pay $79 to get the Snapdragon 410 module lol, but at least Linux is free in some sense.) Let me explain.

 

In my mind, the emulator software has to be prototyped on a Raspberry Pi 3, DragonBoard 410c, something like that. That's the easiest way to get it running quickly and get feedback when debugging. Obviously then I'll have to develop a desktop GUI front-end for the emulator. So it will be natural for the emulator to use USB keyboards, mice, get files and disk images from the filesystem, etc. So in that sense, those features are free because they come naturally as a consequence of a seemingly unrelated detail of the implementation. Indeed, I don't want to debug the emulator software while connected to the Macintosh, so this is all the easiest way.

 

All of the Snapdragon 410 SoMs I've seen also have WiFi, Bluetooth, and GPS built in. You just need to connect an antenna. Now, I can't sell this thing with antennas, since it would have to be certified and I don't know how to achieve compliance with that kind of thing, but I'm individuals can to hook up their own (well, subject to FCC regulations). If a Raspberry Pi can make a PPP or SLIP connection with a Mac and get it online, I'm sure we can do the same in software. So that's not that hard.

Edited by ZaneKaminski
Link to post
Share on other sites

I wanna stress that I don't plan to implement grayscale video or video output myself, or any of the peripheral boards I've given rough designs for. (I may become frustrated and design the USB/JTAG header board while debugging the Maccelerator though. It's pretty easy.) I really am just trying to do the things in the list I posted above, but I'm trying to make sure that with my hardware, there exist plausible implementation strategies for certain desirable features.

 

In my mind, video is one of the features that should be possible and about which I should have a clear implementation strategy; I just won't actually implement it. Someone else can do it if they want, and hopefully the plan (which I have/am detailing now) will be helpful or serve as a starting point.

 

That's what I had gathered when reading the thread.  The question seemed more like "should I put a $1 connector on the board or not?  Well, my opinion is, for $1 why not?  Then someone like a future more experienced me could interface with that and maybe get something working.   Also, as for the usefulness of an external display with a compact Mac, it is just awesome having a portrait display next to my SE/30!

Link to post
Share on other sites

Hmm but maybe some of the peripheral stuff isn't as free as I may have thought. For example, emulating the VIA and other members of the chipset is different from adding more I/O devices to a system which already has a physically working chipset.

 

However, this work shouldn't be too terrible. The chipset emulation, used when the Maccelerator software is running on a standalone ARMv8-A chip, can be taken from Mini vMac. (Hopefully their license permits it.) Some drivers for the Mac will need to be written, but they can be just brain-dead simple. I mean, a virtual disk driver would be so easy. Just moving memory around, no actual control of the drive or anything like is done with the Sony IWM driver.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...