Jump to content

68060 accelerator cards for Mac: Would you be willing?

Recommended Posts

  • Replies 74
  • Created
  • Last Reply

Top Posters In This Topic

I dunno about IDE, as there are a few existing solutions that are fine:


1) SCSI-IDE converter

2) Ultra Wide drive with a UW nubus card

3) Ultra Wide drive with a 65-50 pin converter


and if you wanted CF you could stick a CF-IDE converter on the SCSI-IDE converter.

Link to post
Share on other sites
Is there anything that can be learned about 68000/68030 implementation on an FPGA/PIC from projects like MAME?
Aren't PICs something like small 8 bit processors?


You're right. There are 8 and 16 bit PICs. They are fast enough to emulate a slow 32 bit device (eg USB 1.1 client) but not a *fast"* multi function computer.


MAME is an arcade game emulator which runs on many platforms. Install MAME; find the ROM image for the arcade game; play. The emulator includes libraries or modules that translate processor instructions from the arcade game to the host machine. If you can find a way to use the libraries, you don't need to reinvent. Enthusiasts have done that already to emulate computers (see MESS).

Link to post
Share on other sites
Coldfire on PPC machines
Why? When it would be even harder / or perhaps impossible.
Why? So you can run your 68k apps on a machine with a faster bus, more memory, bigger hard drive, better video, etc.


Every single other thing on the machine, including the ROM, is expecting a PPC, not a 68k or a Coldfire. Won't work, IMHO.


I could maybe see it working as a PCI card with a set of INITs that patch the ROM to trap all 68k code and redirect it to the Coldfire with some re-interpretation along the way for incompatible opcodes. Maybe.


But then one really does have to wonder why you'd bother when it would be a whoooole load easier and cheaper to stick a 1 GHz G4 in there and run 68k apps on it.

Edited by Guest
Link to post
Share on other sites
68K instructions which are not supported on the Coldfire, should generate an exception, which can then be handled by executing the proper code on the Coldfire to emulate the excepted 68K instruction. Very neat. It's the same mechanism which the Mac Toolbox uses to execute its routines.


I wonder ... if this is a PDS card, could we redirect the excepted instructions back over to the original 680x0 for execution? Or build the card with a 68k co-processor onboard?


What is the performance like?


There is quite a significant overhead incurred when the ColdFire processor encounters an unimplemented instruction which has to be handled by CF68KLib. As a result, the effective performance which you will see will depend very much on whether your application triggers lots of exceptions within performance-critical loops. For best performance of production code, we recommend either recompiling for ColdFire (if the code was written in C or another high-level language), or translating using PortAsm/68K for ColdFire (if the code was written in assembly language).


In short, one wonders if a fast 68k copro would help ...


Does CF68KLib implement floating-point instructions?




... and an FPU.


Another approach would be to design a 68030 chip into an FPGA. ...

Beyond properly emulating each instruction, one must make sure that all the control registers, interrupt functionality and supervisory systems operate as expected. That's a lot of stuff.


As well, if we're going to use this an accelerator, it has to be able to interface to the Mac at its original CPU bus speed, and cache instructions and data between accesses.



You might also be interested in this PPCMLA discussion
and the follow-one, more technical discussion at Applefritter.
Link to post
Share on other sites
8.1 runs well enough on a 040 for me. That said I *own* an Amiga 060 machine / so I can get an 060 fix anytime I need one.


Have you ever run a Mac emu on the 060 Amiga and compared? Just curious.


Excuse me guys: I'm just catching up with this thread.

Link to post
Share on other sites
Freescale / produce a chip called Coldfire
one could always read the thread before posting


A small number of apps don't run well on PPC but they are a really small number
Is there any 680x0 code which cannot be run under CF68KLib?


/ a few programming tricks which are legal but deprecated on the 68000 family will prevent successful operation under the library. /

So they are probably the same apps which would break under Coldfire too :-/

Link to post
Share on other sites

I didn't notice Coldfire had already been mentioned - I guess I'll just blame bad eyesight or the fact it was getting late or something lame. I think more coffee is in order...!


There are code differences between 68000, 68020/030 and 68040 strains inside the 680x0 family that stop code written for one running on the others, but it's not caused any issues in Mac OS because the OS was programmed or amended to run on the later 68k chips.


I see no reason why the same would not apply to CF68k chips, given the correct intermediary, possibly on the form of a FPGA, between the computer and the CF68k chip. It'd not be ideal but the commands that are mentioned in the FAQ are well documented so they could be trapped and redirected or re-coded to legal routines if you had an experienced programmer on the case. Either that or you need a pre-boot patch ROM that loads the necessary patches to 'patch' the Mac OS ROM. Daystar's Turbo cards do this, so it's possible.


IMHO that would be required to use either a 68060 or a Coldfire as both vary slightly from earlier models and there's no way to patch the OS at a low enough level as to sort it out before code execution takes place, short of loading it from ROM at power-on otherwise.


Of course this is all pretty moot as it's never going to happen, but it makes for interesting speculation :)

Link to post
Share on other sites
I don't understand the obsession people have with the coldfire.. it's just incompatible enough to screw everything up and offers little to no benefits as nobody will develop for it.


The nice thing about Coldfire is that they are the only chips around which are (almost) compatible with the 68K instruction set and are available for $30 and less in speeds an order of magnitude faster than are in the original machines. I don't think any other chips come that close to providing an affordable speedup for 68K based systems.


As I mentioned above, the problem with 68060 upgrades is that the 68060 chip costs about $300 each from Freescale. You could build an accelerator for them, but at those prices three people will be able to afford them.


Heck, I had an idea about building a copy of the PowerCache030 for SE/30s until I priced 68030 chips. The original 68030 chip is priced around $100 each depending on the speed. Building new accelerators around those chips isn't practical either, at those prices. That's not even counting the FPU chip to go with it.


You're correct. It is incompatible or, at least, not fully compatible. But, if one wrapped the hardware and software around it to make it compatible it doesn't matter if anyone develops for it. It would look like a fast 68K to the host Macintosh. Old and new 68K software would run on it. That's the dream. More about dreaming below.


The big attraction of the Coldfire is that presumably, most of the code would not be emulated. That depends on the actual instructions used in 68K programs, but it would be nice to have an accelerator which is mostly running object code natively without emulation.


trag - I know you've done a few nice projects, but an accelerator is a big job.


Yeah, my previous projects are not in the same ballpark, neighborhood, nor municipality as an accelerator. But I did mention in my post (see paranthetical comment) that it was a more of a fantasy and that for there to be any realistic chance of realizing such a dream I'd need a bunch of software guys to help.


On the other hand, in my experience, the trick to doing large complex projects is to rob them of their complexity by breaking them into smaller doable chunks. That's part of what I was trying to do in my previous post. We don't know how to make an accelerator, but we could compare the list of signals in two datasheets (680030 and target Coldfire). We can also compare instruction sets from two different datasheets. Those things are simple, though the latter would be tedious.


Once we had documented the differences, I'm pretty sure we could check the documentation of that 68K emulation library to see if it makes up the difference.


I know how (in theory) to map out the functions of the GAL chips on an old Daystar accelerator card. Translating it to functions it's performing on an active IIci or SE/30 bus would be more daunting, but mainly because one would need a strong understanding of how the 68030 runs its bus and communicates.


Once we knew all those things, we would (roughly) know how to connect the host Macintosh bus to GLUE logic (probably in an FPGA) which would perform the functions that the old GALs did, plus any new Coldfire specific logic, and connect that to the pins of the Coldfire chip. I've done FPGA programming so I'm pretty sure I could program a chip to perform as the GLUE.


I certainly know or can find from documentation how to connect a boot ROM (programmable Flash) to a Coldfire chip. I'm not as certain that all the unsupported 68K instructions can be made to generate exceptions. But that will be apparent from the chip's documentation and the comparison of the instruction sets.


Hooking up the USB and 10/100 ports is trivially easy. Connecting the DDR memory is more difficult because all the traces need to be the same length and mostly insulated from noise, which probably means on interior layers with power or ground layers between them and the outside world. This may create a need for a six layer board, blech. Prototypes with 4 layer boards are cheap (<$200 per three) Prototypes with 6 or more layers are much more expensive.


So, from my hardware point of view, I think it is doable. It would take a lot of time.


It is also possible that the initial documentation studies would cause one to conclude that it isn't practical. Perhaps the Coldfire I/O busses are just too different from the 68K. I'm assuming at this point, that they're fairly similar with 32 address and 32 data lines, plus similar or identical bus arbitration signals.


Maybe start with something simpler.. like a nubus USB board perhaps?


That is mainly a software problem.


There are USB chipsets which interface directly with a CPU bus rather than to PCI. So it would be fairly trivial to get one of those chipsets onto a NuBus card. It would need an FPGA (or at least a CPLD) to provide the GLUE between the NuBus and the USB chipset. In many ways it'd be easier to just build a PDS to USB board. There's a lot less translation from bus to USB chip that way but some kind of hardware interface between the USB chipset and the PDS slot would probably still be needed.


How do you write a USB driver after that though? Ideally, one would write it as a SCSI SIM so that the USB bus would be bootable. Know any 68K Mac developers who are bored and need a large software project? Writing USB drivers would probably be pretty large, I think. But if you can come up with some serious 68K/USB programmers, I'm willing to team up and support the hardware side. They're also going to need to write a SCSI Manager 4.3 type XPT so that the host machine can handle having more than one SCSI bus.


Discussion of XPT and SIM can be found fairly early in Chapter 4 of "Inside Macintosh, Devices" which is downloadable from Apple as a several PDFs (one per chapter, chapter 4 is titled "SCSI Manager 4.3).


What I'd really like to do is build an IDE board for 68K Macs. That's been kicking around in my head too. It'd be cool to have an interface for laptop IDE drives in the old 68K models. I just haven't made the time for it and now I've started a new job and have even less time.


I've been telling myself that when I finish assembling the last few IIfx SIMMs I have laying around here I'll start on the next new project, but the blank PCBs are still sitting on my bench. Sigh.


A flash SIMM board that could hold all 3 iifx/iisi/se30 roms would be welcome development too.


I already have a (non-writable) board design drawn for the ROM SIMM. I've never had it fabricated because it doesn't appear to make economic sense. I also think that I may need to revise it to put two chips on each side, instead of four chips on the same side. If one is hand soldering the chips onto the SIMM, one needs space between the chips for the soldering pencil.


It would cost about $600 just to have 200 SIMM circuit boards made. That's not counting the cost of chips. After that the SIMMs can be built one at a time as needed at a marginal cost of about $4 for the chips. It seems unlikely to me that there are enough folks wanting ROM SIMMs to make up the $600 plus $4 per SIMM even at $20 or $30 per SIMM, assuming they would pay that much. And that's not even considering the time involved. While fewer than 200 boards could be made, the total cost doesn't actually drop much or at all, the unit price just goes up.


And there's the issue that such a board would violate Apple's copyrights, though why any sane person would care at this point... Still, makes it kind of hard to advertise and sell.


If you want the board flashable as well, that would require hooking up the Write_Enable signal (assuming it's even present in the ROM socket) and the then coding up a software routine which can massage the Flash chips into writing their contents. That's software again, at which I am marginally competent, and more importantly, just not that interested. I can connect the wires. Writing the routine which will properly massage the Flash chips into being written does not interest me.

Link to post
Share on other sites
It occurs to me that my idea of using 68k copros is just making everything more than twice as complicated.


More importantly, it makes it too expensive.


The very first thing we need to look at in any concept is the cost per unit, I think. 68K CPUs cost about $100 each depending on the model. If you can live without the MMU and at low speeds, they get cheaper.


68060 and 68040 are also too expensive.


Coldfire chips are cheap enough (~$30) but I see your post about them lacking an FPU. So, we either live without an FPU in accelerated systems, or....


If we're going to try to implement an FPU in an FPGA, then I'd say we're better off just going all the way back to the emulate the 68030 in an FPGA. Besides, the CF doesn't have a built-in interface to an FPU the way the 68030 does, does it? That would make providing an external FPU much more daunting.


FPGAs with about twice as much logic as found in an actual 68030 cost under $20 and run at 200 MHz.


The 68030 contains about 273,000 transistors. The 68040 contains just under 1,200,000 transistors.


If we assume that there are about 4 transistors per gate (this is very conservative from the point of view of this estimate) then the 68030 contains about 70,000 gates and the 68040 contains about 300,000 gates.


The Xilinx XC3S500E contains about 500,000 gates and costs $20.75 when purchased individually from Digi-Key. I'm not sure how FPGA gates translates to CPU transistors or even CPU gates though.


There are at least 4 transistors in each real world gate, except inverters which only have two. If we assume an average of four (which should be low) we get the numbers above for total gates in the target CPUs. I would assume that one needs many more gates in an FPGA to get the same functionality found in a CPU, because we're not custom designing the logic. But how much more?


The XC3S500E has 7 times as many gates as a 68030. Is that enough? Is it too much?


Does a 200MHz FPGA run non-optimized logic at a speed faster than 40MHz 68030 with custom logic.

Link to post
Share on other sites
I dunno about IDE, as there are a few existing solutions that are fine:


1) SCSI-IDE converter

2) Ultra Wide drive with a UW nubus card

3) Ultra Wide drive with a 65-50 pin converter


and if you wanted CF you could stick a CF-IDE converter on the SCSI-IDE converter.


I'm not sure if I'm interpreting your comment correctly, but there is no NuBus UW SCSI card. The fastest SCSI card made (or shipped anyway) was Fast & Wide with max transfer rates of 20 MB/s theoretical. UW is 40 MB/s theoretical maximum and was the next step up and the last one before LVD hit the scenes.

Link to post
Share on other sites

Most non-OEM'd UW drives work fine on a Fast Wide card. I know - I've done it many times :)


I'm only throwing ideas out of my head here, but would it make any more sense to build a faster 68030/6882 combo in FPGA? It's something that's been mooted in the Amiga community to create a system-on-chip Amiga 68k machine. You'd need a decent number of gates on the chip to do it but from what I've read there are plenty of FPGAs from companies like XILINX that will do it easily, you just need a good FPGA programmer and plenty of time to work out how.


I have no idea what production costs would be like on something like that, but if the FPGAs are cheap them all you need is a socket or PDS interface for it, and a pre-load system using something like a SD card to load the code to the chips. If you improve the code at any stage you can dump a new file to the SD card and that's your CPU improved. It's a novel idea. It'd also allow you to incorporate patches for the Mac OS ROM if you were really cunning - BSD/Linux/A/UX boot loader in ROM anyone? ;)

Edited by Guest
Link to post
Share on other sites

One final comment on all the updated interface card ideas...


The problem with NuBus cards is that the maximum theoretical performance of a NuBus card is 10MHz X 4 bytes = 40 MB/s. And the reality is quite a bit slower with bus overhead. For example, on the NuBus, the data and address pins are multiplexed, which means that addresses and data are transmitted on teh same wires, sequentially. A PDS bus has separate address and data busses as well as running faster than 10 MHz.


Apple's guidelines for cards for PDS slots are very similar to NuBus cards, so there's actually not that much difference in developing one or the other.


With FPGAs it's not that difficult to build a card which will work regardless of which PDS slot it's plugged into. Although making it autodetect that might be a bit of a trick. It'd be nicer if the user set a jumper.


Of course, there's only one PDS slot in each machine. So a multi-function card such as IDE/USB mentioned above becomes attractive.


At this point one starts thinking about all the things that would benefit from faster performance and is quickly contemplating a card wtih IDE, USB, video, and 10/100 ethernet all rolled into one. Then, if one assumes that a machine wtih such a card installed would also be one with a fast CPU upgrade, one starts thinking, "hey, the upgrade is running at 40 or 50 MHz, why am I limiting my peripheral upgrades with a 16 MHz bus speed? If I put them all on the same card, they could all run at 40 or 50 MHz bus speeds.


And then one start comptemplating a total 68K computer upgrade that just uses the host motherboard as an I/O interface....

Link to post
Share on other sites

Interesting thoughts, trag. Here's a few of mine.


/ I know how (in theory) to map out the functions of the GAL chips on an old Daystar accelerator card. / Once we knew all those things, we would (roughly) know how to connect the host Macintosh bus to GLUE logic (probably in an FPGA) / I've done FPGA programming /


So, from my hardware point of view, I think it is doable. It would take a lot of time.


/ when I finish assembling the last few IIfx SIMMs

Ah, so you're that guy. Eeeexcellent.


Coldfire chips / lacking an FPU. So, we either live without an FPU / or....


If we're going to try to implement an FPU in an FPGA, then I'd say we're better off just going all the way back to the emulate the 68030 in an FPGA.


I've added a post to the Applefritter thread, with these thoughts:


For reference,

here's a block diagram of the 68040:


At the very minimum / "Bus Controller" / duplicate that in the FPGA. / also provide some hardware to streamline the instruction fetch and write-back stages / The *really* hard part of this is going to be emulating the instruction and data cache units. / If you end up running your cache in software that's going to seriously impact the cycles left over for executing code.


So would it be possible to combine the "streamline" fetch/write hardware you mention and the pseudo-cache? I mean what you're talking about is essentially a real cache before the pseudo cache if I'm reading this right.


So emu the cache stages on the FPGA, or use physical RAM (DDR perhaps) with a DMA chip as cache, and emulate only the ALU and FPU in software? [ / or in the FPGA]


There seems to be a lot of documentation available for the 68030; I haven't looked for much on the '040 yet. As the cult machines people want to overdrive (eg SE/30) are often '030 based, maybe this is a better target than the '040.


Does the above make some kind of sense? If I'm thinking right, it means breaking up the task further, to the chip subsystem level. Emulate the data and instruction caches in hardware, and only build the ALU and FPU in the FPGA.


Other thoughts:


Use some of the DDR RAM as VRAM and add a VGA controller chip.

Edited by Guest
Link to post
Share on other sites
/ At this point one / is quickly contemplating a card wtih IDE, USB, video, and 10/100 ethernet all rolled into one. / And then one start comptemplating a total 68K computer upgrade that just uses the host motherboard as an I/O interface....

And then one contemplates a Coldfire SBC that just thinks it's a Mac... which saves you all the problems of interfacing to the '030 PDS etc



BTW Eudimorphodon at the Applefritter thread suggests another possible plan of attack:

/ would be to use one of the FPGA/CPU core hybrid chips. (Something like a Xilinx Virtex-II Pro, which has an embedded PowerPC core.)
... to recreate a PDS upgrade like the Apple 601 cards, but packing a G3/G4 or two, and running the Apple 68k emulator in PPC code.


Virtex-4 Multi-Platform FPGA


Up to two 450MHz embedded IBM PowerPC 405



Ultimate Parallel Connectivity

* 1+ Gbps differential I/O

* 600 Mbps single-ended I/O

* ChipSync™ source-synchronous interface

* 16 I/O banks



* connect or bridge "anything to anything" with up to 24 serial transceivers, 622 Mbps to 6.5 Gbps, full duplex


Integrated 10/100/1000 Ethernet

That's much more what I was thinking / replicating the non-CPU parts of Apple's 601 upgrade cards in the FPGA, hacked to support the dual 405 CPUs.
You'd be better off using a stand-alone FPGA and a G3/G4 CPU. / The 405 core lacks an FPU, has a somewhat different MMU / and has some instruction/hardware extensions that could all pose difficulties / Duplicating the Apple PPC upgrade card is probably very doable, but figure spending several thousands of dollars / and a lot of reverse engineering /


So... 68k PDS to ZIF socket, anyone?

Link to post
Share on other sites
  • 4 months later...

I do note that while playing with the sources of Basilisk II, there are ROM patches to make the Mac ROM work on a 68060 CPU. Why? Basilisk II has a native execution mode where it passes the 68k instructions through to the host 68k CPU when it's running on a 68k Unix/Linux platform. Said host can be a 68060, so it patches the ROM to make things behave.


d) 68060 systems are a small problem because there is no Mac that ever

used the 68060 and MacOS doesn't know about this processor. Basilisk II

reports the 68060 as being a 68040 to the MacOS and patches some places

where MacOS makes use of certain 68040-specific features such as the

FPU state frame layout or the PTEST instruction. Also, Basilisk II

requires that all of the Motorola support software for the 68060 to

emulate missing FPU and integer instructions and addressing modes is

provided by the host operating system (this also applies to the 68040).


So it's not impossible in software. However, in hardware, good luck :)

Link to post
Share on other sites
Or find an old PC, install Windows 95, change the boot screen to look like the MacOS splash screen, set Basilisk II to run on startup, and put it all inside a IIfx case! :p Would probably be faster than a real 68k, depending on what sort of PC you could find for the project...


Can go one better- FreeDOS running with FUSION from http://www.emulators.com... wicked fast 68k. When I tried it Apple System Profiler was reporting speeds of 300mhz '040s. It also has very little overhead due to DOS being so lean.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Create New...