Seeking Dove Marathon 030 Racer with cache for Plus/SE Accelerator Info (JED files, pics, or existing work!)

Builder68

Well-known member
Hey everyone,

I'm excited to share a little project I've been working on, and then hopefully tap into the collective wisdom of this awesome community! I'm developing an accelerator board for the Macintosh 512Ke that ingeniously combines a SCSI interface and ROM-INATOR ROMs.

I've named it "Performer Fusion," and I'm thrilled to say the Gerber files have already been sent to JLCPCB, so the first PCB run is underway!

Based on the prototype I've been probing these past several weeks, I'm optimistic it'll work right out of the gate. Of course, it will be completely open source! I'll be opening a dedicated thread for "Performer Fusion" very soon.

But this thread is for taking things one step further!

From what I've learned, the Micromac Performer (which Bolle cloned) appears to be a rebranded version of the simpler Dove 030 Racer. This leads me to believe that adding cache to my "Performer Fusion" accelerator would be entirely feasible if I could get more information about the Dove model that did include a cache. (I'm having trouble finding its exact name, but I've attached a picture for reference; perhaps it's "Dove Marathon Racer 030," and the stripped-down version is simply "Racer"?)Dove_Marathon_Racer_030-scaled.jpg

So, I'm on the hunt for specific information regarding this Dove accelerator board with cache for the 512k Macs. My goal is to incorporate that cache functionality into the Bolle's Micromac Performer design, which my "Performer Fusion" is based on.

Specifically, I'm wondering if anyone out there has one of these accelerators and would be willing to help with a recovery effort. I'm looking to:

  • Recover the JED files from its PLDs. This would be incredibly helpful for a reverse-engineering project.
  • Provide high-quality pictures of the board. Clear, detailed images would greatly assist in understanding its layout and components.
Alternatively, if someone has already gone down this road and done some of the work (recovering JED files, mapping the board, etc.), I'd be absolutely thrilled if you'd be willing to share your progress!

Any help, insights, or leads would be hugely appreciated!

Thanks in advance,
 
Last edited:

zigzagjoe

Well-known member
Not trying to be discouraging, but you wouldn't want to try to clone that for anything you actually intend to make more than one of and/or encourage others to make.

Those PLDs (22CV10P) appear to be an vendor specific extended version of the usual GAL22CV10 chips.... That'll make them problematic to dump. I would expect them to have a distinct programming algo requiring specialized programmers probably not be vulnerable to GAL glitching or the PALCE backdoor. Even if dumps are procured, you'll have a problem trying to source those chips as they're obsolete. Similar problems with GAL26CV10 but I think Max managed those with unsecured chips. You'll have sourcing issues with the 71B74 TAG SRAM also.

Realistically, I think you'd be better off trying to develop something new with onboard DRAM or SRAM (for ease of development) rather than a caching approach. That'd get you a 32 bit data path and greater performance overall while minimizing or eliminating dependeices on obsolete chips.
 

Builder68

Well-known member
Not trying to be discouraging, but you wouldn't want to try to clone that for anything you actually intend to make more than one of and/or encourage others to make.
Thanks for the heads-up and for looking out! I appreciate you sharing that perspective; it's definitely something to consider if I ever try to scale it up.

Those PLDs (22CV10P) appear to be an vendor specific extended version of the usual GAL22CV10 chips.... That'll make them problematic to dump. I would expect them to have a distinct programming algo requiring specialized programmers probably not be vulnerable to GAL glitching or the PALCE backdoor. Even if dumps are procured, you'll have a problem trying to source those chips as they're obsolete. Similar problems with GAL26CV10 but I think Max managed those with unsecured chips. You'll have sourcing issues with the 71B74 TAG SRAM also.
Ugh, that's what I was afraid of. That definitely throws a wrench in any hopes of dumping them easily.

Supposing I successfully accomplish to adapt the cache logic taken from other accelerator made for the LC (MicromacThunder Cache) that have been already reverse engineer (still learning how to get rid of 68020´s signals like SIZ0/SIZ1, convert DSACK1 to become DTACK, convert DS to UDS/LDS, adjust the address decoding regions, among other things), the sourcing issues for the 71B74 TAG SRAM are definitely a huge roadblock too, but there are drop-in replacements still available (obsolete also) for them and fairly cheap. Thanks for laying it all out; it gives me a much clearer, albeit unfortunate, picture of what I'm up against.
Realistically, I think you'd be better off trying to develop something new with onboard DRAM or SRAM (for ease of development) rather than a caching approach. That'd get you a 32 bit data path and greater performance overall while minimizing or eliminating dependeices on obsolete chips.
I appreciate you bringing that up, and you're right, developing something new with onboard DRAM or SRAM for a 32-bit data path seems like the most logical route to gain some performance, avoiding the caching approach.

Here's the thing: I've also been circling back to the idea of gaining performance by bringing faster RAM to the accelerator board, but it introduces a major roadblock for me. I'm not a software engineer. I'm worried it might mean heavily patching the ROM, and frankly, I'm not entirely sure what that would entail or how extensive those modifications would need to be. It feels like a big leap into the unknown right now.

But hey, I threw this thread out there hoping someone has already asked themselves the same questions I have regarding why there is no caching logic reversed from any accelerator made for the Plus or 512K. Most of the answers, I think, you have already mentioned.
 

zigzagjoe

Well-known member
Thanks for the heads-up and for looking out! I appreciate you sharing that perspective; it's definitely something to consider if I ever try to scale it up.


Ugh, that's what I was afraid of. That definitely throws a wrench in any hopes of dumping them easily.

Supposing I successfully accomplish to adapt the cache logic taken from other accelerator made for the LC (MicromacThunder Cache) that have been already reverse engineer (still learning how to get rid of 68020´s signals like SIZ0/SIZ1, convert DSACK1 to become DTACK, convert DS to UDS/LDS, adjust the address decoding regions, among other things), the sourcing issues for the 71B74 TAG SRAM are definitely a huge roadblock too, but there are drop-in replacements still available (obsolete also) for them and fairly cheap. Thanks for laying it all out; it gives me a much clearer, albeit unfortunate, picture of what I'm up against.

I appreciate you bringing that up, and you're right, developing something new with onboard DRAM or SRAM for a 32-bit data path seems like the most logical route to gain some performance, avoiding the caching approach.

Here's the thing: I've also been circling back to the idea of gaining performance by bringing faster RAM to the accelerator board, but it introduces a major roadblock for me. I'm not a software engineer. I'm worried it might mean heavily patching the ROM, and frankly, I'm not entirely sure what that would entail or how extensive those modifications would need to be. It feels like a big leap into the unknown right now.

But hey, I threw this thread out there hoping someone has already asked themselves the same questions I have regarding why there is no caching logic reversed from any accelerator made for the Plus or 512K. Most of the answers, I think, you have already mentioned.

My thought with the onboard SRAM is this:

As long as the machine does not use the 68000 beyond as an E clock generator (ie. 68030 is always active), you could put 4MB of SRAM in the appropriate address space but do hole-punches in your address maps for the locations of the video and sound buffers. This way, when the ROM accesses those locations your logic treats those holes as an external access and send it to the logic board RAM, and avoid the need to modify the ROM at all. This is predicated on there being a path to upgrade the logic board to 4MB with whatever arrangement of memory is required for that which the ROM supports already.

This also means all access to main memory can happen at 0 wait states for both reads and writes, instead of only on some reads for a cache. Still very much not a trivial bit of work, but doable, I think. As you said once you go down the rabbit hole of requiring modified ROMs, it's easy to get into a death spiral.
 

Builder68

Well-known member
My thought with the onboard SRAM is this:

As long as the machine does not use the 68000 beyond as an E clock generator (ie. 68030 is always active), you could put 4MB of SRAM in the appropriate address space but do hole-punches in your address maps for the locations of the video and sound buffers. This way, when the ROM accesses those locations your logic treats those holes as an external access and send it to the logic board RAM, and avoid the need to modify the ROM at all. This is predicated on there being a path to upgrade the logic board to 4MB with whatever arrangement of memory is required for that which the ROM supports already.
Yes. looking at the PLD equations it seems 68000 is not used for primary processing beyond acting as a clock source, and possibly for some I/O arbitration and initialization, once the 68030 accelerator takes over.
This also means all access to main memory can happen at 0 wait states for both reads and writes, instead of only on some reads for a cache. Still very much not a trivial bit of work, but doable, I think. As you said once you go down the rabbit hole of requiring modified ROMs, it's easy to get into a death spiral.
Also yes, here is where things get uglier! If I am going to make the effort to add SRAM as primary RAM memory, it would mean marginal or no improvement at all over the external DRAM expansion, unless those wait states are eliminated from the ROM (like Apple did with the Macintosh Classic ROM) and possibly from some of the intricate logic of the accelerator board. Well, the truth is I'm not fitted for that task!.

But guess what,

As I was trying to understand the equations for the PLDs on the Micromac Performer, I became really curious about those unused output pins on PLD U5 (specifically pin 14 and pin 15, which has 'RAMCARD' written next to it). Taking a closer look, I realized this board seems to have signals intended for some kind of RAM expansion – perhaps a RAM disk using SRAM!

However, it appears this wasn't designed to be used with a MacPlus, at least not directly.

Pin 14 (I named it /SRAM_GATE) looks like an arbitration and timing signal, used to ensure that signal 'RAMCARD' (better called it '/SRAM_CE') is only active for $400000-$7FFFFF (A23=0 is implied for RAM, enforced by CPU decoding).

Modifying the equation for /SRAM_GATE, I think a managed to get /SRAM_CE asserted for $600000–$7FFFFF address space:

!SRAM_CE = !N$7 & !BR_00 & !SRAM_GATE

SRAM_GATE.D = !N$7 & !BR_00 & !X1 & RESET_00
# !AS_00 & !N$7 & !BR_00 & A22 & RESET_00 ; // $400000-$7FFFFF; A22 = 1
# N$7 & !IPL2 & !X1 & !RESET_00
# N$7 & !BR_00 & !IPL2 & !X1
# N$7 & !BR_00 & !IPL2 & !RESET_00

SRAM_GATE.D = !N$7 & !BR_00 & !X1 & RESET_00
# !AS_00 & !N$7 & !BR_00 & A22 & RESET_00 & A21 // $600000-$7FFFFF; A22 = 1 & A21 =1
# N$7 & !IPL2 & !X1 & !RESET_00
# N$7 & !BR_00 & !IPL2 & !X1
# N$7 & !BR_00 & !IPL2 & !RESET_00

The address space $400000–$7FFFFF might ring a bell for some—perhaps it matches another Macintosh model that can use this accelerator board.

Since my main goal is to deck out my 'Performer Fusion' with as many upgrades as possible, I think adding a 2MB SRAM is a much more achievable short-term plan than my original idea of adding a cache for extra speed.

With a little optimism, I’m hoping the 'Virtual' driver (that third-party gem) will detect this RAM—place it at $600000–$7FFFFF, which should fit into the Mac Plus's unused address space—and let OS 7.1 use it as virtual memory. Does that sound right to you?
 

cheesestraws

Well-known member
I don't know the details, but I can certainly agree that trying to get Virtual to do the hard work for you is exactly what I'd be investigating first in your position.
 

zigzagjoe

Well-known member
Yes. looking at the PLD equations it seems 68000 is not used for primary processing beyond acting as a clock source, and possibly for some I/O arbitration and initialization, once the 68030 accelerator takes over.

Also yes, here is where things get uglier! If I am going to make the effort to add SRAM as primary RAM memory, it would mean marginal or no improvement at all over the external DRAM expansion, unless those wait states are eliminated from the ROM (like Apple did with the Macintosh Classic ROM) and possibly from some of the intricate logic of the accelerator board. Well, the truth is I'm not fitted for that task!.

But guess what,

As I was trying to understand the equations for the PLDs on the Micromac Performer, I became really curious about those unused output pins on PLD U5 (specifically pin 14 and pin 15, which has 'RAMCARD' written next to it). Taking a closer look, I realized this board seems to have signals intended for some kind of RAM expansion – perhaps a RAM disk using SRAM!

However, it appears this wasn't designed to be used with a MacPlus, at least not directly.

Pin 14 (I named it /SRAM_GATE) looks like an arbitration and timing signal, used to ensure that signal 'RAMCARD' (better called it '/SRAM_CE') is only active for $400000-$7FFFFF (A23=0 is implied for RAM, enforced by CPU decoding).

Modifying the equation for /SRAM_GATE, I think a managed to get /SRAM_CE asserted for $600000–$7FFFFF address space:

!SRAM_CE = !N$7 & !BR_00 & !SRAM_GATE

SRAM_GATE.D = !N$7 & !BR_00 & !X1 & RESET_00
# !AS_00 & !N$7 & !BR_00 & A22 & RESET_00 ; // $400000-$7FFFFF; A22 = 1
# N$7 & !IPL2 & !X1 & !RESET_00
# N$7 & !BR_00 & !IPL2 & !X1
# N$7 & !BR_00 & !IPL2 & !RESET_00

SRAM_GATE.D = !N$7 & !BR_00 & !X1 & RESET_00
# !AS_00 & !N$7 & !BR_00 & A22 & RESET_00 & A21 // $600000-$7FFFFF; A22 = 1 & A21 =1
# N$7 & !IPL2 & !X1 & !RESET_00
# N$7 & !BR_00 & !IPL2 & !X1
# N$7 & !BR_00 & !IPL2 & !RESET_00

The address space $400000–$7FFFFF might ring a bell for some—perhaps it matches another Macintosh model that can use this accelerator board.

Since my main goal is to deck out my 'Performer Fusion' with as many upgrades as possible, I think adding a 2MB SRAM is a much more achievable short-term plan than my original idea of adding a cache for extra speed.

With a little optimism, I’m hoping the 'Virtual' driver (that third-party gem) will detect this RAM—place it at $600000–$7FFFFF, which should fit into the Mac Plus's unused address space—and let OS 7.1 use it as virtual memory. Does that sound right to you?

Well, the trick with the SRAM would be to privately decode that on your card and handle the acknowledges yourself. It would be private memory for the 030. So the 68030 only ever accesses the SRAM except for those holepunches, and you control the timing of the acknowledges (well, you'd need to do that anyways). The key part is that the 68000 never needs to access that SRAM - if you can't guarantee that, then you have a bit of a problem.

You will also need to implement overlay functionality to inhibit the SRAM and instead access the ROM at $0 temporarily. This can be done by looking for the key VIA access and capturing that bit, and replicating that functionality in your logic.

Therefore the logicboard RAM would not matter except for the video/sound buffers. A clarification, as far as I know the ROM won't be programming wait states on something this old, that would all be done by hardware for the 68000 asynchronous bus.

Regardless of how you do it, adding SRAM is going to need a bit more than just an enable/decode - you'll need to generate acknowledges, byte selects, write/read strobes, etc. I would encourage you to perhaps consider porting the existing logic into a single ATF1502 CPLD (also uses CUPL) as that will give you a *lot* of flexibility in both understanding and changing how things work.

I would encourage you to avoid Virtual, though. It's a bit of a mystery black box and also likely going to involve comprimises for performance.

You might consider joining the #68kmla irc channel sometime! Good to chat about these things.
 

Builder68

Well-known member
Well, the trick with the SRAM would be to privately decode that on your card and handle the acknowledges yourself. It would be private memory for the 030. So the 68030 only ever accesses the SRAM except for those holepunches, and you control the timing of the acknowledges (well, you'd need to do that anyways). The key part is that the 68000 never needs to access that SRAM - if you can't guarantee that, then you have a bit of a problem.

You will also need to implement overlay functionality to inhibit the SRAM and instead access the ROM at $0 temporarily. This can be done by looking for the key VIA access and capturing that bit, and replicating that functionality in your logic.

Therefore the logicboard RAM would not matter except for the video/sound buffers. A clarification, as far as I know the ROM won't be programming wait states on something this old, that would all be done by hardware for the 68000 asynchronous bus.

Regardless of how you do it, adding SRAM is going to need a bit more than just an enable/decode - you'll need to generate acknowledges, byte selects, write/read strobes, etc. I would encourage you to perhaps consider porting the existing logic into a single ATF1502 CPLD (also uses CUPL) as that will give you a *lot* of flexibility in both understanding and changing how things work.

I would encourage you to avoid Virtual, though. It's a bit of a mystery black box and also likely going to involve comprimises for performance.

You might consider joining the #68kmla irc channel sometime! Good to chat about these things.

Your comment about Virtual spooked me. I'll investigate what performance penalties might come with it.

Thanks for the detailed response. You're right - I was too focused on address decoding and didn't fully consider the DTACK/DSACK requirements.

Making the SRAM private to the 68030 does seem like the cleanest approach. My programmer supports the ATF1502, so I'll follow your advice. I guess it's time to invest in a decent but affordable logic analyzer though.

And yes, I'll finally join #68kmla - I've been lurking long enough.
 

zigzagjoe

Well-known member
Nice thing about the ATF15xx is you can program them with JTAG in system. Unfortunately, you also have to use atmels wretched fitter software too...

I would assume virtual is doing something reasonably intelligent but if it's (ab)using virtual memory as the mechanism, you'd want to be extremely sure that it's not /actually/ paging. Either way it's old undocumented software, so i feel like there are more pitfalls that you're likely to run into. Also, limited performance gain unless your 32 bit wide RAM is at (virtual) address $0.

I've been using the DSlogic logic analyzers. They've been working well for me, I upgraded to the 32 channel one last December as I needed the additional speed.
 

Builder68

Well-known member
I made some progress making a deep dive into the implemented logic on the 030 accelerator, toward the goal of adding SRAM as primary RAM.

It turns out (not surprisingly) the logic already implements sophisticated memory management almost tailored to do this. Putting it another way, the logic already makes the RAM private for the 68030, with punch holes in the memory map for I/O HW and ROM shadowing.

So, here are a few short answers to some of the big questions about what the accelerator logic already does:

Certainly, it isolates the 68000 and only contributes as an E clock generator. It handles OVERLAY states and transitions. (VIA registers are cleverly captured. It is quite intricate how all this is done btw). It disables caching for I/O accesses and ROM shadowing, syncs E clock, and introduces the necessary wait states (16) that allow access to MacPlus devices, i.e., sound, video, SCSI, ROM (this is the key!). Address decoding and strobe control for everything are in place or easy to implement using some unused outputs. One fortunate consequence of taking as much as possible advantage of logic already programmed is that we are sure all the arbitration is proven to work and the hole punches are already properly decoded and monitored.

In case anyone is asking themselves what performance gains will be seen with this upgrade, well, they could be as much as the performance gained by installing the accelerator in the first place! For memory-intensive tasks, around 300%. Way more than just adding any amount of cache.

Thanks again to Zigzagjoe for pointing me in the right direction!
 

zigzagjoe

Well-known member
Nice! That is a windfall for sure. For the internal caches, I assume the logic holds CIIN low on reset until a magic poke which would occur after the MMU setup has done and appropriate regions marked cache-inhibited. If done this way, I think the 030 could boot without ROM patches....?

I've been flirting with sticking 16mb of SRAM on one of my boosters to see how that'd work with an 030 at 47mhz, but haven't yet gotten around to it. I think you might even be understanding the benefit for memory intensive apps, if the logic in place is just giving you an OE for the local RAM then you can synchronously terminate any memory cycle for 2 cycle (@030's speed) 32 bit accesses. Just a little faster than 16-bit 3cycles+ @ 8 odd mhz :)
 

Builder68

Well-known member
Here is a simplified state diagram showing more or less how the original logic seems to use the CIIN flag in normal operation mode (OVERLAY=0). PLD U4 intermediates all I/O accesses between the LB and the 030, while PLD U5 manages the strobes for itstable_state_overlay0.png
 

Builder68

Well-known member
DTACK_SYS+EX is not used, but is almost what is needed for the SRAM.

Kudos to the Dove engineers; I believe they were who originally designed this little devil.
 

Bolle

Well-known member
I don’t think the design originated at Dove. It must have come from whoever was the designing entity behind Total Systems/Extreme Systems/Quesse/Engineering Solutions/Novy

The GAL code on the Total Systems Mercury is basically identical to what ended up being the Micromac Performer. That also explains the decoding logic for 4MB of accelerator RAM being there because there was an optional DRAM board that would plug into the Mercury allowing up to 4MB of RAM. I‘ve also got the GAL dumps for that one as well as (unconfirmed) schematics in case you’re interested.


Edit: as a sidenote, the PLDs on all my Dove Racer accelerators like the one pictured above are locked and the usual methods to get around the lock are not working on the uncommon types they used. So far I didn’t see the need to go the extra mile and recreate them by hand.
 

zigzagjoe

Well-known member
Early Initialization (OVERLAY=1)

View attachment 87855

You're making me feel properly guilty for not properly graphing how my projects work and instead just thinking about it real hard and hoping it matches reality 😂 or tinkering until it does.

I assume the ROM copy is done by the extension - doing it in hardware would be mad (and annoying)?

Has anyone thrown the PLD against a DuPAL?

It wouldn't do much good..... DuPAL and the like are useful for understanding pure combinatorial (A+B= C ) logic but stateful logic it is of much less use. Stateful logic is done with either registered (clocked flip flops A+B + CK = C) or feedbacks used to similar effect (A+B+C=C). Sometimes, both in one equation....

Since there's state inside the PLD poking bits at inputs and looking at outputs as fuzzers do will produce inconsistent results (so you know, at least, that there's a register) but recreating it requires a lot of work by hand.
 

Builder68

Well-known member
I don’t think the design originated at Dove. It must have come from whoever was the designing entity behind Total Systems/Extreme Systems/Quesse/Engineering Solutions/Novy
Oh!. didn't know that!. Well, Kudos to the Total Systems/Extreme Systems/Quesse/Engineering Solutions/Novy

The GAL code on the Total Systems Mercury is basically identical to what ended up being the Micromac Performer. That also explains the decoding logic for 4MB of accelerator RAM being there because there was an optional DRAM board that would plug into the Mercury allowing up to 4MB of RAM.

Yes, that explains all this unused logic on the board. And thanks to it, SRAM (32-bit path) is quite possible.



I‘ve also got the GAL dumps for that one as well as (unconfirmed) schematics in case you’re interested.
Oh, that would be wonderful and so generous! Yes, please! That way, a "DRAM Expansion Included" board version of the "Performer Fusion" for the 512Ke may be possible, and there will be no need to install a separate RAM board.

Hear me out: I think you should be the first one to implement SRAM on the Micromac Performer. After all, it's your baby, no? (You cloned it).

So if you like, I will send you soon all the work I have done producing the logic modified for implementing the SRAM. For sure, it will not be 100% correct, but it may lay the groundwork for you so you're not starting from zero. I'm 100% sure you will finish this project 5 or 6 times earlier than me. 😉.

Edit: as a sidenote, the PLDs on all my Dove Racer accelerators like the one pictured above are locked and the usual methods to get around the lock are not working on the uncommon types they used. So far I didn’t see the need to go the extra mile and recreate them by hand.
Well, it probably won't be needed now that we know how feasible it is to use SRAM exclusively for the 030. Performance gains outweigh the effort to clone the external caching functionality, besides having to scavenge for obsolete components like 8-bit TAG RAM ICs.
 
Top