Well, nextcomputers.org forum doesn't seem to be accepting registrations, so figure I'll post this here instead.
I recently got a NeXTStation Color Turbo thanks to a kind forum member. As my first hardware project for it, I wanted to accelerate it; both as it really could use the help, and as nobody that I can see has tackled this in modern times. There's a few unobtainium vintage designs, of course, and one fellow that came up with a pretty ugly hack. Not good options, that hack had some nonsensical stuff in there that isn't required, but even so replacing the PLL and removing some resistors to make it work... no thanks, I don't want to modify my logic board. Folks had already tried the QuadDoublers that will work in some amigas and found they didn't work in NeXT, and that seems to have been the end of any development attempts.
Putting a little theory out there first... generally, with CPU accelerators you need to present strobes, read/write data, etc as if you were a native CPU operating on the bus. What that means is if you feed a 2x clock into the CPU (simplifying a bit here) the suddenly faster CPU is going to freak out because signals from the logic board will violate what it expects (badly) and cause bad data to be read/written/etc.
Two ways to address this. The QuadDoublers supposedly do some janky tricks to track if the 040 is in the middle of a bus cycle or not, and if it isn't, feeds the CPU a doubled clock, but when a bus cycle starts the clock is reduced back to standard. This way the bus access timings look like a 25mhz CPU (because it is a 25mhz CPU). Either way, yuck, though evidently this approach wasn't unheard of as the IIfx does something similar according to @Bolle.
The second and more common method is to accept the signals from the fast cpu and massage them in logic until they look like the appropriate 25mhz signals, or close enough that it isn't massively violating timings. This is the more compatible approach - if you have logic that is up to the task. For example, this is the approach taken on my Booster accelerators, and an example of what that looks like can be seen below.

(/TS = transfer start, indicates the beginning of a transfer, and /TA is acknowledge, indicating either the end or that data is ready for a multi-word transfer)
Well, @Bolle had reverse engineered the Formac PL150 accelerator some time ago originally designed for the LC475. Credit to Bolle for the image. This ran the CPU at 45mhz while the logic board remained at 25mhz. Bolle had found you could put a typical motorola PLL on the board and run it at 50mhz, too. He was so kind as to share the logic and schematic with me, and I used this as a starting point to see what is needed. As with a PLL clock you're phased-locked to the input clock (literally in the name, after all) I rewrote all of the logic to take advantage of that and as a result increased performance since we can make certain assumptions on how signals line up. Also, there were some really screwy choices made by formac to bypass bugs they introduced or couldn't see how to fix (?).


This was all well and good, and the revised logic was solid enough that it could boot an Amiga 4000. At least, as much as it could with a bad OS - the unit was briefly on loan to me, and I have no amiga background, so I failed there. Still, this is a good omen because it's entirely different from the Quadras I'd tested with.
Next bit of difficulty: The Turbo Color schematic was found some time ago (attached) and it makes clear at least part of the problem with using other accelerators: NeXT uses the 68040 Multiplexed bus mode (didn't even realize that was a thing!). Briefly - a single rising edge, to be exact - the address is presented on the combined address-data bus, and then it switches to data-bus operation. Unless you specifically built your accelerator with that in mind, yeah, that wouldn't work.
Pretty clearly this unobtainable vintage accelerator (seen below) had a set of 2x 16 bit transceivers and buffers, though it's not quite clear to me how they'd use these in order to hold the address as is required for multiplexed operation. Due to this I would need a new board design that both ties the busses together and has the logic required to snipe that address and hold it for long enough to make the system happy.


So a new board was designed and sent off to JLC. Last JLC order to make it over the line in time. The NeXTstation will need a jumper populated to reduce the bus clock to 25mhz; 33mhz would have the poor 68040 trying to run at 66mhz and that's just not going to work.


With a bit of tweaking... it lives! Interestingly, I found NeXT apparently tweaks AVEC on the fly. It's nominally pulled high, but it's wired to one of the core chipset. I'd strapped it high not noticing that. Oops. Nothing a bodged pin won't sort out.
Preliminary results looking good! Dhrystone is a best-case scenario that doesn't stress memory, I will have to see what is out there for a more rounded benchmark. Video results were essentially at parity with the original Turbo@33 - reduced bandwidth was presumably offset by increased performance on the algorithmic side. Haven't been able to do much more testing as of yet, but it seems noticeably more responsive already.

As far as I am aware this is the first modern accelerator for the NeXT ecosystem. It remains to be seen if it can work in the Cubes though (I need to know if they use a multiplexed bus). Despite not being strictly Mac-related, I figured it was worth posting due to the overlap, and it does have a little Mac heritage
I recently got a NeXTStation Color Turbo thanks to a kind forum member. As my first hardware project for it, I wanted to accelerate it; both as it really could use the help, and as nobody that I can see has tackled this in modern times. There's a few unobtainium vintage designs, of course, and one fellow that came up with a pretty ugly hack. Not good options, that hack had some nonsensical stuff in there that isn't required, but even so replacing the PLL and removing some resistors to make it work... no thanks, I don't want to modify my logic board. Folks had already tried the QuadDoublers that will work in some amigas and found they didn't work in NeXT, and that seems to have been the end of any development attempts.
Putting a little theory out there first... generally, with CPU accelerators you need to present strobes, read/write data, etc as if you were a native CPU operating on the bus. What that means is if you feed a 2x clock into the CPU (simplifying a bit here) the suddenly faster CPU is going to freak out because signals from the logic board will violate what it expects (badly) and cause bad data to be read/written/etc.
Two ways to address this. The QuadDoublers supposedly do some janky tricks to track if the 040 is in the middle of a bus cycle or not, and if it isn't, feeds the CPU a doubled clock, but when a bus cycle starts the clock is reduced back to standard. This way the bus access timings look like a 25mhz CPU (because it is a 25mhz CPU). Either way, yuck, though evidently this approach wasn't unheard of as the IIfx does something similar according to @Bolle.
The second and more common method is to accept the signals from the fast cpu and massage them in logic until they look like the appropriate 25mhz signals, or close enough that it isn't massively violating timings. This is the more compatible approach - if you have logic that is up to the task. For example, this is the approach taken on my Booster accelerators, and an example of what that looks like can be seen below.

(/TS = transfer start, indicates the beginning of a transfer, and /TA is acknowledge, indicating either the end or that data is ready for a multi-word transfer)
Well, @Bolle had reverse engineered the Formac PL150 accelerator some time ago originally designed for the LC475. Credit to Bolle for the image. This ran the CPU at 45mhz while the logic board remained at 25mhz. Bolle had found you could put a typical motorola PLL on the board and run it at 50mhz, too. He was so kind as to share the logic and schematic with me, and I used this as a starting point to see what is needed. As with a PLL clock you're phased-locked to the input clock (literally in the name, after all) I rewrote all of the logic to take advantage of that and as a result increased performance since we can make certain assumptions on how signals line up. Also, there were some really screwy choices made by formac to bypass bugs they introduced or couldn't see how to fix (?).


This was all well and good, and the revised logic was solid enough that it could boot an Amiga 4000. At least, as much as it could with a bad OS - the unit was briefly on loan to me, and I have no amiga background, so I failed there. Still, this is a good omen because it's entirely different from the Quadras I'd tested with.
Next bit of difficulty: The Turbo Color schematic was found some time ago (attached) and it makes clear at least part of the problem with using other accelerators: NeXT uses the 68040 Multiplexed bus mode (didn't even realize that was a thing!). Briefly - a single rising edge, to be exact - the address is presented on the combined address-data bus, and then it switches to data-bus operation. Unless you specifically built your accelerator with that in mind, yeah, that wouldn't work.
Pretty clearly this unobtainable vintage accelerator (seen below) had a set of 2x 16 bit transceivers and buffers, though it's not quite clear to me how they'd use these in order to hold the address as is required for multiplexed operation. Due to this I would need a new board design that both ties the busses together and has the logic required to snipe that address and hold it for long enough to make the system happy.


So a new board was designed and sent off to JLC. Last JLC order to make it over the line in time. The NeXTstation will need a jumper populated to reduce the bus clock to 25mhz; 33mhz would have the poor 68040 trying to run at 66mhz and that's just not going to work.


With a bit of tweaking... it lives! Interestingly, I found NeXT apparently tweaks AVEC on the fly. It's nominally pulled high, but it's wired to one of the core chipset. I'd strapped it high not noticing that. Oops. Nothing a bodged pin won't sort out.
Preliminary results looking good! Dhrystone is a best-case scenario that doesn't stress memory, I will have to see what is out there for a more rounded benchmark. Video results were essentially at parity with the original Turbo@33 - reduced bandwidth was presumably offset by increased performance on the algorithmic side. Haven't been able to do much more testing as of yet, but it seems noticeably more responsive already.

As far as I am aware this is the first modern accelerator for the NeXT ecosystem. It remains to be seen if it can work in the Cubes though (I need to know if they use a multiplexed bus). Despite not being strictly Mac-related, I figured it was worth posting due to the overlap, and it does have a little Mac heritage



