• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

NuBusFPGA: HDMI on NuBus Macs

Phipli

Well-known member
De nada! Just thought the general info would be useful, some other stuff on that particular card:

Apple TIL: Macintosh Display Cards 824 & 824 GC: Rev A/B Differences

Acceleration on a card? re: missing daughtercard for LaPorta's SuperMac ColorCard/24:

View attachment 46313

View attachment 46314


Has anyone got that daughtercard? Hi res pics would be greatly appreciated. Hoping it will yield some clues about the general machanics of an early implementation of QuickDraw Acceleration?

Might dissecting GAL formulas be helpful? If not, sorry about the tangent, M.
The Spectrum/24 III equivalent looks like this. They removed the chip markings.

IMG_20220913_195231.jpg
 

Trash80toHP_Mini

NIGHT STALKER
Oh yeaaaah! The ole' sand off the top like there's no such thing as a chip tester/identification unit available ploy! :rolleyes:

This one came up in the SuperMac History thread:
Best part:
Some engineers were insulted that we were going to slow down their board and sales was convinced that within days of the board hitting the street we would have a black market in chips to speed up the $699 boards and turn them into $3,999 ones.
Anybody wanna make a bet on where the wait states were inserted to slow down the lower tier boards? Maybe some very late to market, black market chips could be in the offing? :)

@Melkhior Seems to be heading a bit far afield at this point. If info isn't helpful here, we may have tangential development spinoff brewing?
 

Melkhior

Well-known member
Doing a lower resolution framebuffer on a higher resolution display (i.e. a 1152x870 image letter/pillar/windowboxed on a 1280x1024 display) should not be very hard, as you just need a second set of signals from the timing hardware to tell the video hardware to output black (or whatever background color), that's basically a larger-scale version of what the hardware cursor already does.
Not new signals to propagate needed; turns out adding begin/end registers to change the size of the visible frame (and recomputing the DMA length to match) is enough. That enables support for any resolution with a number of pixels multiple of 128 (technically, bytes multiple of 16, B&W having only 1 bit per pixels is the limiting factor). I've updated the SW to enable lower, windowboxed resolutions in the middle of the screen (position is SW-defined), and it works just fine.
And for reasons yet unknown, lowering the resolution _does_ speed up Speedometer 4.02 QuickDraw tests at 8/16 bits (1/2/4 might have alignment issues), and not insignificantly:
Speedo640x480.jpg
The average is dragged down by low-depth numbers, but at 640x480/8-bits it's one of the fastest 68k video device ever according to Speedometer 4.02 :)
 

Trash80toHP_Mini

NIGHT STALKER
Amazing progress! 🍔*****

Is that 16bit benchmark of 1.3 x Q605 indicate a NuBus limited VidCard outperforming onboard video at long last?

***** It's National Cheeseburger Day over here. So more appropriate than toasting with champagne.;)
 

Melkhior

Well-known member
Is that 16bit benchmark of 1.3 x Q605 indicate a NuBus limited VidCard outperforming onboard video at long last?
Yes; but that's mostly due to the bitblit being really fast - and it shows during the final bottom-to-top scrolling in Speedometer. The data don't have to go over the NuBus, everything stays on the board, so the bottleneck of NuBus is bypassed. In my current design, there's a 100 MHz/128-bit bus between the memory controller and small L1D cache of the Vex core, and for perfectly aligned case there's custom 64-bits Load/Store to do the copy.
It's more comfortable if you do things that use the bitblit, such as scrolling large windows in CodeWarrior, but it doesn't help other use cases...
Speedometer also do some weird things such as oval drawings, and I'm not sure how relevant that is to real use cases (even though it could be theoretically accelerated). Benchmarks only measure themselves, Speedometer is no exception.
In other words - unless you do a lot of (standard quickdraw) scrolling, internal videos on Quadras will probably still "feel" faster, I would say.
 
Last edited:

Melkhior

Well-known member
New update:
(a) The RAM disk driver now has the option to use DMA & interrupt-based completion, showing it's possible to share the interrupt between two devices (the FB needs it for VBL). Annoyingly, on my Q650, that's not faster than plain old synchronous memory-mapped access (but would enable larger devices, as memory-mapping is limited to the 16+256 MiB of slot+superslot space). But it opens the door to 'proper' support of e.g. micro-sd or Ethernet.
(b) I've managed to prototype the hdl-util/hdmi integration as an alternative to Litex' HDMI stuff. No real advantage yet (and I haven't ported the hardware cursor for SBusFPGA), but it opens the door to adding a sound device going through HDMI on the Mac. Still need to figure out how to create said sound device & write the appropriate sound component...
 

vz50938

New member
Hi,
This is really advance stuff! :)

So you've implemented kind of HW interrupt controller that aggregates the frame-buffer interrupt and the DMA completion interrupt, where the main interrupt task serves them accordantly?

About the driving HDMI without external PHY, what I/O bank voltage that you're using? In QMTECH Kintex board the default I/O voltage reference for all usable I/O is 3.3v volt, with an option to reference 4 banks from external 1.8v or 2.5v source.
Since LVDS is not support under 2.5v it might be an issue, because it will add another voltage level to translate to/from 5v, as the 4 banks include many other needed I/O's.

BTW, QMTECH also have daughter card for the Kintex board with HDMI driven from ADV7513 device (which is almost impossible to source these days).

Regards,
N.Blum
 

Melkhior

Well-known member
So you've implemented kind of HW interrupt controller that aggregates the frame-buffer interrupt and the DMA completion interrupt, where the main interrupt task serves them accordantly?
Sort of, as the HW is basically "nubus_irq = irq_fb || irq_dsk" :)
In addition, both IRQ have their own enable/clear register, and there is two different interrupt handlers - MacOS make that comparatively easy, you can register more than one handler and it will go through all of them in order of priority until one says "I handled it".

About the driving HDMI without external PHY, what I/O bank voltage that you're using?
Everything is set to 3V3, including the HDMI. The design for external HDMI was working for me on the QMTech Wukong board, so I just replicated it.
HDMI TMDS signalling is 3V3 (it's not the same as LVDS apparently, but don't ask me to explain the difference...), so using a lower voltage requires external shifting/resignaling, making chips like the TDP0604 or SN65DP159 necessary (... from what I understand).
 

vz50938

New member
Go it :)

They are using TMDS_33 I/O (which is a differential, current based (CML) 3.3v I/O driver) for the high speed serial data. The TPD12S016 is ESD-diode protection for the serial I/F and level shifter/buffer for the I2C port.

The Kinex device should support this as well.
 

demik

Well-known member
New update:
(a) The RAM disk driver now has the option to use DMA & interrupt-based completion, showing it's possible to share the interrupt between two devices (the FB needs it for VBL). Annoyingly, on my Q650, that's not faster than plain old synchronous memory-mapped access (but would enable larger devices, as memory-mapping is limited to the 16+256 MiB of slot+superslot space). But it opens the door to 'proper' support of e.g. micro-sd or Ethernet.
Very interesting. Small question here: what is the interrupt or even DMA on a simple memory mapped RAMDisk used for with MacOS ?

All the implementations I know (on 512k/Plus type hardware) were just memory mapped
 

Melkhior

Well-known member
Very interesting. Small question here: what is the interrupt or even DMA on a simple memory mapped RAMDisk used for with MacOS ?
All the implementations I know (on 512k/Plus type hardware) were just memory mapped
Generally speaking, interrupt/DMA-based system enables overlap between the CPU and the I/O device, thus are more efficient (and may in some OS enables other features, e.g. reordering of operations via a request queue). In MacOS, a disk driver can acknowledge receiving the request immediately upon sending it to the device, and return control to the OS. When the device finish completing the request, it is marked as finish by the interrupt handler. I have no idea whether that offers any practical benefit in MacOS 8 vs. synchronous completion (basically, waiting for the I/O to complete before returning control to the OS).

In MacOS, another benefits is that the physical memory space is somewhat limited at 4 GiB. A NuBus device can only map 16 MiB (slot) + 256 MiB (superslot) in 32-bits mode, so a memory-mapped RAM disk is effectively limited to around 256 MiB. The DMA has no such limit, so could expose a much larger device - Artix-7 boards exist with 512 MiB or 1 GiB, but it's possible to go even higher.

For NuBus specifically, the CPU only initiates long word (32-bits) transaction to memory-mapped area, each independent from one another; so a full NuBus cycle is needed for every 32-bits long word, each seeing the full latency of accessing the DDR3 SDRAM. The DMA engine initiates 16-bytes (resp. 32-bytes) pipelined transfers to the memory, and uses 16-bytes (resp. 32-bytes) synchronous NuBus block transfers to/from the memory controller. The data are always transferred in multiple of the sector size (512 bytes), so that's always possible. Overall it's a much more efficient use of the NuBus bandwidth and hide the memory latency better. At least theoretically, because my current implementation isn't faster on the Q650 than trivial memory-mapping using BlockMove :-( (at 16-bytes it's a bit slower, at 32-bytes a bit faster).

Ultimately my primary goal was to debug the DMA/NuBus bus-master bit of the hardware, and test interrupt sharing, rather than improve the RAM disk. Adding a micro-sd card would make a revised NuBusFPGA more useful and would need both features.
 

olePigeon

Well-known member
This is really, really, really cool. I know absolutely nothing about hardware development, so this is like magic to me.

I have an E-Machines video card with an adapter board that adds ethernet. Would pictures of that help any for your project?
 

Melkhior

Well-known member
This is really, really, really cool. I know absolutely nothing about hardware development, so this is like magic to me.
I have an E-Machines video card with an adapter board that adds ethernet. Would pictures of that help any for your project?
I don't remember/have never seen any video card with Ethernet, so pictures would be nice to satisfy my curiosity, thanks! (Is it a Futura II SX?)
Most, if not all, NuBus cards that I remember were single-function (unless the DSP stuff on video cards for Photoshop is counted as a second function).
Back in the day they probably did it by having both functions hangs of the NuBus at different addresses, so probably quite a but of discrete logic and PALs for decoding addresses and 'routing' the data. My design is a bit different from what was done back then, the NuBus interface is connected mostly to a Wishbone bus internal to the FPGA. The memory ranges for the particular NuBus slot the card is in are mapped to part of the Wishbone (with the rest of the memory map used for stuff not visible on the Mac host such as the video accelerator micro-code). The devices are connected to the Wishbone with addresses in that particular part so they are visible to the host. So to add Ethernet, I would just need to add e.g. a LiteEth core on the Wishbone in that part & connect an appropriate external Ethernet PHY to some pins of the FPGA (but pins are in limited supply...) and an appropriate RJ45 with 'magnetics' connected to the PHY.
The difficult part is by now more on the software side. One would need to write a driver specific to LiteEth or whatever other Ethernet controller is put into the FPGA. Someone has done it before, but I'm not sure how, the documentation doesn't seem as comprehensive for Ethernet drivers as for Framebuffers, SCSI devices or Sound devices.
 

robin-fo

Well-known member
I don't remember/have never seen any video card with Ethernet, so pictures would be nice to satisfy my curiosity, thanks! (Is it a Futura II SX?)
There is the E-Machines ColorLink card:
The apparently exact same specimen is now in my possession and probably defective since 1994… (I wonder if someone once plugged an AUI transceiver into the display connector 😅)
 

Melkhior

Well-known member
Here’s a picture of a Futura II LX/DSP, but I don’t have either of the expansion modules for it.
Thanks. I would guess the slot at the opposite end of the backplate gives access to the on-board video memory for the DSP daughterboard or similar, though it's not very close to the memory chip. And the connector just above the NuBus connector is likely to give access to NuBus signals to daughterboards, such as Ethernet. When you have control of the mapping of all devices on the board, that's probably all that's needed. The only question is interrupt sharing, but that's not very difficult to plan for with at most two devices.

Also I wonder how successful they were historically - and what was the intended market for those dual-function boards. II, IIx and IIfx had plenty of slots and could have separate Ethernet and Video. From the IIci onward, Macintoshes had on-board video so NuBus video was nowhere near as important. Late Quadras and all PowerMacintoshes had on-board Ethernet as well. It seems to me that those were targeted either at the IIcx in the early days, or the Q700 later on. The IIsi could also use one of those, but its intended market segment probably didn't want to spend big bucks on a combo expansion device at the time - and a PDS version would be faster anyway. Nice solution in search of a problem? And yet they have just become the NuBus device I most want :)
 

jeremywork

Well-known member
(from auction photos) I think this one is the Futura II SX/DSP; mostly the same but with less VRAM (L for large displays; S for small.) There are also a set without headers for the DSP, which have a blue rotary switch installed for setting resolution instead of the software mechanism on these (hence the '/DSP' monacre.) http://archive.retro.co.za/mirrors/68000/www.vintagemacworld.com/radius/emfutdsp.html has a bit of info.
i-img584x306-1611376410ira7jb98766.jpgi-img611x346-1611376415lpe5vd98260.jpg

If I follow your post you may have the headers switched. The one directly above the Nubus connector would connect to the DSP module, and the one vertically oriented would connect to a horizontal ethernet interface which would span all the way to the cutout occupied by the black plastic blank. Only one of the two can be physically installed at once, and the suitable DSP card also fits the SuperMac Spectrum Power 1152, fwiw.

supermac_spectrumpower1152.jpg
 

Melkhior

Well-known member
If I follow your post you may have the headers switched. The one directly above the Nubus connector would connect to the DSP module, and the one vertically oriented would connect to a horizontal ethernet interface
Interesting. There's clearly some signals routed dirrectly from the NuBus to the DSP connector. I wonder what is routed to the secondary header - some long-running NuBus traces feel unlikely, so perhaps some pre-decoded stuff from that e-machine ASIC? Some nice engineering either way.
 

Powerbase

Well-known member
While adding ethernet sounds neat, its starting to sound like feature-creep to me. The fastest nubus video card with digital output I think would be a good goal, though.
 

Trash80toHP_Mini

NIGHT STALKER
Out of town ATM, but I've got at least a couple of the version with the cutouts on PCB and backplane bracket/black cover plate.

I've got one each 10bT and ThinNet daughter card NICs to match. IIRC, there's a thread on bbraun's site where he explained to me how the VidCard/NIC function was decoded from a single NuBus address. When I get back in town I can dig 'em out to post pics of the two NICs if you're curious, M? Glad to see you're back and at it again! :)

ISTR the problem with the Futura, add-on NICs being that they're limited to some level of networking or other by OS.
 
Top