• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

Making a VNC server for A/UX

cheesestraws

Well-known member
Since logistics currently requires software projects, I've been having a go at building a VNC server for A/UX 3, both inspired by and based on @marciot 's MiniVNC. Although this started off as a straight fork, in fact, most of it has now been rewritten because of the very different problems that VNC on A/UX faces (and because I'm better at C than C++, but ssssh). It isn't "finished" yet, but I thought people might be interested in how far it's come. This has been a journey of swearing and reverse-engineering and has been rather enjoyable so far.

Here's a video to hook you in. This is running on real hardware, an overclocked Q650. You can see that even with its current rather basic incremental update support, it's already pretty responsive, and the computer is totally usable over it (well, totally usable except for only half the keyboard working so far, etc, etc). Note also that the VNC session is maintained across user sessions, and it manages the change in colour configuration without going funny.


View attachment vnc-rec.mov

The code as it stands is at https://github.com/cheesestraws/aux-minivnc. Part of it is a kernel module to provide access to the framebuffer and input devices from outside the Mac environment. This is probably the first non-trivial A/UX kernel module written in some considerable time (though SolraBizna has beat me to the first at all). There's then a userspace daemon which talks the VNC protocol and manages the remote framebuffer and wrangles input.

Why the big differences between this and MiniVNC for the Mac?
  • Easier: We don't have to deal with MacTCP, we can just use standard UNIX sockets. It's much faster, and this means we need to worry less about how much we're sending and how we're scheduling that sending. So, at the moment, we are using raw encoding rather than TRLE and it's already responsive. This would preclude running over MacIP, but A/UX doesn't support MacIP anyway, so this isn't a problem.
  • Easier: we've just got more CPU and memory to play with. We're not trying to deal with 68000s, we're not dealing with Mac Pluses.
  • Harder: we need to deal with more complicated situations where display parameters change under us, for example between sessions. Some of those sessions aren't even mac-like: the console emulator session type, which just gives a big full screen terminal, is only pretending to be mac-like, nothing of the Mac OS is actually running.
  • Harder: there's no central place to get the information we need. Under Mac OS, we can just ask QuickDraw for all the details about screens and so forth. On A/UX, we can't. QuickDraw lives inside the Mac world and if we want to survive across multiple sessions, including terminal sessions, we can't use it. So we need to ferret out information from corners of the kernel, talk to the kernel's cradle around the slot manager and video drivers, and so forth.
  • Harder: we can't talk to the hardware directly. Getting access to things like framebuffers can only be done via custom code in the kernel (I hoped to use the user interface driver for this, but we can't). So we live in a split world where there is a kernel module and a userspace component, and at the moment we have to copy the framebuffer out of kernel space, rather than consulting it directly. This adds up to 90ms of latency to updating the framebuffer. There is a way around this, but it requires me to go and do some more disassembly of the kernel...
So, are people interested in more details about the development of this / how I got this far / what the current problems I'm facing are?
 

MrFahrenheit

Well-known member
This is a really cool project, and I thank you for doing it.

How much of the host machine does this process use while running?
 

cheesestraws

Well-known member
How much of the host machine does this process use while running?

I haven't done any serious performance measurements yet, too early days for that. But while I'm just mucking around it doesn't seem to be loading the system too heavily: the load average remains the same when noodling in the mac environment with it or without it. Realistically, it's spending most of its life doing I/O, waiting for network traffic to send and be received, so I don't think it should be too impactful.

Also, bear in mind that because this is UNIX you can always run the server at a lower priority, which will give more time to other things.

It's probably best to think about this as like doing moderately busy file sharing.

I'd love to read more about this !

Thanks! I realise this is a bit niche :D. OK, let's talk about the current problem I'm having.

At the moment, the VNC server only works in 256 colour modes. It would be nice if it did better, especially since things like the console emulator run in lower colour depth modes. The first part of this is already in place: the virtual framebuffer that the VNC server exposes is always 256 colours so that the VNC session can remain in a consistent colour depth as the computer changes the "real" colour depth.

However, the rub is: how do we find out what the current colour depth is? On the Mac we'd just ask QuickDraw. We can't do that, we haven't got QuickDraw. Because we're in the kernel, we have access to the A/UX video information that the video driver itself uses, which looks good at first glance (in sys/video.h):

C:
struct video {
    char    *video_addr;        /* address of screen bitmap */
    long     video_mem_x;        /* memory x width in pixels */
    long     video_mem_y;        /* memory y height in pixels */
    long     video_scr_x;        /* screen x width in pixels */
    long     video_scr_y;        /* screen y height in pixels */
    ...
    VPBlock vpBlock;        /* video parameters for def mode */
    ...
};

VPBlock is a QuickDraw video mode block. Hooray! But what I didn't notice was those three little letters 'def' in the comment. That's the default screen mode, not the current screen mode. So we need to go and find the real mode ourselves. Bummer.

A/UX aims for decent compatibility with Macintosh display cards. The way it does this is that instead of trying to do everything itself, it creates little cradles around the ROM Device Manager and the ROM Slot Manager so that it can use their routines to talk to the hardware. The entry point to the video driver is called kallDriver. The kernel code itself here rapidly descends into black magic—unsurprisingly—but calling the video driver itself is moderately easy. I'm already doing it to get the colour palette, here. And according to Cards and Drivers, there's a driver call that video cards need to implement to return the current mode number:

Screenshot 2022-12-05 at 11.07.53.png

But this only gives us the mode number. To turn this into anything useful, we need the details on the mode. I was hoping the kernel kept all this around in memory, but it seems not to. But what we can do is do the same thing that the kernel does to get the information on the default mode, and just keep rather more of it. This is in a function called getVPBlock, a fragment of which is, in Ghidra's rather bizarre idioms:

Screenshot 2022-12-05 at 11.14.51.png

From the Mac point of view, this is delightfully straightforward: we're literally just pulling the mode lists out of the active sResources of the video card. And what's more, we have a slotmanager() call that we can use to call the Slot Manager from inside the kernel without the whole world falling down. Hooray!

So, putting these pieces together, I'm planning to:
  • When the kernel module is opened, go and get the video modes for the card you first thought of from the slot manager. Cache it in RAM so we only have to do this once: the list of valid video modes shouldn't change during run time.
  • When the kernel module gets a FB_MODE ioctl, which is what the VNC server uses to ask it for details on the current mode, use kallDriver(... 2 ...) to request the current mode number, then look it up in the table we constructed above and copy the VPBlock out.
 

Phipli

Well-known member
Does VNC cope with colour pallet changes? Like when you open an image with a custom colour table in Photoshop?
 

cheesestraws

Well-known member
Does VNC cope with colour pallet changes? Like when you open an image with a custom colour table in Photoshop?

Yes, it's designed to deal with indexed colour modes with a mutable palette. You can send a palette update on every update, if you want to. The first draft of this did that. But that's a lot of unnecessary data.

This again is a bit complicated by the fact that a full 256 colour palette is 2kbytes, but ioctls can only return 128 bytes. So to fetch the full palette out of kernel space using ioctls requires multiple calls. Which is slow and rather wasteful.

To get around this, what I do now is provide an ioctl in the kernel module that computes the crc32 hash of the current palette and returns that. The VNC server can then look at that and see whether it's changed or not. If it has, it reloads the palette using multiple ioctls and pushes it out to the viewer. Otherwise, it lets it be.

In theory there are collisions possible here, but I've no idea what two pallettes that collide would look like: the difference in size between the set of theoretically possible palettes (where collisions exist) and the set of actually likely palettes to be used suggests this is pretty unlikely.
 

mdeverhart

Well-known member
a fragment of which is, in Ghidra's rather bizarre idioms
Out of curiosity (and off topic), how are you handling A-Traps in Ghidra? Last time I looked it didn’t have a graceful way of handling them; they were flagged as unknown instructions and the following disassembly was rather muddled. (And of course, it would be really nice to translate them to proper function calls with parameters in the reconstructed C, but now I’m being greedy…)
 

cheesestraws

Well-known member
Out of curiosity (and off topic), how are you handling A-Traps in Ghidra?

I'm afraid to say I'm not: fortunately in the A/UX kernel, any A-traps used are wrapped up enough that I don't need to poke the bit that actually invokes the trap to understand what's going on. At the moment the only thing I know of that disassembles A-traps properly is IDA. I seem to remember finding a half-finished Ghidra enhanvement to do this but I can't find it now.
 

cy384

Well-known member
Out of curiosity (and off topic), how are you handling A-Traps in Ghidra?

somewhat off topic, but I have an easy little hack for very basic A trap support for ghidra, I can make a thread here later if there's interest

nice work with this vnc server!
 

cheesestraws

Well-known member
The slot manager, even in THINK Reference, doesn't seem to be terribly well documented. To further the confusion, here's a function that returns details of a graphics mode from a graphics card. slot is the slot to interrogate; mode is the mode number (0x80 or above); vpb is a pointer to a VPBlock, as defined in video.h, which will receive the video mode information.

This is Metroworks' dialect of C. I now need to translate this to K&R and see how badly it crashes when I introduce it to the kernel...

C:
OSErr getVideoMode(int slot, int mode, VPBlock* vpb) {
    SpBlock sb;
    OSErr err;
    VPBlock* vp;
    
    err = noErr;
    memset(&sb, 0, sizeof(sb));
    
    sb.spSlot = slot;
    sb.spID = 0;
    sb.spCategory = 3;
    sb.spCType = 1;
    sb.spDrvrSW = 1;
    sb.spTBMask = 1; // ignore spDrvrHW
    err = SNextTypeSRsrc(&sb);
    if (err != 0) {
        return err;
    }
    
    // right slot?
    if (sb.spSlot != 0) {
        return -1;
    }
    
    printf("got the bugger\n");
    
    sb.spID = mode;
    err = SFindStruct(&sb);
    if (err != noErr) {
        return err;
    }
    
    sb.spID = 1;
    err = SGetBlock(&sb);
    if (err != noErr) {
        return err;
    }
    
    vp = (VPBlock*)sb.spResult;
    memcpy(vpb, vp, sizeof(VPBlock));
    printf("%d (%d, %d)\n", mode, vp->vpPixelType, vp->vpPixelSize);
    
    DisposePtr((char*)sb.spResult);
    
    return noErr;
}
 

cheesestraws

Well-known member
After a certain amount of mucking about I can now get 1-bit video modes working, which means I can see the console emulator!

Keyboard doesn't work in this mode yet, though, which was expected.

Screenshot 2022-12-06 at 20.23.16.png
 

cheesestraws

Well-known member
Does your server also work if there is no monitor attached?

I don't actually know how A/UX reacts if it doesn't see a monitor attached at all.

My plan is to leave just a 15-to-VGA monitor adapter plugged into the machine so it thinks there's a monitor :)
 

tecneeq

Well-known member
  1. How fast would this be on a 68030?
  2. Have you replaced compute intensive stuff with assembler or is this not how it works, and the c libs are optimized as good as it gets?
  3. Do you use the systems c libs or something from GNU? What compiler did you use?
  4. Why is there no cat food with mouse flavour?
  5. Will the server survive changes in screen size?
  6. How did you add your Q650 to the network, wired using AAUI?
No need to repeat what has been said, but: it's awesome.
431cad5acf406afc4c3deb346a1e1d69a11f0c79c6649b71b4a5f2bc35b3f056.jpg
 

tecneeq

Well-known member
BTW, are kernel modules loaded dynamically or do you have to reconfigure and reboot all the time?:unsure:
 

cheesestraws

Well-known member
How fast would this be on a 68030?

No idea, not tried it. I'd expect noticeably slower. I don't have any A/UX compatible 030s except an SE/30 which is unbearable under A/UX 3 anyway, so...

Have you replaced compute intensive stuff with assembler or is this not how it works, and the c libs are optimized as good as it gets?

There isn't much compute intensive stuff here, it's mostly just moving memory from A to B. At the moment it's all C, because it's at the point where the optimisations I can make are algorithmic optimisations rather than micro-optimisations. There is perhaps some mileage in rewriting the framebuffer copy-and-compare routines in asm, they're quite naïve at the moment, and perhaps in adding something like the hashing from MiniVNC to it for slower machines. But I don't have an A/UX machine slow enough to really test whether they make a difference in interactive latency.

Do you use the systems c libs or something from GNU? What compiler did you use?

System libc. The application is compiled with gcc; the kernel module with system cc. The kernel module probably could have been built for gcc, but I wanted to eliminate variables and I knew the system cc worked. And I don't really mind writing K&R in small quantities.

Why is there no cat food with mouse flavour?

Because one of its legs is both the same.

Will the server survive changes in screen size?

Can't change screen resolution without a reboot, fortunately, so it doesn't have to.

How did you add your Q650 to the network, wired using AAUI?

Yup. Nothing exotic.

BTW, are kernel modules loaded dynamically or do you have to reconfigure and reboot all the time?:unsure:

Reconfigure and reboot! We're in old-school territory. I must have built more A/UX kernels in the last three weeks than have been built on earth in some considerable time.
 

cheesestraws

Well-known member
By interrogating the slot manager and the video driver, we can now do you all the indexed colour modes from 1 bit up to 8 bit. 8 bit colour is fastest, because as I said the native remote framebuffer is 8-bit.

View attachment Screen Recording 2022-12-07 at 20.57.18.mov

A couple of interesting wrinkles on the way here: it's not mentioned in Cards & Drivers, but if you request the CLUT from a card and ask for more colours than the current mode has, it'll tell you to bog off, in the politest possible terms. This confused me for a while, but once I'd figured it out, the kernel driver now only asks for the number of colours the mode can actually provide.

Another interesting problem was how to do the {1, 2, 4}-bit to 8-bit mapping fast enough to make the thing responsive. I flirted briefly with this rather fun technique, which did work faster than the naïve implementation, but ended up using fixed lookup tables because it was faster and on an A/UX box we can afford to use 2kbyte more RAM.

As a bonus, video in X now works, too, although input doesn't.

Screenshot 2022-12-07 at 21.13.47.png
 
Top