• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

Porting LinuxDoom to System 7 (or trying to) - progress and questions

esselfortium

Well-known member
It's been stated plenty of times on here that the official Mac port of Doom was less than great, and it's also well-known to perform terribly on 68k. It's also closed-source (unless you want to pay some eBayer $1000 to get your hands on a CD).

The official Doom source release in 1997 was for the Linux port, and has since been backported to countless other platforms, but never to 68k System 7. So, I thought I'd see if I could get anything compiling in CodeWarrior 6.0 (IDE 4.1).

While I've made a lot of progress in getting through the source, porting over the most crucial functionality and (for now) disabling things like sound for the sake of simplicity, I've run into some issues that I'm not sure how to solve. I've been doing lots of research in the official Mac Toolbox guides and contemporary Mac C programming books, which have been hugely educational and have made it possible to get this far, but I'm still running up against some problems I'm not sure how to solve.

(For further context: I am not an experienced C programmer, but I have taken courses on C and have many years of experience with other languages. This project is me jumping into the deep end of the pool and smashing my head on a rock repeatedly.)

The first big one: alloca. I found a single reference to alloca in an Apple dev guide from the 90s, but nothing more. CodeWarrior doesn't seem to recognize it. I'm unsure if I just need to #include something different, or if the affected code needs to be rewritten to use something completely different (in which case I am going to have bigger questions).

Here's an example of alloca in use in the Linuxdoom source, while compositing textures from one or multiple "patch" graphics:
Code:
//
// R_GenerateLookup
//
void R_GenerateLookup (int texnum)
{
    texture_t*      texture;
    byte*       patchcount; // patchcount[texture->width]
    texpatch_t*     patch;  
    patch_t*        realpatch;
    int         x;
    int         x1;
    int         x2;
    int         i;
    short*      collump;
    unsigned short* colofs;
    
    texture = textures[texnum];

    // Composited texture not created yet.
    texturecomposite[texnum] = 0;
    
    texturecompositesize[texnum] = 0;
    collump = texturecolumnlump[texnum];
    colofs = texturecolumnofs[texnum];
    
    // Now count the number of columns
    //  that are covered by more than one patch.
    // Fill in the lump / offset, so columns
    //  with only a single patch are all done.
    patchcount = (byte *)alloca (texture->width);
    memset (patchcount, 0, texture->width);
    patch = texture->patches;
        
    for (i=0 , patch = texture->patches;
     i<texture->patchcount;
     i++, patch++)
    {
    realpatch = W_CacheLumpNum (patch->patch, PU_CACHE);
    x1 = patch->originx;
    x2 = x1 + SHORT(realpatch->width);
    
    if (x1 < 0)
        x = 0;
    else
        x = x1;

    if (x2 > texture->width)
        x2 = texture->width;
    for ( ; x<x2 ; x++)
    {
        patchcount[x]++;
        collump[x] = patch->patch;
        colofs[x] = LONG(realpatch->columnofs[x-x1])+3;
    }
    }
    
    for (x=0 ; x<texture->width ; x++)
    {
    if (!patchcount[x])
    {
        printf ("R_GenerateLookup: column without a patch (%s)\n",
            texture->name);
        return;
    }
    // I_Error ("R_GenerateLookup: column without a patch");
    
    if (patchcount[x] > 1)
    {
        // Use the cached block.
        collump[x] = -1;    
        colofs[x] = texturecompositesize[texnum];
        
        if (texturecompositesize[texnum] > 0x10000-texture->height)
        {
        I_Error ("R_GenerateLookup: texture %i is >64k",
             texnum);
        }
        
        texturecompositesize[texnum] += texture->height;
    }
    }   
}

What's the right way to handle this for System 7?
 

ymk

Well-known member
alloca() allocates space on the stack. I found no reference to it in THINK C 6 either.

You may have to write this yourself with a bit of assembly:

1. Store the current stack pointer (A7) in the return register (D0).
2. Decrement the stack pointer by the argument to alloca(), with some safety checks.
3. Put an error code in D0 if those checks fail.
4. Return.
 

zigzagjoe

Well-known member
The original port isn't /bad/ from a quality perspective - they actually did quite well in adapting it to Mac in terms of making it pleasant to use. Unfortunately, they spent no time on optimizing performance it seems, so that's the real problem... I suspect some well placed assembly functions would have helped a lot.

I'd also been thinking about doing this, after I spent a bit of time hacking a sort of timedemo functionality into the original port. I was looking at using doomgeneric as a starting point - https://github.com/ozkl/doomgeneric

I do wonder what performance increases might be realized simply by using a newer compiler, but I'm not sure how easy it'd be to convince llvm or gcc to compile some code into something you could jigger into a classic mac os app.
 

esselfortium

Well-known member
The original port isn't /bad/ from a quality perspective - they actually did quite well in adapting it to Mac in terms of making it pleasant to use. Unfortunately, they spent no time on optimizing performance it seems, so that's the real problem... I suspect some well placed assembly functions would have helped a lot.

I'd also been thinking about doing this, after I spent a bit of time hacking a sort of timedemo functionality into the original port. I was looking at using doomgeneric as a starting point - https://github.com/ozkl/doomgeneric

I do wonder what performance increases might be realized simply by using a newer compiler, but I'm not sure how easy it'd be to convince llvm or gcc to compile some code into something you could jigger into a classic mac os app.
https://github.com/autc04/Retro68 might be worth a try for compiling with, though now that I look at it in more detail I see it's missing support for post-system-7 features, so maybe not...

DoomGeneric looks interesting, thanks for pointing me to that!

Functionality-wise, the most "ideal" option for porting would probably be Chocolate Doom, since it's as vanilla-like as possible while supporting a variety of modern platforms and including built-in support for tools like dehacked and deusf that are frequently used alongside custom wads to modify the original exe. But, as with most modern-day Doom ports, it would mean having to work around SDL.

I'm sure you're right that some assembly could speed things up. There are also some ways of optimizing the renderer logic itself that have been found over the years. I believe a bunch of those methods have been implemented into a DOS port called FastDoom, which is targeted at very-low-end PCs.
 
Last edited:

ymk

Well-known member
The Mac Doom II developers did well with what they had (though I'm not sure how similar the engine is to Doom I).

Mac ports are handicapped by having to paint a 640x480 screen with a 320x200 game.
The high detail option rendered at 4x the detail of the PC version and is an unfair comparison.
The low detail option just scaled up 320x200, which takes around a quarter of CPU time by itself.

The scaling routines appear to be all hand-rolled assembly, so they did take the time to optimize at least that part.

If Doom is poorly optimized, what's an example of a well-optimized and comparable 68K FPS game?
 

esselfortium

Well-known member
Doom and Doom II run on the same engine, Doom II just raised some limits and added some new behaviors. As long as it's a new enough version of the executable, you can feed Doom.exe or Doom2.exe either game's WAD file and they'll handle it without issues.

That's a good point about the scaling, though the Mac Doom port does offer a non-scaled-up small mode accessible from the menubar, which in my experience hasn't seemed to make a particularly noticeable difference to performance, surprisingly.

I'm not sure about other 68k FPS games offhand, but with a new 68k Doom port (or source access to the original port) it could be possible to implement some of the optimizations that have been found in the years since the game's release.
 

ymk

Well-known member
Rather than compiling the whole game from scratch, you can replace individual Mac Doom functions with ones compiled into another application.

Then change the jump addresses in with Macsbug.

The Macsbug names were left in the code, so you can search for the functions by name.

I used this method to play with the scaling code and get FPS measurements.

You can also skip parts of the rendering process, which gives you an idea of how much time the game spends at each step.
 

zigzagjoe

Well-known member
Marathon would be the easiest example of a 68K FPS that runs on most things.

I relied a lot on the function names to hack the engine back into singletick mode. Buildtick seems to have been mostly rolled into netupdate, so I ended up hacking on that instead. I lost interest in really finishing the job though.
 

uyjulian

Well-known member
https://github.com/autc04/Retro68 might be worth a try for compiling with, though now that I look at it in more detail I see it's missing support for post-system-7 features, so maybe not...

DoomGeneric looks interesting, thanks for pointing me to that!

Functionality-wise, the most "ideal" option for porting would probably be Chocolate Doom, since it's as vanilla-like as possible while supporting a variety of modern platforms and including built-in support for tools like dehacked and deusf that are frequently used alongside custom wads to modify the original exe. But, as with most modern-day Doom ports, it would mean having to work around SDL.

I'm sure you're right that some assembly could speed things up. There are also some ways of optimizing the renderer logic itself that have been found over the years. I believe a bunch of those methods have been implemented into a DOS port called FastDoom, which is targeted at very-low-end PCs.
You can use Universal Interfaces to have support for those post system-7 features in Retro68
 

joevt

Well-known member
I think CodeWarrior 6 should have alloca for 68K. I grepped for /b(__)?alloca/b and found these occurrences which includes the alloca.h header file. I don't see a library - I think it could be part of the compiler (notice the /Compilers/ in the search results)?

Code:
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Help/err/ccpp/ccpp idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Help/err/ccpp/ccpp_gi.htm
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Help/err/ccpp/errref_610242.htm
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Help/err/err idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/CodeWarrior Manuals idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/_MSL_C_RefIX.fm3.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/_MSL_C_RefTOC.fm.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR010_Intro.fm1.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR020_alloca.fm.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR020_alloca.fm1.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR020_alloca.fm2.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR120_io.fm1.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR150_malloc.fm.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR150_malloc.fm1.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MCR150_malloc.fm2.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/MSL_C_Reference/MSL_C_Reference idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/Porting_Reference/Porting_Reference idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/Targeting_MacOS/MAC120_PPCAsm.fm.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/Targeting_MacOS/Targeting_MacOS idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/Targeting_Windows/Targeting_Windows idx
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/HTML/Targeting_Windows/WIN110_Asmblr.fm.html
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/PDF/index/assists/00000005.wld
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/PDF/index/assists/00000006.wld
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/PDF/index/parts/00000005.did
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/PDF/index/parts/00000007.did
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/CodeWarrior Manuals/PDF/MSL_C_Reference.pdf
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Metrowerks CodeWarrior/CodeWarrior Plugins/Compilers/MW C:C++ 68K
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Metrowerks CodeWarrior/CodeWarrior Plugins/Compilers/MW C:C++ PPC
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Metrowerks CodeWarrior/CodeWarrior Plugins/Compilers/MW C:C++ x86
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Metrowerks CodeWarrior/CodeWarrior Plugins/Compilers/MW Pascal PPC
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Metrowerks CodeWarrior/MacOS Support/Headers/Alloca/alloca.h
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Release Notes/Compiler Notes/CW Mac 68K Notes 2.4.1.txt
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Release Notes/Compiler Notes/CW Mac 68K Notes 2.4.txt
/Volumes/Devs/MetrowerksStuff/CodeWarriorPro6/Release Notes/Compiler Notes/CW Win x86 Notes 2.4.txt
 

uyjulian

Well-known member
alloca is usually implemented as a compiler builtin. It allocates a buffer on the stack. There is usually no library associated with it

In a lot of cases you can usually replace alloca calls with malloc / free. And it may be needed if you are low on stack space
 

ArbysTPossum

Active member
Extremely interested in where this goes. Other FPS game son Mac that run better would be Duke Nukem 3D and Marathon. Granted, with Duke Nukem 3D, by "run better" I mean "Runs great for what it's doing, which is more than Doom". I know Marathon isn't as complex as Doom, but the midway point between Duke3D and Marathon should not run as poorly as Doom does.

There has also been DOS ports of Doom that do not render floor or ceiling textures for additional speed. With Mac Doom II, there's an option to render every other line on screen. I do not know the technical details, which is fascinating to read about even if I only have a loose understanding, but there are many paths to improvement. Again, very interesting.
 

treellama

Well-known member
I’m talking about render complexity in the original levels. Marathon has to do more work, so properly optimized it will slot in your list a bit slower than Doom should.
 

esselfortium

Well-known member
It was sold on eBay for an unbelievable $5006.66.

The disc image also contains the source to the DOS version of Doom, which has never been released to the public before now.
 

Phipli

Well-known member
It was sold on eBay for an unbelievable $5006.66.

The disc image also contains the source to the DOS version of Doom, which has never been released to the public before now.
Ver nice of the winner to share it after paying that much.

Unless a slightly mean seller released it after selling it :s
 
Top