Possible to get the saved instruction address at interrupt time, in a Time Manager task?

bribri · Apr 15, 2025

I just noticed this bit of text in Retro68's README about ConvertObj:

Reads a MPW 68K Object file (*.o) and converts it to input for the GNU assembler (powerpc-apple-macos-as). Well, as long as the .o file does not use global variables or non-local function calls.

That strongly suggests that trying to get Retro68 linking PerformLib.o is a dead end. :-(

SuperSVGA · Apr 15, 2025

I'm not familiar with how C or Pascal might be doing it, but my guess would be it uses LINK to save the stack pointer to an address register. If you can get that stack pointer you should have a predictable location for the exception stack frame. It will be a bit further back however, as the interrupt handler saves D0-D3/A0-A3 to the stack, and the location may differ between older vs newer ROMs.

bribri · Apr 15, 2025

Okay... progress! It occurred to me I might be able to avoid the linking issue if I could use MPW to create a library that already has all of the symbols linked. And so I did... combining together PerformLib and Interface. I also had to resolve symbols to memcpy and memchr, but once I did that... it links! And if I call InitPerf, it crashes!

I suppose that's progress though.

SuperSVGA said:
I'm not familiar with how C or Pascal might be doing it, but my guess would be it uses LINK to save the stack pointer to an address register. If you can get that stack pointer you should have a predictable location for the exception stack frame. It will be a bit further back however, as the interrupt handler saves D0-D3/A0-A3 to the stack, and the location may differ between older vs newer ROMs.

Someone pointed out to me the source code for Time Manager. I might still give that another try, since it seems like the more I bash my head against the wall trying to get PerformLib to work, the more bashing I have to do.

cheesestraws · Apr 15, 2025

Sorry for disappearing on this thread, life got busy. I think you've got more useful input from the others upthread than you would have got from me anyway. I will sound one note of warning, though:

bribri said:
Someone pointed out to me the source code for Time Manager.

Don't forget that the Time Manager source code that is out there is from one specific version of the OS at one specific time, and that Apple were totally at liberty to change the internals at any point. I don't know whether they did change or not, and you might be able to get away with it, but if you want this to work on multiple OSes and you take this path, you'll need to do reasonably extensive testing.

bribri · Apr 15, 2025

For now I'll be happy if it can work on just one system. But it would be nice to have a general solution, which is why I was hoping I could get PerformLib to compile.

David Cook · Apr 15, 2025

Using CodeWarrior 11, I can use either of the PerformLib.o libraries I attached earlier in this thread. It compiles, links, and runs just fine.

Using the Basilisk II emulator, the report outputs but without apparently capturing any samples. Either the emulator is too fast when sampling at 4 ms (the default test code) or the library is not compatible with the emulator. I have not pursued this yet.

Instead, I copied the sample application to a IIci and this is the output.

Performance Parameters
======================
 
Bytes per bucket, Code and ROM: 8
Bytes per bucket, RAM: 4
Sampling Interval: 4 ms
 
Performance Summary
===================
 
Each * in the Bar Graph = 1 hits
Total hits outside of the sampled segments: 0
Maximum hits in one bucket: 38
Total hits in all buckets: 211
 
Performance Data
================
 
Offset Hits | Segment 127  size  40000
====================================================================
 99B0    38 |**************************************
 99B8    18 |******************
 99C0     6 |******
 99C8     2 |**
 99D0    22 |**********************
 A2A8     2 |**
 A2D0     1 |*
2C620     4 |****
2C6B0     2 |**
2C6B8     6 |******
2C6C0     3 |***
2C6C8     3 |***
2C6D0     5 |*****
2C6D8    19 |*******************
2C6E0     3 |***
2C6E8     6 |******
2C6F0     1 |*
2C6F8     1 |*
2C700     2 |**
2C708     3 |***
2C710     2 |**
2C718     1 |*
2C720     1 |*
2C728     2 |**
2C730     4 |****
2C738     3 |***
2C740     5 |*****

Offset Hits | Segment 1  size   8E32  name Sources
====================================================================
  7AC     1 |*
  7F4     2 |**
  7FC     4 |****
  854     3 |***
  894     2 |**
  8A4     1 |*
  8AC     1 |*
  8B4     1 |*
  904     1 |*
  94C     3 |***
  95C     4 |****
  964     6 |******
  96C     2 |**
  974     5 |*****
  A0C     3 |***
  A14     2 |**
  A1C     1 |*
  A24     2 |**
  A2C     1 |*
  A34     1 |*

The segment listed as Segment 127 is likely the IIci ROM. Notice the size is 40000 in hex. I expected 80000.
The segment listed as Segment 1 is definitely the application code segment. The id and size are correct. (I reused a large test console application that I use for anything I'm playing with. A dedicate TestPerf application would be smaller.)

The takeaway is that it works. However, it might be best to reverse engineer the library to extract the critical code. I've attached a disassembly from within CodeWarrior in case that helps.

- David

bribri · Apr 15, 2025

I've been pouring over a disassembly of it using Ghidra, trying to work out how its interrupt function determines where the Program Counter was at interrupt time. So far I haven't gotten anything! Though I do believe I figured out where the interrupt function is. It doesn't look like CodeWarrior recognized that block as code. Its instructions start with:

Code:

4e 56 00 00     link.w     A6,0x0
2c 5f           movea.l    (SP)+,A6
10 3a 00 78     move.b     (0x78,PC),D0b
67 40           beq.b
30 3a 00 68     move.w     (0x68,PC),D0w
20 37 00 00     move.l     (0x0,SP,D0w*0x1),D0

I'm still fairly new to assembly, so it's hard for me to work out everything it's doing. But I don't see it doing anything that might enable it to get at the stored Program Counter. The only suspicious thing I've noticed is it setting A5 to something interesting:

Code:

        00011350 2a 78 09 04     movea.l    (DAT_00000904).w,A5

Basically setting A5 to ~~0x0904~~ (edit) the memory at address 0x0904, a particular spot in low memory. But then it doesn't actually do anything with A5, and restores it later. So part of me is wondering if that's boilerplate left in by the function's original author. I'm pretty sure it's not getting the stored PC from the stack. But it's not getting anything from any address registers either! It's really quite a mystery.

bribri · Apr 15, 2025

Here's the entire interrupt function, as disassembled by Ghidra:

Code:

                             TimerInterruptFunc                              XREF[3]:     L318.__z37___z37_PERFORM:0001124
                                                                                          L318.__z37___z37_PERFORM:0001125
                                                                                          L318.__z37___z37_PERFORM:0001126
        0001132e 4e 56 00 00     link.w     A6,0x0
        00011332 2c 5f           movea.l    (SP)+,A6
        00011334 10 3a 00 78     move.b     (0x78,PC)=>DAT_000113ae,D0b
        00011338 67 40           beq.b      LAB_0001137a
        0001133a 30 3a 00 68     move.w     (0x68,PC)=>DAT_000113a4,D0w
        0001133e 20 37 00 00     move.l     (0x0,SP,D0w*0x1),D0
        00011342 59 4f           subq.w     #0x4,SP
        00011344 2f 00           move.l     D0,-(SP)
                             LAB_00011346+2                                  XREF[0,1]:   00011346(c)  
        00011346 4e ba 00 00     jsr        LAB_00011346+2
        0001134a 20 1f           move.l     (SP)+,D0
        0001134c 48 e7 40 04     movem.l    {  A5 D1},-(SP)
        00011350 2a 78 09 04     movea.l    (DAT_00000904).w,A5
        00011354 42 a7           clr.l      -(SP)
        00011356 2f 3a 00 52     move.l     (0x52,PC)=>DAT_000113aa,-(SP)
        0001135a 2f 00           move.l     D0,-(SP)
                             LAB_0001135c+2                                  XREF[0,1]:   0001135c(c)  
        0001135c 4e ba 00 00     jsr        LAB_0001135c+2
        00011360 20 1f           move.l     (SP)+,D0
        00011362 6d 12           blt.b      LAB_00011376
        00011364 20 7a 00 3a     movea.l    (0x3a,PC)=>DAT_000113a0,A0
        00011368 d1 c0           adda.l     D0,A0
        0001136a 30 10           move.w     (A0),D0w
        0001136c 52 40           addq.w     #0x1,D0w
        0001136e 66 04           bne.b      LAB_00011374
        00011370 30 3c ff ff     move.w     #-0x1,D0w
                             LAB_00011374                                    XREF[1]:     0001136e(j)  
        00011374 30 80           move.w     D0w,(A0)
                             LAB_00011376                                    XREF[1]:     00011362(j)  
        00011376 4c df 20 02     movem.l    (SP)+,{  D1 A5}
                             LAB_0001137a                                    XREF[1]:     00011338(j)  
        0001137a 41 fa 00 34     lea        (0x34,PC)=>TMTask,A0
        0001137e 4a 78 02 8e     tst.w      (DAT_0000028e).w
        00011382 6c 08           bge.b      LAB_0001138c
        00011384 70 01           moveq      #0x1,D0
        00011386 31 40 00 0a     move.w     D0w,(0xa,A0)=>DAT_000113ba
        0001138a 60 06           bra.b      LAB_00011392
                             LAB_0001138c                                    XREF[1]:     00011382(j)  
        0001138c 20 3a 00 18     move.l     (0x18,PC)=>DAT_000113a6,D0
        00011390 a0 5a           PrimeTime
                             LAB_00011392                                    XREF[1]:     0001138a(j)  
        00011392 4e 75           rts

SuperSVGA · Apr 15, 2025

bribri said:
I'm still fairly new to assembly, so it's hard for me to work out everything it's doing. But I don't see it doing anything that might enable it to get at the stored Program Counter. The only suspicious thing I've noticed is it setting A5 to something interesting:

Code:

00011350 2a 78 09 04 movea.l (DAT_00000904).w,A5

Basically setting A5 to 0x0904, a particular spot in low memory. But then it doesn't actually do anything with A5, and restores it later. So part of me is wondering if that's boilerplate left in by the function's original author. I'm pretty sure it's not getting the stored PC from the stack. But it's not getting anything from any address registers either! It's really quite a mystery.

That code just sets A5 to the A5 World location for the current application. Typically it would be used for the application's variables, probably just something the compiler leaves in for the possibility that it will be used.

bribri · Apr 15, 2025

Doing more analysis... I think the key is in these lines:

Code:

0001133a 30 3a 00 68     move.w     (0x68,PC)=>DAT_000113a4,D0w
0001133e 20 37 00 00     move.l     (0x0,SP,D0w*0x1),D0

Somehow DAT_000113a4 is assigned to the right offset value for the stack to get at the stored PC at interrupt time. Looking at the rest of the code, it's using that value as though that were it.

According to Ghidra, the only other place DAT_000113a4 gets set is in the function PERFSETUP. But confusingly, Ghidra doesn't think it gets called from anywhere else in the whole library!

But then... in the CodeWarrior disassembly, at offset 6E6 in INITPERF, we see this:

Code:

000006E6: 4EAD 0000          JSR       PERFSETUP

How on earth is it figuring that jumps to PERFSETUP? It's JSR 0x0, i.e. jump to the next instruction, right? Is there some nuance of 68k assembly I'm missing here?

Edit: Is it jumping to the value in an address register?

David Cook · Apr 15, 2025

Rough draft analysis.

link      a6,#$0000
movea.l   (sp)+,a6
move.b    <Anon_05>+$0E,d0    // Move some global variable into D0. Maybe whether performance checking is currently enabled?
beq.s     PERFINTE+$4C        // If it is zero, goto exit stuff
move.w    <Anon_05>+$04,d0    // Move some global variable into D0. Looks like the magic offset of the saved PC on the stack.
move.l    $00(sp,d0.w),d0    // Using that as an offset against the stack pointer, get some value (PC?) from the stack and put it into D0
subq.w    #$4,sp        // Make some room on the stack (for a return value?)
move.l    d0,-(sp)        // Push D0 onto the stack
jsr       <Anon_04>        // Call Anon_04, which I think just strips addresses to account for 24/32 bit mode
move.l    (sp)+,d0        // Get the cleaned D0 back
movem.l   d1/a5,-(sp)        // Save some registers
movea.l   CurrentA5,a5        // Get the current application's A5 globals (this might not be the correct way of doing it)
clr.l     -(sp)            // Put 0 onto the stack
move.l    <Anon_05>+$0A,-(sp)    // Put some global onto the stack
move.l    d0,-(sp)        // Put the cleaned D0 (interrupted program counter?) onto the stack
jsr       CVTPC            // Call a routine that probably points to the array index if D0 is something we care about
move.l    (sp)+,d0        // Get the result of that call. Probably the array index of addresses we are tracking.
blt.s     PERFINTE+$48        // Less than zero? We don't care about D0. Goto restore the registers and exit
movea.l   <Anon_05>,a0        // Point to a global variable address. Probably our array of addresses we are tracking?
adda.l    d0,a0            // Add the result of the previous call. Probably the index into the array.
move.w    (a0),d0        // Get the value of what is there
addq.w    #$1,d0        // Increment it
bne.s     PERFINTE+$46        // Did it roll over to zero? Nope. 
move.w    #-$0001,d0        // Crap. It rolled over. Set it to maximum.
move.w    d0,(a0)        // Save the incremented count.
movem.l   (sp)+,d1/a5        // Restore the registers
lea       <Anon_05>+$10,a0    // Point to some global structure (seems like this could move down after the bge)
tst.w     ROM85            // Are we on an early Mac? (128K, 512K)
bge.s     PERFINTE+$5E        // Nope. Goto renew our Time Manager alarm
moveq     #$01,d0        // Yes, early Mac.
move.w    d0,$000A(a0)        // Set some variable in our global structure
bra.s     PERFINTE+$64        // Exit
move.l    <Anon_05>+$06,d0    // Get our sample time from our global variable
_PrimeTime           ; OS trap    // Renew Time Manager alarm
rts
unlk      a6
rts

David Cook · Apr 15, 2025

bribri said:
Somehow DAT_000113a4 is assigned to the right offset value for the stack to get at the stored PC at interrupt time

I agree!

bribri said:
How on earth is it figuring that jumps to PERFSETUP? It's JSR 0x0, i.e. jump to the next instruction, right? Is there some nuance of 68k assembly I'm missing here?

I just dumped the RAW .o from within CodeWarrior. To save memory and disk space, the linker only needs to take parts of the library. After collecting all the parts from the entire program, the linker fills in the final addresses everywhere. So, it is 0000 at this point in the process.

David Cook · Apr 15, 2025

bribri said:
PERFSETUP

Is called right at the end of INITPERF if it is successful in setting up everything else before that. It looks like PERFSETUP is using a parameter sent in for the magic offset.

Here is where the magic offset is being saved in the global variable. It looks like PERFSETU doesn't perform the calculation, but only sets up the structures based on whether this is an early ROM or now.

bribri · Apr 16, 2025

I'm getting a little further with this. I'm looking at the routine (revealingly) named CalcPCOffset. It looks like what it does is set up a Time Manager task to fire, then loops endlessly with a 60 FE BRA.S $0000 statement. Because there's now only one instruction execution could be at during interrupt time, the interrupt function actually scans through the stack looking for that instruction's memory address.

However, there's some things the interrupt function does that I don't understand, and I'm now realizing that Ghidra chokes on these instructions and disassembles them as the wrong thing (it thinks they are JSR statements that don't go anywhere), so I've been missing it this whole time. Those are:

Code:

4ead 0000      JSR      +0(A5)

What happens if you jump to whatever's in the A5 register? I thought that was a pointer to data, not code, and about the current app's globals, and at interrupt time the value of it isn't even guaranteed to be anything predictable.

cheesestraws · Apr 16, 2025

bribri said:
Because there's now only one instruction execution could be at during interrupt time, the interrupt function actually scans through the stack looking for that instruction's memory address.

Oh bloody hell. That's... kind of awful.

bribri · Apr 16, 2025

Well, if it's good enough for Apple....

But there's still more to the story. If I just scan through what's on the stack naively, I never find what I'm looking for. Their code however is doing something with the A5 register. Here's what I'm seeing, using a different disassembler that handles opcode 4ead correctly:

Code:

0000001c : 41fa ffa0      LEA -96(PC),A0  <-- address of endless loop instruction
00000020 : 2408           MOVE.L   A0,D2
00000022 : 594f           SUBQ.W   #4,SP
00000024 : 2f02           MOVE.L   D2,-(SP)
00000026 : 4ead 0000      JSR +0(A5)
0000002a : 241f           MOVE.L   (SP)+,D2
0000002c : 204f           MOVEA.L  SP,A0

It's calling some function that's supposedly on the A5 register at interrupt time and -- I think -- passing it the address of the instruction we're trying to find in the stack. It then copies what I believe to be the result into register D2, which is then later used when scanning the stack for what we're looking for. And it copies the stack pointer into A0, and then begins its search. There's another point later, right before comparing a point in memory in the stack with register D2, where it does this same thing, though I'm still not sure why.

This is really quite mysterious, but it's clearly part of the secret sauce that makes this whole dastardly thing work. I'm kind of flummoxed, though. I tried imitating it in my own code and predictably got a system error.

David Cook · Apr 16, 2025

bribri said:
I'm getting a little further with this. I'm looking at the routine (revealingly) named CalcPCOffset. It looks like what it does is set up a Time Manager task to fire, then loops endlessly with a 60 FE BRA.S $0000 statement. Because there's now only one instruction execution could be at during interrupt time, the interrupt function actually scans through the stack looking for that instruction's memory address.

Wow. Not crazy if it works, I guess.

David Cook · Apr 16, 2025

bribri said:
Their code however is doing something with the A5 register.

Are you disassembling the raw library or after it has been compiled into a program? I think you are looking at the raw library, which is before all the correct addresses have been inserted by the linker. Here is what the compiled library looks like at that location.

My JSR has been linked to Anon28, right below CVTPC.

By the way, I assume the calculation interrupt routine must then change (increment by four bytes?) the found PC address on the stack, such that it can escape the endless loop after returning from the interrupt.

bribri · Apr 16, 2025

David Cook said:
Are you disassembling the raw library or after it has been compiled into a program? I think you are looking at the raw library, which is before all the correct addresses have been inserted by the linker. Here is what the compiled library looks like at that location.

Yes, I have been using the raw library, and I didn't realize some of those statements would be changed after linking. Thanks for pointing that out! I would've ended up chasing my tail for quite a while otherwise. I built the example app using MPW, so I'll try disassembling that.

David Cook said:
By the way, I assume the calculation interrupt routine must then change (increment by four bytes?) the found PC address on the stack, such that it can escape the endless loop after returning from the interrupt.

I believe you're correct, though it increments by two bytes. Here's what it does:

Code:

        000125c2 54 40           addq.w     #0x2,D0w
        000125c4 22 30 00 00     move.l     (0x0,A0,D0w*0x1),D1
        000125c8 54 81           addq.l     #0x2,D1
        000125ca 21 81 00 00     move.l     D1,(0x0,A0,D0w*0x1)

At that point A0 contains the address of the stack, and D0 the offset to the return address.

bribri · Apr 16, 2025

@david_cook What disassembler are you using? I'm starting to suspect that Ghidra is unreliable, in that it's not getting some of the jump instructions correctly, even when I used the linked library.

Possible to get the saved instruction address at interrupt time, in a Time Manager task?

bribri

SuperSVGA

bribri

cheesestraws

bribri

David Cook

Attachments

bribri

bribri

SuperSVGA

bribri

David Cook

David Cook

David Cook

bribri

cheesestraws

bribri

David Cook

David Cook

bribri

bribri

Similar threads