Possible to get the saved instruction address at interrupt time, in a Time Manager task?

bribri · Apr 16, 2025

@David Cook ~~Sorry I'm not sure if this is pinging you correctly!~~ Okay I got it. Apologies to the user whose username is simply "David" for pinging you accidentally.

What disassembler are you using? I'm starting to suspect that Ghidra is unreliable, in that it's not getting some of the jump instructions correctly, even when I used the linked library.

David Cook · Apr 17, 2025

Attached is ResEdit with the code disassembly extension in the ResEdit Preferences file (put in the Preferences folder), and Resorcerer. I switch between them because sometimes ResEdit doesn't read the debugging labels or non-'CODE' resources correctly.

bribri · Apr 17, 2025

Ah yes, I forgot about that ResEdit extension! I actually already have it installed in one of my systems, but forget to use it.

In any case, I think I've got the search for the stack offset working, using Apple's rather strange method. There very well may be a better way of doing it buried in the assembly of PerformLib, probably somewhere in InitPerf, but that function is kind of a monster and without some kind of good decompilation I don't think I'd want to take a stab at figuring out everything it does.

I ended up writing an interrupt routine purely in assembly so that I could be in complete control of the stack, since otherwise it's difficult to tell the compiler not to be pushing things on to it. That's probably why my efforts at writing a C function to search the stack didn't work out so well. When the search is written in assembly it does seem to find a consistent offset to where the stored PC value is.

The strange thing, though, is that Apple's routine has one additional check in it: it goes through the stack word by word looking for one whose low byte is 0x1F. Then, upon finding it, it checks the next longword and see if it matches the address of the instruction doing the endless loop. If I put that check for 0x1F in there, it never finds the right offset, because that arrangement of bytes -- 0x1F followed by a longword of the stored PC value -- never occurs.

But that check works when linking to PerformLib. I wonder why that is? Could it have something to do with the fact that I'm not compiling with Apple's compiler? Or maybe (perhaps more likely) the "endless loop" trick is not a code path PerformLib always takes, and it does something more sensible to figure out the offset.

David Cook · Apr 17, 2025

bribri said:
re, it never finds the right offset, because that arrangement of bytes -- 0x1F followed by a longword of the stored PC value -- never occurs.

I believe this is the answer. The program puts 1F in the CCR register (immediately before the endless loop branch), which must get pushed onto the stack in that arrangement when an interrupt occurs. So, just an extra way of making sure this is the actual return address.

Seems like a good idea. Maybe you should add that to your code?

joevt · Apr 17, 2025

bribri said:
Yes, I have been using the raw library, and I didn't realize some of those statements would be changed after linking.

I remember A5 points to globals. But it also points to a jump table for cross segment subroutine calls.
https://dev.os9.ca/techpubs/mac/runtimehtml/RTArch-116.html
https://dev.os9.ca/techpubs/mac/runtimehtml/RTArch-117.html
https://dev.os9.ca/techpubs/mac/runtimehtml/RTArch-118.html

I guess If a JSR is cross segment then the instruction will remain 0x4EAD (A5 relative) but if it's not cross segment that it gets changed to 0x4EBA (PC relative). The offset 0x0000 needs to get changed in either case.

bribri · Apr 17, 2025

David Cook said:
I believe this is the answer. The program puts 1F in the CCR register (immediately before the endless loop branch), which must get pushed onto the stack in that arrangement when an interrupt occurs. So, just an extra way of making sure this is the actual return address.

View attachment 85618

Seems like a good idea. Maybe you should add that to your code?

I missed that! But yes, that explains it, and surely will make this more reliable. Thanks for pointing that out!

bribri · Apr 18, 2025

I think I'm getting close to finishing my Retro68 compatible sampling profiler. I have another question I'm hoping somewhere here knows the answer to.

When I have an address from the Program Counter, how do I figure out what the relative address is to the segment of code its executing in? Basically, I need a way of taking the global memory address from the PC and converting it into an offset that will line up with the compiler's debug data.

Is something like this technically a correct solution:

Code:

    Handle h = GetResource('CODE', 1);
    codeStart = (UInt32)(*h);
    codeStart = (UInt32)StripAddress((void *)codeStart);
    codeEnd = codeStart + GetResourceSizeOnDisk(h);
    codeStart += 4;

David Cook · Apr 18, 2025

bribri said:
When I have an address from the Program Counter, how do I figure out what the relative address is to the segment of code its executing in? Basically, I need a way of taking the global memory address from the PC and converting it into an offset that will line up with the compiler's debug data.

I guess it depends on whether you are trying to make a global profiler or only for the application that includes it as a library.

If it is only for the current application, then you can call CountResources and GetIndResources to make sure all 'CODE' resources are loaded (hmm... maybe you need to use the segment manager?). Lock each one as you go. This will give you a handle for which you can call GetHandleSize. That should give you the address range for each.

For a global profiler, you will also need to track the current application (A5). Maybe RecoverHandle and GetResInfo would be useful in that case when you hit and address that isn't in range and you want to know what resource is comes from. I don't know if RecoverHandle requires a Ptr to the first byte of the resource of whether it can handle an offset into the Ptr. Anyway, if it works, you would be able to profile time spent in System calls as well.

David Cook · Apr 18, 2025

David Cook said:
For a global profiler, you will also need to track the current application (A5). Maybe RecoverHandle and GetResInfo would be useful in that case when you hit and address that isn't in range and you want to know what resource is comes from. I don't know if RecoverHandle requires a Ptr to the first byte of the resource of whether it can handle an offset into the Ptr. Anyway, if it works, you would be able to profile time spent in System calls as well.

It occurs to me that some of this gloabl stuff may not work during interrupt time. And, you might need to be in the correct help to call GetResInfo. Best to start with an application-only profiler and then work up to some crazy stuff.

bribri · Apr 19, 2025

Yeah, I have no ambition to make a global profiler. Just one for the current application, included as a library similar to PerformLib.

I might be able to sidestep the issue of different code segments, because when compiling with Retro68, it does everything as one big segment, unless you specifically tell it not to. But it would be nice if this worked in a more general case. I'm not sure how it should handle different code segments being loaded dynamically. I suppose it's not unreasonable to say that if you're using it, you have to have all of your code segments loaded and locked.

bribri · Apr 21, 2025

Okay, I did it! I've written a sampling profiler, and I'm actually getting usable data from it. Now I can see exactly what lines of code in my game are sucking up all the time. This'll make it a ton easier to optimize.

It took a bit to get it right too. My samples include not just the stored Program Counter address at interrupt time, but a stack trace too, so I can calculate exclusive and inclusive times. But neither the stored Program Counter nor A6 can be relied on containing anything reasonable, so I had to check both of them to make sure the sample I'm taking was relevant to the current app, and isn't going to crash things if I start crawling the stack.

As soon as I've got it cleaned up, I'll throw the source code up into a GitHub repo. Currently it's only usable when compiling with Retro68, but it could be adapted for other compilers. (Though if you're using a classic compiler, you can just use PerformLib!)

(It just occurred to me... if I had made this software back around 1990, I could've sold it for a pretty penny!)

David Cook · Apr 21, 2025

Congratulations! That was a lot of hard work. Good to hear you were successful.

cheesestraws · Apr 21, 2025

Well done! That sounds like a very nice tool indeed.

bribri · Apr 21, 2025

Okay, here it is:

GitHub - briankendall/Profiler68: A sampling profiler for Retro68 projects

A sampling profiler for Retro68 projects. Contribute to briankendall/Profiler68 development by creating an account on GitHub.

github.com

I'd love to hear if anyone other than me ever makes use of it.

David Cook · Apr 22, 2025

I just looked at your code. You are a good programmer.

bribri · Apr 23, 2025

David Cook said:
I just looked at your code. You are a good programmer.

Thanks!

I've come a long way since Unicycle.

Possible to get the saved instruction address at interrupt time, in a Time Manager task?

bribri

David Cook

Attachments

bribri

David Cook

joevt

bribri

bribri

David Cook

David Cook

bribri

bribri

David Cook

cheesestraws

bribri

GitHub - briankendall/Profiler68: A sampling profiler for Retro68 projects

David Cook

bribri

Similar threads