Duplicating a file with a resource fork won’t produce an exact copy

David Cook

Well-known member
For testing and cataloging purposes, I wrote a routine that compares two files to determine any differences. Everything works well, except when comparing the resource forks. It turns out, in classic Mac OS, a portion of the resource fork header differs when a file is copied.

Steps to Reproduce

1. Create a test file with a resource fork. For example: Open SimpleText (not TeachText), choose New, type some stuff, choose Save, name it “untitled”. This creates a really boring text file that importantly includes a resource fork.

2. In the Finder, duplicate your “untitled” file however you wish. For example, choose Duplicate from the File menu, or option drag “untitled” to another folder, or even Stuff and Unstuff it to another copy. At this point, you should have two copies of the “untitled” file, perhaps with the second one named “untitled copy” or “untitled (2)”.

3. Using HexEdit (or any other editor that let’s you view the raw contents of the resource fork), open the raw resource fork of the original “untitled” file and the copy you made.

HexEdit Resource.jpg

Observe that a chunk of the file differs in these "identical" files. (Aside: there is a lot of unrelated garbage in the first 256 bytes because whatever OS routine created this did not clear the memory buffer first. Boo! Boo!)

untitled differences.jpg

In the example above, notice that the filename and file types unexpectedly appear within the resource fork and differ between the two “duplicate” files. The second copy was created from Unstuffing and has a clue in the type/owner of “Part” and “SIT!”. It appears the information about the file itself is captured as the resource fork is being established (i.e. “Part”) – before StuffIt gives the file the final type/creator of “TEXT” “ttxt”.

The leaked Macintosh source code confirms that a “directory copy” is placed in the resource header.

ResourceManager.jpg

Technical Note 74 claims the directory information is to “aid in scavenging”. And, in fact, Apple acknowledges in Technical Note 74 that duplicated files won’t match: “The duplicate may not be exactly like the original”.

Technical Note 74.jpg

Who Cares?

Let’s say you have a massive collection of backed-up disks and you want to create a listing of unique files. Or maybe you want to recognize newer/patched software when the developer didn’t actually increment the version number. Or, you have written a transfer program that verifies that the transferred file matches the source file. Well, you can’t just checksum the file and use that for comparison. You need to exclude some of the bytes in the resource header during the calculation.

Can You Fix a Duplicate?

I wrote code to intentionally target the resource header bytes of an existing file. It appears bytes from offset $30 to $7D are being protected. FileManager? The write call returns noErr but doesn’t do anything to those bytes. It does successfully write before and after the protected range.

Blocked-from-writing.jpg

According to Inside Macintosh Vol I, the first 16 bytes are for the resource manager, the next 112 are for the system, and the last 128 are for the application. (Technical Note #62 takes that back – the application should not use those bytes in the header. But, I was able to write to that space.)

Inside Mac.jpg

I have not yet discovered how a well-behaving application (including the Finder) can perfectly duplicate a file.

Next Steps

I have searched through the ROM source without any luck of finding the operating system code that block writes to the header. I guess I will need to step through using Macsbug. I suspect there is a flag that indicates when the system is making a write call, to allow it to modify the header. Or, maybe a range of bytes is being set to protected on the OpenRF call?

I have not tried using PBHCopyFile. However, that is a newer call not available on many OS versions, and is primarily aimed at optimizing remote (AppleShare) server copying. Such a call would not help with stuffing/unstuffing or modem transfers.
 

Juror22

Well-known member
Your work might also provide an insight into how some programs seem to 'know' that the original file has been copied. Did you attempt the copy between volumes? I assume the result would be the same, but it would eliminate the change of name that accompanied the examples you listed.
 

David Cook

Well-known member
Your work might also provide an insight into how some programs seem to 'know' that the original file has been copied. Did you attempt the copy between volumes? I assume the result would be the same, but it would eliminate the change of name that accompanied the examples you listed.

You can rename the file or drag to another folder (not option drag) and the resource header will be unchanged. This is because only the directory entry is being modified -- not the file.

As for your question...

1747353501776.png

I dragged the original "untitled" file to a different volume (named "System 7.1 2021"). What happens is the garbage in memory after the real filename of "untitled" is copied into the resource header of the new "untitled". For example you can see on the right side that the filename is still [08] "untitled" but then leftover garbage from the volume name (".1 2021") and then a bunch of other junk is now in the header file. TLDR: The file changes!

So, if you wanted to be sneaky for copy protection, I guess you could read the garbage in your resource fork header and use that to encrypt/decrypt the license key that the user has to enter the first time your application is run. In the future, if the license number doesn't decrypt correctly, it means your application was copied.

Previously, the Finder would perform a sector-by-sector copy of a floppy disk. In that case, the copying would not affect the contents of the resource header. However, Apple changed the code to force a file-by-file copy of a floppy disk, so that bad sectors in the destination would not prevent use of the disk.
 

David Cook

Well-known member
What happens if you take a file with no resource fork and call CreateResFile on it?

I took a BinHex file (no resource fork) named "0x90 test.Hqx" and called HCreateResFile (but didn't add any resources). As you can see, the header and the start of the resource blocks are properly created. But, there is still a lot of garbage following the filename. It is hard to tell from one sample whether the remainder of the header is cleanish purely by luck.

1747355020862.png
 

nathall

Well-known member
Previously, the Finder would perform a sector-by-sector copy of a floppy disk. In that case, the copying would not affect the contents of the resource header. However, Apple changed the code to force a file-by-file copy of a floppy disk, so that bad sectors in the destination would not prevent use of the disk.

You said “previously…..”

Any idea when this change was made? What System version?
 

NJRoadfan

Well-known member
AFP has a function called FPCopyFile that can duplicate a file 1:1... well at least it should. Its dependent on the server's file copy routines, which at least with Netatalk DOES produce an identical file. What I see here is a Finder bug..... and there are MANY. Not surprised there are bits of uninitialized memory cropping up in there.

A lot of those Finder bugs can be exposed by having files with no filetype/creator (just null bytes), something that Netatalk is really good at presenting to a client by default!

Regarding disk copies, generally the system will do a faster block copy if the source and destination are the same number of blocks.
 

cheesestraws

Well-known member
I tried to read the Resource Manager source code as in SuperMario last night and I have some ... observations (I will not say conclusions, they're not concrete enough) on this front.
  1. Good grief I'm glad I'm not working on this codebase for money or anything that matters. Bring on Copland, that will sort this all out, right?
  2. I can't see anything in CreateResFile or OpenResFile that deliberately puts stuff here (though it may of course be in the File Manager, which I didn't have the mental fortitude to attempt). I can, however, see lots of opportunities for random uninitialised RAM to turn up.
  3. The mention of 'scavenging' and the information that has ended up here (if it isn't here by accident) makes me wonder if this is some kind of weird slowly bit-rotting hangover from MFS.
 

David Cook

Well-known member
You said “previously…..”

Any idea when this change was made? What System version?
if the source and destination are the same number of blocks.

<2> 7/3/90 PK For 800K floppies only, reduce allocation blocks from 1594 to
1593, so System 6 Finders don't do physical disk copies.

/* Special case for 800K GCR disk: decrease the number of allocation blocks by one.
This is so 6.x Finders won't do disk-to-disk copies physically, which is an
optimization triggered only if both disks have exactly 1594 (0x63A) allocation blocks.
We don't want them to try to do a physical copy as they'd see the bad blocks. */


Looks like they made this change around the time of System 7. I have not investigated if they do this only if the disk has a bad sector, or if all 800K disks formatted in System 7 have slightly smaller capacity in exchange for System 6 Finder copies working more often.
 

David Cook

Well-known member
Good grief I'm glad I'm not working on this codebase for money or anything that matters.

Ha! That's exactly the reaction I had. The resource manager code has so many patches for fonts, compression, printers, and ROM resources. It is a mess.

I can't see anything in CreateResFile or OpenResFile that deliberately puts stuff here (though it may of course be in the File Manager, which I didn't have the mental fortitude to attempt).

I agree. Since the File Manager is actually doing to heavy lifting of creating a resource fork, I imagine the code is in there. I stepped through with Macsbug last night but was too tired to spot anything.
 

David Cook

Well-known member
Oh! It's even weirder (or less weird?) than I originally thought.

Every time the resource file is modified, the 'directory' information in the header is updated. So, not just when the file is copied, but anytime the resource fork is changed.

1. Rename or move. No change.
2. OpenResFile/CloseResFile. No change.
3. AddResource. Directory information is updated.
4. ChangeResource (but keep the same size). Directory information is updated.

My theory is, the file bytes are not being protected, but instead, whenever the first chunk of a resource fork is written the File Manager overwrites with the most recent 'directory' information just before writing the chunk.

As long as your application does not modify its own resource fork, you could use this trick for both copy protection and virus protection. On first launch, copy the first 512 bytes of your resource fork to your preferences file (or anywhere else, I guess). Then compare that information on subsequent launches.

I wonder if Norton Utilities or other file recovery software uses this to detect deleted files?
 

adespoton

Well-known member
Looks like they made this change around the time of System 7. I have not investigated if they do this only if the disk has a bad sector, or if all 800K disks formatted in System 7 have slightly smaller capacity in exchange for System 6 Finder copies working more often.
Well THAT brought back a memory.

When System 7 came out, I discovered that disks formatted in System 7 had a slightly smaller capacity. At the time, I was taking "damaged" floppies from computer labs, cleaning them up, and re-formatting them, testing their reliability, and putting the reliable ones back into circulation.

At first I thought that the capacity issue had to do with bad sectors, but eventually I tested on some new disks, and found that formatting under System 6.0.4 instead of System 7 provided slightly more capacity. So I went back and re-formatted all the disks in that collection for use as "scratch disks" in the lab for when someone didn't have a floppy handy.

So yeah; if my memories are still reliable, this affected ALL 800K disks formatted in System 7.
 

David Cook

Well-known member
: when you call AddResource, that changes the in-memory directory info?

Actually, I am only checking on the disk after the resource file closes. I'm not sure if the resource header is fully loaded into memory and accessible.

I modify the resource, call ChangedResource, and then close the file.
 

David Cook

Well-known member
Found the code!

So there you have it. The File Manager is intentionally reading and overwriting portions of the first block of the resource fork when it is closed.

TFSRFN2.a
FClose()
; 16-May-85 PWD Added options to FlushCache call, [B]added code to write a copy ; of the file's catalog entry into the resource fork.[/B] BTST #FCBFlgRBit,FCBMdRByt(A1,D1) ; Resource fork? BEQ FlFilBuf ; Br if not <21Oct85> <SM4> CSS CMP.L #128,FCBEOF(A1,D1) ; Room for a catalog entry copy? BCS FlFilBuf ; If not, forget it <21Oct85> <SM4> CSS MOVEM.L A1/D1,-(A6) ; Save FCB pointer across call MOVE.L A0,A3 ; Save Key pointer MOVEA.L VCBBufAdr(A2),A1 ; Point to volume cache MOVE.W D1,D0 ; Set up file refNum MOVEQ #0,D1 ; No options on GetBlock MOVEQ [B]#0,D2 ; First block of resource fork[/B] JSR GetBlock ; Read it into the cache BNE.S @2 ; Br on errors <21Oct85> MOVE.L A0,-(SP) ; Save block pointer LEA [B]48[/B](A0),A1 ; Point into the block to write catalog entry [B][48 is the first overwritten location I saw in the file][/B] LEA ckrCName(A3),A0 ; Point to CNode name there MOVEQ #(lenCKR-4-2),D0 ; Length of key - dirID - length byte+filler _BlockMove ; MOVEQ #(lenCKR-4-2),D0 ; Same length again ADDA.L D0,A1 ; Update destination pointer LEA filFlags(A5),A0 ; Source: catalog entry MOVEQ #(filUsrWds+16-filFlags),D0 ; length to copy _BlockMove ; MOVEQ #(filUsrWds+16-filFlags),D0 ; length to advance pointer by ADDA.L D0,A1 ; Update destination pointer LEA filFndrInfo(A5),A0 ; Point to additional finder info MOVEQ #16,D0 ; Length of finder info _BlockMove ; MOVEQ #16,D0 ; Distance to advance pointer ADDA.L D0,A1 ; Advance the pointer MOVE.L filCrDat(A5),(A1)+ ; Copy creation date MOVE.L filLgLen(A5),(A1)+ ; Copy data fork's logical length MOVE.L filRLgLen(A5),(A1)+ ; Copy resource fork's logical length
 

cheesestraws

Well-known member
Oh, well found.

The File Manager is intentionally reading and overwriting portions of the first block of the resource fork when it is closed.

I enjoy that the resource manager all the way through says that this is 'by convention' and the file manager just enforces it en masse. I wonder if I'm reading too much into this to feel a little annoyance in that 'by convention' phrasing...

Do we understand why this feature is implemented?

The technote linked earlier and the comments in only place in the resource manager this is mentioned (in the private equates) suggests that this was used or intended to be used when the master directory block is damaged, to have a backup copy of the directory entry for each file. Whether it's actually used or not, I don't know (and don't currently have the energy to try to find out).

Are there any downsides to the file being changed?

The main one, I think, is that you can't blindly checksum or hash the resource fork of a file and get a consistent result. However, given the way the resource manager is documented, I don't think one should be surprised about this; it's stated repeatedly that you can't just treat the resource fork as a byte stream and expect not to have a bad day, because if a resource fork is present then the system expects to be able to use it, and the only guarantees you get are those that the resource manager itself provides.

So the inability to make a consistent hash of the resource fork just by treating it as a byte stream is annoying, but not at all outside the model that the OS has committed to.
 

David Cook

Well-known member
or intended to be used when the master directory block is damaged,

When you search for the partial word "scaveng" in the SuperMario source, there are at least 30 references. None of them seem to use the resource fork info. But, maybe Disk First Aid or Norton use it? Maybe it helped them in testing/debugging?

Are there any downsides to the file being changed?

Besides the inconsistent hash, I'm bothered by:
1. The uncleared garbage. I don't think it is going to leak any secrets -- but who knows. I should write some code that farms it. : )
2. The fact that the information is not kept up to date. You can rename the file, move it to another directory, and change the finder info (?) without the resource 'directory' information being updated.

Aside: this fclose source also updates the file modified date. When a brave sole decides to try to fix the 2040 date issue, they can start pulling at the thread there.
 
Top