The ultimate reference is the ProTracker 68000 source, because there's a fair amount of corner-case behavior that certain MODs rely on. But that's only correct for newer MODs that were made on ProTracker rather than SoundTracker.
There's a famous (and very good) MOD called "Klisje Paa Klisje" that sets a tempo of 0x20 at one point. In SoundTracker, that just meant "update every 32 timer ticks", but in ProTracker that means "set timer to 32 BPM" which is quite different.
[ and
@MIST ]. I've been looking at the assembler code a bit more, to turn this MOD player into a proper Mac app. This means I'll end up using things like NewPtr to allocate sample data and MOD data and being able to read in a MOD file properly.
I tend to find there's ambiguity with the term "sample", because it can mean an entire sampled waveform, or an individual sample value within that waveform. So, I'll use the term "waveform" to mean a full sampled waveform and "sample" to mean an individual sample value within it. I'll use "wave-cycle" to indicate a set of samples within a waveform that constitutes a complete cycle at the current pitch. So, for example, if a waveform doesn't have a constant pitch envelope, then wave-cycles vary throughout the waveform.
Without trying to repeat the spec in the earlier
link too much, a Pro-tracker MOD music file has 4 major sections:
- A header containing information about the song and Instrument definitions (which consist of pointers to waveforms, their lengths, up to 64K x 16-bit words, some fine-tuning (+/-8/8 of a semitone), loop markers and master volume levels.
- Song data, which consists of a fixed-length 128 byte array containing indexes to one of 64 patterns.
- Patterns, each of which consist of, seemingly a fixed-length 64 element 2D table [Step][Voice], each of which is 4 bytes that specify the instrument to be used; a 12-bit pitch value (C'1' =856); and a 12-bit effect command.
- Waveforms which consists of signed 8-bit samples up to 128kB each (64K words).
The MOD tracker code mimics this structure, so there's code to handle waveform generation, which has to be mixed down to a 370 sample buffer per 60.15Hz Mac frame, which is the routine at
nomus and the higher level routine, called
music which sets up the waveform generation for each frame. So let's look at the core routine first, which isn't labelled, but is wrapped with a
.rept LEN . I think
@MIST will have changed a couple of things in the algorithm now, but it won't change the fundamental code.
I find the gas convention of preceding a register name with '%' somewhat annoying since registers are used most of the time (rather than labels) so everything gets peppered with multiple '%'s.
Anyway, on entry:
- D0.w contains the current position in the waveform for voice 0. This could be an error, because a waveform can be up to 128kB.
- D1.w contains the current phase position for voice 0, i.e. the fraction of a position.
- A0.l contains the pointer to the beginning of the current waveform for voice 0. It doesn't seem like the pointer is updated.
- A2.l contains the pointer to the volume adjustment for the current waveform for voice 0, which must be in the range -64..63.
- D4.w, D5.w, A1.l, A3.L correspond to D0.w, D1.w, A0.L, A1.L for voice 1.
- D3.b is a temp, containing the raw sample value read from a waveform at the current position.
- D7.b is a temp, containing the volume-adjusted sample for voice 0, subsequently mixed with the volume-adjusted sample for voice 1.
- A6.L points to the 370b audio buffer (which then gets copied into the actual Mac's hardware audio buffer at the beginning of the VBL flyback, but not shown here).
Code:
add.w %a4,%d1 #Current phase V0 (V2, 2nd pass)
addx.w %d2,%d0 #Current position V0 (V2, 2nd pass)
add.w %a5,%d5 #Current phase V1 (V3, 2nd pass)
addx.w %d6,%d4 #Current position V1 (V3, 2nd pass)
move.b 0(%a0,%d0.l),%d3 #Sample[Pos[Voice0]] at full vol.
move.b 0(%a2,%d3.w),%d7 #Attenuate volume
move.b 0(%a1,%d4.l),%d3 #For Voice 1
add.b 0(%a3,%d3.w),%d7 #Attenuate volume for Voice 1 & add.
/* Sample audio is represented as a signed byte, -128 to 127.
To convert to unsigned we need to add 128=> 0 to 255, which is
equivalent to eor.b #0x80.
*/
eor.b #0x80,%d7 /* convert to unsigned */
lsr.b #1,%d7 /* divide by two to allow adding of second channel */
move.b %d7,(%a6)+ #Finally, write the sample to the buffer.
We can see that the routine handles 2 voices (channels) at the same time. The length is 32b, so the whole length is 11.5kB. 16c+56c+22c =94 cycles per loop or 34,780 for 2 voices. This basic loop is then repeated for the next two voices making 69,560 total per frame. In addition, the data transfer from the mixed 370b buffer to the Hardware buffer takes 102 cycles * 45+28 cycles = 4,618; 74,178 cycles per frame or 4461806.7 cycles per second, roughly 69% of CPU, if an original Mac can manage 6.5M cycles/s on average.
One, obvious improvement that can be made would be to only unroll the loop once and repeat the code for voices 2 and 3 instead of a second whole repeat. This would cost very little in terms of setting up the pointers, however, the move.b %d7,(%a6)+ at the end would become add.b in both cases, requiring the buffer to be cleared at the beginning of each frame (e.g. when copying to the hardware buffer), which would cost another move.l dn,(an) * 45 = 540c or 0.5% of CPU.
The next post, I think I will discuss the song and pattern playing code; what tables are defined and how they're used.