• Updated 2023-07-12: Hello, Guest! Welcome back, and be sure to check out this follow-up post about our outage a week or so ago.

Outputting samples to the Mac SE sound chip without using Sound Manager

Crutch

Well-known member
Right, it exists only for backward compatibility because a lot of old Mac code just uses the VBL as a way to do something periodically every 1/60 second. The bulk of that old code didn’t actually are whether the VBL was actually happening at that moment or not (the obvious exception being code to blit bits for flicker-free animation) so when the Slot Manager was introduced, Apple just kept the _VInstall rate at 60+Hz so all that code wouldn’t break.
 

Snial

Well-known member
I'm already giving up on 68000 support. Mixing four voices at 22254.54Hz with my asm code probably takes up more than 80% of the CPU time on a 68000 8MHz, I don't really have room for another stage (buffer copying) now that I think about it.

So I'm going to target 68020+ and use one Sound Manager voice, and I'll use a vblank interrupt/callback for player timing, I think. However, like I asked earlier, how do I detect the vblank rate? Some Macs have 66-67Hz video modes (12" Apple RGB monitor etc.), and that would mess with my timings if I expect a nominal 60.15Hz
It's still intriguing though to see how well it could have been done on the SE. My idea would be to create a fake double-buffer. If it's possible to get the VIA to generate an interrupt half way down a scan, when half the sound buffer has been played, then refill the first half of the buffer, then on the VBL fill the second half.

Assuming a0^ the source sound buffer and a1^ the hardware sound buffer:

[move.b (a0)+,(a1)
addq.w #2,a1 ] x 184
move.b (a0),(a1)

Would take just under 0.5ms.

What's the objectives for your 4 channel audio? Is it just samples? Is it 4 channel music? Do you need to calculate pitches?
 

Crutch

Well-known member
If you just want to copy from a buffer in memory to the sound buffer periodically to produce continuous sound, why not just use the sound driver and the freeform synthesizer? That’s exactly what it does.
 

8bitbubsy

Well-known member
[...]
What's the objectives for your 4 channel audio? Is it just samples? Is it 4 channel music? Do you need to calculate pitches?
It's four 8-bit PCM voices with free pitch, volume between 0..64, and loop and 'no loop' mode. Also, whenever a sample has reached its end (either 'end of sample' or loopEnd point), it will update the new sampleData and sampleEnd registers. This is needed for maximum compatibility.
I basically wrote a Paula (the Amiga sound chip) emulator in 68000 assembly, and then I took the Amiga ProTracker (.MOD music format) player and replaced the Paula register writes with emu calls.

It's a very accurate player, and I thought it would be cool to bring it over to 68000 Macs. I know we have The Sound-Trecker, but it requires 68020+, and it's not as accurate.

Anyway, I already failed, so 68020+ it is...
 
Last edited:

Crutch

Well-known member
Oh, cool. This is also basically what Studio Session does, but indeed it takes the whole CPU.
 

Snial

Well-known member
It's four 8-bit PCM voices with free pitch, volume between 0..64, and loop and 'no loop' mode. Also, whenever a sample has reached its end (either 'end of sample' or loopEnd point), it will update the new sampleData and sampleEnd registers. This is needed for maximum compatibility.
I basically wrote a Paula (the Amiga sound chip) emulator in 68000 assembly, and then I took the Amiga ProTracker (.MOD music format) player and replaced the Paula register writes with emu calls.

It's a very accurate player, and I thought it would be cool to bring it over to 68000 Macs. I know we have The Sound-Trecker, but it requires 68020+, and it's not as accurate.

Anyway, I already failed, so 68020+ it is...
OK, yeah. I agree. I did a quick bit of coding for it and estimated it'd take about 74c/sample or 6.5Mcycles per 22010 samples = 87% of CPU. And I don't have a proper 0..64 volume (I guess I could use a table instead of a shift).

;Emulating 4 voice, free pitch, volume 0..64 (shift), loop/noloop.
;a0^ sample (byte), a1^dest.
;d0=pitch
;d1.pos
;d2.vol as a shift.
;d3=len.
;15ms worth of data = about 330 samples.
moveq #0,d4; clear d4.
move.b (a0),d4 ;basic sample.
lsl.w d2,d4; 0..6 so 4 to 4+24 = 28.
move.w d4,(a1)+ ;10?
add.l d0,d1 ;only upper word matters.
swap d1 ;lower word=pos now.
add.w d1,a0 ;
clr.w d1
swap d1
 

8bitbubsy

Well-known member
Here's my mixer. Sample end has to be tested for every output sample (for end/data swapping).

a0 = Points to signed 16-bit mix buffer
a1 = Points to current sample data
a2 = Points to volume LUT, pre-aligned for current voice volume
a3 = Sample end point for next cycle (not an address)
a4 = Points to sample data for next cycle
d0 = <scratch>
d1.w = Integer sample position (upper word is cleared)
d2.w = Fractional sample position
d3.w = Fractional sample delta (pitch)
d4.w = Integer sample delta (pitch)
d5 = <scratch>
d6.w = Current sample end point

Inner mixer loop macro for first voice:
Code:
    moveq  #0,d0
    move.b (a1,d1.l),d0
    add.w  d0,d0
    move.w (a2,d0.w),(a0)+
    add.w  d3,d2
    addx.w d4,d1
    cmp.w  d6,d1
    bhs.w  \1

Inner mixer loop macro for other voices:
Code:
    moveq  #0,d0
    move.b (a1,d1.l),d0
    add.w  d0,d0
    move.w (a2,d0.w),d5
    add.w  d5,(a0)+
    add.w  d3,d2
    addx.w d4,d1
    cmp.w  d6,d1
    bhs.w  \1

Then whenever that end-of-sample branch happens, it does this, then jumps back to the inner mixing loop:
Code:
    sub.w  d6,d1 ; subtract end point from sample position (keeps overflow samples)
    move.w a3,d6 ; set new sample end point
    move.l a4,a1 ; set new sample data pointer
    bra.w  \1

This is unrolled 16 times for less branching overhead.
I don't think it gets much faster than this, keeping in mind that I have to test the sample end point every sample. Also, I'm mixing at 22254.54Hz.

EDIT: Since sample lengths are limited to 65534 ($FFFE) bytes in ProTracker modules, you simply can't overflow the word index register with the highest delta (pitch) possible (1.41 at 22254.54Hz). So the code is safe. Some nasty modules from other trackers extended this limit to 128kB, but I force it to 64kB in this case. I know I can do 32-bit logic and support 128kB, but it's slower on the 68000 ALU.
 
Last edited:

Crutch

Well-known member
That is pretty nifty. The standard of the 8MHz Mac era would have been 11 kHz mixing (Studio Session), 22 kHz not becoming too common until probably the Mac II era.
 

8bitbubsy

Well-known member
I've been optimizing my mixer a little bit. Instead of checking for the sampling end points in the inner mixing loop, I run a calculation loop that determines how many samples I can mix before I reach the end of the sample (or sample loop), so that the inner mixing loop can be branchless (gets rid of CMP + Bcc instruction per output sample).

This creates a new problem; there will be a ton of calculational overhead on shorter sample loops, like those of a song using short chiptune styled waveforms. I think I can make it fast enough for a 7.8MHz classic Mac by unrolling sample loops on load to be at least 1024 samples long, but this breaks the "live sample swap" quirk/functionality of ProTracker. Maybe still worth it to get a 22kHz .MOD player for these old 68000 Macs. Most .MODs (especially non-chiptune styled) do not use the sample swap technique, so it would still be an 'okay' player. A 68020+ version wouldn't need this speed limitation, and would stay perfectly accurate.

Will have to find out how to get a stable and fast buffering scheme using Sound Driver + a vblank interrupt. Sounds difficult to me since the buffer is so short.
 
Top