Version 2.1.0 of the TashTalk firmware is out! You can get it from github here:
An interface for Apple's LocalTalk networking protocol. - lampmerchant/tashtalk
github.com
Nobody using TashTalk right now should feel any pressure to upgrade, as there are no bugfixes. (Knock on wood, there are no known bugs to fix...)
What this version adds is the ability to calculate and check the CRCs (a.k.a. Frame Check Sequences) of outbound and inbound frames. This may make little difference for larger embedded systems like the Raspberry Pi, which can easily handle the extra load, but for smaller scale embedded systems, the ability to outsource this function can make a significant difference in performance. The features are switchable, meaning that this version of the firmware is completely backward-compatible.
For anyone interested in what goes on under the hood, adding CRC calculation in the transmitter was relatively simple - there was enough unused space in the transmitter loop to fit the necessary code. The only trick was figuring out how to inject the calculated CRC into the data coming from the UART, but even this was accomplished without much trouble.
Fitting the CRC check into the receiver was a much bigger challenge. One of the consequences of using a UART as TashTalk's interface to its host (rather than I²C or SPI, for example) is that the firmware must be ready to receive data from the host at any time, even if it's in the middle of bitbanging LocalTalk. The PIC's UART has a FIFO but it's only two levels deep, which doesn't leave much wiggle room.
To understand what I did in the end, it is necessary to understand how TashTalk's receiver works. After receiving a clock edge, which happens every 230,400th of a second, the firmware has a limited number of cycles to get some work done before it has to start anticipating the next clock edge. It first jumps into a state machine whose function depends on whether a flag byte has been received, whether any data has been received, and whether the bit just received is a one or a zero - this does things like storing the bit into the appropriate position in the buffer and keeping track of consecutive ones so it knows when to anticipate and ignore a stuffed zero bit. Next, it jumps into a secondary function which depends on whether a byte or a flag has been completed. If a byte or a flag has been completed, the remaining time before the next clock edge is spent dealing with the consequences of that - sending the byte to the host over the UART, for example. If a byte or a flag has not been completed, the UART receiver is serviced - completed bytes are taken off the FIFO and queued for processing once the receive is finished, and the CTS signal is deasserted if the queue is more than half full, to signal to the host that it should hold off sending any more data. Thus, during normal data reception, the sequence of secondary events goes "service receiver, service receiver, service receiver, service receiver, service receiver, service receiver, service receiver, transmit byte, repeat".
This is where the opportunity to insert the CRC processing came up. The UART runs at 1 MHz, meaning that if the host is pounding it with bytes at full speed, they're coming in every 100,000th of a second (one start bit, eight data bits, one stop bit), meaning that the UART receiver needs to be serviced 100,000 times per second. Servicing it seven times per nominally-8-bit LocalTalk byte means we're servicing it 201,600 times per second (7/8ths of 230,400), roughly twice as often as we need to. What if we set things up so that we only service the receiver every other bit, and use the freed up time to check the CRC? Then we'd be servicing the receiver 115,200 times per second, which is still in excess of what it needs. Breaking the CRC logic into three independent pieces wasn't too tricky, so this is exactly what I did. The sequence of secondary events is now "service receiver, CRC part 1, service receiver, CRC part 2, service receiver, CRC part 3, service receiver, transmit byte, repeat". A given byte is factored into the CRC while the next byte is being received, with the final byte being factored in while the trailing flag is being received.
And thus was born a new feature!