Truthfully, at the coarsest level there isn't much understanding required. The fundamental equations from eqn2jed are already normalized in a way that agrees with CUPL, order of operations, etc. If you want to stay sane, the equations from eqn2jed should have the variables renamed to match net names gleaned from what they connect to (ex connects to the CPU AS pin? then it's AS.LOCAL).The logic is based on the Diimo GALs, adapted to fit on a single PLD, right? So I take it you had to understand enough of it to be able to combine it, but other parts of the code you haven’t needed to analyse?
And so in adapting it to a single logic device new issues have arisen, perhaps because it was running marginally to begin with?
Next, registered and OE logic needs to be distributed to each equation in a GAL that uses it, any inverted signals (usually just /OE if used as an input) need to be made consistent, syntax translated, and various optimization steps like eliminating feedback declarations. These are largely mechanical steps and don't require particular understanding of the equations being worked on as it's "just" translation and clean-up.
By the end of it you've got a pile of equations in a single file and you need to either assign pins or let the fitter do it for you; hypothetically it can optimize placements. Not sure of the truth of that, though, and I find manual placement according to what makes my life easier to route is usually fine. Conceptually at this point you can make a board and it might just work. I just recently did this with the Formac design Bolle uses for his Quadra 700 clock doublers, and after correcting a single typo in an equation the resulting board worked perfectly. I even snuck in a version that swapped unidirectional latches for bidirectional registered transcievers.

So you could stop here, but that's no fun, and potentially risky if you don't test every single possible scenario. So let's continue.
Alright, you've labeled the outputs and inputs in accordance to what they connect to - but what is being done in the middle, in these mystery equations? You've still got a lot of intermediate nodes left that aren't clear what they do, and this means potential edge cases. For example:
Code:
gal3_nc14.ck = BCLK_LB;
!gal3_nc14.d = !RSTI
# !gal3_nc16 & !gal3_nc15 & !gal3_nc14
# !BB_LB & gal3_nc16 & gal3_nc15 & !gal3_nc14 & !gal3_nc13
# gal3_nc16 & gal3_nc15 & !gal3_nc14 & !gal3_nc13
# gal3_nc16 & gal3_nc15 & !gal3_nc14 & BG_LB_REG
# BG_LB & gal3_nc16 & gal3_nc15 & !gal3_nc14
# gal3_nc16 & !gal3_nc15 & !gal3_nc14
# gal3_nc19 & !gal3_nc16 & !gal3_nc15 & gal3_nc14;
Working from the edges in you can start to assign probable functions based on a best guess of the referenced variables. For example, It's looking for SIZ0 & SIZ1? OK, it cares about line-size accesses, and this equation identifies that state; therefore we have a new name for this variable instead of the useless temporary name it had before. From there it's possible to slowly gain further insight as now we know anything referencing that cares about line accesses.... and so it goes. Sometimes probing with a LA can also help.
If you're lucky eventually you'll meet in the middle and have a "full" understanding, and perhaps even be able to optimize. I was able to significantly redesign the Formac, increasing cpu performance by 10%, memory bandwidth by 20%, and video performance by over 30%! This was a mixture of resolving some bugs/special cases with the original code, and since I know I've got a fixed PLL-locked clock phase relationship I could optimize around that. I even manged to spot a minor tracing error in the schematic Bolle had been so kind as to send me as the equation didn't make any sense as-written & wired. Eventually 4 GAL16V8 turned into a single ATF1502 CPLD that is only about half full.
The thing is, the Formac (and the Booster before it) have relatively simplistic logic governed almost entirely by the CPU's timing requirements. So it makes pretty clear what your intermediate states needs to be, and from that you can figure out what the equations must be doing. Plus, there isn't any state carried over between bus cycles at all.
The diimo however, well, that's another beast entirely; it originally had 12 GAL 16V8 that were heavily utilized. Up to 96 Macrocells; those were replaced by an ATF1508 with 128. From a "subsystem" perspective, it has both internal and external bus strobes, data muxing/register control, cache control (and a cache snoop case for bus masters!), and internal bus cycle management. Around 36 of the macrocells in use were internal feedback only or inter-PLD feedback! All of these work together to make the thing work.
Have a gander at that visualization from earlier... look for the big light-green AS node, click on it, then check the nodes it depends on...look how deep that rabbit hole goes Eeeek!
You're also right that there's room for timing issues. At a high level, the ATF150x CPLD has a similar propogation delay to a GAL. 10ns on both is the guarenteed combinatorial (non-clocked) input to output time. In my experience the CPLDs are very stable and will hit their timings without much variance. GALs have a little more wiggle room. For example, a global clocked output on one of these CPLDs will have an output in 5ns max (and that's basically a constant), and as long as you meet 3ns of setup time you're good. However, 10ns GAL is rated anything from 2-7ns, and it wants a 10ns setup time before the clock!
In general the ATF15xx CPLD of the same speed grade as the GAL should be the same or faster than the GAL, but under the hood the timing models are very different and with the more intermediate nodes between input and output, the more those timing differences can add up. The architecture of the CPLD is much more flexible, but with that comes much more intricate timing. Things can be tight enough that even the act of observing an intermediate (by either tying it to an ouput pin, or using the feedback to an output pin) can cause behavior to change by the internal layout of the CPLD shifting. Since I don't have a way to simulate the thing as a whole and analyze timings, you can see how it turns into a real mess!
Currently, for example, I've got a case where the local bus acknowledge code will deadlock under certain circumstances, and I only have 4 test points with which to examine internal state. Tweaking the clock used by one of the intermediates seemed to do the trick with the first issue ... but it broke synchronous termination when the cache is off. And the data buffer clock on reads doesn't make sense at all; it doesn't seem to be clocking (latching) data when an 030 would normally do so at all. I have no idea how it's getting valid data!
And this is hypothetically one of the easier cached accelerators; I've already switched GALs all around on a finished board and only a couple of them are really sensitive to propogation delay. On cards like the Turbo 040 GAL version which is known to be fickle at the best of times and has a specific mix of GAL speed grades for certain speeds - godspeed.
Hope this sheds a little light



