Just something I noticed, not sure if it was intentional or not.
In the PLP test, when PHP is used to store the P register, the V flag is set. This is an effect from the BIT $4210 instruction in WaitNMI(). Bit 6 in $4210 is undefined, open bus.
Not really a big deal once you know, but if the idea of these tests is to test CPU instructions, it might be neater to bring the flags to a state only defined by CPU behavior first.