CPU bug?

This section is for discussing everything about Next hardware and latest updates.
slingshot
Posts: 40
Joined: Mon Mar 22, 2021 12:21 pm

Re: CPU bug?

Postby slingshot » Wed Mar 31, 2021 8:43 am

Without the CPU fix: about 20% is failed isr, ula isr doesn't counting, time is 00:00.00
With the CPU fix: 2% failed isr, ula isr is counting, time is also counting.
Now I wonder how I could catch that 2%, triggering on INT looks good, the correct vector (E8 usually) is picked up.

Upd.: it's 0.5% failed isr on long term (17366 failed/3253731 ctc0 count after 6 minutes)
Upd2.: rebased to master halved the failed isr count. Maybe the CTC is still not perfect? While CTC1 interrupt happens reguralry (EA as interrupt vector), CTC1 count rarely increases. It's only 129 after 26 minutes. CTC2 is just 2, CTC3 is 0.

slingshot
Posts: 40
Joined: Mon Mar 22, 2021 12:21 pm

Re: CPU bug?

Postby slingshot » Wed Mar 31, 2021 1:00 pm

Found the bug:
ctctest.jpg
ctctest.jpg (24.92 KiB) Viewed 2137 times

Alcoholics Anonymous
Posts: 781
Joined: Mon May 29, 2017 7:00 pm

Re: CPU bug?

Postby Alcoholics Anonymous » Wed Mar 31, 2021 2:03 pm

Well I think you fixed it :)

As long as the ula interrupt is not toggled off, the times should match the actual time accumulated by the ctc interrupts.

The 128K frame is 70908 T but there is a known issue where the 128K is run at 3.5 MHz instead of 3.54 MHz (the PLL has to be updated which I haven't gotten to yet). This makes the ula frame rate 49.3597 Hz. The time passed as measured by ula interrupts is 8:33.96 which is bang on.

The ctc0 counter is counting 0.0001 sec per tick. The interrupt is not enabled on the ctc0 so the channel will set a bit in the status register. The program reads the set bit to indicate it should update its counter and resets the bit in the status register. As long as the screen is not updated, the program sits in a tight loop counting these set bits. It may still miss a few counts since the period is fast and will not count while a key is pressed and the screen is updated. Time measured according to this polling method and ctc0 status bit is 8:11.26 which I would say is correct given the margins (and maybe you updated the screen a few times while running?). Maybe running at 28 MHz will get an even closer result.

The ctc1-ctc3 counts are doing the same thing with polling the status register. However, I realize now it will be relatively rare that the program sees a set bit. The reason is the ctc channels will set their bit but this will also trigger an interrupt and when the end of the isr is reached, reti will execute and cause the status bit to clear. That means for the program to see a set bit, it is the instruction that reads the status register that must be interrupted which is going to be very uncommon. The ctc0 count is 4912665 which means the ctc1 interrupt should have run that many times / 100 = 49126 and out of that only 49 were seen by the polling. ctc2-3 are even rarer (491 times and 8 times respectively). From the ctc1 count the odds of catching a set bit seems to be 1/1000 so you would expect ctc2 and ctc3 to read 0 at the point the program was stopped.

So I think everything is running properly, that's great news and another bug in the t80 fixed!

slingshot
Posts: 40
Joined: Mon Mar 22, 2021 12:21 pm

Re: CPU bug?

Postby slingshot » Wed Mar 31, 2021 2:40 pm

After 40 minutes, there's not a single CTC3 count, so yes, it's not caught.
But supporting both 48K and 128K correctly will require a reconfigurable PLL, right?

Another bug? Every bug :) Well, that might be not true, however all fixes are in the merge request.

Alcoholics Anonymous
Posts: 781
Joined: Mon May 29, 2017 7:00 pm

Re: CPU bug?

Postby Alcoholics Anonymous » Wed Mar 31, 2021 2:54 pm

slingshot wrote:
Wed Mar 31, 2021 2:40 pm
After 40 minutes, there's not a single CTC3 count, so yes, it's not caught.
Yeah you would have to run for at least 1000 minutes before you might expect to see it count 1.
But supporting both 48K and 128K correctly will require a reconfigurable PLL, right?
Yes we are already doing that to have the VGA 0-6 settings that can modify the system clock dynamically to try to increase frame rates toward 60 Hz for vga displays that can't sync to 50 Hz. VGA 0 is the preferred setting of course.

But I may not do it for this iteration of the system since the hdmi fix we have planned changes how things work completely.
Another bug? Every bug :) Well, that might be not true, however all fixes are in the merge request.
Well let's hope it's all of them :) Cheers for your work here, hopefully I won't be stuck in testing for too much longer.

Alcoholics Anonymous
Posts: 781
Joined: Mon May 29, 2017 7:00 pm

Re: CPU bug?

Postby Alcoholics Anonymous » Thu Apr 01, 2021 6:03 pm

slingshot, I have one other z80 related item in my to-do; I don't know if it interests you?

The nmos z80 has a bug concerning when the "LD A,R" and "LD A,I" instructions copy the interrupt enable bit (IFF1 I think) in that if an interrupt occurs during the instructions, the interrupt will clear the IFF1 flag before these instructions copy to P/V. That means there is a small chance these instructions can't properly read the current interrupt enable state.

Zilog has an official workaround which is similar to this: https://github.com/z88dk/z88dk/blob/mas ... di.asm#L39

The cmos z80 corrected this bug presumably by copying to the P/V flag before the interrupt clears IFF1. I looked into it a long while ago and I think the nmos bug is present in the T80 and we've decided we don't want that and prefer the cmos z80 fix.

Up to you if you are looking for even more things to do :)

slingshot
Posts: 40
Joined: Mon Mar 22, 2021 12:21 pm

Re: CPU bug?

Postby slingshot » Fri Apr 02, 2021 12:07 pm

Alcoholics Anonymous wrote:
Thu Apr 01, 2021 6:03 pm
slingshot, I have one other z80 related item in my to-do; I don't know if it interests you?
Of course I can look at it, just is there any test programs where I can see if it actually works? I'm not really good at writing tests :)
The problem is strange, since interrupts are handled at the end of an instruction cycle, so the LD should finish its job already.

Alcoholics Anonymous
Posts: 781
Joined: Mon May 29, 2017 7:00 pm

Re: CPU bug?

Postby Alcoholics Anonymous » Fri Apr 02, 2021 1:56 pm

slingshot wrote:
Fri Apr 02, 2021 12:07 pm
Of course I can look at it, just is there any test programs where I can see if it actually works? I'm not really good at writing tests :)
The problem is strange, since interrupts are handled at the end of an instruction cycle, so the LD should finish its job already.
Yeah the interrupt pin should be sampled on the rising edge of the last clock cycle so that means if there is a problem, IFF1 or IFF2 must be copied to the P/V flag in the very last cycle on either the following falling edge or the rising edge of the next cycle. Or maybe the flag is copied continuously on an edge throughout the last few cycles. "LD A,I" and "LD A,R" have two machine cycles, both opcode fetches, but the last one is an unusual 5T instead of 4T so it seems it is possible an extra T executes following the refresh to do the flag copy.

I think I can come up with a simple test to see if the bug is present.

slingshot
Posts: 40
Joined: Mon Mar 22, 2021 12:21 pm

Re: CPU bug?

Postby slingshot » Fri Apr 02, 2021 2:09 pm

Alcoholics Anonymous wrote:
Fri Apr 02, 2021 1:56 pm
slingshot wrote:
Fri Apr 02, 2021 12:07 pm
Of course I can look at it, just is there any test programs where I can see if it actually works? I'm not really good at writing tests :)
The problem is strange, since interrupts are handled at the end of an instruction cycle, so the LD should finish its job already.
Yeah the interrupt pin should be sampled on the rising edge of the last clock cycle so that means if there is a problem, IFF1 or IFF2 must be copied to the P/V flag in the very last cycle on either the following falling edge or the rising edge of the next cycle. Or maybe the flag is copied continuously on an edge throughout the last few cycles. "LD A,I" and "LD A,R" have two machine cycles, both opcode fetches, but the last one is an unusual 5T instead of 4T so it seems it is possible an extra T executes following the refresh to do the flag copy.

I think I can come up with a simple test to see if the bug is present.
According the code, IntE_FF2 is zeroed after the last T_State/MCycle (which is 5/1 for these instructions, excluding the prefix fetch), the P flag is loaded at MCycle=1, TState=3, so I don't see how can it fail. Yeah, a test program would definitely help to spot if there's some unexpected side-effect somewhere. The T80n core itself works only at the rising edge of the clock, only the async top-level is acting on both (which is a good design IMHO).

azesmbog
Posts: 16
Joined: Mon May 29, 2017 9:12 pm

Re: CPU bug?

Postby azesmbog » Fri Apr 02, 2021 3:05 pm

slingshot wrote:
Fri Apr 02, 2021 12:07 pm
Of course I can look at it, just is there any test programs where I can see if it actually works? I'm not really good at writing tests :)
Everything has long been invented and written before us.
20 years ago, the famous programmer Ivan Roshchin investigated such a problem.
The site is no longer working, but the web archive still remembers something)

https://web.archive.org/web/20180829085 ... /index.htm

As far as I remember, all the test cases worked for me, with a green border at the top and a yellow one at the bottom.
But now for some reason only the first example works on the emulator.
In the Next - the green border does not work, so the error has been fixed?

Maybe this will help you figure it out? :-))
Yeah you would have to run for at least 1000 minutes before you might expect to see it count 1.
for me for six hours (360 min) of work of this test the variable "count 3" has counted up to three.
This is normal?
and further. The timer runs a little faster, approximately 62/60 sec.


Who is online

Users browsing this forum: No registered users and 6 guests