I'm currently looking into the problem to efficiently use the HW sprites from SW i.e. from Assembler.
But I ran into a few problems and I would like to hear your ideas or solutions.
The problem is simply put: How to deal with the sprite's position efficiently.
When working with sprites the usual loop you need to do is:
1. Moving sprites
2. Drawing sprites
3. Looking for collision
4. Remove sprites (e.g. because of collision)
5. Create new sprites (e.g. explosion or a shot has been created or what else)
6. Goto 1
For most of these operations the position of the sprite is required.
In the ZX Next the x and y position is 9 bit.
I.e. more than 1 byte which requires double register arithmetic.
Furthermore the 8th bit is shared in a byte with other bits which makes it difficult to retrieve and set.
Then the data need to be kept in a kind of shadow memory as the ZX Next Sprite HW registers are not readable.
So all in all this sums up to an algorithm like the following:
1. Move sprites in normal RAM (shadow memory)
2. Draw sprites: Copy this shadow memory to ZX Next HW
3. Go through all x/y positions in the shadow memory to look for collisions.
4. Remove sprites: Disable them in the shadow memory (on next copy to ZXNext HW the HW sprites are disabled)
5. Create new sprites: Enable them in the shadow memory and set their position (on next copy to ZXNext HW the HW sprites are disabled)
6. Goto 1
The shadow memory should use the the same bit packing as the ZX Next HW in order to move the complete block all at once from the memory to the ZX Next HW (I think this can be done also with DMA, but for DMA it is even more important that structures in memory are exactly the same as in HW).
That means I still need to implement an efficient way to retrieve the X/Y position.
Let's take the X-position as an example:
The lower 8 bits are at offset 0. The bit 8 is the first bit in the byte at offset 2.
The other bits are not 0 but contain other information that should not be changed and need to be bit-masked.
This would result in assembler in something like this:
Code: Select all
ld hl,shadow_memory + sprite_index*5 ; Load the address of some sprite ; Load X ld e,(hl) ; T7. Load low byte of X inc hl ; T6. inc hl ; T6. ld d,(hl) ; T7. Load the 9th bit. de now contains X ; T=26 ; Add some delta ex de,hl ; T4. hl contains X ld bc,x_add ; T10. The value to add to X add hl,bc ; T11. x += x_add ; T=25 ; Store new value ex de,hl ; T4. de = new X, hl = pointer to 9th bit ld a,(hl) ; T7. Load the current flags rra ; T4. Discard 9th bit rr d ; T8. Move bit 9th bit to Carry rla ; T4. Set 9th bit from Carry ld (hl),a ; T7. Store 9th bit without changing the other flags ; Store low byte dec hl ; T4 dec hl ; T4 ld (hl),e ; T7 ; T=49 ; Total T=100
To compare this with an ideal case where I just take a 2 byte value add another 2 byte value to it and save it, this is very much. Here is the ideal case:
Code: Select all
ld hl,(x) ; T20. The X-position ld de,x_add ; T10. The value to add/subtract to X-position add hl,de ; T11. x += x_add ld (x),hl ; T20. Store new value ; In total T=61
The same has to be done similarly for the Y-value as well.
This was for the moving then there is the collision detection. Here we need to get the x and y values for comparison.
And here we additionally need to mask out the high bits when getting the values.
So the additionally code required would be similarly maybe 40 to 50% more.
And in collision detection it can also happen that the same sprite is compard several times against other sprites. Meaning: the getter algorithm has to be used a lot of times.
So all in all this is a huge area for optimization.
I would be happy to get any ideas for some better algorithm. Maybe there is also some HW support or some Z80N instruction that could help and that I have overseen.
Any help is appreciated.