Re: coding the Z80 and 6510
                                       
   From: "Joe Mackay" <regular.viewer@bunty.hoven>
   Reply to: "Joe Mackay"
   Date: 03 Nov 98 23:32:49 +0000
   Organization: [posted via] UK Online Ltd
   References:
          <16476-agent.imc@comlab.ox.ac.uk> 
          <MPG.10a526788e4c1899989738@news.uni-x.net> 
          <71fo4o$oe9@cpca3.uea.ac.uk> 
          <Pine.LNX.3.95.981031221758.409A-100000@menaxus.demon.co.uk> 
          <MPG.10a976a8fa4f0e0398975c@news.uni-x.net>
On 03-Nov-98 20:13:55, Brix did excrete via fingers:
>> This is indeed typical Z80 code when you've got all the time in the world,
>> but where speed is important, usually the stack is used, like this:
>>
>>      ld sp,$1100     (10 cycles)
>>      ld hl,$4444     (10 cycles)
>>      ld b,$20        (7 cycles)
>> .loop        push hl         (11 cycles)
>>      push hl         (11 cycles)
>>      push hl         (11 cycles)
>>      push hl         (11 cycles)
>>      djnz loop       (13 cycles if b>1, 8 if b=1)
>>
>> This takes advantage of the fact that every PUSH fills two bytes, so less
>> time is spent on the rather hideously slow Z80 branching instructions.
>> In this example we've got 27 cycles setup, plus 32 x 57 cycles (-5 for
>> last iteration).
>>
>> This is just 1846 cycles - less than the 6510 code!
>> On the Speccy @ 3.54MHz, this would take under 522us.

>Your syntax is pretty strange to me, but I'll see if I can understand what
>this program  does.. since you have 16 bit registers only, you fill hl with
>$4444..

The Z80 doesn't "have 16 bit registers only". The 6 8-bit registers B, C,
D, E, H and L can be grouped into the 16-bitters BC, DE and HL. The A
(accumulator) register cannot, but it's sometimes paired with the F (flag)
register for PUSH/POP/EX instructions.

(NB. EX is used to exchange between the alternate register set, i.e. A',
B', C' etc)

>(means you could pattern-fill an area with the bytes $44 and $22 by using ld
>hl,$4422 ??)

Yep.

>Does it mean every PUSH decreases the b register by 1 and
>automatically increases HL by
>2 ??

No. The B register here is used by the DJNZ instruction (dec b, branch if
not 0). The HL register is not changed by PUSH, the SP (stack pointer) is
decreased by 2.

>When I'm not mistaken the program:
>ld sp,$1100
>ld hl,$4444
>ld b,80
>.loop push hl
>dnjnz loop

>would also work the same way?

Yes, but it would be slower (unrolling the PUSHes means less slow
branching).

>> Of course, the more PUSHes you use the faster this becomes, and where
>> every last microsecond is vital, Speccy programmers would unroll the loop
>> completely. In this case, you'd get down to 1428 cycles (404us) for the
>> full 128 PUSHes...

>Unrolling the code completely (called "Speedcode" in our world) we get this:

>lda #$44  (2 cycles)
>sta $1000 (4 cycles each)
>sta $1001
>sta $1002
>sta $1002
>...
>sta $10ff

>Would make a total of 256x4 cycles + 2 cycles = 1026 cycles.

This would use about three times as much memory though, since PUSH
instructions only take one byte and I assume sta $nnnn uses three.

>> This is fun! Any more example code fragments we can compare?
>
>Yes, this is fun. We both learn something and it is not just "hey, you're
>shit and we're  not..".

I think we're getting somewhere :-)

>But please explain your codes as good as you can, because for me (and the
>6510 syntax is  damned easy) your syntax is very strange.

And vice versa (6510 looks downright weird to me).