Re: coding the Z80 and 6510 From: w_gayk@NOSPAMbielefeld.netsurf.de (Brix) Reply to: Brix Date: Tue, 3 Nov 1998 21:13:55 +0100 Organization: [Customers of] IS Internet Services GmbH & Co, Hamburg References: <70r5ec$8jp$17@newsreader1.core.theplanet.net> <3635b416.21822437@news.telepac.pt> <16476-agent.imc@comlab.ox.ac.uk> <71fo4o$oe9@cpca3.uea.ac.uk> > This is indeed typical Z80 code when you've got all the time in the world, > but where speed is important, usually the stack is used, like this: > > ld sp,$1100 (10 cycles) > ld hl,$4444 (10 cycles) > ld b,$20 (7 cycles) > .loop push hl (11 cycles) > push hl (11 cycles) > push hl (11 cycles) > push hl (11 cycles) > djnz loop (13 cycles if b>1, 8 if b=1) > > This takes advantage of the fact that every PUSH fills two bytes, so less > time is spent on the rather hideously slow Z80 branching instructions. > In this example we've got 27 cycles setup, plus 32 x 57 cycles (-5 for > last iteration). > > This is just 1846 cycles - less than the 6510 code! > On the Speccy @ 3.54MHz, this would take under 522us. Your syntax is pretty strange to me, but I'll see if I can understand what this program does.. since you have 16 bit registers only, you fill hl with $4444.. (means you could pattern-fill an area with the bytes $44 and $22 by using ld hl ,$4422 ??) Does it mean every PUSH decreases the b register by 1 and automatically increas es HL by 2 ?? When I'm not mistaken the program: ld sp,$1100 ld hl,$4444 ld b,80 .loop push hl dnjnz loop would also work the same way? If it was then you just cheated a bit, because I could also do: ldx #$20 (2 cycls) lda #$44 (2 cycls) .loop sta $1000,x (5 cycls) sta $1001,x (5 cycls) sta $1102,x (5 cycls) sta $1104,x (5 cycls) dex (2 cycls) dex (2 cycls) dex (2 cycls) dex (2 cycls) bne loop (2 cycls) What would be 4 cycles setup + 64x30 cycles = 1924 cycles, ok this is still mor e than Z80. You win in this point. But note, that this index adressing is very flexible and does not need to be a linear operation. > Of course, the more PUSHes you use the faster this becomes, and where > every last microsecond is vital, Speccy programmers would unroll the loop > completely. In this case, you'd get down to 1428 cycles (404us) for the > full 128 PUSHes... Unrolling the code completely (called "Speedcode" in our world) we get this: lda #$44 (2 cycles) sta $1000 (4 cycles each) sta $1001 sta $1002 sta $1002 ... sta $10ff Would make a total of 256x4 cycles + 2 cycles = 1026 cycles. > This is fun! Any more example code fragments we can compare? Yes, this is fun. We both learn something and it is not just "hey, you're shit and we're not..". But please explain your codes as good as you can, because for me (and the 6510 syntax is damned easy) your syntax is very strange. -Brix/Plush-