DEV Community

Ian Jacobs
Ian Jacobs

Posted on

6502 Programming

For my portability and optimization class, I am taking a look at the 6502 assembly code in preparation for modern assembly code. I am using a 6502 emulator located here that includes a bitmap display and a text output so we can visually see the output for the assembly code including a memory monitor to see the added memory.

I will be running some 6502 assembly code to see the results while calculating the time it takes to run and performing some extra experiments in this blog.

The instructions for creating programs with this CPU are very minimal and a manual used for this lab and in the course can be found here with all instructions for the 6502 processer.

Calculating Performance

The following code fills the bitmap with a solid colour:

lda #$00    ; set a pointer in memory location $40 to point to $0200
    sta $40     ; ... low byte ($00) goes in address $40
    lda #$02    
    sta $41     ; ... high byte ($02) goes into address $41

    lda #$07    ; colour number

    ldy #$00    ; set index to 0

loop:   sta ($40),y ; set pixel colour at the address (pointer)+Y

    iny     ; increment index
    bne loop    ; continue until done the page (256 pixels)

    inc $41     ; increment the page
    ldx $41     ; get the current page number
    cpx #$06    ; compare with 6
    bne loop    ; continue until done all pages
Enter fullscreen mode Exit fullscreen mode

Timing Results

Below are the timing results from each instruction used above including how many cycles it would take to complete which is then transformed into time.

Timing result of the above code

A bit of clarification, the bitmap display is 32x32 pixels long and takes that amount of space in bits in the 6502's memory, and because we are looping and changing each pixel. The cycle count for those operations is 32x32 = 1024 split into 4 pages of 256 pixels.

This overall means that the program requires 10326 cycles to complete which assuming a 1 MHz CPU speed would take 0.01036 seconds to complete.

Memory

During the execution of the code, memory is used for each instruction used and for variables. The total memory used for this program was the following:

Memory used

Just to note: The answer of 8KB is not the max used during the program but it is the total amount of bytes used during the entire program. The byte values for the instructions were found in the manual here.

Optimized code

To optimize this code for better time I removed the loading of the x register to the beginning of the program instead of in the page loop. I then compare x to the current high-bit (since it's incremented) which would tell me if it is within the page's boundary for the bitmap.

    lda #$00     ; a pointer in memory location $40 to point to $0200
    sta $40     ; low byte ($00) goes in address $40
    lda #$02    
    sta $41     ; high byte ($02) goes into address $41

    LDX #$06    ; Loading number of pages

    LDA #$06    ; Yellow color code
    LDY #$00    ; set Y register index to 0

loop:   
      STA ($40),y   ; Set pixel colour at the address (pointer)+Y
    INY        ; Increment Y register
    BNE loop    ; Continue until done the page (256 pixels)

    INC $41      ; increment the page
    CPX $41      ; Comparing the high bit to the number of pages
    BNE loop     ; continue until done all pages
Enter fullscreen mode Exit fullscreen mode

Here are the new timings below

Timing results for above, optimized version

As can be seen, my new "optimized" implementation doesn't really shave off that much time only 0.000006 seconds.

I am not sure exactly how to cut more time from this program. I knew that it most likely had something to do with how the loops were constantly repeating instructions and that it could be optimized there somehow.

Modifying the Code

Colouring Each Page

Here is code that will add 1 to the accumulator used for getting the colour code for each pixel after every page so that its colour updates each page change.

  lda #$00  ; set a pointer in memory location $40 to point to $0200
    sta $40     ; low byte ($00) goes in address $40
    lda #$02    
    sta $41     ; high byte ($02) goes into address $41

    LDX #$06    ; Loading number of pages

    LDA #$06    ; Yellow colour code
    LDY #$00    ; set Y register index to 0

loop:   
        STA ($40),y     ; Set pixel colour at the address (pointer)+Y
    INY     ; Increment Y register
    BNE loop    ; Continue until done the page (256 pixels)

    ADC #$01    ; Adds 1 to the accumulator to update the color
                 ; for each page
    INC $41      ; increment the page
    CPX $41      ; Comparing the high bit to the number of pages in x
    BNE loop     ; continue until done all pages
Enter fullscreen mode Exit fullscreen mode

Here is the result:

Multi color page output in stacks

Experiments

TYA

Adding the TYA instruction at the beginning of the loop:

Results of using TYA in bitmap display; vertical lines of different colour

After adding TYA it causes the bit map to display vertical lines of different colours. It appears that it is displaying each available colour in a loop for each pixel as each horizontal bit on the display is a different colour on each line and there are both 16 different colours and 32 bits on the display so it shows each colour twice.

LSR

Including LSR after and with the TYA instruction:

Results of TYA and LSR. Thicker lines

When adding LSR after TYA it causes each different coloured vertical line to become thicker meaning that there is only 1 set of the 16 different colours showing.

Results of Using 2 LSR statements. wider lines plus overlapping

Adding more LSR instructions continually causes the lines to become thicker as after 3 LSR instructions there are 8 lines but since there are 16 colours available they overlap each other on every other line.

5 lsr result

This eventually causes the single-width lines to be displayed horizontally once 5 instructions of LSR are performed. Additionally, it only repeats the first 8 colors instead of the 16.

ASL

Restarting the process by adding TYA I added ASL instead which caused the following.

ASL Result

The display once again displayed vertical lines for different colours, however, this time compared to using LSR some of the colours were skipped altogether. When referring to the 6502 emulator colour code chart on this page it seems that the ASL operation ignores one colour every time it loops meaning that only 8 out of the 16 colours are repeated for the 32 bits.

When adding more ASL operations like before, it seems that the program skips more and more for each ASL opcode added with each ASL added effectively halving the number of colours available (starting from 16) as seen below.

2 ASL

This culminates in 4 ASL instructions as it only shows the first colour: black, because 4 ASL instructions == 2^4 = 16 | 16/16 =1 meaning that the first colour is repeated for each pixel.

When referencing the manual located here ASL is an "Arithmetic shift left" which means that ASL performs a bitshift to the left which ends up dividing the byte by 2 explaining what is happening here with each additional instruction.

INY

Another experiment to perform is to add more INY (increment y register) instructions for each loop. I would expect this to increment the counter in 5's meaning it makes skip colouring pixels.

However, what happens is the screen still fills with colour but when running at the lowest speed you can see the pixels being filled in with 5 spaces apart for each pixel repeating 256 times which means that it overflows like (255 -> 300 (overflow) x --> 004) and so on meaning that eventually, the pixels fill up each page as usual then the entire display.

Random Colours

lda #$00    ; set a pointer in memory location $40 to point to $0200
    sta $40     ; ... low byte ($00) goes in address $40
    lda #$02    
    sta $41     ; ... high byte ($02) goes into address $41

    lda $FE         ; Sets random colour (number) to accumulator
    ldx #$06    ; get the current page number
    ldy #$00    ; set index to 0

loop:   LDA $FE       ; Adds random colour for next pixel
    STA ($40), y
    iny
    BNE loop

    inc $41     ; increment the page
    cpx $41       ; compare with 6
    bne loop    ; continue until done all pages
Enter fullscreen mode Exit fullscreen mode

This code loads the hexadecimal number 0xFE (254) which is defined as a pseudo-random number in the peripherals of the 6502 processor. This generates a random number assigned to the accumulator before each pixel is stored, allowing each pixel to be a random colour. Here are two examples with the above code on the 6502 Emulator:

Image description

Challenges

The code below sets the bit display to a single colour except for the middle 4 pixels.

    lda #$00    ; set a pointer in memory location $40 to point to $0200
    sta $40     ; ... low byte ($00) goes in address $40
    lda #$02    
    sta $41     ; ... high byte ($02) goes into address $41

    lda $FE        ; Sets random colour (number) to accumulator
    ldx #$06    ; get the current page number
    ldy #$00    ; set index to 0

loop:   

    STA ($40), y
    iny
    BNE loop

    inc $41     ; increment the page
    cpx $41         ; compare current page to x (0600)
    bne loop    ; continue until done all pages

    ADC #$01    ; Add one to the colour code

    STA $03EF   ; Storing the new coloured pixels at middle location
    STA $03F0
    STA $040F
    STA $0410

Enter fullscreen mode Exit fullscreen mode

My solution to this challenge was to manually add each of the middle pixel locations and assign them +1 to the colour chosen (kept the random colour choice in this code so it is still random).

Here is the output:

Challenge output: full screen except for middle two pizels

Final Thoughts

After completing this lab, and doing the experiments and challenges. I believe that I have learned the basic instructions and addressing methods for this assembler including the little-endian design philosophy when it comes to byte storage.

When first viewing the assembler demos and lab I was sort of intimidated by the randomness of how the code operated but throughout this lab, I experimented and got a good beginner's understanding of how the 6502 processer operates and how assembly operates in general.

Top comments (0)