Saturday, February 5, 2022

Commodore 64 Emulator

Terminal UI

Some years ago, out of pure nostalgia and 2 hours of free time a day during the train commute to and from the office, I decided to write a Commodore 64 Emulator. I knew good emulators were available, and it was not my goal to compete with them, but I wanted to get into the details of doing such a task myself.

So I ended up with an emulator that runs the original Kernal and Basic; the UI works in ASCII mode in a Linux Terminal. This allows one to key in and execute Basic code. Please note that many things are not done yet/ever. I consider it a fun hack, and I enjoyed writing it and still enjoy popping up a CBM64 screen from time to time in my Terminal.

The basics of writing an emulator tend to be not that hard, it's a bit of work, though, and since it could be helpful to others, a few components are briefly outlined in this document.


Core Loop

At the core of the C64, and computers in general, there is a CPU. Simplified, this "Central Processing Unit" does the following steps in a loop.
  • Read an instruction from a memory location
    • the location is given by a variable stored in a two byte register, called the Program Counter (PC)
  • increment the PC
  • Execute that instruction (in an emulator this is basically a unique value that maps to a function), often these are the folowing steps
    • read data (arguments) from memory or from the registers
    • do a computation on that data
    • write the result to registers or memory
  • start all over again

The full list of instructions, called Opcodes, can be found in the file  MOS6510.cpp.


Below is an overview of the CBM64 memory map.

The the memory range is from 0x0000 to 0xFFFF, not surprising this fits into 2 bytes.
Note that this is not all just RAM. Like the kernal ROM is (by default) mapped at address 0xF000 - 0xFFFF, the Basic at 0xA0000 - 0XC0000. The Screen, which is pretty usefull to communicate with the computer users is mapped at 0x0400 - 0x0800, that means if we instruct the CPU to write an output of an instruction to 0x0400, likely something will be shown on screen (for example try: POKE 1024, 49). The screen output is handled by a chip called VIC and my implementation does just the basics, being ASCII mode.


These are two nice schemas of the Commodore 64 Internals:

You will notice the following components

  • 6510 CPU
  • 6567 VIC (NTSC Video Interface Chip; can be 6569 for PAL) 
  • 6581 SID (Sound Interface Chip)
  • 8 x 4164-2 RAM (each 65.536 x 1 bit, so 8 make the 65.636 Bytes Ram)
  • Color RAM
  • Kernal ROM
  • Basic ROM
  • Character ROM
  • Bus
  • 6526 CIA's for IO Stuff

To get a CBM 64 running, you need to implement the 6510 CPU, at least the Text mode VIC, RAM and some IO for keyboard input and basic IRQ handling. Once done you should only need to load the ROM's, like the Kernal (yup, it was a Kernal at that time, the 'a' is not a typo) at the right place in memory, set the program counter to a start location and start the CPU.

If all go's right, the emulator should start the Kernal and the Basic and the well known "38911 Basic Bytes Free" should appear.

Implementation overview

Memory - Bus

Memory reads and writes are done over a BUS.

Implementing Memory could be as easy as defining an array of bytes. However, because devices and ROM are mapped into the address space, it is useful to make a class that allows registering memory mapped devices and an interface to read and write memory.

I implemented it in a Class called Bus.cpp , all memory reads and writes are done via the Bus and this one knows about all devices that are  mapped in the address space. It will make sure a reads and writes are done to and from whatever is mapped at a given address. 

The following devices are registered in the Bus, note I only mapped one CIA where there are actually 2. One was enough to get the ASCI mode running.
  24   mBus = CBus::GetInstance();  
  25   mVic = new CMOS6569();  
  26   mRam = new CRam();  
  27   mBasicRom = new CBasicRom();  
  28   mKernalRom = new CKernalRom();  
  29   mProcessor = new CMOS6510(mMutex);  
  30   mCia1 = new CMOS6526A(mMutex);  
  31   mCharRom = new CCharRom();  


Implemted in : Ram.cpp
RAM is just a buffer of bytes, registered to the BUS from 0x0000 - 0xFFFF, if nothing else is mapped o n the given address reads and writes (peek and poke) will happen on this buffer.


There are 3 ROM's in the implementation
The idea here is, since it is just data and they are mapped as memory, to just provide Reads from it. In CBM64 terms; Peek. And that is what the implementations do.

CPU, the 6510

Here most magic happens, this is the code that reads Instructions (opcodes) and execute the necessary steps for that instruction.

There are two matrixes in this class, one with all opcodes organized per address mode.
Another one holds the CPU cycles spent by the 6510 to execute the instruction. Some instructions take extra cycles for certain address modes, these are counted in the instuctions implementation. Like the BRANCH instructions take an extra cycle if the branch is executed, and another extra cycle if the address to branch too crosses a page boundary (this is partially done in this implementation).

The main function in this class is 'Cycle()'. This reads and executes a next instruction and runs the IRQ if needed. 


At start the PC (Program Counter) is set to an address specified at the Kernal address 0FFFC.  

r_pc  = mMemory->Peek16(0xFFFC); 

The CPU starts reading and executing insructions from that address.

CPU Cycles

The PAL version of the CBM64 CPU runs at 985 KHz and the NTSC at 1023 KHz. 

The MOS6510.cpp implementation has a Cycle() method, this method fetches and executes one instruction (it also runs the IRQ's if needed). Just running Cycle() in an endless loop would run the processor but because our modern CPU's are fast, and even taken into account the overhead of the not so optimal implementation of the 6510 emulated instructions, it wil largely overshoot the intended 1023KHz.

To solve this the Cycle() method returns the number of 6510 CPU cycles used by the instructions executed. A way to implement a somewhat accurate CPU cycle rate is to count the spent CPU cycles up to 1.023.000 (or larger), check the actual time spent, and sleep the remainder of the second, then carry over the rest of the1.023.000 cycles to the next round. This is implemented in main.cpp, actually it does it a bit more grannular and checks intervals of 100ms instead of 1 second. One can make it run at smaller intervals as needed.