Saturday 15 August 2015

Introducing Emu!

Background

Once in a while, I end up talking to young coders who are either about to graduate from a programming (or games programming) course, or who have recently graduated. One thing that worries me a bit is that they don't seem to have the depth of knowledge about how computers actually work that I'd expect.

It's a long time since I graduated, but I think my understanding of things at that point looked something like this:


Whereas your average graduate seems to have a picture more like this:

Now, obviously I'm exaggerating a bit, but the point is, most graduates don't have much of an understanding of what happens between them pressing F5 (or, God forbid, Unity's Play button), and stuff appearing on the screen.

Universities just aren't teaching this stuff. They don't talk about the CPU and the memory bus, caches and registers, interrupts and IO. At the risk of going into Grumpy Old Man mode, one of my first courses at university involved writing assembler for a computer that we had to make ourselves out of a breadboard, a Z80 CPU, some RAM, an EEPROM and a load of wires and other bits, to make it write text to an oscilloscope. With that experience under our belts, understanding higher level concepts was more instinctive, since the mystery had been taken out of it. Also, it was incredibly rewarding, and a lot of fun.

I'd really like to find a way to give new grads a similar experience.

The Plan

Unfortunately, it's not really practical to expect grads to build their own Z80 based computer. It's expensive, time consuming and insanely error prone. 

Writing some Z80 assembler on an emulator is a little more realistic. It's a much simpler bit of hardware than a modern CPU, but lots of the same concepts still apply. However, it's got some very rough edges that aren't really that instructive: It's mostly 8-bit except the bits that are kind of 16-bit if you squint a little. It's got a really small register set. It's sort of accumulator based, but sort of not.

It's also lacking the ability for us to explain more modern concepts. Caches, for example, are a really important concept that aren't really represented.

So, how about we invent a new machine and write an emulator for it. It can have an instruction set like the Z80, but with some of the rough edges smoothed off. It can be extensible and configurable in whatever way we want to make it a useful teaching aid.

That's what I'm working on, and I'd really like your feedback.

The Machine

Emu (if anyone can think of a better name, I'd very much like to hear it) is a program that takes some assembly code in a text file and executes it on a virtual machine. It's run from the command line, outputs to the console, and opens up a display window if one is specified in the config.

The machine is currently configurable in the following ways:
  • Number of registers
  • Amount of RAM
  • Stack size (can be 0) @cravo pointed out this is unnecessary. I'll change it so the SP isn't initialised, and if you want a stack you set it yourself.
  • Screen size (and whether there's a screen at all)
  • Whether there's a keyboard attached
  • Cycle count for a memory read/write
In the future, I'll extend this with lots of other options: data cache size and cost, more IO devices (mouse, files), multiple CPU support, etc, etc.

The instruction set is inspired by the Z80, but with some simplifications:
  • Registers are all 32 bit
  • Memory is word-addressed.
  • Instructions are fixed size
  • The same instructions can operate on registers and literals (i.e. "set r0 10" and "set r0 r1" are both valid, rather than having separate set immediate instruction).
  • Errors are reported in a friendly way, rather than the machine just crashing.
  • There are special debugging functions for outputting to the console
  • Interrupts jump to a named label (e.g. "vsync"), rather than a specific address
A simple program might look like this:

# Starting from address 100, keep adding numbers until 0 is found
# $alias is used to give registers or literals friendly names
# Registers are named r0, r1, r2...

# Start by aliasing some registers we're going to use
$alias START 100
$alias INDEX r1
$alias ACCUM r2
$alias TEMP r3

set ACCUM 0           #Set accum to 0
set INDEX 0           #Set index to 0

loop:
ldo TEMP START INDEX  #Load with offset into r3 from start + index
jeq end TEMP 0        #Jump to end if TEMP is 0
add ACCUM ACCUM TEMP  #Add temp to accum and store in accum
add INDEX INDEX 1     #Increment index
jmp loop              #Loop

end:
print Sum is ACCUM    #print is a bit special...
quit                  #Not strictly necessary

The configuration of the machine is done via a Lua script (inspired by TIS-100). As well as the configuration, you can also specify Lua functions to be run before and after execution, which have the ability to read and write to memory and registers. This allows challenges to be set up, for example a Sort challenge, where the first 100 words of memory are loaded with random numbers before execution, and the correct order is checked after execution.

Lesson Plan

All of the above can be broken down into bite-sized tutorials that explain the concepts. At the moment, I'm thinking it'd look something like this:

Lesson 1: The Basics of the CPU

At this point the machine has no RAM, no screen, no input. It's just a bunch of registers with some basic instructions. We'd talk about the basics of assembly, registers and literals, and the program counter (probably avoiding talk of aliases for now).

We'd start with simple linear programs, e.g.
  • Register r0 and r1 are preloaded with values. Add them together into r3 and print them
Then we could get into flow control and comparisons:
  • r0 and r1 are preloaded with values. Set the higher in r2 and the lower in r3

Lesson 2: RAM

A small amount of RAM is added to the machine. We talk about addressing memory in order to read and write. The examples and challenges are similar to the code to add a sequence of numbers from earlier.

Lesson 3: The Stack, Functions and Aliases

We add a stack to the machine, and explain pushing and popping and the stack pointer. We'd start by using the stack to read and write data (maybe reversing a sequence of numbers), before going onto how we can use the stack to back up the PC and how that allows us define functions.

We'll probably introduce aliases at this point, just to keep things manageable.

Lesson 4: Memory Mapped Devices

At this point we probably want to do something a little more visual. We'll add a screen to the device, with a memory-mapped screen buffer. The example is probably something like writing a function that draws a box.

Lesson 5: Interrupts

Interrupts are illustrated using the "vsync" which is implemented by the screen. We'll explain how these work (a jump to a label which could happen at any time, with the previous PC being pushed to the stack), and explain about disabling them while they're handled. 

At this point, we can do some moving images on screen. Maybe just scrolling a pixel, or something simple like that.

Lesson 6: IO

IO is handled with "in" and "out" instructions, and we'll explain the difference between these and the "load" and "store" instructions that we've already used. The example input device at this stage is the keyboard, so we can upgrade our previous scrolling pixel example to be a pixel that you can move around with the cursor keys.

For output, maybe we have a "Beep" device, which beeps when you hit the edge of the screen.

Future Lessons

All of the above already works in the current version of the emulator, but there's lots more I'd like to do. Some examples:
  • Serial IO (loading files, etc.)
  • Memory caching (explain how expensive memory access is, and how caching alleviates that)
  • Multiple processors (test-and-set operations for synchronisation)
  • Self-Modifying code (this isn't implemented right now)
  • Compilers (what some simple C would compile down to) (There could also be a separate course on writing a compiler)
  • Operating Systems (Can we make a trivial OS?)
  • Advanced algorithms (implement Bubble-sort vs. Quicksort, etc)
  • Branch prediction

Feedback!

That's about all I've got in my head right now, and there's lots of things that I'm not sure about:
  • Is the problem even actually a problem, or is this all irrelevant
  • Is the machine too fictionalised to be useful?
  • Is the order and content of the lessons about right?
  • Is there stuff missing?
If you've got thoughts on any of that, please let me know either by email or in the comments (if I can figure out how to turn them on...)

Also, if you'd like to have a play around with the emulator, give me a shout! It's still a little flaky, but there's a lot you can do with it (I've got loads of examples or drawing stuff, sorting challenges, with a quicksort implementation example), and there's enough there that you could write something like Snake or Pong without too much trouble.

Thanks for reading!

Acknowledgements

Massive thanks to Chris Cummings for all of his input.
Thanks to Zachtronics, creators of TIS-100, which got me thinking about all this.

No comments:

Post a Comment