The Chifir Virtual Machine
A simple project to change the mood while I was in the middle of grinding webdev stuff.
2022.7.4:
Somehow I forgot to link the repository here. Here you go: https://git.sr.ht/~bctnry/chifir2024.7.22: New repository: https://codeberg.org/bctnry/chifir
Chifir is a virtual machine described in the paper "The Cuneiform Tablets of 2015" by Long Tien Nguyen and Alan Kay at Viewpoint Research Institute. To be brief, Chifir is a part of the "Cuneiform" system which is designed to preserve programs for decades to come.
The virtual machine itself is extremely simple:
- Each word is 32 bits.
- A memory
Mof 2097152 words. (2097152 * 4 = 8388608Bytes = 8MBytes) - A program counter PC with a length of 32 bits.
- A memory-mapped 512 * 684 black & white bitmap display:
- Each pixel is represented as a single 32-bit word.
- The display memory starts at address 1048576
- Each instruction is made up of 4 words: one operator and three operands, named as
A,BandCrespectively. - The instruction set is defined as follows:
- PC ← M[A]
- If M[B] = 0, then PC ← M[A]
- M[A] ← PC
- M[A] ← M[B]
- M[A] ← M[M[B]]
- M[M[B]] ← M[A]
- M[A] ← M[B] + M[C]
- M[A] ← M[B] - M[C]
- M[A] ← M[B] × M[C]
- M[A] ← M[B] ÷ M[C]
- M[A] ← M[B] modulo M[C]
- If M[B] < M[C], then M[A] ← 1, else M[A] ← 0
- M[A] ← NOT(M[B] AND M[C])
- Refresh the screen
- Get one character from the keyboard and store it in M[A]
All operands are treated as unsigned 32-bit integers. When the result is bigger than 32-bit maximum, the higher part is ignored (i.e. modulo 2^32).
PCis increased by 4 except for instruction 1 and 2 when PC is directly assigned.
The input is the most confusing bit. Because it was originally intended to preserve Smalltalk-72 the ASCII set is different:
\nis "Up" and\ris "Enter". I was not well-versed in the original Smalltalk-72 so this "Up" is mapped to the up arrow key and "Enter" the enter key. It can be easily modified into "Enter key =\n", "Enter key =\r\n" or whatever you like.- 33 is the exclaimation mark
!but in Chifir it's the "Return" (a big upward arrow with no fill) symbol. It's the good oldreturnconstruct that we all know. But in later Smalltalk^is used as the return symbol and^is mapped to another symbol in Chifir so I'm not sure which to use. - 34 is the double quote
"but in Chifir it's "Hand", which is like a quote in LISP, used to signify that "this is a symbol". - 37 is the percent sign
%but in Chifir it's the "Eye", which in Smalltalk-72 means "see", as in "do this when you saw this message". - 38 is the ampersand sign
&but in Chifir it's the○symbol (a white box symbol in the original Smalltalk-72 manual). It's used as the "bitwise logical operation prefix" symbol. e.g.+is the normal addition and&+is the bitwise OR,*is the normal multiplication and&*is the bitwise AND. - 63 is the question mark
?but in Chifir it's the "Right" symbol. It's the conditional statement (i.e. theif-then-elseconstruct). The conditional statement in Smalltalk-72 has the form ofcondition [Right] (then-clause) else-clause; it's probably a direct influence from McCarthy60. It's also used in method definition as well. For example:to box var | x y size tilt ( [Eye]draw [Right] ([Smile] place x y turn tilt. square size.) [Eye]undraw [Right] ([Smile] white. SELF draw. [Smile] black) [Eye]turn [Right] (SELF undraw. [Hand]tilt ← tilt + :. SELF draw.) [Eye]grow [Right] (SELF undraw. [Hand]size ← size + :. SELF draw.) isnew [Right] ([Hand]x ← [Hand]y ← 256. [Hand]size ← 50. [Hand]tilt ← 0. SELF draw) )or, if you want a much easier time to read:
to box var | x y size tilt ( %draw ? (@ place x y turn tilt. square size.) %undraw ? (@ white. SELF draw. @ black) %turn ? (SELF undraw. "tilt _ tilt + :. SELF draw.) %grow ? (SELF undraw. "size _ size + :. SELF draw.) isnew ? ("x _ "y _ 256. "size ← 50. "tilt _ 0. SELF draw) )
the %draw %undraw etc. are actually conditions; isnew is the condition of "whether it's creating a new object", the then-clause of this condition is thus the constructor.
The "?" does look like asking about the name of the message (e.g. %draw? = "is the message draw?"). I have no idea if this is a coincidence or not.
- 64 is the at sign
@but in Chifir it's the "Smile". The "Smile" represents the turtle object - Smalltalk-72 is very different from Smalltalk-80, it's more like "LOGO but OOP". - 94 is the circumflex sign
^but in Chifir it's the upward arrow↑. Now, the upward arrow does not exist in the Smalltalk-72 manual (only the "thicc" upward arrow, which is thereturnconstruct), it probably does not exist in Smalltalk-76 either. I thought it was thesuperconstruct (as in inheritance) but that does not exist in Smalltalk-72. - 95 is the underscore sign
_but in Chifir it's the left arrow←. It's mainly used to assign stuff (combined with "Hand") so in Smalltalk-72 it would be something like[Hand] d ← 3(or"d _ 3if we directly translate the left arrow as_). - 96 is the backtick sign
`but in Chifir it's the unary minus, the same construct as the APL high minus symbol¯.
Some constructs in the original Smalltalk-72 is lacking here, e.g. the thicc colon (the normal colon gets the next value in the message evaluated, this thicc colon get the next literal token), the keyhole (haven't read enough manual to know what it's for but probably some kind of inspection utilities), and the 's symbol (the subscript construct, the . in the A.B; but that can be unified with methods e.g. à la Io so there can be no problem).
Yeah, if you want to preserve programs you can definitely do a lot better than this...
(BTW I haven't heard about Project Oberon being treated as a permacomputing-related project & with Project Oberon you got a whole computing stack as well.)
About the Python version
- You need to install PySDL2.
- Took me about 3 hours? I lost count. Half of the 3 hours was spent on testing things out. The "afternoon hack" part is definitely true.
- The
HOTAREApart is there because I don't want to refresh the whole screen every time instruction 14 is executed. - The file reading part is not tested.
About programming in this VM
JMPfeels kinda weird, because to doJMP [some-address]the simplest way is to doM[PC] = 1; M[PC+1] = a+2; M[PC+2] = [some-address]becauseM[PC+1]is theApart so instead ofPC ← M[A]it's actuallyPC ← M[M[PC+1]], while with other common machines you don't have thisM[a+1]=a+2indirect stuff.- Conditional branching feels even weirder, because you have to use instruction 12 and instruction 2 together, so if you want to do
IF x < y THEN GOTO 1000, it would need to be like this:X: x Y: y T: [ignored] PC+0: IF M[M[PC+2]] < M[M[PC+3]] THEN M[M[PC+1]] ← 1 ELSE M[M[PC+1]] ← 0 PC+1: T PC+2: X PC+3: Y PC+4: IF M[M[PC+6]] == 0 THEN PC = M[M[PC+5]] PC+5: PC+7 PC+6: T PC+7: 1000
where X, Y and T are different addresses than PC+0~7. To be honest, this kind of indirect manuvering is kinda killing me.