Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

VM instruction implementation without huge switch()?

Name: Anonymous 2014-06-09 23:26

How would one go about making a virtual machine without using a huge and clumsy switch( opcode ) statement?

The architecture I'm implementing (DCPU-16) has about 35-40 different instructions.

Name: Anonymous 2014-06-09 23:41

Make a data structure with an opcode's code and a function pointer to the opcode's operation.
Sort an array of those structures by opcode.
Every time you get an opcode at runtime, do a binary search on the array and, if the opcode is found, call the function pointer with the opcode's arguments (if any). If the opcode is not found, you know what to do.

Name: Anonymous 2014-06-09 23:59

switch runs in constant time, >>2-san's binary search runs in logarithmic time.

Don't listen to >>2-san and go for a switch.

Name: Anonymous 2014-06-10 0:13

Store an array for function pointers indexed by the opcode which calls them.
#define EAT 0x00
#define FUCK 0x01
#define DIE 0x02
#define INSTRUCTIONS 0x03
void eat(){}
void fuck(){}
void die(){}
int main(int a, char ** b){
void (*i[3])();
i[EAT]=eat; i[FUCK]=fuck; i[DIE]=die;
while(peakNextInstruction() < INSTRUCTIONS){
i[popNextInstruction()]();
}
return -1;
}

Name: Anonymous 2014-06-10 0:35

>>1

Yes. You can translate opcode into x86 opcode, write it into array and execute directly. Be careful with self modifying code.

Name: Anonymous 2014-06-10 1:48

>>4
eat fuck die
And having the mind of a 13 year old is also a sign of apping ability.

Name: Anonymous 2014-06-10 1:50

>>3
switch is not constant. Its upper bound is N. If the opcode being executed is the last opcode in your switch, you need to make N, or in this case 40, comparisons, whereas lg(40) = 5.

A switch would be better if you know that certain opcodes will be executed from frequently than others, in which case you can order your switch by the frequency of the opcode.

Name: Anonymous 2014-06-10 1:53

>>7
http://en.wikipedia.org/wiki/Branch_table
switch is O(1) if your compiler isn't retarded.

Name: Anonymous 2014-06-10 1:56

>>8

You should really inspect resulting assembly, before making any dicision.

Name: Anonymous 2014-06-10 1:59

>>9
You should inspect my anus before making any incision.

Name: Anonymous 2014-06-10 3:38

>>8-10
it's funny because while switch can be very easily implemented O(1) or very close to it (ex O(log n)) MS Visual C++tm Enterprise edition uses O(n)

>>1
How would one go about making a virtual machine without using a huge and clumsy switch( opcode ) statement?
you can use a array with function pointers but I remember some other ways too
another way is to have a huge and clumsy switch but let the preprocessor handle it (just like I did)
I had made a script in bash and with the help of the preprocessor I had made it so it will map every opcode in a switch statement to a function or some lines of code while I do not give it a list of the functions (the were in another file and the list was generated by the script in the stile X(functionName, something, somethingElse) and was used by many parts of my program
sorry if I am not understandable, I am sleepy

Name: Anonymous 2014-06-10 4:55

Can you guys explain how switch can be implemented in O(1)? Not only I stand in disbelief of this fact, I also don't understand the wikipedia page.

Name: Anonymous 2014-06-10 5:09

>>11
Sure, you save a lot of space on transient switches that way.

>>12
The input to the switch effectively becomes an index into a lookup table of branch targets. Array lookup is O(1), as is the branch.

Name: Anonymous 2014-06-10 6:46

>>9
gcc -O2 with sequential case values and more than 3 cases will generate a jump table. If you care so very much you could also use an explicitly computed goto (assuming your compiler supports it).

Using a call table (i.e., with function pointers) is not a good idea for simple opcodes since most compilers are not smart enough to inline any of the called functions once their address is taken (even if scope of the functions and the resulting address is non-global).

tl;dr get off your high horse and use a switch/case. If it's good enough for bpf-jit it's good enough for you.

Name: L. A. Calculus !jYCj6s4P.g 2014-06-10 7:17

WHO TAUGHT U RETOIDS HOW 2 DESIGN PROGRAMS?

#include <stdio.h>
#include <stdlib.h>

enum {
NOPERATIONS = 40
};

enum {
SUCCESS, CONTINUE, ARITY, INVALID_OPCODE
};

struct cpu {
unsigned char *mem;
int ip;
int size;
/* ... */
};

int nop(struct cpu *cp, const unsigned char *p, int n)
{
return 1;
}

int set(struct cpu *cp, const unsigned char *p, int n)
{
unsigned index;

if (n < 5)
return -ARITY;
index = p[1] << 8 | p[2];
cp->mem[index] = p[3];
cp->mem[index + 1] = p[4];
return 5;
}

int (*operations[NOPERATIONS]) (struct cpu *, const unsigned char *, int) = {
nop, set /* , ... */
};

int cpu_tick(struct cpu *cp)
{
int n, op;

if (cp->ip == cp->size)
return SUCCESS;

op = cp->mem[cp->ip];
if (op >= NOPERATIONS || operations[op] == NULL)
return INVALID_OPCODE;

if ((n = operations[op](cp, &cp->mem[cp->ip], cp->size - cp->ip)) < 0)
return -n;

cp->ip += n;
return CONTINUE;
}

int main(void)
{
unsigned char b[] = { 0, 0, 1, 0, 0, 69, 96, 0 };
struct cpu cpu;
int e;

cpu.mem = b;
cpu.ip = 0;
cpu.size = (int) (sizeof b / sizeof b[0]);

while ((e = cpu_tick(&cpu)) == CONTINUE)
;

switch (e) {
case INVALID_OPCODE:
fprintf(stderr, "invalid opcode: %d\n", cpu.mem[cpu.ip]);
return EXIT_FAILURE;
case ARITY:
fprintf(stderr, "invalid arity to op %d\n", cpu.mem[cpu.ip]);
return EXIT_FAILURE;
case SUCCESS:
printf("%d %d\n", b[0], b[1]);
return 0;
default:
fprintf(stderr, "TELL DA RETOID WHO WROTE DIS EMULATOR 2 REED"
" DA FUKIN STANDARD.\n");
return EXIT_FAILURE;
}
}

Name: L. A. Calculus !jYCj6s4P.g 2014-06-10 7:27

O BTW, I WUD HAV COMMENTED IT, BUT I WAS AFRAID DAT IF ID HAVE DONE DAT, U RETOIDS WUD START BITCHING ABOUT HOW COMMENTS DELAY DA COMPILATION TIME BY A FEW FUKIN NANOSECONDS.

ROFL

Name: Anonymous 2014-06-10 9:18

>>15
struct cpu {
[…]
int ip;
That was NSA QUALITY!

Name: Anonymous 2014-06-10 9:35

>>6
Kill yourself. I'm a much better programmer than you, and you know it.

Name: Anonymous 2014-06-10 9:41

>>15
int size;
I love size_t more than you stackboi

Name: Anonymous 2014-06-10 9:44

seize my anus

Name: L. A. Calculus !jYCj6s4P.g 2014-06-10 9:46

>>17
DATZ MY INSTRUCTION POINTER

Name: Anonymous 2014-06-10 10:31

enum {
NOPERATIONS = 40
};

What the fuck is the point of that enum?

Name: Anonymous 2014-06-10 11:22

>>22
It's a Jewish thing. You're not meant to understand it.

Name: Anonymous 2014-06-10 14:23

// Only the chosen are expected to understand this.

Name: Anonymous 2014-06-10 16:46

>>22
Number of operators I think

Name: L. A. Calculus !jYCj6s4P.g 2014-06-10 16:59

>>22
DER ISNT ONE. HOW DO U LIKE DAT?

Name: Anonymous 2014-06-10 19:38

You're writing in assembly, you don't have to be limited to the constraints of C.

http://www.complang.tuwien.ac.at/forth/threaded-code.html

Name: Anonymous 2014-06-10 19:57

>>27
Oops, I misread the OP. Thought they were writing a virtual machine in DCPU-16 assembly, but they were actually talking about DCPU-16 emulator for PCs.

Yeah, a switch statement is the best you can do in portable C. You can get something close to token-threaded code if you use the gcc labels-as-values extension, but then it won't be portable to all C compilers, and the code generated by even the best C compiler is going to be shit compared to a naive assembly implementation.

Name: Anonymous 2014-06-11 4:15

>>27-28
Your link is cool so I forgive you

Name: Anonymous 2014-06-11 4:26

>>19
In the beginning there were only int and char,

>>25
I find that sort of usage annoying since it will trigger "enum constant not covered" warnings in switch statements. One of many annoyances with enums in C... (I'll confess I (ab)use them frequently, solely because it makes it easy to traverse definitions with ctags.)

I prefer this ENTERPRISE QUALITY ENUM HANDING:

#include <errno.h>

typedef enum { /* yes typedef, I hate typing "enum" */
FOO,
BAR,
BAZ,
QUX,
} meta meta_t; /* and the POSIX committee can bite me */

static void do_foo();
static void do_bar();
static void do_baz();
static void do_qux();

/* ... */

int dispatch_meta(meta_t val)
{
switch (val) {
case FOO:
return do_foo();
case BAR:
return do_bar();
case BAZ:
return do_baz();
case QUX:
return do_qux();
}

return -EINVAL;
}

Name: L. A. Calculus !jYCj6s4P.g 2014-06-11 5:19

>>30
YAINT RED DA FUKIN STANDARD.

return do_foo();

do_foo DOESNT RETURN ANYTHING. I DNUNO WAT PLANET UR FROM BUT U KANT CONVERT NOTHING TO A FUKIN int!

typedef enum { ... } meta meta_t;

EEEEH EEEEEEEH EEEEEEEEH EEEEEEEEEH

U MEAN typedef enum { ... } meta_t;

I DUNNO WHO TAUGHT U HOW 2 PROGRAM BUT NEXT TIME U SHUD AT LEAST RUN UR PROGRAM THROUGH DA FUKIN COMPILER

N DATZ NOT ME TALKIN DATZ BARNEY FUKIN STROOTSTROOP! (DA DANISH ONE, JUNIOR)

Name: Anonymous 2014-06-11 5:32

>>30
In the beginning there were only int and char,
epic xD

return -EINVAL
epic xD

Name: Anonymous 2014-06-11 5:34

>>31
So goes freehanding before coffee. Please accept this patch:

--- a/foo.c 2014-06-11 00:28:16.679040075 -0700
+++ b/foo.c 2014-06-11 00:28:38.001086477 -0700
@@ -1,16 +1,16 @@
#include <errno.h>

-typedef enum { /* yes typedef, I hate typing enum */
+typedef enum meta { /* yes typedef, I hate typing enum */
FOO,
BAR,
BAZ,
QUX,
-} meta meta_t; /* and the POSIX commitee can bite me */
+} meta_t; /* and the POSIX commitee can bite me */

-static void do_foo();
-static void do_bar();
-static void do_baz();
-static void do_qux();
+static int do_foo();
+static int do_bar();
+static int do_baz();
+static int do_qux();

int dispatch_meta(meta_t val)
{

Name: Anonymous 2014-06-11 5:35

>>31
good post. keep it up, nigger.

Name: Anonymous 2014-06-11 5:39

>>1
You can do function pointers or threaded code

Threaded code basically jumps to the code that implements the next instruction directly without a dispatch.

prog = { op_add, op_add, op_sub, ... };
op_add:
reg_a = reg_b + reg_a;
goto *ins_stream++;

op_sub:
reg_a = reg_a - reg_b;
goto *ins_stream++;

etc...

requires gcc's labels as values extensions to implement in C. You can use function pointers for the same idea, but it might be slower.

Name: Anonymous 2014-06-11 5:43

>>35
indirectly*

Name: Anonymous 2014-06-11 5:56

>>6
Your just mad because youll never get 2 fuk a 2hu; btw wich 2hu wud u fuk?

Name: Anonymous 2014-06-11 7:55

>>29
One of the articles linked at the end is the one that led me to FORTH SATORI. Make sure to check it out too:
http://www.bradrodriguez.com/papers/moving1.htm

Name: Anonymous 2014-06-11 12:26

>>37

epic meme, /g/ro, shalom.

Name: Anonymous 2016-10-24 15:24

>>1
You can replace the switch block with if...else if...else if...else, but that's not really a solution. Another option is to implement the instructions as functions and use an array of function pointers (i.e. opcode 5 is the function stored in instructions[5]).

Essentially though, the construct you want is the old ON OPCODE GOTO 100, 200, 300, 400, 500, and the switch statement is the closest thing C has to that.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List