Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-

How to compile arithmetic expressions to x86?

Name: Anonymous 2015-07-10 21:16

hi prog, say I want to take as given some variables on the stack x,y,z,w and compile expressions (x+y)*(z+w) to x86 asm.

first off the instructions take registers or pointers and well x86 is weird and has instructions like this:

add x, y ;x OR y must be a register, result in x
mul x ;multiplies eax with x, result is put into eax


so this would work:

mov eax, stack0
add eax, stack1
mov ebx, stack2
add ebx, stack3
mul ebx


but how would you make an algorithm to do this in general?

Name: Anonymous 2015-07-10 23:57

This is not the only way, but this is the traditional way. Introduce temporaries to generate code like the following.

t1 = x + y
t2 = z + w
t3 = t1*t2


Each node of the expression tree becomes a variable. You can also locate common subexpressions if you wish.

Then assign the variables to registers and stack positions and generate assembly like you wrote.

Name: Cudder !cXCudderUE 2015-07-11 1:23

some variables on the stack
The question you should be asking yourself is why aren't they in the registers already?

If you don't need x again (or one of the other ones, similarly), this uses one less register:
mov eax, [ebp+_y]
add [ebp+_x], eax
mov eax, [ebp+_z]
add eax, [ebp+_w]
mul [ebp+_x]


The general algorithm depends on how many of those values will need to be used after that single expression. It produces a dataflow graph and then matches the available instructions against it.

Name: Anonymous 2015-07-11 1:47

Please marry me, Cudder-sama ;_;

Name: Anonymous 2015-07-11 1:48

>>3
>>4
fuck off and kill yourself

Name: Anonymous 2015-07-11 1:55

>>3
That doesn't help you if you need to pass more values than there are registers, which is stupidly common on your beloved x86 because it only has a handful. It is also not fun at all to debug code for architectures with limited registers because an optimal register scheduling scheme will destroy even very recent and useful intermediate values with great eagerness.

Sure, you can always dial back the optimization level in your code generator or use tracing features, but on non prematurely optimized ISAs you aren't forced into doing this immediately for the most trivial cases. With x86, you have to reach for a bazooka to swat a fly.

Name: Anonymous 2015-07-11 4:42

>>4
fuck off cockring-nigger.

Name: Anonymous 2015-07-11 13:28

>>5
fuck off and optimize your quotes

Name: Cudder !cXCudderUE 2015-07-11 13:41

>>6
If you don't need to use memory then don't. Start using memory only when it doesn't all fit into registers - and arrange things suitably so that the values that are used first/most are the ones that get chosen for registers first. That's all there is to it.

It is also not fun at all to debug code for architectures with limited registers because an optimal register scheduling scheme will destroy even very recent and useful intermediate values with great eagerness.
I have no idea WTF you're complaining about. "It's too hard to debug"? It should be. This is not BASIC. If you're working at Asm level and worrying about debugging then YOU ARE DOING IT WRONG.

Name: Anonymous 2015-07-11 14:15

How to compile arithmetic expressions to x86?
You don't. First you wait for garbage collector to run and then, if you're lucky, the interpreter parses the expression and evaluates to some result.

Name: Anonymous 2015-07-11 14:58

>>9
kill yourself namefag, no one wants you here

>>10
lmao nice prog memes I can tell youre a true old/prog/

Name: Anonymous 2015-07-11 20:12

https://en.wikipedia.org/wiki/Three-address_code

I can easily compile to 3 address code, it's going from that to x86 that is hard

Name: Anonymous 2015-07-12 1:23

>>12
Then compile to two address code.

Name: Anonymous 2015-07-12 21:40

>>9
If you're working at Asm level and worrying about debugging then YOU ARE DOING IT WRONG.

This problem doesn't just apply to people who are writing assembly by hand. Any language implementation that compiles to native code and implements symbolic debugging by mapping the generated code back to high level constructs will have the same problem. It is usually hopeless to try and debug optimized x86 code generated by a compiler symbolically because the compiler will almost never generate code that maps back to the source in a straightforward fashion. Lack of registers is a major cause for this.

Name: Cudder !cXCudderUE 2015-07-13 6:38

>>14
"I'm too stupid to read Asm."

Go back to mental masturbation with your ultra-high-level languages, quiche-eater.

Name: Anonymous 2015-07-13 10:40

>>15
Who are you quoting.

Name: Anonymous 2015-07-13 11:44

>>16
Who are you quoting.
Whom are you quoting?

Name: Anonymous 2015-07-13 11:59

>>15 kill
>>16 your
>>17 self

Name: Anonymous 2015-07-13 14:36

Name: Anonymous 2015-07-14 18:29

>>15
You're a stupid codemonkey. You're an autist who would rather build spaceships out of legos than think about real spaceships because you can put yours together piece by piece. But in the end, all you'll have is a toy, while actual humans will have a ship that, while sometimes explosive, actually exist and can do everything your Assembly trinkets can't.

Name: Anonymous 2015-07-14 19:43

Who else loves quiche?

Name: Anonymous 2015-07-14 19:55

No real man, that's who

Name: Anonymous 2015-07-14 21:10

>>20
>>21
>>22
you retards can shut up until i get helped

Name: Anonymous 2015-07-14 21:18

>>23
You want some quiche too?

Name: Anonymous 2015-07-14 21:26

>>15
I wouldn't call C an ultra high level language. Not when targeting x86, anyway. I'm only familar with a handful of the most common optimizations, but the x86 output of most C compilers is rarely surprising to me. The main thing that makes things hard to follow is the incessant bouncing of values to the stack frame and back, which is a direct consequence of the shortage of registers.

Name: Anonymous 2015-07-14 23:52

>>24
Will it optimize his quotes?

Name: Anonymous 2015-07-15 1:15

OPTIMIZE THE QUICHE

Name: Anonymous 2015-07-15 4:08

>>25
I am Haskal programmer.

Name: expert !Up3JQchEFY 2015-07-15 5:59

>>12
It is my opinion that Three Address Code would be drastically overcomplicated in this situation.
This is because simple arithmetic expressions require no concept of looping or jumping, which would serve to only complicate this endeavor.
Surely a better approach would be utilizing Dijkstra's Shunting Yard algorithm1 to translate a given infix expressoan to a Postfix2 expression.
The advantages of this are obvious, of course, in that you could allocate registers3 and generate x86 Assembly4 efficiently in one pass over the postfix expression.

As an example, let us consider the infix expression, "(3 * 4) + 2".
The tokenization of this expression is displayed in figure A.
The translation of this expression to a linear stream of assembly instructions is not immediately clear in infix form, but if we are to translate it to a postfix expression, as in figure B, the (slightly naive) translation is simple and straightforward, as displayed in figure C.

I believe it is now adaquately obvious that translation to postfix is highly preferable to three address code. There is one alternative worth mentioning, however, which is building a tree5 from the infix expression, and generating assembly by recursively iterating through it. This is approximately equivalent in difficulty, but can make traversal and register allocation slightly more complex.

(
3
*
4
)
+
2

Figure A

3
4
*
2
+

Figure B

mov $3, %eax
mov $4, %ebx
mul %ebx
mov $2, %ebx
add %ebx, %eax
; The end result of the expression is now stored in %eax.

Figure C

References:
1: https://en.wikipedia.org/wiki/Shunting_yard_algorithm
2: https://en.wikipedia.org/wiki/Reverse_Polish_notation
3: https://en.wikipedia.org/wiki/Register_allocation
4: https://en.wikipedia.org/wiki/X86_instruction_listings
5: https://en.wikipedia.org/wiki/Parse_tree

Name: Anonymous 2015-07-15 7:48

>>29
You are actually posting something that makes me think! It's been so long since this has happened!

In that example the size of the stack never exceeded two, so you could do the whole thing directly in registers by just assigning registers to stack positions. If the stack grows larger it's a little harder, but I'm sure there are many ways of adapting it. The stack would be the usual stack and the registers would act like a sliding circular buffer cache for the top elements of the stack.

Name: Anonymous 2015-07-15 17:13

>>29
Welcome to the 1970s, here's your peephole optimizer and an absence of common subexpression elimination.

Name: Anonymous 2015-07-15 17:56

>>31
You can do subexpression elimination by duping a value at a specified depth in the stack.

Name: Anonymous 2015-07-15 19:25

>>32
Yes, that's the easy part, i.e. what happens once the values have been identified as duplicates.

Name: Anonymous 2015-07-15 20:45

>>33
And identifying duplicates isn't too bad either. Just search for maximal identical subtrees of the expression tree.

Name: Anonymous 2015-07-15 20:49

the problem is to produce asm, not search for common subexpressions

Name: Anonymous 2015-07-15 22:14

>>31
If it was the 1970s, he would have to give that compiler away for free if he wanted anyone to use it.

Name: Anonymous 2015-07-15 23:48

>>35
Search for my anus.

>>36
Use my anus.

Name: Anonymous 2015-07-16 3:00

>>36
UNIX QUALITY!

Name: Cudder !cXCudderUE 2015-07-18 15:41

>>29
RPN just gets you the order of operations. That's trivial stuff. Mapping values to registers and selecting instructions is the interesting part. Relevant links:

https://en.wikipedia.org/wiki/Strahler_number
https://en.wikipedia.org/wiki/Sethi%E2%80%93Ullman_algorithm

And use the standard syntax, not that backwards GNU shit.

>>30
Once you need more storage than available registers then a dataflow-directed method should be far better than "stack emulation", although testing this will be left as an exercise for the reader.

Name: Anonymous 2015-07-18 15:54

Ullman claims in his personal page at Stanford to be against the Iranian government,[5] but it's also alleged that he has demonstrated anti-Iranian sentiments. In one case, he responded to an email from an Iranian student who had inquired about admission at Stanford with an off-topic political rant and went on to say that he would not help Iranian students even if he could:

And even if I were in a position to help, I will not help Iranian students until Iran recognizes and respects Israel as the land of the Jewish people. I know that you may not hold the same insane position as the mullahs that run your country, but it is a matter of principle. If Iranians want the benefits of Stanford and other institutions in the US, they have to respect the values we hold in the US, including freedom of religion and respect for human rights.

Following that the National Iranian American Council issued a formal complaint to Stanford University,[6] to which Stanford spokesperson, Lisa Lapin responded that Ullman was expressing his own personal views and not the views of the University and that "he has no involvement in admission, and Stanford doesn’t discriminate in their admission process"[7][8][9][10]

Name: Anonymous 2015-07-19 17:36

>>40
Ullman FUCK ! What a Piece of shit !!!

Name: Anonymous 2015-07-19 18:24

I want to learn Krav Maga to beat shit out of goyim.

Name: Anonymous 2015-07-19 18:26

>>39
stop posting or stop using a name you fucking retard

Name: Anonymous 2015-07-19 21:18

>>43
implying a retard can understand he behaves like a retard

Name: Anonymous 2015-07-19 21:26

>>44
Who are you quoting?

Name: Anonymous 2015-07-19 21:36

>>44
Using retarded /g/ memes is worse than using a name, you know. Also, Cudder is way older than your failure of a teenager mom.

Name: Anonymous 2015-07-19 21:57

>>46
Hello, Cudder!

Name: Anonymous 2015-07-19 22:27

>>46
Old Cudder-fag had a clam, HE-HI-HE-HI-HO
And on his clam he had some dicks, HE-HI-HE-HI-HO

With a dick, dick here, and a dick, dick there,
Here a dick, there a dick, everywhere a dick, dick,

Name: Anonymous 2015-07-19 22:39

>>43-48
NOT VIP QUALITY

Name: You sire are and idiot 2015-07-19 23:06

>>49
including 43 in that
no, fuck you

Name: Anonymous 2015-07-19 23:15

>>50
Excuse me, dear Sir, would you be so kind to tell us what is the origin of your quotation?

Name: Anonymous 2015-07-20 0:56

>>50
NOT VIP QUALITY

Name: Anonymous 2015-07-20 2:57

das rite!

Name: Anonymous 2015-07-20 5:47

>>50
What do you have against Cudder?

Name: Anonymous 2015-07-21 1:18

You guys havent helpde me much but I figured out a bit more myself:

SOME IDEA ABOUT A NEW IL:

dest <- dest + src
x y + z

edx:eax <- eax * r/m32

data AS
= Add Var Var Var -- 2 or 3 -- must be 3!
| Mul Var Var Var Var -- 4

NEED 3 because its not SSA otherwise!

(insert a mov before or after ?)

y ) ( x
--------- Add x y z ---------
precolor graph with: x = y (this probably breaks the coloring algorithms though..)

z ) ( x y
------- Mul x y z w ---------
precolor graph with: x red, y blue

Lets work an example by hand:

(a * b) + (c * d)

Set u2 a
Set u3 b
Set v2 c
Set v3 d
Mul u0:u1 u2 u3
Mul v0:v1 v2 v3
Add r u1 v1

lifetimes:

set set mul mul add end
. . . . . . . |
u0 --------------..
u1 ---------
u2---------
u3 ---------------------..
v0 -----------..
v1 ------
v2 ---------
v3 -----------------..

now coloring that should give a good allocation

Name: Cudder !cXCudderUE 2015-07-21 11:05

Forget graph colouring, it's stupid academic circlejerk bullshite. Work backwards from dataflow and use something closer to bin-packing if you want to do better than existing compilers and be closer to LLVM.

Name: Anonymous 2015-07-21 14:56

Cudder is all talk and no action.

Name: Anonymous 2015-07-21 15:16

>>56
fuck off, dont post with a name in my thread

>>57
fuck off

Name: Anonymous 2015-07-21 16:24

>>58
you first

Name: Anonymous 2015-07-21 19:48

>>56
Why would I want to approximate LLVM? It already exists, right?

Name: Anonymous 2015-07-21 20:00

color my anus

Name: Cudder !cXCudderUE 2015-07-22 5:11

>>60
LLVM doesn't use graph colouring and is better because of it.

http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html

Name: Anonymous 2016-01-01 16:49

C is turdware.

Don't change these.
Name: Email:
Entire Thread Thread List