/prog/ - the death of optimizing compilers

Name: Anonymous 2015-04-18 0:15

http://cr.yp.to/talks/2015.04.16/slides-djb-20150416-a4.pdf

Name: Anonymous 2015-04-18 1:26

Nothing new here (except for a terrible presentation style), keep moving please!

Name: the death of progrider 2015-04-18 1:40

always disappointed when i come here

Name: Anonymous 2015-04-18 1:49

>>2,3
He's advocating writing more assembly for inner loops, I figured you bitfuckers would be all over that shit.

Name: Anonymous 2015-04-18 2:35

>>4
You could try taking off those abstractions and Javascript apps you rammed up your ass.

Name: Anonymous 2015-04-18 3:05

>>5
I'm actually more of a C/assembly/FORTHfag.

Name: Anonymous 2015-04-18 3:20

>>6
Tell me about how you use forth please, im findibg it hard to get into

Name: Anonymous 2015-04-18 6:18

>>7
I don't work in FORTH, though I'd like to. I wrote a shitty implementation in Racket a few years ago, wrote half of a shitty x86 assembly implementation somewhat more recently, and that's about it. Just like Lisp, however, it's always in the back of my mind, waiting for the right project to present itself.

I'm working on some shitty indie games right now, writing everything in C, and as I approach the stage where my engines become more and more "incomplete, bug ridden etc. implementations of half of common lisp/smalltalk/whatever" (gotta data drive everything, make as much code reloadable as possible, implement late binding/rebindability where its sane, hey wouldn't it be nice if I could let users script this at runtime, etc), it feels like it'd almost be less work to just write my own language.

With modern memory hierarchies and ridiculously OoO CPUs, getting the data layout and access patterns right is orders of magnitude more important than the actual instruction stream for all but the tightest of loops. Javashit and friends do the exact opposite of the "right thing" by wasting all of their effort on insanely complicated JITs, micro-optimizing the least important part of the problem, while all of the real performance problems come from the GC overhead, bloated pointer-chasing dynamically-typed data structures, and heap allocations endemic to these languages by design. In contrast, most of the "speed" of C and C++ relative to other languages comes from the fact that you can lay out simple contiguous data structures, place them in linear arrays, and write your own memory allocators, ensuring predictable and efficient cache access patterns. Sure, the machine code that C and C++ compilers generate is better than average, since hundreds of man-years of work has been poured into them, but a good assembly programmer can always run circles around them anyway, especially if SIMD can be used to solve the problem.

So my suspicion for a while has been that a language that straddled the lines between the two worlds would be more useful to me. Straightforward, dumb, unoptimized, but easy to debug machine code output. Contiguous data structures and custom allocators, with a preference for arrays and "arenas" (allocate everything for a task with reckless abandon, then free it all at once; memory is cheap, bandwidth isn't). Hot-patchable functions allowing for interactive development. Introspection to ease inspection of data structures, creation of data serialization formats, etc. Make it easy to drop down into assembly for important inner loops. Metaprogramming support so that you can create DSLs to lessen some of the burden of writing assembly. An implementation that's actually possible to understand in its entirety in a short period of time. And so on.

I usually imagined a Lisp-like language when thinking about this, but the thing is, most of a typical Lisp's "simplicity" comes from natural consequences of various design decisions, not the parenthesis and prefix notation. If you've decided to have garbage collection, a parse tree generation step instead of generating code as you parse, and full language access at compile time, then Lisp macros kinda "fall out" of the design. But if you try to approach things from the opposite perspective and say "I want a language with simple syntax and metaprogramming support like Lisp, but with radically simpler machine-level semantics," it becomes a lot more muddy. Certainly the functional programming koolaid-drinking community won't have any useful direction to provide. They'll just lecture you over continuations and closures and CPS and so on, mocking you for not "getting" Lisp. You almost have to design separate languages for "compile time" and "run time", which was the exact opposite of what we wanted. It quickly becomes obvious that you'd spend much more time building the perfect "machine Lisp" than most people with real projects to work on could justify. But damn would it be nice if it existed.

Then I discovered FORTH. I remember that night, devouring article after article on FORTH implementation and philosophy. To cut a long story short, the beauty of FORTH is that by punting the concerns of context management (via the stack) and error checking (due to the lack of an enforced type system) to the user, an extremely simple (both to the programmer, and the machine) implementation of a surprisingly capable language just "falls out of" the design. A field in a data structure, for instance, becomes add eax, OFFSET_OF_FIELD (assuming the top of stack is in the eax register, and contains at this moment the address of a data structure). Tail call optimization is just patching a call subroutine into a jump subroutine when the word ending a word definition is reached. The difference between the "interpreter" and "compiler" is whether it executes machine code or copies it (or a call to it) into the instruction stream. A "macro" is just a word that is guaranteed to be executed even in compile mode. A "metalanguage" is just another dictionary of word definitions for the "compiler" to search when in the correct mode. And so on. It's beautifully simple, and fits the middle ground language I imagined earlier quite well.

But then, I try to imagine an x86 assembler written in FORTH, and I come crashing back to earth. As I try to wrestle the two incompatible languages into something practicle, I remember the other lesson from my assembly days, that the language doesn't fucking matter, that I should stop wasting my time on overengineered unwork bullshite and just make my fucking game already in whatever stupid turing-complete language is most accessible to my platform. But every now and then, I read something like the djb slides I made this thread around, and it kinda reassures me that I'm not crazy and that somebody could earn an modest underground neckbeard following pursuing a language design like this. "Hey suckless weenies, my programming language implementation is billions of times smaller than yours, provides more useful features, and results in smaller, faster programs when its methodology is followed. Suck on that."

Name: Anonymous 2015-04-18 6:18

>>8
I don't remember clicking the sage button.

Name: Anonymous 2015-04-18 9:19

You almost have to design separate languages for "compile time" and "run time", which was the exact opposite of what we wanted.

Yes, that's what you have to do. And, there isn't anything wrong with that. Common lisp makes a good macro language. Just let macros run in common-lisp, generating code in machine lisp. If you want to reuse your machine lisp code, build an interface from common lisp to machine lisp.

Name: Anonymous 2015-04-18 12:16

check em

Name: Anonymous 2015-04-18 12:23

>>8
https://github.com/tomhrr/dale C with Lisp syntax

Name: Anonymous 2015-04-18 16:56

>>8
I read all of your post >>8, very interesting.

writing everything in C

Why not C++? Suddenly you are the most powerful wizard.

Name: Anonymous 2015-04-18 17:10

>>13
Whom are you quoting?

Name: Anonymous 2015-04-18 17:14

>>13

Why not C++

http://250bpm.com/blog:4

Name: Anonymous 2015-04-18 17:44

>>8

implementation of a surprisingly capable language

It can't even define a real function, as in, with formal parameters. I wouldn't call that "capable".

Name: Anonymous 2015-04-18 18:16

>>8

the language doesn't fucking matter

The language does fucking matter if you're a perfectionist.

Name: Anonymous 2015-04-18 18:51

And __YOUR_LANGUAGE__ can't push a variable number of elements onto the call stack and later consume all of them in a loop. I wouldn't call that ⁶⁶capable⁹⁹.

Name: Anonymous 2015-04-18 19:12

>>18

call stack

What's that and why should I care? I'm a programmer, not a lowly hardware nigger toilet scrubber.

Name: Anonymous 2015-04-18 19:19

http://www.reddit.com/r/forth

Name: Anonymous 2015-04-18 19:21

http://blog.eatonphil.com/2015/04/06/introduction-to-stack-based-languages-with-and-elementary-debugging-in-forth/

Forth is a stack-based language. There is one major data structure and it is, you guessed it, a stack. The entirety of using a stack-based languages revolves around pushing data onto the stack, popping data off of the stack, and operating on the stack.

What kind of idiot would call a language with one data structure "capable"? Even LITHP has more.

Name: Anonymous 2015-04-18 19:31

>>21
I haven't used forth, but in other stack-based things, the stack is used for memory management and you can push whatever data structures you want onto it.

Name: Anonymous 2015-04-18 19:39

>>2
yeah I mean we have been doing this incrementally for ages, poorly of course

I didn't quite figure out whether this was a statement of intent or just calling attention to the idea. either way, this is a blog post, not a talk. he was probably paid an embarrassing sum of money to give that talk.

Name: Anonymous 2015-04-18 19:40

>>22
Like that 2 GB array, right? And the stack won't overflow?

Name: Anonymous 2015-04-18 19:51

>>24
If you are getting at laziness, stack based languages have thunks. You can store code in arrays, and execute them. From that it's easy to get laziness.

Name: Anonymous 2015-04-18 20:00

>>25
No, I'm getting at stack overflows. You know, the reason decent language have heaps and not just stacks.

Name: Anonymous 2015-04-18 20:17

>>18
And just what language can't you do that in?

#include <stdio.h>
#include <stdlib.h>

struct Node{
  void* data;
  int size;
  struct Node* next;
};

struct Node* callstack;

void push(void* data, int size){
  struct Node* n = (struct Node*)malloc(sizeof(struct Node));
  n->data = data; n->size = size; n->next = callstack;
  callstack = n;
}

struct Node* pop(){
  struct Node* n = callstack;
  if(n == NULL) return n;
  callstack = callstack->next;
  return n;
}

void initstack(){
  callstack = (struct Node*)malloc(sizeof(struct Node));
  callstack->data = callstack->next = NULL; callstack->size = 0;
}

void println(){
  struct Node* n = pop();
  int len = (int)n->data;
  for(int i = len; i > 0; i--){
    n = pop();
    if(n == NULL) break;
    if(n->size == sizeof(int))
      printf("%d", (int)n->data);
    else if(n->size == sizeof(char))
      printf("%s", (char*)n->data);
    else continue;
    free(n);
  }
  printf("\n");
}

int main(int argc, char** argv){
  initstack();

  push((void*)".", sizeof(char));
  push((void*)"\b\b}", sizeof(char));
  for(int i = 0; i < 50; i++){
    push((void*)", ", sizeof(char));
    push((void*)i, sizeof(int));
  }
  push((void*)"This is a set: {", sizeof(char));
  push((void*)(2 + 50 * 2 + 1), sizeof(int));
  println();

  return 0;
}

Name: Anonymous 2015-04-18 20:21

>>27
Are you a retard? Why would you cast to void * (or to anything) in that case? Why would you use int for size?

Name: Anonymous 2015-04-18 20:34

>>28

Why would you cast to void * (or to anything) in that case?

To make the warnings go away, obviously. And type safety, of course.

Why would you use int for size?

Fuck you and your size_t bullshit.

Name: Anonymous 2015-04-18 20:42

>>29
What warnings? What safety?
Are you sure that you are using a C compiler?
Are you sure that you know basic C?

Name: Anonymous 2015-04-19 1:45

>>26
And what keeps your heap from overflowing? The stack is just an abstraction.

Name: Anonymous 2015-04-19 4:12

>>26
In some stack based languages, the values on the stack are references to garbage collected objects on the heap.

Name: Alexander Dubček 2015-04-19 4:44

Optimize my doubles.

Name: Anonymous 2015-04-19 4:48

>>15 Sepples - Not Even O(1).

Name: Anonymous 2015-04-19 5:11

>>26
Forth implementations generally use multiple stacks (at least two), one data stack (where all your arithmetic and basic stack ops go) and a return/loop stack for control flow.
These obviously need not share any memory pages, make the "data stack" the size of your conventional C heap and you'll only get a stack overflow the same time your malloc() would return NULL.

Name: Anonymous 2015-04-19 6:00

>>15
I read both parts of the article, interesting with good points. It does not apply for >>8, because "run indefinitely without UB =/= indie game engines". In fact, C++ is made for indie game engines! That's all C++ is for. Indie fucking games.

Name: Anonymous 2015-04-19 6:13

>>36 And browsers, video editors, simulations, renderers,..or any program where performance is critical.

Name: Anonymous 2015-04-19 6:17

>>36
C++ is also popular in the Big Science community, unfortunately.

You think they'd just stick with Fortran, Matlab/Octave, and Lisp, but no. All of the shit that they run on the big super-computers or to run all of their expensive particle collision equipment is written in C++. It's just tens of millions of LoC of cobbled-together shit using C++ badly (overuse of templates, inheritance, etc.).

I mean--fuck--with all of the C++ crap they have to support, they ended up writing CMake, which itself is written in C++.

Name: Anonymous 2015-04-19 6:21

>>30
I'm not going to measure dicks with you. Kill yourself. When Lambda comes and insults me, maybe I'll care. Until then, I'll do my programming with as much copy-paste as possible.

Name: Anonymous 2015-04-20 8:16

My big fat FORTH stack towers over your puny C heap.

Name: Anonymous 2015-04-20 16:02

>>40
yep you win, you have the largest goiter

the death of optimizing compilers

1 Name: Anonymous 2015-04-18 0:15

2 Name: Anonymous 2015-04-18 1:26

3 Name: the death of progrider 2015-04-18 1:40

4 Name: Anonymous 2015-04-18 1:49

5 Name: Anonymous 2015-04-18 2:35

6 Name: Anonymous 2015-04-18 3:05

7 Name: Anonymous 2015-04-18 3:20

8 Name: Anonymous 2015-04-18 6:18

9 Name: Anonymous 2015-04-18 6:18

10 Name: Anonymous 2015-04-18 9:19

11 Name: Anonymous 2015-04-18 12:16

12 Name: Anonymous 2015-04-18 12:23

13 Name: Anonymous 2015-04-18 16:56

14 Name: Anonymous 2015-04-18 17:10

15 Name: Anonymous 2015-04-18 17:14

16 Name: Anonymous 2015-04-18 17:44

17 Name: Anonymous 2015-04-18 18:16

18 Name: Anonymous 2015-04-18 18:51

19 Name: Anonymous 2015-04-18 19:12

20 Name: Anonymous 2015-04-18 19:19

21 Name: Anonymous 2015-04-18 19:21

22 Name: Anonymous 2015-04-18 19:31

23 Name: Anonymous 2015-04-18 19:39

24 Name: Anonymous 2015-04-18 19:40

25 Name: Anonymous 2015-04-18 19:51

26 Name: Anonymous 2015-04-18 20:00

27 Name: Anonymous 2015-04-18 20:17

28 Name: Anonymous 2015-04-18 20:21

29 Name: Anonymous 2015-04-18 20:34

30 Name: Anonymous 2015-04-18 20:42

31 Name: Anonymous 2015-04-19 1:45

32 Name: Anonymous 2015-04-19 4:12

33 Name: Alexander Dubček 2015-04-19 4:44

34 Name: Anonymous 2015-04-19 4:48

35 Name: Anonymous 2015-04-19 5:11

36 Name: Anonymous 2015-04-19 6:00

37 Name: Anonymous 2015-04-19 6:13

38 Name: Anonymous 2015-04-19 6:17

39 Name: Anonymous 2015-04-19 6:21

40 Name: Anonymous 2015-04-20 8:16

41 Name: Anonymous 2015-04-20 16:02

Name: Anonymous 2015-04-18 0:15

Name: Anonymous 2015-04-18 1:26

Name: the death of progrider 2015-04-18 1:40

Name: Anonymous 2015-04-18 1:49

Name: Anonymous 2015-04-18 2:35

Name: Anonymous 2015-04-18 3:05

Name: Anonymous 2015-04-18 3:20

Name: Anonymous 2015-04-18 6:18

Name: Anonymous 2015-04-18 6:18

Name: Anonymous 2015-04-18 9:19

Name: Anonymous 2015-04-18 12:16

Name: Anonymous 2015-04-18 12:23

Name: Anonymous 2015-04-18 16:56

Name: Anonymous 2015-04-18 17:10

Name: Anonymous 2015-04-18 17:14

Name: Anonymous 2015-04-18 17:44

Name: Anonymous 2015-04-18 18:16

Name: Anonymous 2015-04-18 18:51

Name: Anonymous 2015-04-18 19:12

Name: Anonymous 2015-04-18 19:19

Name: Anonymous 2015-04-18 19:21

Name: Anonymous 2015-04-18 19:31

Name: Anonymous 2015-04-18 19:39

Name: Anonymous 2015-04-18 19:40

Name: Anonymous 2015-04-18 19:51

Name: Anonymous 2015-04-18 20:00

Name: Anonymous 2015-04-18 20:17

Name: Anonymous 2015-04-18 20:21

Name: Anonymous 2015-04-18 20:34

Name: Anonymous 2015-04-18 20:42

Name: Anonymous 2015-04-19 1:45

Name: Anonymous 2015-04-19 4:12

Name: Alexander Dubček 2015-04-19 4:44

Name: Anonymous 2015-04-19 4:48

Name: Anonymous 2015-04-19 5:11

Name: Anonymous 2015-04-19 6:00

Name: Anonymous 2015-04-19 6:13

Name: Anonymous 2015-04-19 6:17

Name: Anonymous 2015-04-19 6:21

Name: Anonymous 2015-04-20 8:16

Name: Anonymous 2015-04-20 16:02