Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

the death of optimizing compilers

Name: Anonymous 2015-04-18 0:15

Name: Anonymous 2015-04-18 6:18

>>7
I don't work in FORTH, though I'd like to. I wrote a shitty implementation in Racket a few years ago, wrote half of a shitty x86 assembly implementation somewhat more recently, and that's about it. Just like Lisp, however, it's always in the back of my mind, waiting for the right project to present itself.

I'm working on some shitty indie games right now, writing everything in C, and as I approach the stage where my engines become more and more "incomplete, bug ridden etc. implementations of half of common lisp/smalltalk/whatever" (gotta data drive everything, make as much code reloadable as possible, implement late binding/rebindability where its sane, hey wouldn't it be nice if I could let users script this at runtime, etc), it feels like it'd almost be less work to just write my own language.

With modern memory hierarchies and ridiculously OoO CPUs, getting the data layout and access patterns right is orders of magnitude more important than the actual instruction stream for all but the tightest of loops. Javashit and friends do the exact opposite of the "right thing" by wasting all of their effort on insanely complicated JITs, micro-optimizing the least important part of the problem, while all of the real performance problems come from the GC overhead, bloated pointer-chasing dynamically-typed data structures, and heap allocations endemic to these languages by design. In contrast, most of the "speed" of C and C++ relative to other languages comes from the fact that you can lay out simple contiguous data structures, place them in linear arrays, and write your own memory allocators, ensuring predictable and efficient cache access patterns. Sure, the machine code that C and C++ compilers generate is better than average, since hundreds of man-years of work has been poured into them, but a good assembly programmer can always run circles around them anyway, especially if SIMD can be used to solve the problem.

So my suspicion for a while has been that a language that straddled the lines between the two worlds would be more useful to me. Straightforward, dumb, unoptimized, but easy to debug machine code output. Contiguous data structures and custom allocators, with a preference for arrays and "arenas" (allocate everything for a task with reckless abandon, then free it all at once; memory is cheap, bandwidth isn't). Hot-patchable functions allowing for interactive development. Introspection to ease inspection of data structures, creation of data serialization formats, etc. Make it easy to drop down into assembly for important inner loops. Metaprogramming support so that you can create DSLs to lessen some of the burden of writing assembly. An implementation that's actually possible to understand in its entirety in a short period of time. And so on.

I usually imagined a Lisp-like language when thinking about this, but the thing is, most of a typical Lisp's "simplicity" comes from natural consequences of various design decisions, not the parenthesis and prefix notation. If you've decided to have garbage collection, a parse tree generation step instead of generating code as you parse, and full language access at compile time, then Lisp macros kinda "fall out" of the design. But if you try to approach things from the opposite perspective and say "I want a language with simple syntax and metaprogramming support like Lisp, but with radically simpler machine-level semantics," it becomes a lot more muddy. Certainly the functional programming koolaid-drinking community won't have any useful direction to provide. They'll just lecture you over continuations and closures and CPS and so on, mocking you for not "getting" Lisp. You almost have to design separate languages for "compile time" and "run time", which was the exact opposite of what we wanted. It quickly becomes obvious that you'd spend much more time building the perfect "machine Lisp" than most people with real projects to work on could justify. But damn would it be nice if it existed.

Then I discovered FORTH. I remember that night, devouring article after article on FORTH implementation and philosophy. To cut a long story short, the beauty of FORTH is that by punting the concerns of context management (via the stack) and error checking (due to the lack of an enforced type system) to the user, an extremely simple (both to the programmer, and the machine) implementation of a surprisingly capable language just "falls out of" the design. A field in a data structure, for instance, becomes add eax, OFFSET_OF_FIELD (assuming the top of stack is in the eax register, and contains at this moment the address of a data structure). Tail call optimization is just patching a call subroutine into a jump subroutine when the word ending a word definition is reached. The difference between the "interpreter" and "compiler" is whether it executes machine code or copies it (or a call to it) into the instruction stream. A "macro" is just a word that is guaranteed to be executed even in compile mode. A "metalanguage" is just another dictionary of word definitions for the "compiler" to search when in the correct mode. And so on. It's beautifully simple, and fits the middle ground language I imagined earlier quite well.

But then, I try to imagine an x86 assembler written in FORTH, and I come crashing back to earth. As I try to wrestle the two incompatible languages into something practicle, I remember the other lesson from my assembly days, that the language doesn't fucking matter, that I should stop wasting my time on overengineered unwork bullshite and just make my fucking game already in whatever stupid turing-complete language is most accessible to my platform. But every now and then, I read something like the djb slides I made this thread around, and it kinda reassures me that I'm not crazy and that somebody could earn an modest underground neckbeard following pursuing a language design like this. "Hey suckless weenies, my programming language implementation is billions of times smaller than yours, provides more useful features, and results in smaller, faster programs when its methodology is followed. Suck on that."

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List