Let's design a suckless bignum library. (I'm not part of suckless though, just curious about replacing GMP).
I researched a bit into algorithms and the rundown is this: * long multiplication: O(n^2) * karatsuba O(n^1.5) * Toom-Cook, fourier transform based methods - even faster but only used for numbers 10k digits+ long. Much more complex.
So we should probably use karatsuba for all multiplications. Squaring can be done a bit faster than multiplying two different numbers sometimes.
Now I suggest programming it in assembly, that gives you access to the carry bit (C doesn't get you that). Of course we will use libc and the normal C calling conventions so that it's a regular C library.
What to do about memory management? e.g. if you want to add two numbers do we need to allocate a new 'number' as long as the largest to write the result into or do it destructively "x <- x + y"? Maybe the library should support both - then a calculator program would figure out the best primitives to use for a given computation.
It might be nice to also support things like (big modulus) modular arithmetic and polynomials. stuff like exponentiation and modular inverses have interesting algorithms.
What other integer operations would we want? I don't really want to do anything with arb. prec. real numbers - arithmetic with rationals could be done though.
This sounds like a really interesting project! I'm interested in what kind of scope you have in mind. What kinds of operations should be supported? Aside from integers and modular arithmetic, do you also want to have real numbers, vectors, matrices, rationals, etc? What systems do you want to support (x86, MIPS, etc)? What do you think would be a nice goal for a prototype/proof of concept?
I have some ideas for what a very simple prototype could include: * a representation for arbitrary length integers * addition/subtraction/multiplication on these integers * multiplication implemented with karatsuba * c calling convention wrapper
Name:
Anonymous2015-11-17 2:16
Make everything fixed point and represent everything in BCD. END OF DISCUSSION
>>5 The 6502 variant in the NES had the BCD support removed, dumbass.
Name:
suigin2015-11-17 5:21
>>1 Would be interesting to know if these new intrinsics generate (near) optimal code if used to implement bignums, if it works, it would save you having to get your hands dirty with lots of assembly:
>>15 I don't know of an existing C compiler that will generate optimal assembly for bignums without using intrinsics. Implementing semantic rules for that would be extremely complex, and may cause performance problems for non-bignum code.
You're free to submit patches to GCC or Clang/LLVM. But from a practical standpoint, one must work with existing compiler technology. And so you either hand roll the assembly yourself or use intrinsics.
What to do about memory management? e.g. if you want to add two numbers do we need to allocate a new 'number' as long as the largest to write the result into or do it destructively "x <- x + y"? Maybe the library should support both - then a calculator program would figure out the best primitives to use for a given computation.
Been thinking about this. I think the simplest thing would be to consider memory management as outside of the scope of the library. The set of functions implemented by the library would be low-level and require that the buffers passed in be suitably large, otherwise overflow and truncation occurs.
Memory management would be left up to the programmer using the library. This would be sufficient for implementing bc(1) or dc(1) for sbase, which uses fixed size buffers and precision that can be adjusted by the user.
If people want a set of bignum functions that manages memory, this could be done by layering a higher-level library on top of the lower-level one.
Name:
Anonymous2015-11-19 4:29
Karatsuba requires temporary storage for the sums.
That would make for a very unfriendly API if there is no library-side memory management.
Name:
Anonymous2015-11-19 5:01
>>19 That sorta requires some way to retrieve the required size of a calculation and its result in order to allocate the right amount of memory, though.
Name:
suigin2015-11-19 5:31
>>21 There's usually an upper-bound for the size of the result that can be determined ahead of time based on the size of the inputs.
If, on the other hand, you try to implement it so that you always allocate just as much memory as you need, your bignum operations become far more complex, as they need to detect this condition, allocate more memory, then restart where they left off. It also becomes more CPU cache intensive when done this way.
From both a complexity and performance standpoint, for most operations, I think it would be better if you allocate everything up front, and then do the operation in one shot. Memory capacity is cheap.
>>20 There are also upper-bounds for the amount of temporary storage you need as well, and the library can be designed so that you have to pass in pointers to temporary buffers. A higher-level library with an easier-to-use API can still be layered on top, which handles all of the memory management.
Name:
Anonymous2015-11-19 5:50
>>18 That's why I said C is horrible shit for this. Get out of that fucking abomination of a language.
Its notions of expressions and evaluation are so far up its own ass it's impossible to fundamentally change anything in the language. It's fast for one class of activity, but is shit slow if you try to break out of it.
That's why I said C is horrible shit for this. Get out of that fucking abomination of a language.
So we should implement a ``suckless'' bignum library in LISP? But anon, LISP already has bignums. What's the point?
Or you meant Rust or Go? Well, both Rust and Go compilers have the same problem as C in this regard.
Besides, C compiler intrinsics are no different than LISP forms that are builtin to the LISP compiler or interpreter, instead of implemented with macros on top of more elementary forms. It's the same idea. In order to get an optimal mapping from a form to machine instructions, the LISP implementer often has to do the job himself, or use some form of brute-force search or genetic algorithm combined with runtime profiling to arrive at the optimal solution.
So we should implement a ``suckless'' bignum library in LISP? But anon, LISP already has bignums. What's the point?
If you feel like creating a better bignum implementation, what better place to play with it than a language which supports bignums? Why dick around with calling library functions instead of directly using immediate built-in syntax?
Besides, C compiler intrinsics are no different than LISP forms that are builtin to the LISP compiler or interpreter, instead of implemented with macros on top of more elementary forms.
Lol, do you even Lisp, bro?
You don't do this with regular macros, you do it with compiler macros and defining new transforms in the intermediate compiler definitions. You can add CPU-specific AVX instructions or whatever as a general toolkit for the compiler to use without having to code entire routines in asm. Lisp lets you hone Lisp to your problem, instead of fighting against the C compiler's fixed assumptions.
In order to get an optimal mapping from a form to machine instructions, the LISP implementer often has to do the job himself, or use some form of brute-force search or genetic algorithm combined with runtime profiling to arrive at the optimal solution.
Bullshit. Type inference carries a long way, and the language passes a higher level of abstraction to the compiler for optimization than C, with a billion times less undefined behavior keeping it cleaner. And if you're talking about high-level human optimization of algorithms and style, that's the same in every single language.
Name:
Anonymous2015-11-19 8:36
I remember reading about a language that let you specify optimizations. You could just specify a rule add(negate(X), Y) -> sub(Y, X) and the optimizer would do that transformation at the start of compilation.
Name:
Anonymous2015-11-19 9:01
>>26 You're describing pretty much every mature symbolic language.
You can add CPU-specific AVX instructions or whatever as a general toolkit for the compiler to use without having to code entire routines in asm. Lisp lets you hone Lisp to your problem, instead of fighting against the C compiler's fixed assumptions.
C has this too. It's called ``inline assembly''. Your inability to see the parallels between these C constructs and LISP makes me want to shake my head. Sure, LISP perhaps does it better, but since ``suckless'' more or less sticks to POSIX standards and the UNIX way, the project, by definition, is best implemented in C.
Bullshit. Type inference carries a long way, and the language passes a higher level of abstraction to the compiler for optimization than C, with a billion times less undefined behavior keeping it cleaner. And if you're talking about high-level human optimization of algorithms and style, that's the same in every single language.
There isn't always a one to one mapping from a form to a sequence of instructions, especially on modern RISC and CISC CPUs with vector instruction sets, out-of-order execution, multi-level instruction and data memory caches, etc.
Finding the optimal sequence of instructions can be a non-deterministic polynomial time operation, and it becomes very expensive as the size of a sequence grows. Compilers take a lot of short-cuts in their code generators.
You need some reading comprehension. Fuck inline assembly, that's what I said you can escape by leaving C. Writing assembly is WRONG. Telling the compiler how to use new intrinsics to make your existing code use CPU specifics is right. You do NOT end up including blocks of asm code, but write your inner algorithms in plain Lisp.
There isn't always a one to one mapping from a form to a sequence of instructions blah blah blah
None of that is language specific.
However, re "Compilers take a lot of short-cuts in their code generators", the C compiler is very constrained because of all the undefined behaviors from the spec that prevents very obvious optimizations. A better specified language can take advantage of much more.