Is there a small version of GCC for shipping with Windows program? Because Mingw installation takes frigging gigabyte, including a lot of bloat, like C++ and fortran compiler, with useless crap, like directx bindings.
Name:
Anonymous2016-09-08 14:58
Clang is frigging enormous too. Just executable takes 50 megabytes.
Name:
Anonymous2016-09-08 15:13
Cygwin is pretty big as well - 100MB for a base install. Unless you want to try porting GCC yourself, TCC is probably your best option - it supports at least some of the GNU C extensions, and the whole install is just over 2 MB (however it takes up about 7 MB on my flash drive, due to the 32KB allocation unit size, but that's still much smaller than MinGW or Cygwin), but keep it mind it's C only, not C++.
>>7,8 The TCC website says it compiled a 60+ MB project, so it can definitely handle large projects. Maybe it stores everything in memory during the compilation process, rather than using temporary files like GCC, in which case the segfaults could be due to limited memory.
dunno. Yet my project compiled with GCC works. But when compiled with TCC it begins running and then segfaults.
Name:
Anonymous2016-09-08 16:48
>>10 Oh, you mean the generated program segfaults, I thought you meant the compiler segfaults when generating code. How big a project are you talking about? How many lines of code is it, and how big are the TCC/GCC generated binaries? And does it link to any non-standard external libraries?
Name:
Anonymous2016-09-08 17:22
Protip - Don't use C.
Name:
Anonymous2016-09-08 17:25
>>12 What do you suggest, assembly? Pretty much all other languages are implemented in C, or at least depend on C libraries.
Name:
Anonymous2016-09-08 17:44
Okay. I've managed to reduce GCC installation down 50 megabytes uncompressed. Or 15 megs compressed. That should be good enough for online distribution.
I'm sitting at Russian cafe and wifi here is friggin slow.
Name:
Cudder !cXCudderUE2016-09-09 11:25
It never ceases to amuse me how stupid GCC is, despite its size.
This boringly trivial function void foo(int *x) { (*x)++; }
with default settings, turns into this monster:
; this isn't 16-bit --- you can use rsp too, retard push rbp mov rbp, rsp ; you write the first param into memory... mov QWORD PTR [rbp-8], rdi ; ... just so you can read it back again? WTF!? mov rax, QWORD PTR [rbp-8] mov eax, DWORD PTR [rax] ; Let's waste another register just so we can show off ; how clever we are with the lea instruction. Idiot. lea edx, [rax+1] ; If you were the slightest bit intelligent, you would ; not overwrite rax with the value. If you were just a ; tiny bit more so, you'd realise that it was already in ; rdi. This is terminally retarded. mov rax, QWORD PTR [rbp-8] mov DWORD PTR [rax], edx ; A NOP!?! What idiocy made you put one here? nop pop rbp ret
It outputs a much better "inc dword [edi]" (as it should) with optimisation, but why the fuck does it even bother generating all that shit otherwise? It's like the default is "-O-3".
Intel pays Stallman, so he makes GCC slow, so Intel could sell newer CPUs.
Name:
Anonymous2016-09-09 13:42
>>19 How exactly were you able to get GCC output in Intel syntax? I was able to do it with the online tool at gcc.godbolt.org, however even with maximum optimization it gives add DWORD PTR [rdi], 1 rather than using the inc instruction. This seems to be the case with all x86 GCC versions. However, both Clang and ICC generate something along the lines of inc DWORD PTR [rdi] which is much closer to your optimized version.
Though in any case, I do agree its silly to generate half a page of assembly code and use 6 registers just to perform a dereference-and-increment operation.
>>19 You do realize a human didn't write the code right?
Name:
Anonymous2016-09-09 17:32
>>24 You do realize a human wrote the algorithm that produced the code right?
Name:
Anonymous2016-09-09 17:37
>>24 Why would a compiler generate so many unnecessary instructions? If that's how it increments through a pointer, imagine what the quadratic formula would look like.
increment-symbol was alone in the lambda forest. It was three nights ago, since he escaped increment-factory of People's Republic of Java. It was cold and lambdas obscured the path: he remembered the warmth of register fires(or was it register files) popping and crackling out of a stack. He tried to chew on partially applied lambda, but the taste made him jump out of pain. He nearly vomited into a random register and jumped again. The lambda forest was moving around him, like a menacing swarm of shadowy pointers, eager to garbage collect anything that crosses their path. A ray of light appeared in the form of large glowing lizard with "Suave Space Toad Deliveries" stamped on its back. He jumped on its tail in final show of strength and hoped the lizard will end his misery. The tail shifted and springing back launched the increment-symbol out of forest.
Name:
Anonymous2016-09-09 22:09
>>27 (defun increment-symbol (src) (declare (optimize (speed 3) (safety 0))) (incf (the number (symbol-value src))))
Practically living on edge of the stack. Who leaves such horror in mission-critical code such as increment-symbol? What risks they take in secret? Could it be exploitable? What if malicious hackers cause it to Double-Increment?
Name:
Anonymous2016-09-09 23:19
>>30 if you want the safety code, dont call it bloat
>>19 The nop is the funniest part. It's nopnot even trying to optimize by aligning anything, but it still puts a nop randomly in the code as if to mock you.
Yeah, that's what I don't get. It seems like someone would have to go out of their way to make such a simple function translate into something so complex.
Name:
Anonymous2016-09-10 17:35
>>36 it makes sense if you've ever looked at the gcc source. it's a radioactive cesspit.
That instruction is used to fill space for alignment purposes. Loops can be faster when they start on aligned addresses, because the processor loads memory into the decoder in chunks. By aligning the beginnings of loops and functions, it becomes more likely that they will be at the beginning of one of these chunks. This prevents previous instructions which will not be used from being loaded, maximizes the number of future instructions that will, and, possibly most importantly, ensures that the first instruction is entirely in the first chunk, so it does not take two loads to execute it.
The compiler knows that it is best to align the loop, and has two options to do so. It can either place a jump to the beginning of the loop, or fill the gap with no-ops and let the processor flow through them. Jump instructions break the flow of instructions and often cause wasted cycles on modern processors, so adding them unnecessarily is inadvisable. For a short distance like this no-ops are better.
The x86 architecture contains an instruction specifically for the purpose of doing nothing, nop. However, this is one byte long, so it would take more than one to align the loop. Decoding each one and deciding it does nothing takes time, so it is faster to simply insert another longer instruction that has no side effects. Therefore, the compiler inserted the lea instruction you see. It has absolutely no effects, and is chosen by the compiler to have the exact length required. In fact, recent processors have standard multi-byte no-op instructions, so this will likely be recognized during decode and never even executed.
>>40 Instruction alignment can be a pretty big deal though, in particular in regards to loops. I got a quite significant performance increase (30% or so) when I aligned a big loop (had lots of big avx instructions) so it needed one less i-cache fetch and could fit entirely in the loop buffer cache.
>>38,39 You apparently missed the part where this is code generated with NO optimization, and the NOP, along with all the other shit, completely disappears if compiled at O1 or higher.
That turns into a .comment section in the compiled object file, and when the object files are linked into a binary all of the ident strings will be included. Some object files will almost invariably come from libraries, so normally you'll see at least a couple of different GCC versions mentioned when you examine your executables.
There does not seem to be any reason for the .ident strings other than backwards compatibility (with SVR4, according to some sources). You can inhibit the automatic generation of .ident directives using the GCC compiler option -fno-ident, but unless you have also rebuilt all libraries with this option you'll still get strings from e.g. glibc when linking.
Name:
Anonymous2016-09-12 22:44
>>45 Large part of executable consists of many times repeated GCC ident. Now multiply it by total number of executable in bin and total number of Linux machines to get the number of bytes and CPU cycles wasted to load this crap into memory for HDD and transfer Linux distro over network.
GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 6.2.0..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 6.2.0..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 6.2.0
Name:
Anonymous2016-09-12 22:55
Passing to GCC -ffunction-sections -fdata-sections -Wl,--gc-sections crashes it: gcc: internal compiler error: Aborted (program collect2) Please submit a full bug report, with preprocessed source if appropriate. See <http://sourceforge.net/projects/mingw-w64> for instructions.
The GNU strings program will not show these strings by default, because they are in uninitialized .comment sections in the binaries.
Bastards. strings shouldn't give one flying fuck what type of file it's been given, instead those GNUtards bloat it up with dumb shit like this.
Name:
Anonymous2016-09-13 12:15
>>50 Maybe that is some backward compatibility boilerplate?
Depending upon how the strings program was configured it will default to either displaying all the printable sequences that it can find in each file, or only those sequences that are in loadable, initialized data sections. If the file type in unrecognizable, or if strings is reading from stdin then it will always display all of the printable sequences that it can find. For backwards compatibility any file that occurs after a command line option of just - will also be scanned in full, regardless of the presence of any -d option.
Name:
Anonymous2016-09-13 12:29
Besides half a megabyte of ident strings, GCC also includes by default a lot of useless bloat, like unused stdlib and runtime routines, unwind-tables, exceptions and RTTI. And all this bloat has to be loaded from disk, wasting memory and slowing down OS and software initialization.
Name:
Anonymous2016-09-13 13:12
And then people invent things like BusyBox or aggressive static linking projects (suckless, anyone?) just to counter GCC stupidity.
Name:
Anonymous2016-09-13 14:25
>>50 Then how would you suggest strings distinguish "string data" from non-string data?
>>52 A lot of that is for debugging, and should be stripped from a release binary.
>>53 Do either of those actually have their own C compilers? Or do they have tools to debloat GCC-generated binaries?
"the original reason for this was largely political - the people who added DWARF-based unwinding (.eh_frame) wanted it to be a feature that's always there so it could be used for implementing all kinds of stuff other than just C++ exceptions"
"You cannot strip them with the strip command later; since .eh_frame is a section that lives in the loaded part of the program (this is the whole point), stripping it modifies the binary in ways that break it at runtime. "
Do either of those actually have their own C compilers? Or do they have tools to debloat GCC-generated binaries?
BusyBox is a single executable to get a better overhead/code ratio overall. This approach can affect code reuse but it also reduces overhead of things like headers and link tables and whatnot. {BLOAT+CODE}, {BLOAT+CODE}, {BLOAT+CODE} becomes {BLOAT+CODE+CODE+CODE}
Name:
Anonymous2016-09-13 16:38
GCC compiled executable also includes some crazy code like following: int sub_401540() { HMODULE v0; // rax@2 void *v1; // rax@3
I don't like Java, so it is annoying that my project includes references to Java. People could by mistake infer, that I wrote it in Java and I will look bad.
Name:
Anonymous2016-09-13 18:32
>>66 Wait, Symta is written in Java? I thought it was Common Lisp....
Symta is written in Symta, but uses GCC to compile itself into x86_64. Yet GCC secretly inserts a lot of Java-related boilerplate into any executable it creates.
Also, GCC totally ignores __attribute__ ((noinline)) and even -fno-inline-functions, which can mess you code, if it relies on some functions not being inlined and also bloats object code with copies of a function.
>>69 Obviously Intel paid GCC maintainers to subtly produce inefficient code, making CPUs with small caches look slow:
gcc _used_ to always do what people asked for, Linux has historically treated "inline" as a "force_inline". And I was very unhappy when gcc changed that, just because it broke historically good code. In many ways, it might have been better if we had a "__may_inline" thing to tell the compiler "you can inline this if you think it's worth it"). Both gcc (long ago) and Ingo (now) decided to just make plain "inline" mean that, but with a pretty strong bias. It was wrong for gcc to do so, imho, and it may have been wrong for this OPTIMIZE_INLINE thing too.
Name:
Cudder !cXCudderUE2016-09-14 11:00
>>54 Contiguous sequences of bytes in the ASCII range, like it had always done before the GNUtards fucked it up with their bloat?
>>68 It's probably coming from a dependency. Even the GNUtards aren't that retarded to put Java into a C program...
It's probably coming from a dependency. Even the GNUtards aren't that retarded to put Java into a C program...
Nope. It is part of compatibility with Java libraries. The only way to disable it is by recompiling GCC with -disable-libgcj, which is impossible on Windows (GCC compilation requires Unix system).
The DJGPP startup code does many things in preparation for running a protected-mode program in a Posix-compliant environment. This includes switching the processor to protected mode (which requires a lot of code), wildcard expansion, long command-line support, and loading the environment from a disk file; these usually aren't available with other DOS compilers. Exception and signal handling (not available at all in v1.x), FPU detection and emulator loading (which were part of go32 in v1.x), are now also part of the startup code.
And lo and behold: TCC .exes don't do wildcard expansion. I'd always thought this is the OSs job but it seems compilers need to fix up some environment stupidities... Windows also does provide cmd args in both ASCII AND UTF-16/UCS-2 and, you guessed it, getting the UCS-2 args requires a special Win32 call -- __wgetmainargs()
It is no longer possible to compute function-size in GCC. "Because GCC-3.4 reorders completely functions and this is no more working: int myfct (int param) { return 3 * param; } asm (" __sizeof__myfct = . - myfct \n");"
knowing function size is useful if you use it for automatic code generation or producing a backtrace.
So GCC does what it was not asked to do, thinking it is smarter than programmer.
there is no real way to do this from my experience (especially not at compile time). I did have success with using an empty function after my flash writer and then doing pointer arithmetic to find the size of the flash func, eg
void A() { ... }
void B() {}
//find size of A sizeofA = B - A; //something like that
I didn't feel safe doing this though so in the end I just ended up starting at A and copying as much data as I could fit in my buffer because it doesn't matter if you copy far too much, better to be safe than sorry.
Name:
Anonymous2016-09-15 1:07
>>81 That's true for any codebase that's as old and as widely scoped as GCC. This is true for Linux, the rest of GNU or any of the BSD systems. Software doesn't grow on trees, it happens because people invest their time into studying and developing it. This is the reason why enterprises advocate for the use of software patterns, reusable and composible modules (OO programming) and strict documentation demands because it's very normal for big software systems to be maintained and extended by developers who did not write the older system.
Name:
Anonymous2016-09-15 10:47
>>84 Most of GCC codebase is just garbage, like this libgcj boilerplate. People need just a reliable C compiler, that does only what was ordered to do.
Name:
Anonymous2016-09-15 13:51
>>85 You're projecting your needs onto these mythical "people".
Name:
not >>852016-09-15 14:16
>>86 I personally would like to have a simple, non-bloated, reliable C compiler.
People need just a reliable C compiler, that does only what was ordered to do.
GCC isn't just a C compiler and was never meant to be. It was probably meant to be the only compiler installed on a system. But I agree, I also want just a simple and fast <1MiB C environment.
>>90 How so? Yes, as I said, it doesn't do wildcard expansion for example. But how does it produce broken code?
How so? Yes, as I said, it doesn't do wildcard expansion for example. But how does it produce broken code?
It segfaulted on me, while code compiled with didn't. And calling TCC compiled code from GCC compiled code will give segfaults. Too lazy to research why it segfaulted and write a patch or a workaround.
Name:
Anonymous2016-09-15 23:56
>>92 So you're saying TCC is incompatible with GCC? Quite likely, might be GCCs fault, though, too (yes, unlikely). In reality, a lot of compilers are probably binary incompatible with each other. What I really think is that I'd guess that TCC produces correct code in terms of C compliance, but not POSIX/operating system compliance, i.e. it shits on ABIs and ``common standards''.
Just tested and I get both TCC and GCC compiled DLLs running in TCC-built exe. Can't get TCC-built DLLs running in GCC, though -- don't know how to emit .a from TCC or create .a from .def in some way...
It segfaulted on me, while code compiled with [GCC?] didn't
Yes, might be that your code wasn't fully C compliant. As we all know, writing 100% correct C code is hard. Personally, I've had issues with code that returns structs from functions in TCC. Might be a C thing, not a TCC thing.
Name:
Anonymous2016-09-16 1:06
The following code prints garbage and segfaults when compiled with TCC on Windows, but works fine with at least one version of MinGW-64 GCC.
#include <stdio.h>
typedef struct meme { int hax; char anus; } meme;
meme mememaker(int n, char c) { meme ameme; ameme.hax = n; ameme.anus = c; return ameme; }
Returning a struct from a function doesn't seem common practice anyhow, more common is to return a pointer to malloc'd storage. And it makes me wonder, a function can't return an array, but if a function can return a struct, can it just return a struct containing nothing but an array? Seems inconsistent to me.
And it makes me wonder, a function can't return an array, but if a function can return a struct, can it just return a struct containing nothing but an array?
I *guess* the issue here is that arrays can have unknown size while structs have fixed size -- which can be a problem when you want to have things returned on stack.
And to clarify, C89 says:
A function declarator shall not specify a return type that is a function type or an array type.
So yes: structs allowed, TCC (seemingly) broken.
Name:
Anonymous2016-09-16 2:08
>>94,95 My TCC on windows works just fine. I just copied your code and it had not a problem. I use TCC exclusively on Windows so I don't have to deal with GNU nonsense and I have not once had any of the issues in this thread.
PS C:\Users\Adam> more anus.c #include <stdio.h>
typedef struct meme { int hax; char anus; } meme;
meme mememaker(int n, char c) { meme ameme; ameme.hax = n; ameme.anus = c; return ameme; }
And it makes me wonder, a function can't return an array, but if a function can return a struct, can it just return a struct containing nothing but an array?
Yes. struct k { char foo[256]; } myfunfunc() { struct k ks = { "Foo!" }; return ks; } A function "returning" a struct actually gets converted to a function having a pointer to a struct as its first parameter: struct k { char foo[256]; } *myfunfunc(struct k *_ret) { struct k ks = { "Foo!" }; memcpy(_ret, &ks, sizeof(ks)); return _ret; } Those two above should compile to byte-identical code, and at least in MSVC6, they do.
>>96 That's weird, because it doesn't work for me either (crashes)... I have Win7, tcc version 0.9.26 (x86-64 Win64).
>>97 So the caller allocs the memory for the return val, the callee fills it in, returns a pointer to it in eax/rax and the array doesn't get popped by the callee? Also: what about Flexible array members in structs (since C99) like struct {int a; int x[];}; where the size isn't known?
Also: what about Flexible array members in structs (since C99) like struct {int a; int x[];}; where the size isn't known?
educated guess: only the pointer is stored inside a struct
Name:
Anonymous2016-09-16 14:33
>>100 Flexible array members aren't pointers -- they're inserted directly at this position (usually at the end of the struct). In my example, the size of the struct isn't (sizeof(int) + sizeof(int*)) but just sizeof(int). https://en.wikipedia.org/wiki/Flexible_array_member
The sizeof operator on such a struct is required to give the offset of the flexible array member.
don't know how to emit .a from TCC or create .a from .def in some way...
you don't need .a or .def to create a .dll with TCC. Just call tcc.exe with -rdynamic, -shared and -r, then load it from GCC compiled file with dlopen.