Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Embeddable GCC

Name: Anonymous 2016-09-08 14:56

Is there a small version of GCC for shipping with Windows program? Because Mingw installation takes frigging gigabyte, including a lot of bloat, like C++ and fortran compiler, with useless crap, like directx bindings.

Name: Anonymous 2016-09-10 20:34

>>40
Instruction alignment can be a pretty big deal though, in particular in regards to loops.
I got a quite significant performance increase (30% or so) when I aligned a big loop (had lots of big avx instructions) so it needed one less i-cache fetch and could fit entirely in the loop buffer cache.

Name: Anonymous 2016-09-10 22:50

>>38,39
You apparently missed the part where this is code generated with NO optimization, and the NOP, along with all the other shit, completely disappears if compiled at O1 or higher.

Name: Anonymous 2016-09-12 18:42

Reduced GCC size even future, but now it has no standard library (which was huger, over 7 megabytes).
https://github.com/saniv/symta-releases

Name: Anonymous 2016-09-12 19:50

Fucking dubs. Check 'em!

Name: Anonymous 2016-09-12 22:21

Wow! GCC bloats executables on purpose!
http://www.trilithium.com/johan/2004/12/gcc-ident-strings/
GCC behaviour to automatically generate an .ident directive into the assembly output:

.ident "GCC: (GNU) 4.0.0 20041214 (experimental)"

That turns into a .comment section in the compiled object file, and when the object files are linked into a binary all of the ident strings will be included. Some object files will almost invariably come from libraries, so normally you'll see at least a couple of different GCC versions mentioned when you examine your executables.

There does not seem to be any reason for the .ident strings other than backwards compatibility (with SVR4, according to some sources). You can inhibit the automatic generation of .ident directives using the GCC compiler option -fno-ident, but unless you have also rebuilt all libraries with this option you'll still get strings from e.g. glibc when linking.

Name: Anonymous 2016-09-12 22:44

>>45
Large part of executable consists of many times repeated GCC ident. Now multiply it by total number of executable in bin and total number of Linux machines to get the number of bytes and CPU cycles wasted to load this crap into memory for HDD and transfer Linux distro over network.

GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 6.2.0..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 6.2.0..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 4.9.3..GCC: (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 6.2.0

Name: Anonymous 2016-09-12 22:55

Passing to GCC -ffunction-sections -fdata-sections -Wl,--gc-sections crashes it:
gcc: internal compiler error: Aborted (program collect2)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://sourceforge.net/projects/mingw-w64> for instructions.

Name: Anonymous 2016-09-13 10:02

>>47
Works on my machine. GCC 5.1.0

Name: Anonymous 2016-09-13 10:05

int __attribute__((used)) this_function_will_not_be_gced (){int GCC=100; }

Name: Cudder !cXCudderUE 2016-09-13 11:14

>>45
The GNU strings program will not show these strings by default, because they are in uninitialized .comment sections in the binaries.
Bastards. strings shouldn't give one flying fuck what type of file it's been given, instead those GNUtards bloat it up with dumb shit like this.

Name: Anonymous 2016-09-13 12:15

>>50
Maybe that is some backward compatibility boilerplate?
Depending upon how the strings program was configured it will default to either displaying all the printable sequences that it can find in each file, or only those sequences that are in loadable, initialized data sections. If the file type in unrecognizable, or if strings is reading from stdin then it will always display all of the printable sequences that it can find. For backwards compatibility any file that occurs after a command line option of just - will also be scanned in full, regardless of the presence of any -d option.

Name: Anonymous 2016-09-13 12:29

Besides half a megabyte of ident strings, GCC also includes by default a lot of useless bloat, like unused stdlib and runtime routines, unwind-tables, exceptions and RTTI. And all this bloat has to be loaded from disk, wasting memory and slowing down OS and software initialization.

Name: Anonymous 2016-09-13 13:12

And then people invent things like BusyBox or aggressive static linking projects (suckless, anyone?) just to counter GCC stupidity.

Name: Anonymous 2016-09-13 14:25

>>50
Then how would you suggest strings distinguish "string data" from non-string data?

>>52
A lot of that is for debugging, and should be stripped from a release binary.

>>53
Do either of those actually have their own C compilers? Or do they have tools to debloat GCC-generated binaries?

Name: Anonymous 2016-09-13 14:33

>>53

"the original reason for this was largely political - the people who added DWARF-based unwinding (.eh_frame) wanted it to be a feature that's always there so it could be used for implementing all kinds of stuff other than just C++ exceptions"

Name: Anonymous 2016-09-13 14:34

>>54
A lot of that is for debugging, and should be stripped from a release binary.
see >>55

it is done on purpose to bloat the resulting executable.

Name: Anonymous 2016-09-13 14:35

>>54

"You cannot strip them with the strip command later; since .eh_frame is a section that lives in the loaded part of the program (this is the whole point), stripping it modifies the binary in ways that break it at runtime. "

Name: Anonymous 2016-09-13 14:47

>>54
Do either of those actually have their own C compilers? Or do they have tools to debloat GCC-generated binaries?
BusyBox is a single executable to get a better overhead/code ratio overall. This approach can affect code reuse but it also reduces overhead of things like headers and link tables and whatnot.
{BLOAT+CODE}, {BLOAT+CODE}, {BLOAT+CODE}
becomes
{BLOAT+CODE+CODE+CODE}

Name: Anonymous 2016-09-13 16:38

GCC compiled executable also includes some crazy code like following:
int sub_401540()
{
HMODULE v0; // rax@2
void *v1; // rax@3

if ( qword_4030A0 )
{
v0 = GetModuleHandleA("libgcj-16.dll");
if ( !v0 )
{
v1 = sub_401530;
goto LABEL_4;
}
v1 = GetProcAddress(v0, "_Jv_RegisterClasses");
if ( v1 )
{
LABEL_4:
((void (__fastcall *)(__int64 *))v1)(&qword_4030A0);
return sub_4017D0(sub_4015A0);
}
}
return sub_4017D0(sub_4015A0);
}


Why? Why my printf("Hello, World!\n") needs all this?

Name: Anonymous 2016-09-13 16:43

>>4
using Windows and complaining about non-free software
ISHYGDDT

Name: Anonymous 2016-09-13 17:00

>>60

VS is okay as a tool, but you can't embed it into your own software, without permission from Microsoft. And even if you get permission, VS is huge.

Name: Anonymous 2016-09-13 17:31

>>61
VS is an IDE, not a compiler.

Name: Anonymous 2016-09-13 17:32

>>59
Liar.

Name: Anonymous 2016-09-13 17:33

>>62
IDE includes compiler.

Name: Anonymous 2016-09-13 17:34

>>64
Yes, and you happen to need only the compiler which is small.

Name: Anonymous 2016-09-13 17:36

>>63
Download it and disassemble for yourself (symta.exe): https://github.com/saniv/symta-releases/blob/master/symta-0.0.3-3-w64.zip

I don't like Java, so it is annoying that my project includes references to Java. People could by mistake infer, that I wrote it in Java and I will look bad.

Name: Anonymous 2016-09-13 18:32

>>66
Wait, Symta is written in Java? I thought it was Common Lisp....

Name: Anonymous 2016-09-13 18:46

>>67

Symta is written in Symta, but uses GCC to compile itself into x86_64. Yet GCC secretly inserts a lot of Java-related boilerplate into any executable it creates.

Also, GCC totally ignores __attribute__ ((noinline)) and even -fno-inline-functions, which can mess you code, if it relies on some functions not being inlined and also bloats object code with copies of a function.

Name: Anonymous 2016-09-13 21:17

>>68
Linus Torvalds said that GCC is crap already in year 2000
http://yarchive.net/comp/linux/gcc_inline.html

Name: Anonymous 2016-09-13 21:23

>>69
Obviously Intel paid GCC maintainers to subtly produce inefficient code, making CPUs with small caches look slow:
gcc _used_ to always do what people asked for, Linux has historically treated "inline" as a "force_inline". And I was very unhappy when gcc changed that, just because it broke historically good code. In many ways, it might have been better if we had a "__may_inline" thing to tell the compiler "you can inline this if you think it's worth it"). Both gcc (long ago) and Ingo (now) decided to just make plain "inline" mean that, but with a pretty strong bias. It was wrong for gcc to do so, imho, and it may have been wrong for this OPTIMIZE_INLINE thing too.

Name: Cudder !cXCudderUE 2016-09-14 11:00

>>54
Contiguous sequences of bytes in the ASCII range, like it had always done before the GNUtards fucked it up with their bloat?

>>68
It's probably coming from a dependency. Even the GNUtards aren't that retarded to put Java into a C program...

>>69
I prefer this one:
https://lkml.org/lkml/2014/7/24/584

Name: Anonymous 2016-09-14 11:09

C-dder is all talk and no action.

Name: Anonymous 2016-09-14 11:09

>>71
It's probably coming from a dependency. Even the GNUtards aren't that retarded to put Java into a C program...
Nope. It is part of compatibility with Java libraries. The only way to disable it is by recompiling GCC with -disable-libgcj, which is impossible on Windows (GCC compilation requires Unix system).

Name: Anonymous 2016-09-14 14:11

Name: le segfault face !yKlKAT7mo. 2016-09-14 18:53

When will GCC support C++17?

Name: Anonymous 2016-09-14 19:05

>>74

That is Cygwin (Linux emulator), not Windows.

Name: Anonymous 2016-09-14 20:09

>>76
Cygwin is not an emulator, it's a POSIX->Win32 translation DLL along with a Windows port of GCC and other GNU software.

Name: Anonymous 2016-09-14 22:58

>>77
Cygwin is not an emulator
Then it would be called CINE, by analogy to WINE.

Name: Anonymous 2016-09-14 23:17

So I just did a small comparison of GCC vs TCC (on Win 7) and it seems some of the bloat in GCC makes sense, see: http://www.delorie.com/djgpp/v2faq/faq8_14.html
The DJGPP startup code does many things in preparation for running a protected-mode program in a Posix-compliant environment. This includes switching the processor to protected mode (which requires a lot of code), wildcard expansion, long command-line support, and loading the environment from a disk file; these usually aren't available with other DOS compilers. Exception and signal handling (not available at all in v1.x), FPU detection and emulator loading (which were part of go32 in v1.x), are now also part of the startup code.
And lo and behold: TCC .exes don't do wildcard expansion. I'd always thought this is the OSs job but it seems compilers need to fix up some environment stupidities...
Windows also does provide cmd args in both ASCII AND UTF-16/UCS-2 and, you guessed it, getting the UCS-2 args requires a special Win32 call -- __wgetmainargs()

>>75
Probably not before 2017...?

Name: Anonymous 2016-09-15 0:33

It is no longer possible to compute function-size in GCC.
"Because GCC-3.4 reorders completely functions and this is no more working:
int myfct (int param) {
return 3 * param;
}
asm (" __sizeof__myfct = . - myfct \n");
"

knowing function size is useful if you use it for automatic code generation or producing a backtrace.

So GCC does what it was not asked to do, thinking it is smarter than programmer.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List