Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

C troglodytes absolutely annihilated by a toy language

Name: Anonymous 2020-02-04 13:24

Name: Anonymous 2020-02-04 14:04

smash a C program that was looked at by thousands of eyes
*2 eyes: http://git.savannah.gnu.org/cgit/coreutils.git/log/src/wc.c

also lmao
implying unix hackers know what they're doing
win32 gang represent

Name: Anonymous 2020-02-04 14:16

it's not a real wc replacement as it is not able to read files or handle any of the command-line switches. also, I guess if GNU wc is optimized for somehting, it's optimized for disk I/O

Name: Anonymous 2020-02-04 14:23

>>3
Disk I/O has not been a valid metric since the advent of SSD storage. Admit your role.

Name: Anonymous 2020-02-04 14:26

>>4
SSD I/O is still a thing, plust most people and most servers don't have an SSD-only setup

Name: Anonymous 2020-02-04 16:06

Haskell is unpredictable. C is predictable.

Name: Anonymous 2020-02-04 16:48

Is wc reinvented in Haskell every other year or what? I swear I've already read this article at least twice before.

Name: Anonymous 2020-02-04 19:44

>>7
Me too. Must be some Mandela effect thing, or just Haskell niggers nigging.

Name: Anonymous 2020-02-04 23:10

>>7
It is supposedly an improvement based on https://chrispenner.ca/posts/wc

Name: Anonymous 2020-02-05 7:58

My wordcount is x36 faster than wc on large files(to benchmarks use some linux .iso), but of course doesn't handle Unicode(i don't support Unicode on principle. Use <wchar> and replace isspace /iswspace from <wctype> if you need Unicode support.).
http://void.wikidot.com/code:wordcount-c

Name: Anonymous 2020-02-05 8:14

>>10
Not that utf-8 unicode files (as long as they're pure utf-8) CAN be counted by wordcount.c). Any wide characters or mixed unicode formats will be miscounted, as well as some local multi-byte encodings.

Name: Anonymous 2020-02-05 8:17

With some quite minor tweaks
Stopped reading. Java implementations already do better with tweaks.
>>10
doesn't handle Unicode
Then it doesn't work. Word counting is simpler if you only look for 0x20, while forgetting all about 0x0B & 0x09

Name: Anonymous 2020-02-05 8:19

>>12
It appears to work with pure utf-8 which is most of web unicode documents.

Name: Anonymous 2020-02-05 8:56

>>10
if you really want to be fast on large files, use mmap. also. freeze my anus

Name: Anonymous 2020-02-05 9:32

>>14
Not portable and is only x1.001-x1.008 faster(file caching already mmaps everything).
But if you need that extra performance, here it is.
//uwordcount.c utility by FrozenVoid
//UnixWordCount:Requires mmap support,Unix system assumed.
//Count number of words/lines/bytes in file(text-only).

#include <ctype.h>
#include <stdint.h>
#include <inttypes.h>
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/io.h>
#include <sys/mman.h>




int main(int argc,char**argv){
int in; int64_t words=0,lines=0,bytes=0;
struct stat filestat;

if(argc==2){
in=open (argv[1], O_RDONLY);
}else{in=fileno(stdin);}
if(in==-1){printf("Wrong filename:%s",argc==2?&argv[1][0]:
"Only one optional argument expected:wordcount [Filename]");
;perror("");return 1;}
fstat(in,&filestat);
char* buffer=mmap(NULL, filestat.st_size, PROT_READ, MAP_PRIVATE, in, 0);
int lastspace=1,curspace=0;
if(argc==2){
bytes+=filestat.st_size;
for(size_t i=0;i<filestat.st_size;i++){
char c=buffer[i];
curspace=isspace(c)!=0;
lines+=c=='\n';
words+=lastspace^curspace;
lastspace=curspace;
}

}else{//handle pipes/stdin
FILE* infile=stdin;
while(!feof(infile)){bytes++;
char c=(char)fgetc(infile);
curspace=!!isspace(c);
lines+=c=='\n';
words+=lastspace^curspace;
lastspace=curspace;
;}}
printf("%"PRIu64" %"PRIu64 " %"PRIu64"\n",lines,(words/2),bytes-(argc!=2));return 0;}

Name: Anonymous 2020-02-05 10:20

The coreutils wc is slow because its handle widechars and multiple encodings with full compatibility with everything that can run under linux. Look at the multibyte handling sections.
>>2 *10 people
https://github.com/coreutils/coreutils/blob/master/src/wc.c

Name: Anonymous 2020-02-05 10:50

>>16
changing the copyright year is programming

Don't change these.
Name: Email:
Entire Thread Thread List