The HTML5 entities list and code, after cleaning out the dupes and other crap, still turns into >20k of binary (i.e. around 2/3 as big as everything else so far; which includes a HTML tokeniser/parser, DOM viewer, UI, and crude renderer)... the HTML4 ones are a slightly better ~4k.
Especially since it's not needed for Acid2 pass and would otherwise contribute to >30% of my 64k budget, I'll stay with the 253 HTML4 entities. It'll be easy to add the rest of them anyway.
>>245 I recall that LAC appeared on the /prog/rider IRC once to say that he was done playing his character and while that might have been an imposter he hasn't posted since. I really miss his rants about stakboi retoids.
>>249 /prog/rider is dead, so no. There was progrider@conference.jabber.ccc.de during the shutdown, but it pretty much became the goatfinger chatroom after a while. No idea if it's even alive anymore.
They've also introduced a bunch of completely unnecessary states and a "temporary buffer" just for character references. What a load of bullshit. The spec was already disgustingly verbose, and they made it even worse.
Name:
Anonymous2017-05-01 10:55
>>254 Well opera has an email and irc client, yet they both aren't HTML
Name:
Anonymous2017-05-01 11:11
>>254 I never took a look at web specs until now, but what the fuck is this? This looks like spec-by-implementation, except that the implementation isn't in a programming language.
>>254,256 Why don't you email the committee with all the mistakes and foolishness that you have found? I am actually interested to see how they are going to defend this cancerous radioactive mess.
>>257,258 Many years ago when this first started I tried to ask them to replace a quadratic-time algorithm in the spec with a linear-time one which produced the identical results, but they didn't care. You are welcome to try, however.
It's mostly an advert for MyHTML, which turns out to be faster than the others, but I haven't been able to benchmark it against mine since his benchmark code is not usable on Windows... and I don't have the same hardware so the results there aren't comparable.
Meanwhile, I've improved mine so it parses the original HTML5 spec page I was using for testing in 46ms. It started at ~170ms, then moved down to 70ms, 60ms, and now 46ms. Amusingly enough, assembling and linking the compiler output vs. letting the compiler do it, makes it faster by ~3ms (to 43ms.) The original progrider page took 8ms; it's down to 1.2ms.
Maybe it's good enough now, and I should move onto CSS parsing...
Name:
Anonymous2017-05-15 14:05
Will anyone design a new car from scratch, just because others are expensive and bloated? Thats seems like a waste of effort
Name:
Anonymous2017-05-15 14:48
>>274 Is it still a waste if you design it to be easily produced, say with a 3D printer?
Name:
Cudder !cXCudderUE2017-05-16 2:28
I think I have a competitor, who coincidentally also seems to be Russian:
The ~1MB HTML5 spec: Parser Mem Time ------------------------------ MyHTML 11.3MB 27.7ms parseh(mine) 3.72MB 43.6ms
Mine looks significantly slower, but MyHTML is reading the whole file into memory and processing it in one go whereas I'm doing it in 4KB blocks (much like a real browser would, for incremental rendering). I'm also using 1/3 of the memory, and there is some GUI stuff too --- the crude DOM viewer and renderer is part of this, whereas MyHTML is only the parser with the bare minimal CLI needed to make it parse.
How about something bigger... much bigger?
100MB of HTML: Parser Mem Time ------------------------------ MyHTML 1850MB 32957ms parseh(mine) 540MB 8506ms
This eliminates any startup overhead and shows that even when it's reading 4KB at a time, mine is almost 4x faster and uses 3/10ths of the memory. Cache effects are important here.
Name:
Anonymous2017-05-16 21:48
>>278 The main difference is that MyHTML didn't take 4 years and is cross-platform.