Upon closer examination, there's even more retardedness in the entities list --- some are defined more than once! What sort of fucked-up design-by-committee lead to this idiocy?
We start with some "only slightly retarded" duplication...
ast; U+0002A
midast; U+0002A
lbrack; U+0005B
lsqb; U+0005B
...move onto WTF-inducing "you're an idiot if you think this is even the slightest bit useful"...
lowbar; U+0005F
UnderBar; U+0005F
grave; U+00060
DiacriticalGrave; U+00060
nbsp; U+000A0
NonBreakingSpace; U+000A0
...and finish with "ARE YOU FUCKING INSANE!?!?"
die; U+000A8
Dot; U+000A8
DoubleDot; U+000A8
uml; U+000A8
ap; U+02248
approx; U+02248
asymp; U+02248
thickapprox; U+02248
thkap; U+02248
TildeTilde; U+02248
Bonus level:
NegativeMediumSpace; U+0200B
NegativeThickSpace; U+0200B
NegativeThinSpace; U+0200B
NegativeVeryThinSpace; U+0200B
ZeroWidthSpace; U+0200B
Completely different names, yet the exact same codepoint. :quintuple-facepalm:
See it yourself at
https://www.w3.org/TR/html5/syntax.html (scroll to bottom, extract table, sort by codepoint.)
Now I know why there are 2K+ entities. Around half of them are duplicates with an extra ';' at the end (easily handled by the parsing code, but the brainless turds that wrote the spec did not even
think...), the other 1/4 are useless duplicates, and what's left is possibly, maybe sometimes, actually useful. But supposedly to be "HTML5 compliant" you would need to parse them all, regardless of whether anyone will actually use them except in demo pages and the like (probably not). Fuck that bullshit.
"Why browsers are bloated".