Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-8081-

Hue Shift

Name: Anonymous 2019-07-19 9:30

Ok. The HSV formula for transferring RGB triplet into the HSV form is widely published. But I've failed to find any hints to how one arrives at it, so I could try to optimize it for my game's blitter routine. Therefore I tried to infer h,s,v formula myself, using basic vector math.

v = (r+g+b)/3 //value
hs = v-r,v-g,v-b //hue*saturation vector
s = hs.length //saturation
h = angle(hs/s) //hue is the angle of the vector


Guess when one needs to just change the saturation, there is no reason compute the angle or even vector length in full, but the common requirement to change the hue requires nasty computation. Can anyone hint at any efficient way of shifting hue?

Name: Anonymous 2019-07-19 10:00

I just took a really hue shift

Name: Anonymous 2019-07-19 10:46

>>2
hot

Name: Anonymous 2019-07-19 15:55

Note that in the hs/s vector the sum of any two elements is always equal to the third element, and its length is always less than 1, so it can be safely reduced to a single angle value.

Name: Anonymous 2019-07-19 18:31

Ok. Tried to complete the implementation of that h = angle(hs/s)

And found that I can't really represent hs/s as an angle, but I can still represent it as a single value by solving quadratic equation. I also had to solve the issue with v, doing hs=(v-r,v-g,v-b)/v instead, to normalize the shit. Now it is better than HSV, because it handles saturation properly. But the problem is: it is not HSV or HSL, because it still works inside the RGB cube.

Now the colormap looks like a strange irregular nonsense: http://lj.rossia.org/users/sadkov/457191.html

Still needs further research on how its hue part works, because I noticed that quadratic equation solely by accident.

Name: Anonymous 2019-07-19 18:40

Here is the full code, in case anyone can help with it.
qsolve B C =
| D = B*B - 4.0*C
| when D < 0.0: leave 0
| D = D.sqrt
| S0 = (D - B)/2.0
| S1 = (-D - B)/2.0
| S0,S1

dergb RGB =
| R = 0
| G = 0
| B = 0
| unrgb RGB R G B
| R = R.float/255.0
| G = G.float/255.0
| B = B.float/255.0
| V = (R+G+B)/3.0
| HS = [R-V G-V B-V]/V
| S = HS.abs
| H = HS/S
| Hue = H.0
| D = 0
| A = H.0
| Bs = qsolve A (A*A - 0.5)
| when Bs:
| B0,B1 = Bs
| B = H.1
| D <= if (B-B0).abs < (B-B1).abs then 1 else 2
| Hue,S,V,D

enrgb HSV =
| Hue,S,V,D = HSV
| A = Hue
| B = 0
| if D then
| Bs = qsolve A (A*A - 0.5)
| less Bs: leave rgb{ 0 0}
| B <= if D><1 then Bs.0 else Bs.1
else
| B <= -(A*0.5)
| C = -(A+B)
| H = [A B C]
| HS = H*S
| R,G,B = (HS*V + [V V V])*255.0
| rgb R.int G.int B.int

Name: Anonymous 2019-07-19 20:26

Warning: you're posting in a Nikita thread.

Name: Anonymous 2019-07-20 7:38

Ok. Dropped this shit. It solved the brightness-saturation clash problem, but I found that at higher brightness values it gets non-uniform. Beyond my current grasp in math to analyze further:
https://www.youtube.com/watch?v=qMgVFyn7M14

Found another format - Lab,
https://github.com/gka/chroma.js/blob/master/src/io/lab/lab2rgb.js

Name: Anonymous 2019-07-20 7:42

>>8
In addition it required at least 16 bits to hold the hue. But this Lab also requires 16 bits, so I cant used 32bit array to hold values, or have to sacrifice the alpha channel :(

Name: Anonymous 2019-07-20 7:51

>>7
Sage this post as untrue

Name: Anonymous 2019-07-20 12:10

>>10
Get mad, hamsterfucker.

Name: Anonymous 2019-07-20 15:33

>>7
Saging as fake new

Name: Anonymous 2019-07-20 16:19

>>12
Log off, GRU boy.

Name: Anonymous 2019-07-22 8:01

Ok. I designed a custom color space: https://www.youtube.com/watch?v=9JjjQEu9XHY

Name: Anonymous 2019-07-23 16:24

I found a nice video:

https://www.youtube.com/watch?v=82ItpxqPP4I

basically they call that crap XYZ plane, due to the property of X+Y+Z = 1.0. I botched the calculation, because I have no experience with linear algebra and plane equation. But well, I've admitted that my math skills are near zero and I prefer doing hacks, than solving the shit analytically.

Name: Anonymous 2019-07-24 10:40

r is 0, g is 120, b is 240 with the usual units
0, 80, 160 would fit in a byte

you should be able to change both saturation and value without changing hue
it might do some funny things like 0.0 sat, 1.0 val is #ffffff

Name: Anonymous 2019-07-24 11:34

I think val might be max(r,g,b)/255, sat = max(rgb) - min(rgb),
and then hue ~= mid-min / max-min

Name: Anonymous 2019-07-24 11:44

~hue = 0 is a solid r/g/b, or 0 degrees, ~hue = 1 is yellow/cyan, or +/- 45's

Name: Anonymous 2019-07-24 12:57

>>16
That is the problem with usual HSV - its saturation has interference with its value. Therefore HSV is considered bad choice even for modern video gaming graphics.

Name: Anonymous 2019-07-24 17:01

( r + g + b ) / 3

Name: Anonymous 2019-07-24 21:54

(r * 0.299f + g * 0.587f + b * 0.114f)

Name: Anonymous 2019-07-24 21:57

make blacker nigger

Name: Anonymous 2019-07-25 4:11

>>21
you could use this to make it keep it's greyscale value constant while changing hues

Name: Anonymous 2019-07-25 4:38

https://en.wikipedia.org/wiki/HSL_and_HSV#/media/File:HSV-RGB-comparison.svg
shows the linearity, it's not doing vector rotation

Hue is more of a mock-angle, the rgb vector length will vary as the hue is changed

Name: Anonymous 2019-07-25 7:42

>>24
Well, I actually used cosine to interpolate, because I disliked these pronounced magenta, cyan and yellow lines. And used proper weighted gamma (instead of (r + g + b)/3) to avoid botching brightness.

Name: Anonymous 2019-07-25 9:11

>>22
Tsk.

Name: Anonymous 2019-07-25 16:17

Here is the final encoder routine, converting the usual RGB into my version of HSL. Had to use int64_t, because of overflow in RGB_METRIC:
#define RGB_METRIC(dist, item, rr, gg, bb) \
do { \
int64_t x = (int64_t)(item)->r - rr; \
int64_t y = (int64_t)(item)->g - gg; \
int64_t z = (int64_t)(item)->b - bb; \
dist = x*x + y*y + z*z; \
} while (0)

uint32_t rgb2hsl(uint32_t rgb) {
uint32_t lm;
uint32_t hs;
uint32_t r = rgb>>16; //no div by 255, cuz luma() gives us L*255
uint32_t g = (rgb>>8)&0xff;
uint32_t b = rgb&0xff;
uint32_t rgb16 = R5G6B5(r,g,b);
uint32_t l = LUMA8(r,g,b);
lut_item_t *item = rgb_to_hs_lut + rgb16*3;
lut_item_t *best_item = item;
uint64_t dist,best_dist;
item += 1;
if (!item->hs) return (l<<16) | best_item->hs;
lm = inv8[l];
r = (r*lm)<<6;
g = (g*lm)<<6;
b = (b*lm)<<6;
RGB_METRIC(best_dist, item, r, g, b);
item += 1;
RGB_METRIC(dist, item, r, g, b);
if (dist < best_dist) {
best_dist = dist;
best_item = item;
}
item += 1;
if (!item->hs) return (l<<16) | best_item->hs;
RGB_METRIC(dist, item, r, g, b);
if (dist < best_dist) {
best_dist = dist;
best_item = item;
}
return (l<<16) | best_item->hs;
}

Name: Anonymous 2019-07-25 16:51

Why do you need to implement your own hueshift op? There are multiple good options available.

Name: Anonymous 2019-07-25 17:29

>>28
All options are some variation of usual HSV, or this academic Lab, which gives you useless a and b params.

Name: Anonymous 2019-07-25 17:58

Name: Anonymous 2019-07-25 20:04

>>30
Russian algorithm, no thank you!

Name: Anonymous 2019-07-26 1:42

>>25
I guess the trouble with using hue for gradients is you always get the colour cycling /rainbow type effect, as it separates out the rgb channel fades

It's probably good for storing a colour palette, beats trying to cycle through rgb values in code, and it'll even do a slight compression for groups of S/V/L

Name: Anonymous 2019-07-26 6:52

shift my dubs

Name: Anonymous 2019-07-26 7:25

Name: Anonymous 2019-07-26 7:29

>>32
You can still mix colors in HSV:
http://lj.rossia.org/users/sadkov/462113.html

although not the same as in rgb, because (Color1.h+Color2.h)/2 doesn't work with the spectral wheel. I.e. using this format makes mixing colors more expensive.

Name: Anonymous 2019-07-31 10:22

TLDR: I came with different color space, and now need a fast integer sqrt for gamma crunching.

Name: Anonymous 2019-07-31 10:23

>>36
TLDR: I came

Name: Anonymous 2019-07-31 10:32

>>36
everything depends on how much accuracy you want to sacrifice for speed, how fast you really need to be and do you have other requirements like memory use. iterative methods based on Newton's formula are a classic, and you can set max number of iterations to control the tradeoff. if this isn't enough, pre-computed lookup tables can help. here's a solution using those made by some stackoverflowgrammer: https://stackoverflow.com/a/1100591

Name: Anonymous 2019-07-31 12:07

>>38
I would have gone with just blending raw RGB, ignoring there gamma nature, but people say that is incorrect:
If it helps, you can think of sRGB as being an opaque compression format. You wouldn’t try to add two ZIP files together, and you wouldn’t try to multiply a CRC32 result by 2 and expect to get something useful, so don’t do it with sRGB! The fact that you can get something kinda reasonable out is a red herring, and will lead you down the path of pain and deep deep bugs. Before doing any maths, you have to “decompress” from sRGB to linear, do the maths, and then “recompress” back.

Name: Anonymous 2019-07-31 16:18

Update to: http://lj.rossia.org/users/sadkov/467130.html

Checked my sqrt against the log2 based sqrt, using clang's __builtin_clz (which should expand to single assembly opcode), and the library's sqrtf, called using (int)sqrtf((float)i):
#define CLZ(x) __builtin_clz(x)
uint32_t clz_sqrt(uint32_t value) {
if (!value) return 0;
uint32_t xn = 1 << ((32 - CLZ(value))/2);
xn = (xn + value/xn)/2;
xn = (xn + value/xn)/2;
xn = (xn + value/xn)/2;
return xn;
}



got rather strange results:
$ gcc -O3 test.c -o test && ./test
isqrt16: 6.498955
sqrtf: 6.981861
log2_sqrt: 61.755873



Clang provided CPU based sqrtss, which is nearly as fast as my one. Lesson learned: on x86 compiler can provide fast enough sqrt, which is less than %10 slower than what you can come with up yourself, wasting a lot of time, or can be 10 times faster, if you use some ugly bitwise hacks. And still sqrtss is a bit slower than custom function, so if you really need these 5%, you can get them. Yet ARM for example has no sqrtss, so log2_sqrt shouldn't lag that bad.

Name: Anonymous 2019-07-31 18:45

>>40
Posted it as an answer to https://stackoverflow.com/questions/1100090/looking-for-an-efficient-integer-square-root-algorithm-for-arm-thumb2/57293481

and got downvoted despite it being faster than anything but native sqrtss on x86.

Name: Anonymous 2019-08-01 3:59

This is not needed, just slows down the code.
Who would computer sqrt of 0?
if (!value) return 0;

Name: Anonymous 2019-08-01 7:45

>>42
if r, g or b is zero, then the sqrt is zero.

Name: Anonymous 2019-08-01 9:07

>>42
Then get rid of the the early return.
uint32_t clz_sqrt(uint32_t value) {
uint32_t xn = 1 << ((32 - CLZ(value))/2);
xn = (xn + value/xn)/2;
xn = (xn + value/xn)/2;
xn = (xn + value/xn)/2;
return xn*(value!=0);
}

Name: Anonymous 2019-08-01 9:16

>>45
If range is 0-255, then a lookup table will be much faster. 256 byte(sqrt is <16) of table will fit in the cache.

Name: Anonymous 2019-08-01 9:49

>>41
and got downvoted despite it being faster than anything but native sqrtss on x86.
question is about arm and you reply with x86 benchmarks. what did you expect, bydlita? make your're are game

Name: Anonymous 2019-08-01 10:28

>>46
can you prove it is slow on ARM?

Name: Anonymous 2019-08-01 10:31

>>44
That branch doesn't affect anything, because of x86 branch prediction. So eliminating it solves nothing.

Name: Anonymous 2019-08-01 11:01

>>48
Not with recent hardware fixes for Branch prediction exploits and compiler trampolines.

Name: Anonymous 2019-08-01 11:04

>>49
Just turn these fixes off. Security is overrated, when you're using smartphone. Edited on 01/08/2019 11:05.

Name: Anonymous 2019-08-01 11:04

Persistent threat without a possibility of mitigation in software

In February 2019, it was reported that there are variants of Spectre threat that cannot be effectively mitigated in software at all.[98][99]

Name: Anonymous 2019-08-01 11:11

>>50
Oh... I forgot. You can't turn them off. Because Microsoft, Apple and Linux Foundation know better.

Name: Anonymous 2019-08-01 11:15

Benchmark it with current Spectre patches.
Branch prediction is getting riskier and riskier.
https://en.wikipedia.org/wiki/Category:Speculative_execution_security_vulnerabilities Edited on 01/08/2019 11:20.

Name: Anonymous 2019-08-01 11:18

>>47
OP asked a question about ARM. you post answer about x86. it is you who needs to prove its relevance to the question you hamster-killing psychopathic bydlo

Name: Anonymous 2019-08-01 11:19

sqrt my dubs

Name: Anonymous 2019-08-01 11:26

>>54
I posted code that works fast on any CPU.

Name: Anonymous 2019-08-01 11:27

>>53
I've disabled OS auto-update to avoid all that crud. Moreover, auto-update easily eats several gigabytes of my precious SSD space.

Name: Anonymous 2019-08-01 12:30

>>56
Its several magnitudes slower than a lookup table.

Name: Anonymous 2019-08-01 12:43

>>58
proof?

Name: Anonymous 2019-08-01 12:49

>>56
how do you know this? your're are poast mentions it being fast on x86, on which you benchmarked it. this does not always map 1:1 to speed on ARM

Name: Anonymous 2019-08-01 12:57

>>60
Test it yourself.

Name: Anonymous 2019-08-01 13:23

>>61
StackOverflower: how much is 2+2?
Bydlita: 3+3 is 6
SO: but I want to know how much is 2+2
B: it's 6
SO: I don't think your're are right
B: prove it!

Name: Anonymous 2019-08-01 13:48

Hue shit

Name: Anonymous 2019-08-01 13:53

>>59
Lookup table: one(likely cached) memory load.
sqrt: 3 divs with consequent dependence, 1 early branch.

Name: Anonymous 2019-08-01 14:06

Lookup tables win EVEN if they don't fit in the cache, IIRC most chess programs have precomputed "bitboard" tables, often several megabytes of different piece tables to quickly solve intersection/bijection test for attacks. Only the first accesses of a such table is penalized, then the L2/L3 cache begins to kick in and no algorithms can compete.

Name: Anonymous 2019-08-01 14:15

look up the repeating digits in my poast number

Name: Anonymous 2019-08-01 16:02

>>64
ARMs are not that cache dependent and the LUT is just 1000 bytes - enough to fit in a cache.

Name: Anonymous 2019-08-01 16:04

>>65
In most cases lookup tables are accessed locally. I.e. if you're processing RGB color photo, then they RGB values will vary smoothly across image.

Name: Anonymous 2019-08-01 16:06

>>64
Also, that early branch dependency isn't my code, but a copy from the upvoted answer from stackoveflow.

Name: Anonymous 2019-08-01 16:52

>>67
256 bytes, if the sqrt is in range 0-15(0-255).
128 bytes, with more complex adressing(store 2 sqrts in one byte.4bits fits 0-15 exactly)

Name: Anonymous 2019-08-01 17:54

Lookups = non constant time operation.

Name: Anonymous 2019-08-01 19:10

>>71
Actually memory/cache latency is fairly constant.

Name: Anonymous 2019-08-02 5:28

>>36
Do you still have the presquared values?

function sqraprx(a, b) // ~= sqrt a^2 + b^2

c = max(a,b) + 0.41* min(a,b) / max(a,b)

return c;

Name: Anonymous 2019-08-02 5:33

Missed a multiply i think
c = max(a,b) + (0.41* min(a,b) / max(a,b)) * max(a,b)

Name: Anonymous 2019-08-02 5:35

lol, simplified
c = max(a,b) + 0.41* min(a,b)

Name: Anonymous 2019-08-02 5:40

sqraprx(3,4) = 5.2

Name: Anonymous 2019-08-02 5:57

\(c = max(a,b) + (0.41* min(a,b) / max(a,b)) * max(a,b)\)

Name: Anonymous 2019-08-02 7:13

>>77
It's using a lazy estimate/precalc of sqrt(2) in 1 + 1 * 0.41
error value of 20 on ~sqrt(300^2 + 400^2) doesn't seem too bad, +4% error
6% for (100, 400), and similar for (200,400)

Calculation should just about be competitive with the sum of square precalculation

Name: Anonymous 2019-08-02 13:30

>>73

I'm using it for gamma packing, not distance. And precision is somewhat important. I considered using 9bit floating point numbers, but they mapped badly to gamma rgbs, producing more loss of precision.

I'm doing it all in software, so I can't really afford true 16 floats, like GPUs do.

Name: Anonymous 2019-08-03 1:53

I was just introduced to YCoCg-R, I'm in love.
—FLIF user

Name: Anonymous 2019-08-03 8:26

>>80

r/2 + g/4 + b/2 is not a proper luma function. Proper luma is 0.299*r + 0.587*g + 0.114*b.

You can't even do a proper sprite recolor inside YCoCg. I.e. if you have a colorable font and make to draw it as blue, then in YCoCg your blue would be too dark and unreadable.

Name: Anonymous 2019-08-03 15:38

>>72
Not fully consistent however. Every single AES implementation that uses sboxes that I know of has been broken. Would you like some cia nigger to be able to see what is on your screen based on a timing attack?

Name: Anonymous 2019-08-04 0:15

>>82
*Using sbox values specially chosen by the people trying to break in

Name: Anonymous 2019-08-05 15:35

>>82
Not with real-time kernel patches, with fine grained multithreading its impossible once you run background task.

Name: Anonymous 2019-08-05 22:44

Instead of introducing the previously devised custom color space, I want to see how fast I can do hue-saturation change in plain RGB. It seems not exactly fast. But would it be fast enough for my game? Here is the saturation multiplier function.
void saturate(int *sr, int *sg, int *sb, int f) {
int r, g, b, l;
r = *sr;
g = *sg;
b = *sb;
r = r*r; g = g*g; b = b*b;
l = LUMA8(r,g,b);
l = l*(256-f);
r = (r*f + l)>>8;
g = (g*f + l)>>8;
b = (b*f + l)>>8;
if (r < 0) r = 0;
if (g < 0) g = 0;
if (b < 0) b = 0;
r = isqrt16(r); g = isqrt16(g); b = isqrt16(b);
*sr = clamp_byte255(r);
*sg = clamp_byte255(g);
*sb = clamp_byte255(b);
}



Yes. You see it right. A mere saturation boost/reduce requires 3 square roots and a lot of other operations. That is for each pixel. Ideally gamma function should be 2.2, but that would be even more expensive, square roots map better to a lookup table and there is that old Quake hack you can use to computer them lightning fast. In addition, gamma=2.2 would require doing r=pow22lut[r], instead of less expensive r*r, but 256 byes LUT isn't that expensive. Disabling gamma correction leads to heavy artifacts, like de-saturated sprite being too dark.

Still in my format changing saturation would be as simple as moving the U,V coords towards the whitepoint:
NV = (V-WV)*Saturation + WV
NU = (U-WU)*Saturation + WU

I.e. far more simpler code. So if one does full scene saturation change, then RGB is not an option. Hue shifting can be solved in part by recoloring, but even recoloring is expensive in general case, because of requiring to compute 256 byte LUT for every shade of recolored color.

Generally one reduces saturation to make special effects more eye popping. I.e. if on a bomb explosion you reduce surroundings saturation, that explosion would look more heavy.

TLDR: gamedev isn't easy.

Name: Anonymous 2019-08-06 3:26

>>85
Use the sqrt lookup tables instead of Isqrt
if (r < 0) r = 0; replaced by r*=(r>0) or something similar; which doesn't require a branch.
*sr = clamp_byte255(r); use ternary *sr= (r>255?255:r)

Name: Anonymous 2019-08-06 7:22

>>86
cmov is faster than imul. Although I should probably use uint32_t instead of int. then just one r>255 check would suffice.

Name: Anonymous 2019-08-06 8:28

also, 32-bit ints are not enough for gamma unpacked r,g,b, so one has to use floats or 64-bit ints.

Name: Anonymous 2019-08-06 9:39

Ok. For now I will use the following code:
static INLINE void saturate(int *sr, int *sg, int *sb, int f) {
int r = unglut[*sr];
int g = unglut[*sg];
int b = unglut[*sb];
int l = LUMA8(r,g,b)*(256-f);
*sr = glut[clamp(0,MAX_GAMMA,(r*f + l)>>8)];
*sg = glut[clamp(0,MAX_GAMMA,(g*f + l)>>8)];
*sb = glut[clamp(0,MAX_GAMMA,(b*f + l)>>8)];
}


Before transitioning to proper color space.

The problem is that I still have to support RGB color space for stuff like sprite sheet packing, because such transition from RGB to another color space is not one-to-one, and therefore lossy.

Name: Anonymous 2019-08-06 12:29

https://blog.johnnovak.net/2016/09/21/what-every-coder-should-know-about-gamma/

Nice article about gamma. Unfortunately I've stumbled upon it only after learning the lesson that sRGB is non-linear in the hard way. Still that guy explains a few quirks, like notice incorrect render and why font rasterizers use unusual 1.42 gamma.

Don't change these.
Name: Email:
Entire Thread Thread List