>>14-15you're rambling, FV. like, I usually have no problem understanding your posts but I can't follow the moon logic present here. pigeonhole principle is a mathematical principle, math has no concept of 'files' or 'usefulness'. if you have two bits of data, you won't be able to reproduce all possible states using just 1 bit.
if I understood correctly, you seem to propose a mixture of efficient algorithms tailored to specific types of data (good idea, but won't break pigeonhole principle) with some weird combination of lossy algorithms but with exhaustive search to find what was lost. now that's a fucking horrible idea. why? even disregarding time and memory requirements, it's basically impossible to tell the difference between 'useless' and 'useful' algorithmically. you could assume that high-entropy data is useless but that's not always the case. sure, it might be garbage but maybe I've compressed a folder which, among other things, contains encryption keys, already-compressed files and, worst of all, perl code? how could your algorithm tell the difference between noise and just things that look like noise but will break if you decide they should be replaced with something more structured?
basically, as clever as your algorithm might be, it won't break the theoretical limits set by pigeonhole principle while still being lossless.