
Karl Fogel at
Well... it's a lot of special cases. Right now it's unreleasable because it infringes copyrights. For example, there's a conditional in the code that says "if (text == 'full text of this particular NYTimes editorial') { return 'this particular compressed string'; } else if (text == 'full text of this other editorial') { return 'this other compressed string'; } ..." etc, etc.
But I just realized that I could replace all those fulltext comparisons with hash comparisons! Then there would be no need to have the fulltexts in the code, and I could release the program. Thanks for helping me think that through.
The other problem is that it's a *lot* of special cases. I mean, this is a general theoretical property of a certain kind of compression program. You can achieve spectacular compression rations if you're willing to sacrifice on the side of the source code size of the compression program.
So I guess I'll release it when GitHub installs a few more yottabytes of storage?
Let me know.
But I just realized that I could replace all those fulltext comparisons with hash comparisons! Then there would be no need to have the fulltexts in the code, and I could release the program. Thanks for helping me think that through.
The other problem is that it's a *lot* of special cases. I mean, this is a general theoretical property of a certain kind of compression program. You can achieve spectacular compression rations if you're willing to sacrifice on the side of the source code size of the compression program.
So I guess I'll release it when GitHub installs a few more yottabytes of storage?
Let me know.