1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Compression Algorithms

Discussion in 'Nerd Out Zone' started by Potatocat, Apr 12, 2016.

  1. Potatocat

    Potatocat Back Into Space

    • Member
    Today's compression algorithms for both lossy and lossless file deflation <opin.> could be better. </opin.>

    How a traditional compression algorithm works (ca): Find patterns of data that repeat multiple times, count number of times pattern is repeated (maybe even count if only a portion of the pattern is repeated!!!), then add the pattern to a table inside the output file's header, then write a reference to the pattern wherever it is needed.

    This works pretty well, but... IT COULD BE BETTER.

    The way I suggest it could be better is that, rather than making a new table in the file header for the file being worked on, simply have a table made in advanced that is stored inside the program that compresses the file, and places reference to certain patterns we already have stored/made into the output file.

    How this would or could work:
    -Read thousands of sample files, finding common patterns of data
    -Take the most common samples/patterns and put them in a reference table using HASH values
    -update regularly via internet, but keep records of version
    -keep at a quickly index-able size (1 gig?)
    -Compress a file by matching parts of the file to a pattern using a pattern matching algorithm
    -Can even compress FURTHER by finding patterns of patterns!!! (HAVE ANOTHER TABLE OF COMMON HASH VALUE COMBOS??!?!??)
    -Do not include table in header, only the type of file
    -Decompress by using the same version table with same HASH values

    Considerable Idea? Y/N?
  2. Potatocat

    Potatocat Back Into Space

    • Member
    7 views, doth not one have a thing to say?
  3. PsychoticLeprechaun

    PsychoticLeprechaun Designer & Web Developer

    • Dev Member
    I suspect having one huge table would slow decompression times considerably. I'm also not sure about the inner workings of compression algorithms in detail, but I would be interested to see how docs of different content would compare in duplicate contents of their tables.
    Potatocat likes this.
  4. Potatocat

    Potatocat Back Into Space

    • Member
    Once again humanity must choose between speed or reliability, I suppose. But, if quantum computing gets integrated into normal everyday computers, then this would be the ideal compression method.

    And I'm not sure how well docs would compare, but my guess is that the output will be smaller than the input, unless the look up table is very large.
  5. balmerhevi

    balmerhevi Guest

    Lzma2 is faster when using 4 or more cores and it gives better compression.

    More about.....Lzma2 7zip

    Balmer
    Potatocat likes this.
  6. Potatocat

    Potatocat Back Into Space

    • Member
    That's what I've heard.

Share This Page