Lossless encoding, or lossless compression, refers to the process of encoding data more efficiently so that it occupies fewer bits or bytes but in such a way that the original data can be reconstructed, bit-for-bit, when the data is decompressed. The advantage of lossless encoding techniques is that they produce an exact duplicate of the original data but they also have some disadvantages when compared to lossy encoding techniques.
Lossless encoding techniques cannot achieve high levels of compression. Few lossless encoding techniques can achieve a compression ratio higher than 8:1 which compares unfavorably with so-called lossy encoding techniques. Lossy encoding techniques -- which achieve compression by discarding some of the original data -- can achieve compression ratios of 10:1 for audio and 300:1 for video with little or no perceptible loss of quality. According to the New Biggin Photography Group a 1,943 by 1,702 pixel 24-bit RGB color image with an original size of 9.9 megabytes can only be reduced to 6.5 megabytes using the lossless PNG format but can be reduced to just 1 megabyte using the lossy JPEG format.
Any application that involves storing or distributing digital images, or both, presupposes that these operations can be completed in a reasonable length of time. The time needed to transfer a digital image depends on the size of the compressed image and as the compression ratios that can be achieved by lossless encoding techniques are far lower than lossy encoding techniques, lossless encoding techniques are unsuitable for these applications.
Many lossless encoding techniques, including PNG, use a form of coding known as Huffman coding. In Huffman coding the more often a symbol occurs in the original data the shorter the binary string used to represent it in the compressed data. However, Huffman coding requires two passes one to build a statistical model of the data and a second to encode it so is a relatively slow process. This in turn means that lossless encoding techniques that use Huffman coding are notably slower than other techniques when reading or writing files.
Another disadvantage of the Huffman coding is that the binary strings or codes in the encoded data are all different lengths. This makes it difficult for decoding software to determine when it has reached the last bit of data and if the encoded data is corrupted -- in other words it contains spurious bits or has bits missing -- it will be decoded incorrectly and the output will be nonsense.