Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Constrained formats, and formats where the set of 'essential information' can be canonicalized into a particular representation should be the norm, rather than the exotic exception, especially in situations where minute inessential differences can be cascaded to drastically alter the result.

That might be very challenging in practice, because a more expressive language directly allows a more compressed/efficient encoding of the same information, but at the cost of being more difficult (or impossible) to create a canonical representation.

Also, data formats that are purposefully redundant for error tolerance all basically have the property that readers should be tolerant of non-canonical forms. If we want to redundantly represent some bytes redundantly in case of data loss, then there must be multiple representations of those bytes that are all acceptable for the reader for this to work.

Video and image formats use multiple encodings to give encoders the room to make time-space trade-offs.



I agree, for anything that a human is supposed to see with the eyes, there are always different representations that look the "same" enough to be indistinguishable.

People should be aware of it, not believe in a non-existing world where it isn't so.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: