Data Conversion - Information Basics

Information Basics

Before any data conversion is carried out, the user or application programmer should keep a few basics of computing and information theory in mind. These include:

  • Information can easily be discarded by the computer, but adding information takes effort.
  • The computer can add information only in a rule-based fashion.
  • Upsampling the data or converting to a more feature-rich format does not add information; it merely makes room for that addition, which usually a human must do.
  • By storing data electronically, the chances of data loss are greatly lowered and it is easier to perform data mining and other conversions.
  • Data stored in an electronic format can be quickly modified and analyzed.

For example, a true color image can easily be converted to grayscale, while the opposite conversion is a painstaking process. Converting a Unix text file to a Microsoft (DOS/Windows) text file involves adding characters, but this does not increase the entropy since it is rule-based; whereas the addition of color information to a grayscale image cannot be done programmatically, since only a human knows which colors are needed for each section of the picture–there are no rules that can be used to automate that process. Converting a 24-bit PNG to a 48-bit one does not add information to it, it only pads existing RGB pixel values with zeroes, so that a pixel with a value of FF C3 56, for example, becomes FF00 C300 5600. The conversion makes it possible to change a pixel to have a value of, for instance, FF80 C340 56A0, but the conversion itself does not do that, only further manipulation of the image can. Converting an image or audio file in a lossy format (like JPEG or Vorbis) to a lossless (like PNG or FLAC) or uncompressed (like BMP or WAV) format only wastes space, since the same image with its loss of original information (the artifacts of lossy compression) becomes the target. A JPEG image can never be restored to the quality of the original lossless image from which it was made, no matter how much the user tries the "JPEG Artifact Removal" feature of his or her image manipulation program.

Automatic restorage of information that was lost through a lossy compression process would probably require important advances in artificial intelligence.

Because of these realities of computing and information theory, data conversion is more often than not a complex and error-prone process that requires the help of experts.

Read more about this topic:  Data Conversion

Famous quotes containing the word information:

    The real, then, is that which, sooner or later, information and reasoning would finally result in, and which is therefore independent of the vagaries of me and you. Thus, the very origin of the conception of reality shows that this conception essentially involves the notion of a COMMUNITY, without definite limits, and capable of a definite increase of knowledge.
    Charles Sanders Peirce (1839–1914)