Data conversion is the conversion of computer data from one format to another. Throughout a computer environment, data is encoded in a variety of ways. For example, computer hardware is built on the basis of certain standards, which requires that data contains, for example, parity bit checks. Similarly, the operating system is predicated on certain standards for data and file handling. Furthermore, each computer program handles data in a different manner. Whenever any one of these variables is changed, data must be converted in some way before it can be used by a different computer, operating system or program. Even different versions of these elements usually involve different data structures. For example, the changing of bits from one format to another, usually for the purpose of application interoperability or of capability of using new features, is merely a data conversion. Data conversions may be as simple as the conversion of a text file from one character encoding system to another; or more complex, such as the conversion of office file formats, or the conversion of image and audio file formats.
There are many ways in which data is converted within the computer environment. This may be seamless, as in the case of upgrading to a newer version of a computer program. Alternatively, the conversion may require processing by the use of a special conversion program, or it may involve a complex process of going through intermediary stages, or involving complex “exporting” and “importing” procedures, which may include converting to and from a tab-delimited or comma-separated text file. In some cases, a program may recognize several data file formats at the data input stage and then is also capable of storing the output data in a number of different formats. Such a program may be used to convert a file format. If the source format or target format is not recognized, then at times a third program may be available which permits the conversion to an intermediate format, which can then be reformatted using the first program.Before any data conversion is carried out, the user or application programmer should keep a few basics of computing and information theory in mind. These include:
Information can easily be discarded by the computer, but adding information takes effort.
The computer can add information only in a rule-based fashion.
Upsampling the data or converting to a more feature-rich format does not add information; it merely makes room for that addition, which usually a human must do.
Data stored in an electronic format can be quickly modified and analyzed.
For example, a true color image can easily be converted to grayscale, while the opposite conversion is a painstaking process. Converting a Unix text file to a Microsoft (DOS/Windows) text file involves adding characters, but this does not increase the entropy since it is rule-based; whereas the addition of color information to a grayscale image cannot be done programmatically, since only a human knows which colors are needed for each section of the picture–there are no rules that can be used to automate that process. Converting a 24-bit PNG to a 48-bit one does not add information to it, it only pads existing RGB pixel values with zeroes, so that a pixel with a value of FF C3 56, for example, becomes FF00 C300 5600. The conversion makes it possible to change a pixel to have a value of, for instance, FF80 C340 56A0, but the conversion itself does not do that, only further manipulation of the image can. Converting an image or audio file in a lossy format (like JPEG or Vorbis) to a lossless (like PNG or FLAC) or uncompressed (like BMP or WAV) format only wastes space, since the same image with its loss of original information (the artifacts of lossy compression) becomes the target. A JPEG image can never be restored to the quality of the original image from which it was made, no matter how much the user tries the “JPEG Artifact Removal” feature of his or her image manipulation program.
Automatic restoration of information that was lost through a lossy compression process would probably require important advances in artificial intelligence.
Because of these realities of computing and information theory, data conversion is often a complex and error-prone process that requires the help of experts.