Each line in a file represents a row of data, but that data is usually divided into fields, or columns. Two standard methods delineate fields. One is to have fixed length fields, and therefore fixed length records; the other is to have variable length fields with variable-length records. Variable-length fields must be separated by a symbol, which is the delimiter. A database of variable-length records will produce a delimited file, a database of one with fixed length records produces a fixed width file.
The most common form of delimited file uses the comma as a field separator. These files are called comma separated values (CSV) file. The comma is suited to numeric data, but can cause a problem with text. Other delimiters include the space (" ") bar ("|") or hat sign ("^"). The file designer or programmer has to find a character that is rarely used in the data. Sometimes it may be necessary to use a combination of characters.
It's not always possible to guarantee that one character will never be used in data to be stored in files, so the difficulty of finding a suitable delimiter can make fixed length fields preferable. This format presents overheads in both storage and processing, so delimited files are more common. A fixed length field has to be padded. The most common forms of padding are left padding with zeros for numeric data, and right padding with spaces for text.
Video of the Day
Whether a file is of fixed width or contains delimited fields, the writing and reading programs have to follow the same conventions. A program receiving a fixed width file first has to know the length and the data type of each filed. A program receiving delimited files has to know the delimiter to look for.
In each case, importing programs should have exception-reporting procedures that write out rejected records to a separate file. The most common reason that a delimited record gets rejected is that the delimiter appears in the data, creating extra columns. Fixed width records usually get rejected for being too long. Short records usually do not cause errors. The final fields will be unpopulated. If the final fields are mandatory, short records will be rejected.