Data and file formats
Learn how GIS software stores mapping data.
There’s a huge range of file types used in GIS. Each software product stores information in different ways. Here are the main ones:
- single file for each layer – in some systems all information for a given layer is stored in a single file (for example, .dxf files in AutoCAD®);
- multiple files for each layer – some systems use a series of files for each layer (for example, MapInfo® has a .tab file for each table of information, but this is just a pointer to a set of files containing the geometry, attributes, identifiers and indexes separately); and
- a folder of files for each layer – more complex systems can use a more complicated hierarchy of files held in a specific folder to store the information (for example, ArcInfo® coverages).
In these last two examples, you only ever interact with the data through the GIS software interface. Editing individual files outside the GIS risks corrupting the data.
GIS can read image data from standard graphics file formats but often need an additional file to put each square of map in the right place. Examples include MapInfo .tab and ESRI® .tfw. These are simple text files. If you have any examples on your own computer, try viewing the contents in a text editor. This can be useful to understand how they work.
Translators and transfer formats
GIS software is designed to work with data stored in specific proprietary binary data formats. The binary code used to store the data is critical to the performance of the software and GIS vendors jealously guard their binary formats as part of their intellectual property.
In the early days of GIS, if you used data in one package you could not open it in software from a different vendor. More recently it has become standard for GIS software to have import facilities that can open files from a diverse range of formats and store them in the preferred local format.
Products have been developed to convert geographical data between a whole array of formats. it is now possible to convert between practically any of the possible data formats, of which there are over a hundred, in either direction.
The previous pages in this section have discussed why so many different file types are used in today's GIS applications. Traditionally, data providers have supplied data in open ASCII formats with systems simultaneously loading and translating it into the proprietary binary format. Data can be exchanged between systems where an import option exists for the particular formats. There are also bespoke translation tools available to cover every possible option.
Exchange between formats has advanced in recent years, so interoperability is important. several different software products might be in use in an organisation. They should enable information sharing between systems.
With continual improvements in the processing power of computer hardware, storing data in the optimum format for the GIS software becomes less critical to system performance. The recent trend for non-specific data formats means that data is read from, and written to, different native formats.
As a data provider, we make careful decisions about the formats we use to supply data. Data users want a choice of formats to avoid the need for translation. Although it’s difficult to provide every possible format, excluding just one would be unfair to that software vendor. Open data exchange formats create a level playing field for the producers of software and translators.
The increasing significance of databases and the Internet is also playing a big role in the advance of interoperability. Increasingly, systems are being built around the use of databases to hold the information in each GIS layer, replacing the use of flat files. Proprietary data formats are therefore becoming less important. Similarly, systems can now read data in real time from central locations on a network, rather than reading from locally-stored files.