Jump:

Ordnance Survey – Great Britain's national mapping agency

GIS Files 6: Expert GIS concepts

6.3: Spatial databases (5)

Advanced database technology

Databases are now used in sophisticated ways that go beyond the simple storage of information and its retrieval by structured queries. Two techniques that are often encountered in the study of GIS are data warehousing and data mining. Databases are designed as part of a particular system, only storing the information needed to make that system work. Organisations end up with many databases serving different functions and often this data can contain information that has value beyond the purpose of the individual systems for which they were designed. The centralised gathering together of diverse sets of information stored within an organisation is known as data warehousing.

Techniques have emerged that automatically scan the information held in a data warehouse to identify possible relationships between data items. This is known as data mining, which can reveal phenomena that may otherwise remain undetected. Terms you will hear often associated with data mining are regression, classification and clustering. Mostly, data mining concerns a statistical analysis of the contents of the data warehouse to identify commonalities and patterns. For example, regression refers to the mathematical analysis of numerical data to identify a formula that best fits the trends in the data. If successful this can enable successful prediction of future results.

Another feature of databases that is very important to their application in GIS is indexing. The way in which indexes speed up the response to queries has already been described. This becomes very important when performing geographical search queries as it is possible to generate spatial indexes that break down the space occupied by features in the table and sort them into a hierarchy similar to the alphabetical sorting of text values. In response to a request to find all objects that intersect a polygon, it can be quicker to find a subset near that polygon first and then analyse each object of this subset more accurately to find those that actually intersect.

Indexing is important because spatial queries can be very complicated and time consuming. If you are trying to select all features lying within a county boundary you could be checking from many thousands of records against the shape with thousands of vertices, a very convoluted geometric algorithm. Spatial databases that contain features in three dimensions are starting to be developed – for example, to store a building as a 3-D volumetric object, not just a planar polygon shape. The generation of spatial indexing for three-dimensional space presents interesting challenges!

Finally in this section, a mention of another key challenge – the storage of data in the fourth dimension. GIS databases designed to store information about real-world objects and how they change over time are called spatio-temporal. To truly reflect real-world changes in data form, a GIS needs to maintain historical records. The simplest way of achieving this is to keep copies of the data at intervals to create a series of time slices. More ideal, but harder to achieve, is to archive each feature every time a change is made to it; this means you can answer a query for any moment in time.

The next section of Chapter 6 – Advanced GIS Concepts – looks at the different ways in which map information can be portrayed from a digital source.

< 6.3: Spatial databases (4) | 6.4: Derived mapping (1) >

Top of page