Any information that relates to a place on the ground can be loaded into a GIS and analysed. When you consider the amount of detail contained in GIS data, this can often raise the question – Is Big Brother watching you?
Well, yes actually, he probably is. We are now well and truly living in an information age and there is no escaping the fact that some of this information is about ourselves. The difference with GIS is that if Big Brother is watching us, he is less likely to be looking at us in the wrong place.
This section looks at the different ways in which the information attached to GIS objects can be interrogated and exploited.
What are attributes?
The maps shown in GIS are intelligent – the features know their own identity.
The term attribute describes any piece of information about an object that can be stored in addition to its geographic properties. For instance, a road may have a number, a name, a maximum width, a speed limit and so on. GIS can work with this descriptive attribute information to create intelligence way in advance of what can be achieved by placing text on a paper map.
With GIS you are no longer restricted by how many text descriptions can be fitted into the available space to convey information about the objects in an area. Tabular information can be stored about each of the objects just as in a database, so allowing an almost infinite array of attributes to be recorded. All GIS have simple tools that allow the interrogation of the features. Hence, by using your mouse to click on an object, a full set of attributes can be displayed without that information having to be on screen all the time. The object of interest can be identified by the visual map graphic and then that object can tell you its own attributes. Which is very clever stuff...
Most GIS enable the user to view the data in tabular form without necessarily using map graphics at all. This is equivalent to using typical office spreadsheet software. Often you may know the name of an object but not necessarily where it is – hence you can use the table to find the object and then switch to the map to see where it is.
The GIS forms a constant link between the attributes and the geographical properties of each of the features: you can get either one of these if you know something about the other. This is the basis of location-finding mapping services on the Internet: you can generate a map for any location because there is a data layer with a link between the postcode attribute and the geographical coordinates. You can see how this works on Ordnance Survey Get-a-map™.
GIS can be used to link to any piece of information that may exist about an object from other systems.
GIS can tell you everything worth knowing about anything
Once a feature is loaded into a GIS, any piece of information about that object can be linked to it. How does this work?
When you start with geospatial data you often only have attributes that could be determined from the original source material. Information gleaned from the original map might, for example, show a line feature as an A road, numbered A11. However, the GIS can be used to link to any piece of information that may exist about the object from other systems. This can often lead to very powerful applications of GIS.
Any organisation which holds information about geographical objects can load that information into a GIS as long as they have some map data containing the relevant objects. Therefore it is not just the attributes that come with the geographical data that can be interrogated but any other item of information known about the object.
For this to work it is necessary to have some kind of common referencing system so that the correct record in the geospatial data can be matched with the corresponding record in the non-geospatial data.
The image on the right shows Ordnance Survey data about a river and a map showing its location. It then shows the Ordnance Survey data and the environmental data. You will see different information stored in each table and a common reference and all of the information joined together and shown on a map.
For example, Ordnance Survey holds a table of spatial information about rivers (name, location, length) and an environmental group holds a table of environmental information about rivers (name, sourcetype, nitrate pollution, flow rates). The two tables both contain the name of the river, so the two tables can be joined using the name attribute as the common reference. The joining of the tables gives the environmental information a spatial reference (location), so now the environmental information can be viewed on a GIS display.
This kind of application is very much dependent on the ability to establish links between the entities in the two sets of information. Often it is better to use a numerical referencing system understood by all users of a particular type of information, so that the specific features can be identified unambiguously. If you just use text names this can fall down if one set of information has a misspelt name or if there are duplicate entries. There are, for example, many stretches of river in Britain with the name attribute River Avon.
Ordnance Survey has developed its own common reference system using millions of Topographic Identifiers (TOID®s). These are unique 16-digit numbers applying to every feature in its large-scale database. TOID will make it a lot easier for users to link, combine or transfer information quickly and efficiently. This system is part of a massive project known as the Digital National Framework™ (DNF™).
Using GIS? Be selective
You can query the features in GIS map layers by selecting and viewing just those that satisfy particular criteria. How useful is that?
Performing selections on information held in spreadsheets and databases is the classical way in which computer users make sense of large volumes of data and provide answers to specific problems. Within GIS it is possible to do just the same, only with the added advantage that the results of those queries are displayed in a geographical context. So, not only can you identify which records satisfy a particular set of criteria, you can also see where they are in relation to each other.
Data stored in tables is usually very difficult to digest all at once. It is necessary to filter out relevant sets of information corresponding to a particular group of conditions. For example, imagine you are visiting a city with which you are unfamiliar. You might have a map showing the city centre and the location of all the restaurants. With GIS a range of information about those restaurants could be stored – what type of food, average cost per head, eat in or takeaway.
The GIS can help you decide which restaurant you want to visit. You are in the mood for something a bit spicy, so you decide to select Mexican restaurants. The records from the table will be identified and any Mexican restaurants in the city will be highlighted on the map screen. You can refine your search further. You are on a budget, so you do not want the bill to be too pricey: you can therefore add an extra filter to the query by requesting a list of all Mexican restaurants with a medium cost. From this selection you may also want to make sure they do takeaway. The display now shows just those records that fit your present needs and you can maybe choose which of the selections you want based on where they are in the city.
You can make this type of query as simple or as complicated as you like, as long as the data fields are there to interrogate. This ability is not unique to GIS software; many different types of information system will allow you to perform selections. However, only GIS can provide a visual representation of the location of the query results. Furthermore, GIS can apply geographical criteria to the selection filter such that objects are selected="selected" based on where they are.
This is one of the key functions of GIS, but what does the term geocoding actually mean?
In a way, the concept of geocoding is very similar to the idea of linking to external datasets. Geocoding describes another way of importing non-map data into the GIS such that its geographic properties can be identified and the records positioned in space. However, unlike the linking method in which the additional attribute table remains external to the mapped layer, when a table is geocoded it becomes a new map layer in its own right. Coordinate points are assigned to the geocoded table so that it can be used on its own to display the locations of the objects concerned.
OK, we may have lost you, so… Geocoding is easier to explain with a worked example:
It usually takes place with a list of locations with known addresses. Imagine you have a simple table of British football clubs containing the name of the club and the postcode. To geocode this list you have to process each record against the postcode data in the GIS. There are already several GIS data products that store the National Grid coordinates for every postcode in the country. The National Grid coordinates from the postcode product get copied across to join the football club list. This creates a new football club layer that can be added to the GIS.
Geocoding is often applied to address lists. Not many people know the National Grid coordinates of where they live so any list of people's addresses needs to be geocoded to load it into a GIS. Any company – a bank for example – which holds an address list of customers can geocode this information and instantly analyse their geographic distribution. This may reveal trends that the bank would otherwise be unaware of – areas of particularly high or low customer density – which may suggest reasons for the successful recruitment of customers. We will look at how organisations are using GIS to improve their decision making in greater detail in the next chapter.
The significance of structure
For geospatial data to be really useful to GIS applications it must be structured to model the real world. What is the significance of structured data?
We know that to link environmental information about rivers we needed a river map layer possessing names as attributes; and to geocode a set of addresses we needed a geographic layer with the coordinates of postcode locations together with the postcode text itself.
In these examples the GIS must hold location and attribute data that corresponds to the physical objects which people want to analyse. This means that the structure of the physical object data in the GIS data must be attuned to the types of application that the system is meant for.
In the rivers example this is achieved because the data to be analysed relates to simple large objects (whole rivers) for which there are many sources of suitable small-scale GIS data. However, when attempting to look at information about much smaller objects, like single addresses or buildings, the geographical data must be much more detailed and structured so that the appropriate information for actual objects is present. Because most large-scale data available for use in GIS has come from digitised cartographic sources, the data structure is not necessarily optimised for GIS analysis.
The idea is that data has to be structured in a certain way to carry out certain types of spatial analysis.
The thematic map
By bringing together data from a wide range of different sources, you can visualise trends in the data by creating thematic maps. How does this work?
Unlike a paper map GIS does not require every piece of information to be visible at the same time. It can also change the depiction of a particular object depending on the value of one of its attributes. This function is known as thematic mapping. You will already be familiar with thematic maps from atlases and geography textbooks. For example, a map of parliamentary constituencies shaded in different colours can show the number of seats held by different political parties. GIS can build this kind of map automatically from the data values (number of seats), and in many different ways.
Thematic maps come in all shapes and sizes; for example, a display of a road network with different colours to show average traffic speeds; and a map of farmer's fields showing a different colour for each type of crop grown; different sized point symbols to show the relative population of towns; a density map to show average numbers of badgers across different counties.
These examples are typical of the types of thematic mapping for which GIS is used. The really dynamic thing about these maps is that they can automatically change their appearance as the values in the data tables change with time.
Basic spatial querying
GIS can not only tell you what information exists about particular features in the map data but it can also analyse where things are in relation to each other. What can this tell us?
GIS can go beyond visual analysis of thematic mapping as described in the previous section. The software can identify trends across a given area as well as performing specific queries. Such queries can select attribute data depending on its geographical location and then interrogate the attributes by performing calculations and statistical analysis. Selecting data based on the geometry of objects is known as performing a spatial query.
The simplest spatial query can be performed on screen using the selection tools that are provided with the GIS software. For instance, you can draw a circle on screen and select all objects falling inside it. This example shows addresses that have been selected="selected" because they fall within the circle.
This technique could be used by the emergency services to quickly identify all houses within 500 m of a spillage of a dangerous chemical.
It is possible to analyse how close objects are to one another using a buffer. A buffer is a shape based on any other existing object (point, line or area) that can be generated by the GIS. The buffer object represents the total area within a certain distance of a given feature.
You can use the GIS to generate buffer zones and then identify all features that lie within a particular distance. For instance, you can select all addresses within a 500-m buffer of a busy road and compare these with data about the incidence of asthma. By comparing both sets of data you can work out if there are statistically more asthma sufferers living in the buffer region than in the general population. This allows you to analyse whether proximity to a busy road is likely to be a factor in the cause of asthma.
A key advantage of being able to layer data in a GIS is to carry out overlay operations. These can be quite complex, but simply mean combining layers of data to create one new layer (similar to the way a mathematical sum or calculation creates a new value or answer).
In the example on the left a farmer needs a certain level of rainfall and a type of soil to successfully grow a crop. By combining the rainfall map and the soil type map it is much easier to find the best location. In this example the GIS assigns a numeric value to each soil type and to the amount of rainfall. This makes it easier, on the resulting map, to see where the optimum growing area is located.
The point is that by combining layers of information, the farmer creates a new map that is much more useful.
Overlay operations are particularly effective when using raster data. Raster data is good for representing the continuous varied surface of the earth, whereas vector data makes assumptions that the edge of an area has a defined boundary. An example of a vector overlay can be seen in the graphic on the left. The soil type areas are clearly defined, whereas in reality we know where two soil types meet they often gradually blend into each other. Another good example is using aerial photography to analyse vegetation. It would be very difficult to draw onto a photograph where one area of vegetation stopped and another began: the vegetated areas would be blurred into each other and probably produce a speckled effect in the photograph.
GIS users must never forget that the result that a GIS provides is only as accurate as the data that was used for the query. Don't forget that rubbish in equals rubbish out.
Spatial querying is used in many different ways to help understand the world around us. Often the answer to a particular problem can only be unravelled by comparing two layers of information in a way that would be almost impossible to achieve without the GIS software.
One of the most far-reaching applications of GIS is in network analysis. Network analysis is the mathematical processing of the geometry of a link/node layer, enabling the identification of all possible routes around that network, along with the distances and times involved. Put simply, this means that, using an accurate road data layer, the computer can identify possible routes between two locations and calculate the shortest.
In link-node topology it is important to have the data structured correctly (that is links joining at nodes with no gaps). In order to carry out network analysis you need a link-node data layer of line features representing a real-world network (for example, a road network); only then is it possible to model movement around that network.
The simplest example of network analysis is to choose two points on the network and ask the GIS to calculate the shortest path between them.
This basic concept can be used to help build navigation systems and to plan distribution services.
By applying the principles of network analysis to accurate road data, you can build systems for motorists to navigate.
In-car navigation requires up-to-date map data and extra information to make the data model behave like the real world: you need to know which roads are one-way streets or where there are no-right-turn signs. By using unique identifiers in the road data so that each link in the network can be pinpointed, additional information can be built into the system. Furthermore, it is possible to receive real-time information about traffic conditions as you drive, so that you get advance warning if there are hold-ups due to road works or an accident.
Another benefit of network analysis is the ability to calculate drive times, which identify how far you can travel in a certain amount of time.
The typical drive-time map - for example, for pizza delivery - would show a central point surrounded by a series of circles estimating how long it takes to get to places within that radius. This method assumes an as-the-crow-flies route to each location.
A GIS can be much more accurate - it can use network analysis to generate isochrones (lines that join up points of equal travel time) that take into account the true road network and give a proper measure of how far you can get over a set time. This can even take into account the average speed on each road, so that the area appears stretched along faster roads.
Many different organisations use this kind of drive-time analysis to plan their operations, from the siting of new stores to the planning of distribution networks.
Network analysis doesn't have to be carried out on vector link and node data. Raster data can be effective in describing continuous varied surfaces. This quality of raster data is useful for identifying the optimum path (path of least resistance or shortest path) through a continuous surface.
For example, a company needs to erect electricity pylons from A to B. They need to make sure that they disrupt the forest areas as little as possible. The GIS calculates the path of least resistance, by finding the path that adds up to the lowest value. A vector line can then be added to show the location of the proposed route.