When I wrote about ‘Why you need an address master data management strategy’, I highlighted how organisations could gain better control of the address data they hold. There are many reasons why you would want to do this; better addresses can streamline operations, reduce errors and waste and even lead to new business opportunities.
For many organisations, their main activity is reliant on location, which is often expressed as an address. A delivery company needs to find an address to deliver goods; a mortgage lender wants to value the property found there; a utility company needs to deliver underground services there; an insurer wants to know the risks surrounding it. An address by itself cannot tell you any of this, but it can be used to unlock other location data.
Organisations who want to gain advantage by using location data should take a structured approach and start by creating a location data strategy. In this two-part article, I’ll explain the steps you need to take to get started.
Express your business challenge as a location data challenge
Chances are, your business activities have a physical footprint in the real world. Start by documenting these activities, the challenges and any improvement targets. Then try expressing how these can be solved by location data. Take the delivery company example. The challenge might be;
“To increase on-time deliveries, I need to understand how long it takes a van and driver to find and deliver to each address on a route.”
Expressed as a location challenge, it might look like this:
|Challenge:||Location data needed:|
|Identify the exact location of each address||Accurate geocoded addresses|
|Find most efficient route between each address||Routable road and path network|
|Find best place to park the van||On-road parking restrictions|
|Calculate the time to drop-off parcel||Distance from roadside to property entrance|
Sometimes it seems harder to relate the business activity to location. For instance, an online fashion retailer looking to grow sales might not think that location matters if it sells via the internet. However, its online customers live somewhere in the real world and their location may well influence their behaviour and spending patterns. If the online retailer has competitors with a bricks-and-mortar presence , this may reduce their customers’ spending in the area around a store. Analysing where your customers live compared to where your competitors are based may provide insights that help the online retailer to devise targeted sales campaigns.
Once you have an idea of the data you need, you can start to source it. But there is a lot of location data out there; How do I start? Which types of data do I need? Where do I source it from? And how do I compare and evaluate different location data sets?
Know the different types of location data
When I joined OS, I was a newbie to the geospatial industry. I quickly found that I needed to learn a whole new language and set of concepts. On my first day, a colleague said to me “just remember; it’s all about pixels and polygons”. Frankly I was a little baffled by this, but it stuck in my mind and eventually I saw the wisdom in his words. To get started, you need to know about the different types of location data.
- Raster data: this is a ‘dumb’ picture made up of lots of pixels, much like a digital photo. Raster maps are great for looking at and absorbing a lot of contextual information about an area. In systems, they are often used as backdrop map to overlay other information on top. A classic example of a raster map is the OS 1:25,000 and 1:50,000 map series. Raster maps cannot be queried. But you can view a raster map on a screen without any specialist software or expertise.
- Vector data: this is a series of points, lines and polygons (shapes) which collectively make up features, such as a map or a flood-risk zone. It is possible to assign each of these features attribution, such as its function. Vector data can be analysed and queried to output answers. You will need a geographic information system (GIS) and a certain amount of expertise to use vector data. OS MasterMap is a good example of a vector mapping product. Another example is Listed Building data from Historic England, which uses a point to identify listed features and attributes each point with a description and a Listing Grade.
- Textual data: sometimes location data is stored as text, with an implicit location embedded in the text. Addresses or postcodes are great examples. Land Registry Price Paid Data typifies textual data. You do not need any specialist software or expertise to look at this data, but you may need a GIS for spatial analysis.
Determine your constraints
Think about and find answers to these questions. It will help you later when you evaluate different datasets;
- What is the geographic scope of my business challenge? Your organisation could be global, national, or it could be limited to a single region in England. A lot of location data is not consistently available for all geographies. This means it is not always possible to create solutions that work consistently across geographic boundaries. Does this matter? If you have a wide geographic scope, consider if you could segmented your solution into regions. If you can, this provides the opportunity to use best of breed data from each region, but obviously this adds complexity to your data management and solution build.
- What level of granularity is needed?
- Property level data is unique to the exact property you are interested in. Typically, this means the property has been individually inspected and its unique characteristics captured. Data with this type of location precision is less common and is generally more expensive, but allows for very precise decision making. If you are valuing a property for lending, you may decide you need this level of precision.
- Area level data is the same for a whole area. It involves clustering properties within an invisible boundary (e.g. Postcodes, Wards, Census Output Areas, etc.). Good examples of area-level data are socio-demographic or crime statistics published at a postcode level. Data with this type of location precision is more common and much of it is open data. But, it may be too imprecise to support decision making. For example, a postcode-level flood-model will rate the whole postcode as having the same risk, when in fact it will have varying levels of risk.
- How authoritative does it need to be? Not all location data is created equally. Its trustworthiness can vary. This is often driven by its method of collection/creation:
- Surveyed data – the ‘gold’ standard. It involves manual inspection by a trained expert. Usually highly accurate and reliable, but can be limited in scope and geographic coverage.
- Crowd Sourced – this may be very accurate and up to date, but equally it can contain gaps and omissions that are difficult to predict. Timeliness of updates and quality control can be other issues.
- Modelled or derived – this is data that is created by an algorithmic model. An example might be a flood model or a model predicting the number of storeys in a building or the age of construction. This data will be predictive and not accurate all the time. But if this is accepted, it can be hugely beneficial as it may yield data that is not otherwise available.
- What expertise is in the organisation? As noted above, some location data will require specific software and skills to use. Is this already in the organisation? If not, is it prepared to invest in acquiring? Can it be outsourced to a specialist?
- Is open data OK? There are many open datasets, particularly from UK Government, with a location element. These are fantastic resources and often contain data not found elsewhere. But think carefully before building solutions around open data. As well as looking at the legal T&Cs, check the update frequency, completeness and quality.
- With all datasets, consider what would happen to your solution if the data was withdrawn?
In part two, I’ll start to look at the practical steps you can take to bring your location strategy to life.