Why you need an Address Master Data Management strategy

A short while ago, my wife received two seemingly identical catalogues in the post from a well-known online fashion retailer. Both were addressed to our home, but the catalogues differed in two respects. Firstly, one was labelled using her maiden surname, whilst the other used her married surname. The second difference was more interesting. The first catalogue promised a 30% reduction on all merchandise – as a loyal customer reward. The second catalogue promised a 20% reduction. This told me two important things:  (1) my wife clearly spends more with this retailer since we married; (2) the retailer in question has no Master Data Management (MDM) strategy for addresses.

Post box

MDM refers to everything an organisation does to manage the critical data of the organisation, the goal being to provide a single version of the truth. In this article, I’ll explore why MDM is necessary for addresses and how it can be implemented.

The addressing mess

We all use addresses to refer to places; typically these are properties that people live in or work at. Ask a colleague or friend for their address and you’ll probably be given a house number, road name and perhaps a postcode to help you navigate. This structure has its roots in the postal system and delivering mail. In this regard, addresses are extremely effective. They are easy for humans to communicate and remember. But crucially, addresses are a very poor way of storing and managing locations in a machine environment. Consider the following ways in which an address can be written:

  • Flat 1, 4 Acacia Avenue, Southbourne, Bournemouth, BH1 1ZZ
  • Flat 1, 4 Acacia Avenue, Bournemouth, Dorset, BH1 1ZZ
  • Ground Floor Flat, 4 Acacia Avenue, Bournemouth, BH1 1ZZ
  • The Lodge, 4 Acacia Avenue, Bournemouth, BH1 1ZZ

All of these variations refer to a single (fictional) property. It is not uncommon to see addresses written in many different ways according to the taste of the occupier (as an interesting aside, if this address was in Bristol, it would most likely be written “Garden Flat”, as is the custom with Bristolians). All of the above examples are legitimate in the sense that they will all facilitate the delivery of mail or services to the property.

However, business systems can have great difficulty in associating that these different addresses refer to the same property. Addresses are strings of text, so there is also the possibility of spelling and typing errors being introduced.

Factor in also that many organisations will have multiple contact points with customers and so gather the same address multiple times. They may obtain addresses from other sources (such as third party suppliers or through mergers and acquisitions). And don’t forget that addresses are constantly changing as properties are sub-divided, demolished, replaced or built afresh. In the August 2015 update to AddressBase Plus, the address product OS produce in partnership with GeoPlace and the Local Government Association, over 67,000 addresses were added and more than 23,000 were deleted over a six week period.

Addresses in AddressBase Plus, shown over OS MasterMap Topography Layer

Addresses in AddressBase Plus, shown over OS MasterMap Topography Layer

Is it any wonder therefore that many organisations can experience address meltdown? This can reveal itself in a number of ways, but commonly seen examples include:

  • Address records stored in multiple databases – each database holding its own copy of addresses
  • Duplicate records held for the same property – it is not unheard of for some large organisations to hold two or three times the total number of addresses in GB!
  • Invalid or dummy addresses – entered by customers or employees to bypass business processes
  • Continuous address matching – regular and never-ending effort spent by the organisation to clean up and match addresses

As well as being inefficient, this also affects the service level the organisation gives to its customers. Returning to my retail catalogue example, does that retailer really want me to know they offer varying levels of discount depending upon how much they value me as a customer? I now want to know if someone else is getting a 40% reduction. And what about the wastage in printing and posting two catalogues to exactly the same address?

A five-point plan to fix

There is a solution at hand if you can start to treat addresses as master data and implement appropriate controls.

  1. Select a reference address dataset

You will need a source dataset of valid addresses covering the whole country. There are two main GB addressing datasets. The AddressBase product family and Royal Mail Postcode Address File (commonly referred to as PAF). In assessing the differences between the two, consider your likely applications now and in the future. You may start with cleaning up your addresses today, but what can better addressing do for your business tomorrow? For example, can you get advantage by knowing earlier than your competitors when new properties are being built?

Despatch controller and AddressBase Plus

Despatch controller and AddressBase Plus

  1. Stop the problem getting worse – enforce address validation at all capture and maintenance trigger points

Your next focus should be on stopping the current problem getting any worse. Addresses enter your organisation at defined points. Perhaps when a customer completes a web-form or calls a contact centre. They also change at defined points, such as when a customer moves house. All of these trigger-points risk introducing more bad addresses. Bad or invalid addresses arise for many reasons, some benign such as typing errors, some malign such as fraud attempts. So it is essential that only clean and valid addresses are put into databases. This is done by enforcing address validation.

To do this you need two things; the reference address dataset and a means of matching the customer-input address to the dataset. The good news is that there are many solutions providers offering software and API services that combine reference address datasets with address matching algorithms.

  1. Fix the mess you are in – cleanse, match and de-duplicate

Now you have stemmed the tide, you need to fix the back-book. All of the existing addresses in your databases should be put through a process of cleansing and matching. The goal is to make a high confidence match to the reference address dataset.

This starts with using address matching software or API services to make an initial match. The level of matching achieved is dependent upon on the quality of the source addresses and the intelligence of the algorithms within the software or API. As a rule of thumb, expect up to 90% match rate for residential addresses, 80% for commercial addresses.

The key question is what to do with the addresses that don’t match. The answer is dependent upon how much value is placed on having a correct address. If you are cleaning up a database for direct marketing, it’s probably OK to discard the non-matches. If you deliver critical services to each address, you need to investigate the non matches further using a combination of manual desktop analysis and, if necessary, field inspection. There are organisations who can manage this entire process for you.

  1. Maintain one master record of the address

Your goal should be to maintain each address in only one place – your master record. Once you have a valid and clean address in your master database, you will need an efficient way of sharing that across your systems. As mentioned earlier, using the address text is not ideal, especially as it can change. Instead you should use a primary key reference for the address.

The good news here is that these already exist in the two main GB addressing datasets. The AddressBase product family has a Unique Property Reference Number (UPRN) for every address in its file. Royal Mail PAF has a Unique Delivery Point Reference Number (UDPRN) for every address in its file. Both are similar in that they reference a single address with a single numeric value.

The integrity of these unique keys is critical in an MDM scenario. Check if the data provider has robust processes to ensure that keys are always unique, will persist with a property throughout its total lifespan (from planning to demolition) and are not re-used when properties are demolished. Addresses are also hierarchical – a flat is a unit within a building – so ensure the primary key structure maintains these relationships, especially if your organisation delivers services to multiple units or the parent building itself.

  1. Know what to do when an address changes

Your customer addresses are going to change. Properties are demolished and replaced, postcodes are changed and buildings are given new names.  Organisations may get to know about this when their customers inform them. But what if they don’t? This is where address change-intelligence comes in. Knowing the addresses that have changed between each release is a key requirement of a reference address dataset. If you are able to pre-define actions against the type of change, it means you can automatically trigger the right response as early as possible.


Organisations implementing an MDM strategy can expect to see efficiencies and service improvements. Business can learn a lot from government where for many years, the UPRN has been used widely as a means of referencing and sharing addresses. It underpins a range of joined up local and central government services including “Tell Us Once”, a way to report a death to most central and local government bodies in one go.

Organisations can also start to link other data to the address record to unlock new potential. For instance, addresses can be given an accurate geocode to locate them precisely on the ground, the usage can be classified between residential and commercial, single or multiple occupancy properties can be highlighted. Organisations are already using this approach to better understand risks such as flooding, optimise delivery logistics through pin-point accurate routing and deliver more personalised customer service.

If you are interested in learning more about the address products, services and consulting available from Ordnance Survey or our partners, get in touch.

You may also like

AddressBase – 7 years of improving public sector services, deliveries and service provision
England World Cup squad surnames as street names
Why retailers need quality addressing data
Why you need an address master data management strategy

2 Responses

  1. Ralph Pawne

    Great post. I’ve actually come across DataMatch by Data Ladder (www.dataladder.com), which is an excellent fuzzy matching and address standardization/address parsing tool used across business and would work really well for this situation. They offer a complimentary trial for new users.

    In fact, an independent verified evaluation was done of the software comparing it to major software tools by IBM and SAS. There was a study done at Curtin University Centre for Data Linkage in Australia that simulated the matching of 4.4 Million records. It identified what providers had in terms of accuracy
    (Number of matches found vs available. Number of false matches)
    1. DataMatch Enterprise, Highest Accuracy (>95%), Very Fast, Low Cost
    2. IBM Quality Stage , high accuracy (>90%), Very Fast, High Cost (>$100K)
    3. SAS Data Flux, Medium Accuracy (>85%), Fast, High Cost (>100K)

Leave a Reply

Your email address will not be published. Required fields are marked *

Name* :

Email* :