Challenge
Patient health records held by GPs contain important information about a population's health for health providers, business intelligence, and research. Using de-identified patient addresses from these records as part of health research can inform ground-breaking insights into the wider determinants of health, by linking people to places.
However, addresses are entered into GP records as 'free text', meaning the same address can be written in different ways. If we can harness de-identified patient addresses this would enable health records to be analysed at a household level. At present we can only analyse 'place' by postcodes or small areas, both of which include multiple households which may not share the same characteristics.
Solution
A team of researchers, led by the Clinical Effectiveness Group at Queen Mary University of London, and Endeavour Health Charitable Trust, has developed an algorithm that leverages the power of Unique Property Reference Numbers (UPRNs) to analyse health data at household level.
UPRNs are numeric identifiers that are allocated to every property and managed in an Ordnance Survey database. The algorithm, known as ASSIGN (AddreSS MatchInG to Unique Property Reference Numbers), compares addresses in patient health records with Ordnance Survey's UPRN database, one element at a time, and decides whether there is a match. It mirrors human pattern recognition to allow for character swaps, spelling mistakes, and abbreviations.
The algorithm is open source and has been proven to be very accurate – correctly matching 98.6% of patient addresses at 38,000 records per minute. Importantly, the patient records and the UPRNs are de-identified which keeps addresses and patient identities hidden from researchers.
Result
ASSIGN has unlocked the potential of UPRNs for place-based health analysis and research. It is an open source, quality assured, and transparent algorithm available for use under a creative commons licence.
Assigning UPRNs to the addresses in health records enables two key things: linking people who share a household at a point in time to understand variations in household health, and linking to other data sources, such as property information and local authority records, to study other wider determinants of health. The algorithm makes bulk address-matching with UPRNs scalable and fast, using a rigorously tested and standardised method.
The Clinical Effectiveness Group has worked with the NHS in northeast London to assign UPRNs in real time to every patient address in GP health records. They are using the de-identified data, sometimes linked with other datasets, to investigate the health impacts of household overcrowding and household clustering of people affected by multiple long-term conditions.
The team is also working with the NHS in Wales and Scotland and with local authorities in London to leverage the benefits of ASSIGN to improve population health and inform policy.