Continuing our series to introduce you to the individuals within OS and give you a snapshot of the range of roles we have, meet Charis Doidge. As a Senior Data Engineer, here she gives us an insight into the innovative work she is involved in…
How long have you worked for OS?
After completing an undergraduate degree in History, I started at OS in January 2014. Although I studied history, OS has very supportive training programmes and I was soon proficient in all things “Geography”.
I began my career creating our digital maps as a remote sensing surveyor, and soon learned how to also create our super detailed orthorectified imagery using various software suites. I gathered experience using geospatial tools, software, methodologies, and in doing so demonstrated I was able to problem solve.
With the support of my manager I threw myself into data science courses, statistical mathematics, and machine learning courses. OS supported my career formally by sending me on a Fundamentals of Data Science course which soon got me to grips with Python, HTML, SQL and data science principles.
In 2016, I moved to the Business Change & Innovation team where I started to put my knowledge to good use. I started to leverage the imagery data I had been creating only a year earlier, using machine learning to add detail and bespoke data to our maps.
Can you describe your working day?
As I have only recently become a Senior Data Engineer, I don’t quite know my average day yet. In my previous role as a Research Scientist, a typical day for me was developing our methods and staying on top of new trends and discoveries. The ability to take a new idea, create a proof of concept, and run with it is a key skill here. I would then begin work on the data, which always ended up being more difficult than hoped. Wrangling it through several epochs of processing using Python or Arc or FME until it was in a fit state. Data preparation can consume you!
Some days I trained shallow or deep algorithms on our imagery data to produce new labels for our maps, such as roof attributes e.g. roof type, roof material, or presence of solar panels. We would make small adjustments to the model in order to increase accuracy of the results. Our work is now part of the Sensed Data project and may be used to support new PSGA requirements in the future (learn more about this here).
What are you working on right now?
My team is a hive of activity. We have experts in a wide range of fields including 3D modelling, UAV pilots, computer vision engineers, deep learning data scientists and satellite experts. One of my recent projects was to update our bespoke machine learning model which is trained on imagery and topography – called TopoNet. Our hypothesis is that if TopoNet performs well on its own dataset, it will become a good extractor for new features in our landscape. By features we mean new attributes on a map, which can lead to better products and services for our customers, so I ensured I kept a focus on quality. I spent weeks tinkering with the data, ensuring the labels were of a good quality, that they were a balanced set, and that the trained models were performing well. I managed to improve the accuracy of TopoNet from 86% to 96%. I then wrote a report on the newest iteration of our training data, to publish our findings across the business.
What is your favourite part of your job?
My team are a wonderful bunch who are all enthused with their profession. Every team meeting, I learn something new about a different research strand to my own. This in turn feeds into my own ideas and allows my imagination to run wild!
Due to the mixed nature of skills and experience in the team, there is always someone to help you learn a new skill or implement some code. As a team we collaborate across the business, and recently I have been given training in using Databricks to implement our work at national scales. Not only is this a cool project, but the knowledge throughout the business means you are supported in your development.
OS is truly a great place to work. We’re a pioneering bunch.
What are you excited to work on (or continue working on) in the future?
I am looking forward to seeing how our deep learning algorithms in Databricks pipelines can be leveraged to create new automatic attributes for our maps. In the past year we researched several attributes that we could enhance our maps with, and now we are in a place where we can start to iterate and deploy them. There are customer demands that we know we can solve using this method of capture, and I am very much looking forward to seeing how they progress.
I am excited about having recently become a Senior Data Engineer. Building upon my data science and data wrangling knowledge, I am now tasked with creating innovative tools and environments for our data. I am designing and building the infrastructure for data ingestion, processing, and storage, for Systems of Reference, creating modules for leveraging our data using analytics, data science and machine learning. The Data Office will be showcasing our work in the coming months, showing the business how our new data architecture could operate at lightning speeds, with data available, processed, and accessible for users, at the touch of a button.
What is your OS highlight?
My highlight was the first time I tested our newest implementation of our deep learning network, TopoNet. I had spent weeks iterating on the data, munging datasets together and ensuring inputs were of an appropriate quality. The team had rallied around the model and within the cloud we had iterated for the best parameters. It finished training, and we crossed our fingers. On the first validation test I ran we found it had increased our initial validation scores to over 96%. Elation was how I felt in that moment – we were really on to something! Since then we have proced how this new model could be used to identify zebra crossings, roof materials, solar panels, and roof types automatically. And that’s only the beginning…