How Your Google Search Data Is Used To Track Coronavirus

Analysis of online searches produced a 20-day warning of cases going up or down during the early days of the pandemic.
|

When the coronavirus pandemic began, governments around the world scrambled – with varying senses of urgency – to pull together the tools and resources needed to fight it.

While much attention has rightly been paid to vaccines, lockdowns and the merits of face masks, one powerful weapon hasn’t got much credit – your keyboard.

That’s right, Public Health England (PHE) uses Google search data – particularly searches about Covid-19 symptoms – to “understand and monitor the pandemic in England”.

And the data, which contributes to the government’s weekly surveillance reports, was able to accurately give a 20-day warning about whether cases were going up or down during the height of the pandemic.

The assumption is pretty straightforward – people with symptoms of an illness will Google what is wrong with them before they go to or even book an appointment with a health professional.

Google also has the added bonus of being relatively anonymous so people tend to be far more honest than they are when talking to other people about illnesses (and practically everything else), a phenomenon documented in the book Everybody Lies by Seth Stephens-Davidowitz.

Using a survey from March that examined the “first few hundred” cases, a list of 19 symptoms associated with confirmed Covid-19 cases was created as well as their probability of occurrence.

A separate and more general search category was created for Covid-19-related keywords, for example “Covid-19” itself and of course “coronavirus”.

Researchers then used a specialist version of the publicly available Google Trends dashboard that focuses on health data called Google Health Trends.

And the results were startling.

“When we looked at this for a number of countries, around eight, we saw on average there was a 20-day early warning that Covid cases are going to go up or down,” one of the architects of the method, data scientist Bill Lampos of University College London, told HuffPost UK.

“We expected that at the time because health systems were very slow to test, so people were tested mainly when they presented themselves in a hospital with more severe symptoms, and it takes some time to get to that stage.

“People will search for their symptoms and seek help later on. And if your symptoms were not severe, you didn’t get a test so you were not a case in the system but the search did not miss that out.”

Open Image Modal
A chart showing the increase in searches for coronavirus symptoms around the time lockdown was imposed in England.
PHE

The data also raises some interesting historical insights. For instance, a report on the methodology published in July, highlights how effective non-lockdown measures were in the early days of the outbreaks.

It reads: “We also note that for Australia and the UK, search scores were already in decline after the application of physical distancing measures but before lockdowns.”

The report adds: “Our work provides evidence that online search data can be used to develop complementary public health surveillance methods to help inform the Covid-19 response in conjunction with more established approaches.”

Testing is now more widespread in the UK – albeit not as widespread as it should be at this stage in the pandemic – but the data Lampos and his team provides is still useful and contributes to PHE’s weekly infection survey report.

“The model followed the death rate very closely after the big peak in April and since then it’s followed the expected rate,” said Lampos.

Six months into the pandemic the UK combines a number of indicators into its weekly assessment including swabbing, NHS Test and Trace and hospital admissions.

But the search data method still retains one major advantage over all of these – people will generally search symptoms before they come into contact with any official testing or tracing service.

The latest search data Lampos submitted to PHE has reflected the recent upward trend in cases and has been confirmed by data from patient swabbing – but it’s still uncharted territory.

“Now there are second waves, we will have to do a second assessment of the models to see what we are doing correctly and what we’re getting wrong,” says Lampos.

“It’s a work in progress really. It’s an indicator and it has to be combined with other indicators.”