Google Flu Trends, which has successfully tracked seasonal flu outbreaks by monitoring the words people enter into the search engine, appears to have gotten it wrong this season.
Knowing that there has been a surge in the number of people in a particular geographical area who are searching for information about flu symptoms, flu treatments or complain about flu-(like) symptoms can help to spot epidemics before they develop.
The same could be said for observing patterns on social media platforms like Twitter and Facebook. If a topic like ‘flu’ or ‘fever’ is being widely discussed, this can be observed in real time. In theory, this presents an opportunity to step up public health interventions – such as immunisation awareness campaigns – before an outbreak has peaked.
The major benefit of incorporating information from search engine or social media activity is speed.
Traditional disease surveillance relies on networks of hospitals and family doctors reporting to a central system which uses the available data to report on flu outbreaks. The trouble with this method is that it can be slow, taking two weeks to collect, digest and respond to trends.
So, monitoring online behaviour can quickly offer considerable detail in terms of the location – and even age groups – worst affected by epidemics like the flu. The question for search engines and social media tools is whether they can claim to be as accurate as the established way of collecting data.
The signal and the noise
From the beginning, there has been plenty of scepticism about tapping into online trends and using this data as a flu-tracking tool.
In previous years, Google Flu Trends had more or less matched the official US data on influenza patterns – the only difference being that Google’s data was available before the traditional, methodical reports from health authorities.
However, critics have said that Google Flu Trends appears to be least reliable when activity is highest. The problem, they say, is that even small outbreaks can lead to panic-fuelled increases in online flu-related activity.
This year’s flu season in the US has reportedly been more severe than usual, prompting a spike in media coverage and – perhaps – a disproportionate rise in the number of people searching online for influenza-related information.
For example, a genuine outbreak of a severe strain of flu in New York prompted officials to declare a ‘public health emergency’. The epidemic was real. But the publicity led to a huge rise in Google searches for information about influenza which was not reflected in the figures produced by health authorities.
Twitter too has come in for criticism. While it may be possible to pick up local outbreaks of disease – or even to pin-point rising vaccine-hesitancy in certain communities – results based on Twitter or Facebook activity are naturally skewed towards younger and better off segments of the population.
This poses particular challenges when it comes to measuring influenza outbreaks given that one of the key risk groups for severe complications of flu is older people – a group generally less likely to be as well represented on Twitter as people in their 20s or 30s.
Still, while this may be a setback for the Google Flu Trends project, it must be noted that online ‘listening’ tools are becoming increasingly sophisticated. 10 years ago Twitter and Facebook did not exist and few could have foreseen any role for search engines in disease surveillance. But their power is causing health authorities to sit up and take notice.
The challenge now is one of noise reduction: how can public panic be accounted for when trying to measure disease outbreaks?
Updating the disease surveillance system to incorporate social media listening tools and crowd-sourced disease reporting – where individuals voluntarily report their own illness – remains an almost inevitable development.
As more people contribute to sites such as Health Map, Sick Weather, Crowd Breaks and Flu Near You, the data becomes ever more accurate. Health authorities are taking this seriously while trying to figure out how much weight to give the data.
In a world when real-time communications produces masses of freely available data, waiting two weeks to reflect on disease outbreaks seems rather difficult to accept given the scope for early intervention.