For the past seven years, Google Flu Trends has successfully tracked seasonal flu epidemics by monitoring the words people enter into the search engine.
It has worked pretty well, providing information about the location and severity of flu outbreaks around two weeks earlier than official data.
Traditional disease surveillance relies on information from doctors, hospitals and laboratories to follow flu outbreaks.
However, a report in Nature suggests Google may have gotten it wrong this season. It appears that while Google was pretty reliable when it came to pinpointing where flu was spreading, it overestimated the severity of the outbreak.
Part of the problem is that when flu is circulating, we all overreact a little and begin to suspect that every cough and sneeze is caused by an influenza virus. If we search for online information on this topic, Google Flu Trends will notch that up as another suspected flu case.
Read:Is Goolge the future of flu-tracking?
While Google has been working to fine-tune its algorithm to take account of this, researchers at the Boston Children’s Hospital have proposed another source of online data collection: Wikipedia.
An article in PLOS Computational Biology suggests that monitoring traffic to 35 flu-related Wikipedia pages can give a snapshot of flu activity which is 17% more accurate than Google Flu Trends but two weeks faster than the official figures from the US Centers for Disease Control (CDC).
If you look at the graphs you can see that the shape of the Wikipedia graph maps neatly over the CDC graph. And it’s clear that Google Flu Trends (GFT) had a good record until this season when it overshot.
But this comes with caveats. The new study looked at Wikipedia visits and compared them to a previous flu seasons but did not attempt to use Wikipedia to predict flu trends in real time.
And Google has much more data on its users – including location – so it is therefore more useful in terms of public health intervention: if you know where flu is spreading, it helps you to decide where to focus disease control efforts.
All of this is part of a much broader effort to use real-time information from online sources such as search engines, social media and self-reporting by members of the public. 15 years ago, none of this would have been possible. Five years ago critics saw social media as a high-tech toy rather than a public health tool.
Sites like Health Map, Sick Weather, Crowd Breaks and Flu Near You, collect large volumes of data and, as growing numbers contribute their information, these services will become more accurate.
Yes, each has its own imperfections and more work will be required before they are incorporated into established disease surveillance systems, but a decade from now could well be the norm.