Twitter might be something more than a way to keep the world posted on your every thought and action. It just might help health officials track disease, especially the flu.

Aron Culotta, assistant professor of computer science at Southeastern Louisiana University, has found that keeping track of influenza outbreaks has the potential to be far quicker and less costly by monitoring a social network program such as Twitter than following the traditional methods of disease surveillance.

A process called syndromic surveillance uses collected health-related data to alert health officials to the probability of an outbreak of disease, typically influenza or other contagious diseases. The technique involves collecting data from hospitals, clinics and other sources, a labor-intensive and time consuming approach.

By monitoring a social network such as Twitter, researchers can capture comments from people with the flu who are sending out status messages.

A micro-blogging service such as Twitter is a promising new data source for Internet-based surveillance because of the volume of messages, their frequency and public availability, Culotta said. This approach is much cheaper and faster than having thousands of hospitals and health care providers fill out forms each week.

The Centers for Disease Control produces weekly estimates, he notes ut those reports typically lag a week or two behind.

This approach produces estimates daily, Culotta said.

500 million Tweets

Culotta and two student assistants analyzed more than 500 million Twitter messages over the eight-month period of August 2009 to May 2010, collected using Twitters application programming interface (API). By using a small number of keywords to track rates of influenza-related messages on Twitter, the team was able to forecast future influenza rates.

Once the program is running, its actually neither time consuming nor expensive, he said. Its entirely automated because were running software that samples each days messages, analyzes them and produces an estimate of the current proportion of people with the flu.

Southeasterns group obtained a 95 percent correlation with the national health statistics collected by the CDC. In addition, the results were comparable to figures collected by Google with its Flu Trends service, which tracks influenza rates by analyzing trends in query terms.

Culotta said the statistics he collected were for the whole country. His future work will look at extracting information from messages that include more location-specific data. This would allow him to more easily segment reporting information by regions. He is also planning a Web site that will display his results in real time, being developed in collaboration with graduate student Matthew Gill and computer science senior Ross Murray.

The Twitter advantage

Culotta said using Twitter has an advantage over Google because the high message and posting frequency of Twitter enables up-to-the minute analysis of an outbreak. Twitter, he said, reports having more than 105 million users posting nearly 65 million messages a day. Approximately 300,000 new users are added daily.

Despite the fact that Twitter appears targeted to a young demographic, it does in fact have quite a diverse set of users, he said. The majority of Twitters nearly 10 million unique visitors in February 2009 were over 35 years old, and a nearly equal percentage of users are between the ages 55 and 64 as between 18 and 24.