Monitoring of Complex Information Infrastructure by Mining External Signals (2009-2010)

From October 2009 Professor Ben O’Loughlin in collaboration with Linguamatics Ltd conducted a 12-month pilot investigation of the use of blogs and Twitter as a way of monitoring information infrastructures for early warnings of problems. The project was funded by the Technology Strategy Board. It analysed public use of social media around the Haiti earthquake, swine flu vaccine controversy, Sony Playstation hack, and in everyday transport routines. The research was the latest in a series of NPCU grant-funded projects that develop methodologies to explore online behaviour and its consequences for politics and society.

Swine flu / influenza. How does virality work?

The goal of the project was to show that analysis of external chatter can provide early warning in near real time concerning economic or security problems. Automatic analysis of formal channels (e.g. customer surveys and user feedback forms) using Natural Language Processing (NLP) has been successfully used by large organisations to identify issues reported with products and services. Informal online sources of information, such as blogs and twitter, give the potential for greater coverage of issues in near-real time. We took NLP technology already proven in life science research and applied it to blogs and twitter for monitoring of digital services. Weak signals gathered from large numbers of users can suggest problems which do not show up as single point failures. We analysed whether it is possible to catch cases where a rumour of a problem may exacerbate or even cause the problem itself.

Relatively unsophisticated techniques of word counting have proved successful in categorising user comments. In this project we combined these techniques with use of deeper language processing, as used successfully in text mining academic journals for drug discovery, to give early warning of potential infrastructure problems. We looked at the role of rumours, both in exacerbating issues and in suggesting potential information leakage.

The project team also used these techniques to analyse public use of Twitter during the 2010 UK General Election leaders debates. This extended research by O'Loughlin and Nick Anstead on semantic polling and led to numerous publications and media reports. 

Linguamatics are a text-mining company based in Cambridge, UK. Lawrence Ampofo, a PhD student in the NPCU, was Research Assistant on the project.