Funded by the Netherlands Organization for Scientific Research, the TNT project aims to facilitate tracking news events and determining their impact on the general public, both by professionals (media analysts, news watchers, scientists) and by the general public itself. We want to achieve this by developing algorithms that allow us to track news and measure impact largely automatically. In the scenario that we envisage, news data will be obtained from the internet, and the impact of events will be determined by analyzing the comments left behind by readers.
To be able to address our overall task, three preparatory activities are needed: (1) recognizing people, products, organizations, locations and temporal expressions, both in edited content and in user generated content, often quickly written, unedited comments left behind by readers of news messages; (2) abstracting over news stories to news incidents, clustering messages about the same incident, and summarizing the material; and (3) determining the opinions of readers on the basis of their comments.
These three subtasks have been addressed extensively in the literature, with well understood solutions. The innovative and scientifically challenging aspect of this proposal is to (1) apply them to the Dutch language; (2) apply them to noisy texts; (3) integrate them and use them to provide insights in the daily flow of news facts and the comments that they generate.
Using the solutions to the three subtasks as building blocks, we develop and test methods that will facilitate media analysis of large quantities of data. The algorithms that we aim to develop will generate well-organized, interpretable data in which the main trends and links will become visible.
In sum, the project is directly aimed at finding solutions to combat the data explosion constituted by news facts and the public’s responses to them. The project’s results will lead to a renewed digital experience of online news.
- Manos Tsagkias
- Maarten de Rijke