Project Description
Topic detection with large and noisy data collections such as social media must address both scalability
and accuracy challenges. KeyGraph is an efficient method that improves on current solutions by considering
keyword cooccurrence. We show that KeyGraph has similar accuracy when compared to state-of-the-art
approaches on small, well-annotated collections, and it can successfully filter irrelevant documents and identify
events in large and noisy social media collections. An extensive evaluation using Amazon’s Mechanical
Turk demonstrated the increased accuracy and high precision of KeyGraph, as well as superior runtime
performance compared to other solutions.


  • H. Sayyadi, L. Raschid. "A Graph Analytical Approach for Topic Detection", ACM Transactions on Internet Technology (TOIT), 2013. (DOI=10.1145/2542214.2542215)
  • H. Sayyadi, M. Hurst, and A. Maykov. "Event Detection and Story Tracking in Social Streams", to Appear in Proceeding of 3rd Int'l AAAI Conference on Weblogs and Social Media (ICWSM09), May 17 - 20, 2009, San Jose, California.(pdf)

An example of KeyGrpah and extracted topics/events:


Here is also the number of documents per day for topic US Presidential Election found by KeyGraph (each color shows a subevent) versuse Google Trends(here) for the query "2008 Presidential Election". This figur shows the count of documents per day for this topic, as determined by KeyGraph (upper curve) versus Google Trends for the query "\emph{2008 Presidential Election}" (two lower curves). Google Trends provides the search volume index (middle curve) and the news reference volume or count (lower curve). We can clearly observe that the peaks in the KeyGraph output are closely aligned with peaks in Google Trends. For example, the peaks on August 12 correspond to the event when {\em LesserEvil declared itself to be the official snack food of the 2008 presidential election}. On August 23, Obama announced Biden as his running mate; this is detected by KeyGraph. Finally, both KeyGraph and Google Trends show a big peak on August 28; this event corresponds to the 2008 Democratic National Convention from
August 25 to 28.


