Abstract
Detecting traffic events and their locations is important for an effective transportation management system and better urban policy making. Traffic events are related to traffic accidents, congestion, parking issues, to name a few. Currently, traffic events are detected through static sensors e.g., CCTV camera, loop detectors. However they have limited spatial coverage and high maintenance cost, especially in developing regions. On the other hand, with Web 2.0 and ubiquitous mobile platforms, people can act as social sensors sharing different traffic events along with their locations. We investigated whether Twitter - a social media platform can be useful to understand urban traffic events from tweets in India. However, such tweets are informal and noisy and containing vernacular geographical information making the location retrieval task challenging. So far most authors have used geotagged tweets to identify traffic events which accounted for only 0.1%-3% or sometimes less than that. Recently Twitter has removed precise geotagging, further decreasing the utility of such approaches. To address these issues, this research explored how ungeotagged tweets could be used to understand traffic events in India. We developed a novel framework that does not only categorize traffic related tweets but also extracts the locations of the traffic events from the tweet content in Greater Mumbai. The results show that an SVM based model performs best detecting traffic related tweets. While extracting location information, a hybrid georeferencing model consists of a supervised learning algorithm and a number of spatial rules outperforms other models. The results suggest people in India, especially in Greater Mumbai often share traffic information along with location mentions, which can be used to complement existing physical transport infrastructure in a cost-effective manner to manage transport services in the urban environment.