Parakweet Labs - Info Session
Speaker: Kiam Choo
Title: Applications of Entity Detection on Twitter Data: the Parakweet Experience
At Parakweet Labs, we have built entity detection on Twitter’s streaming data to detect books and movies being mentioned by Twitter users. Books and movies are detected from a large proprietary database of >1m entities. Tweets are challenging to work with because of their shortness and their colloquial nature. Unlike other related work that use Wikipedia entries as entities, our proprietary database of entities do not automatically come with a popularity measure that helps with disambiguation. Instead, using Bayesian ideas, we have created a technique that pulls in broader statistical estimates that help detect entity mentions in many situations where short, ambiguous tweets would normally prohibit detection. We achieve a precision of 95% and recall of 60%. Further leveraging our entity detector, we have built behaviour detection (such as if someone wants to watch a movie, recommends a movie, etc.) by doing pattern matching on relations that we extract that are simpler than semantic role labels and more suited for our purposes.
Open to undergraduate and graduate students.
See Parakweet Labs job posting.
Graduating students can forward their resumes to: