Skip to main content

Table 1 Text Features used for Tweet Classification

From: Leveraging graph topology and semantic context for pharmacovigilance through twitter-streams

  Tweet Text Feature
1 Number of hashtags
2 Number of words indicating negation
3 Number of URLs
4 Number of pronouns
5 Number of drug entities
6 Number of effect entities
7 Bag of words text representation
  1. Table 1 The textual features extracted from twitter text for classification. Many spam and irrelevant tweets are designed to redirect a user to a desired website, thus the presence of URL’s in the tweet text is a strong indicator of spam. Spam tweets often contain many hashtags in an attempt to exploit trending topics, this makes the number of hashtags a very informative feature as well