Skip to main content

Table 1 Text Features used for Tweet Classification

From: Leveraging graph topology and semantic context for pharmacovigilance through twitter-streams

 

Tweet Text Feature

1

Number of hashtags

2

Number of words indicating negation

3

Number of URLs

4

Number of pronouns

5

Number of drug entities

6

Number of effect entities

7

Bag of words text representation

  1. Table 1 The textual features extracted from twitter text for classification. Many spam and irrelevant tweets are designed to redirect a user to a desired website, thus the presence of URL’s in the tweet text is a strong indicator of spam. Spam tweets often contain many hashtags in an attempt to exploit trending topics, this makes the number of hashtags a very informative feature as well