From: Disorder recognition in clinical texts using multi-label structured SVM
Feature | Description |
---|---|
Bag of Words | Bag of Words in a 5-word window. |
Part of Speeches | Part of Speeches in a 7-word window. |
Capitalization | Convert all alphabetic characters of the words to uppercase [31]. The window size is 5. |
Case pattern | The patterns are generated by the following steps. Similar to [32], any uppercase alphabetic character is replaced by “A” and any lowercase one is replaced by “a”. In the same way, any number is replaced by “0”. The window size is 3. |
Word representation | We use word2vec to acquire 700 clusters from the unlabeled clinical narratives and give each cluster a different serial number. Then we take the serial number of the clusters as a feature. The window size is 3. |