Skip to main content

Table 1 Description of general feature types with examples

From: Machine learning with naturally labeled data for identifying abbreviation definitions


A character of a SF matches the 1st character of a word in a LF


C erebrospinal fluid ( C SF)


A character of a SF matches the 1st character of a stop-word in a LF


people w ith AIDS coalition (P W A)


A character of a SF matches the character following a non-alphanumeric non-space character in a LF


GH- r eleasing peptide (GH R P)


A character of a SF matches a character within a token in a LF such that token splits at that character into two substrings, one or both of which are defined words.


Cerebro s pinal fluid ( C SF)


The last character of a SF is ‘s’ and last token in a LF ends in ‘s’ or ‘i’


plasma concentration s (PC s )


A letter of a SF matches a capital non-1st character letter in a LF


gamma-vinyl G ABA (GV G )


All characters of a SF appear anywhere in a single token in a LF in the correct order


br omo d eoxy u ridine ( BrdU )


Look-up table match between a character in a SF and a token in a LF


Current ( I )


A substring of a SF matches two or more consecutive characters of a token in a LF


methyl- beta -cyclodextrin (M beta CD)