Skip to main content

Table 1 Description of general feature types with examples

From: Machine learning with naturally labeled data for identifying abbreviation definitions

FC

A character of a SF matches the 1st character of a word in a LF

 

C erebrospinal fluid ( C SF)

FC-ST

A character of a SF matches the 1st character of a stop-word in a LF

 

people w ith AIDS coalition (P W A)

FCG

A character of a SF matches the character following a non-alphanumeric non-space character in a LF

 

GH- r eleasing peptide (GH R P)

SBW

A character of a SF matches a character within a token in a LF such that token splits at that character into two substrings, one or both of which are defined words.

 

Cerebro s pinal fluid ( C SF)

LS

The last character of a SF is ‘s’ and last token in a LF ends in ‘s’ or ‘i’

 

plasma concentration s (PC s )

ALC

A letter of a SF matches a capital non-1st character letter in a LF

 

gamma-vinyl G ABA (GV G )

ALS

All characters of a SF appear anywhere in a single token in a LF in the correct order

 

br omo d eoxy u ridine ( BrdU )

LT

Look-up table match between a character in a SF and a token in a LF

 

Current ( I )

CL

A substring of a SF matches two or more consecutive characters of a token in a LF

 

methyl- beta -cyclodextrin (M beta CD)