BMC Bioinformatics

Table 5 Orthographic Features.

From: Extraction of semantic biomedical relations from text using conditional random fields

Orthographic Feature	Regular Expression
Init Caps	[A-Z].*
Init Caps Alpha	[A-Z][a-z]*
All Caps	[A-Z]+
Caps Mix	[A-Za-z]+
Has Digit	.[0-9].
Single Digit	[0-9]
Double Digit	[0-9][0-9]
Natural Number	[0-9]+
Real Number	[-\+][[0-9]+[\.,]+[0-9].,]+
Alpha-Numeric	[A-Za-z0-9]+
Roman	[ivxdlcm]+\|[IVXDLCM]+
Has Dash	.-.
Init Dash	-.*
End Dash	.*-
Punctuation	[,\.;:\?!-\+"]
Greek	(alpha\|beta\|...\|omega)
Has Greek	.\b(alpha\|beta\|...\|omega)\b.
Mutation Pattern	\w\d+-\D+

Orthographic features and their corresponding regular expressions used in the experiments.

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com