Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: A scalable machine-learning approach to recognize chemical names within large text databases

Figure 1

Example of the principle by which a first-order MM works using the words "ethanol" and "booze". State transition frequencies are calculated for each letter in a word (including spaces on both sides of the word) and compared with models the MM has been trained on; in this case chemicals and words. The probability of observing a sequence of letters within each model is calculated as the product of each state (character) transition. To reflect a statistical distance between two models, the log10 ratio is taken.

Back to article page