Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies

Figure 2

Concept-distribution. Distribution of UMLS concept occurrences in corpora with different levels of redundancy. The All Notes (a) and All Informative Notes (b) corpora have inherent redundancy, while the Last Informative Note (c) corpus does not. The shapes of the distributions of concepts differ depending on the presence of redundancy in the corpus.

Back to article page