Figure 2From: Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategiesConcept-distribution. Distribution of UMLS concept occurrences in corpora with different levels of redundancy. The All Notes (a) and All Informative Notes (b) corpora have inherent redundancy, while the Last Informative Note (c) corpus does not. The shapes of the distributions of concepts differ depending on the presence of redundancy in the corpus.Back to article page