Figure 1From: Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies Distribution of similarity levels across pairs of same-patient informative notes in the corpus. Back to article page