Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Clustering based approach for population level identification of condition-associated T-cell receptor β-chain CDR3 sequences

Fig. 5

Characteristics of the differentially abundant CDR3β sequences in CD PBMC and CD Gut. The differentially enriched CDR3β sequences had biased usage of TRBV genes that are known to be over-represented in gluten reactive CDR3β sequences in previous studies, such as TRBV07-02 and TRBV09-01 from CD PBMC (a), and TRBV06-01 from CD Gut (b) (observed frequencies are shown in red, mean frequency from randomly generated sets of CDR3s are shown in blue). Significantly over-used amino acids at each position are shown for the enriched CDR3β sequences that use TRBV genes detected to be over-used from CD PBMC (c) and CD Gut (d), amino acids are colored according to their properties. The information content of significantly overused amino acids at each position is shown in bits on the y-axis. TRBV and per-position amino acid over-usage is assessed by comparing the observed frequencies in the set of differentially enriched CDR3s to that obtained by chance in 100 randomly sampled CDR3s of same size, TRBV gene and CDR3 length, with p < 0.05 considered significant (gene names indicate TRBVgene::CDR3 length::number of CDR3s in the enriched list with the Vgene and CDR3 length). The results from using nt 4-mer feature vectors are shown

Back to article page