Inferring chromosome radial organization from Hi-C data

Background The nonrandom radial organization of eukaryotic chromosome territories (CTs) inside the nucleus plays an important role in nuclear functional compartmentalization. Increasingly, chromosome conformation capture (Hi-C) based approaches are being used to characterize the genome structure of many cell types and conditions. Computational methods to extract 3D arrangements of CTs from this type of pairwise contact data will thus increase our ability to analyze CT organization in a wider variety of biological situations. Results A number of full-scale polymer models have successfully reconstructed the 3D structure of chromosome territories from Hi-C. To supplement such methods, we explore alternative, direct, and less computationally intensive approaches to capture radial CT organization from Hi-C data. We show that we can infer relative chromosome ordering using PCA on a thresholded inter-chromosomal contact matrix. We simulate an ensemble of possible CT arrangements using a force-directed network layout algorithm and propose an approach to integrate additional chromosome properties into our predictions. Our CT radial organization predictions have a high correlation with microscopy imaging data for various cell nucleus geometries (lymphoblastoid, skin fibroblast, and breast epithelial cells), and we can capture previously documented changes in senescent and progeria cells. Conclusions Our analysis approaches provide rapid and modular approaches to screen for alterations in CT organization across widely available Hi-C data. We demonstrate which stages of the approach can extract meaningful information, and also describe limitations of pairwise contacts alone to predict absolute 3D positions.


This file includes:
Radial CT arrangement of GM12878 (FitHiC). a) Pairwise inter-chromosomal significant interaction pattern matrix derived from FitHiC for GM12878 Hi-C data. b) 2D PCA projection of the FitHiC pairwise inter-chromosomal significant interaction pattern matrix obtained from GM12878 Hi-C data. c) Network modeling generated model cluster for FitHiC GM12878 contacts, selected based on inferred CT distribution type. d) Correlation between predicted tuned CT distance from GM12878 (FitHiC) and lymphoblastoid microscopy imaging data. f ) Correlation between predicted tuned CT distances obtained from GM12878 (FitHiC) and GM12878 (strong interactions) Supplementary Figure 2 Radial CT arrangements inferred from GM12878 and BJ1-hTERT simulated random ligation data. a) Network modeling cluster for GM12878 simulated random ligation, selected based on inferred CT distribution type. b) Correlation between predicted tuned CT distance from GM12878 simulated random ligation Hi-C data and lymphoblastoid microscopy imaging data. c) Pairwise inter-chromosomal strong interaction pattern matrix for BJ1-hTERT simulated random ligation Hi-C data. d) 2D PCA projection of the pairwise inter-chromosomal strong interaction pattern matrix obtained from BJ1-hTERT simulated random ligation Hi-C data. e) Network modeling cluster for BJ1-hTERT simulated random ligation, selected based on inferred CT distribution type. f ) Correlation between predicted tuned CT distance from BJ1-hTERT simulated random ligation Hi-C data and fibroblast microscopy imaging data.
Supplementary Figure 3 Network modeling generated clusters for GM12878. For each cluster, the Pearson's correlation of mean radial CT distances with gene density (GD) and chromosome length (LN) are shown on the top of the cluster. Number of models in each cluster is indicated in parentheses above each cluster.
Supplementary Figure 4 The radial distance distributions of 23 CTs for both GM12878 and BJ1-hTERT obtained from the respective selected model clusters based on inferred CT distribution types (# of models in the selected clusters-GM12878: 109, and BJ1-hTERT: 122). All distributions are ordered left to right from center to periphery.
Supplementary Figure 5 Comparison of network modeling generated radial CT distance profiles with experimentally obtained distributions. a-b) The radial distance profiles of CT18 and CT19 (a top) and short and long chromosomes (b top) for GM12878 and BJ1-hTERT obtained from the network model clusters in this current work. These distributions are similar to the radial arrangement previously measured by microscopy for CT18 and CT19 (a bottom) and short and long chromosomes (b bottom) in lymphocyte (2D arrangement) and fibroblast (3D arrangement) nuclei. Panel (a-b) bottom parts are adapted from Cremer et al. (2001). Copyright Springer Nature. Used with permission. c) Top -The radial distance profiles of CT10 and CTX for BJ1-hTERT obtained from our network model cluster.
Bottom -The distance distribution of CT10 and CTX from the nuclear periphery in human dermal firbroblasts obtained using 3D FISH. Panel (c) bottom part is created with data obtained from Mehta et al. (2010). d) Top -The radial distance profiles of CT4 for GM12878 obtained from our network modeling cluster. Bottom -Radial distance profiles of AF4 gene, which is located on chr4, from the nucleus center in two lymphoblastic cell lines -NALM-6 (black) and IL-9 (grey). Panel (d) bottom part is adapted from Gué et al. (2005), copyright John Wiley and Sons, used with permission.
Supplementary Figure 7 Radial arrangement of BJ1-hTERT and BJ-5ta follow similar CT distribution patterns. a) Pairwise inter-chromosomal strong interaction pattern matrix for BJ-5ta Hi-C data. b) 2D PCA projection of the pairwise inter-chromosomal strong interaction pattern matrix obtained from BJ-5ta Hi-C data. c) Network modeling generated model cluster for BJ-5ta, selected based on respective inferred CT distribution type (# of models in the selected clusters: 115). d) Correlation of BJ1-hTERT PC1 values with BJ-5ta PC1 values obtained from the PCA transformation of the respective pairwise interchromosomal strong interaction pattern matrices. e) Correlation between predicted tuned CT distance from BJ-5ta Hi-C data and fibroblast microscopy imaging data. f ) Correlation between predicted tuned CT distances obtained from BJ1-hTERT and BJ-5ta.
Supplementary Figure 8 Radial arrangement of GM12878 replicates R1 and R2 show similar CT distribution pattern. a) Pairwise inter-chromosomal strong interaction pattern matrix for GM12878 R1 Hi-C data (R2 used in main figures). b) 2D PCA projection of the pairwise inter-chromosomal strong interaction pattern matrix obtained from GM12878 R1 Hi-C data. c) Network modeling generated model cluster for GM12878 R1, selected based on respective inferred CT distribution type (# of models in the selected clusters: 142). d) Correlation of GM12878 R2 PC1 values with GM12878 R1 PC1 values obtained from the PCA transformation of the respective pairwise inter-chromosomal strong interaction pattern matrices. e) Correlation between predicted tuned CT distance from GM12878 R1 Hi-C data and lymphoblastoid microscopy imaging data. f ) Correlation between predicted tuned CT distances obtained from GM12878 R2 and GM12878 R1 Supplementary Figure 9 The radial distance profiles of 23 CTs for GM12878 R1, GM12878 R2, BJ1-hTERT and BJ-5ta obtained from the respective selected model clusters based on inferred CT distribution types (# of models in the selected clusters: for GM12878 R2 -109, GM12878 R1 -142, BJ1-hTERT -122, and BJ-5ta -115) Supplementary Figure 10 The radial distance profiles of 23 CTs for GM12878 standard, paternal, and maternal copies obtained from the respective selected model clusters based on inferred CT distribution types (# of models in the selected clusters: for GM12878 standard -109, paternal copy -194, and maternal copy -148) Supplementary Figure 11 Statistical comparisons of the CT distance profiles obtained from network modeling for GM12878 and BJ1-hTERT for different sample sizes n s (number of network structures generated from Hi-C data) using the two-sided Mann-Whitney U test.