Skip to main content

Table 1 10 times repeated 5-fold cross-validated F1 score in five 1000 Genomes Project superpopulations using SVM, PCA or GTM

From: Probabilistic ancestry maps: a method to assess and visualize population substructures in genetics

Ancestry 1000G code PCA 8-NN SVM 10 PCs GTM 3 PCs GTM 10 PCs
Africans AFR 1.00±0.00 1.00±0.00 1.00±0.00 1.00±0.00
Admixed Americans AMR 0.93±0.00 1.00±0.00 1.00±0.00 1.00±0.00
East Asians EAS 1.00±0.00 1.00±0.00 1.00±0.00 1.00±0.00
Europeans EUR 0.99±0.00 1.00±0.00 1.00±0.00 1.00±0.00
South Asians SAS 0.93±0.01 1.00±0.00 1.00±0.00 1.00±0.00
Overall F1 score   0.98±0.00 1.00±0.00 1.00±0.00 1.00±0.00
  1. SVM10 = support vector machine classification model using 10 principal components, PCA = k-nearest neighbours model based on 2D PCA map (k = 7), GTM{3,10,100} = bayesian classification model based on generative topographic mapping using 3, 10 or 100 principal components. Each value is an average with 95% confidence interval