Skip to main content

Table 1 10 times repeated 5-fold cross-validated F1 score in five 1000 Genomes Project superpopulations using SVM, PCA or GTM

From: Probabilistic ancestry maps: a method to assess and visualize population substructures in genetics

Ancestry

1000G code

PCA 8-NN

SVM 10 PCs

GTM 3 PCs

GTM 10 PCs

Africans

AFR

1.00±0.00

1.00±0.00

1.00±0.00

1.00±0.00

Admixed Americans

AMR

0.93±0.00

1.00±0.00

1.00±0.00

1.00±0.00

East Asians

EAS

1.00±0.00

1.00±0.00

1.00±0.00

1.00±0.00

Europeans

EUR

0.99±0.00

1.00±0.00

1.00±0.00

1.00±0.00

South Asians

SAS

0.93±0.01

1.00±0.00

1.00±0.00

1.00±0.00

Overall F1 score

 

0.98±0.00

1.00±0.00

1.00±0.00

1.00±0.00

  1. SVM10 = support vector machine classification model using 10 principal components, PCA = k-nearest neighbours model based on 2D PCA map (k = 7), GTM{3,10,100} = bayesian classification model based on generative topographic mapping using 3, 10 or 100 principal components. Each value is an average with 95% confidence interval