Skip to main content

Table 3 Training data - summary of the labels extracted using the heuristic (regular expression based) approach, which were supplied to the machine learning algorithm as training data

From: ALE: automated label extraction from GEO metadata

 

Gender

Age

Tissue

Samples Annotated

441,311

299,878

861,703

% Samples Containing Label

51.2%

34.8%

100.0%

Most frequent label (mean for age)

Female (50.3%)

51.0 years

Blood (12.5%)