Skip to main content

Table 2 Select-Train Function

From: A specialized learner for inferring structured cis-regulatory modules

SELECT-TRAIN(trainset, tuneset, aspects, phases, metric, K)

1 CRM ← TRAIN(trainset, aspects, phases, metric, K)

2 repeat

3     unjustified_ aspects ← { }

4     for aspect aspects

5        alt_CRM ← TRAIN(trainset, aspectsaspect, phases, metric, K)

6        if there is not a sufficiently low χ2 test probability that the tuneset predictions of CRM, alt_CRM

7           are from the same distribution or CRM scores better on tuneset than alt_CRM

8           then unjustified_aspectsunjustified_aspects aspect

9     aspects ← highest scoring set resulting from removing one of unjustified_ aspects based on tuneset

10     CRMalt_CRM associated with these aspects

11 until unjustified_aspects is empty

12 final_CRM ← TRAIN(trainset + tuneset, aspects, phases, metric, K)

13 return final_CRM

  1. The Select-Train algorithm takes: trainset, a set of labeled DNA sequences; tuneset, held-aside evaluation data; aspects, a list of CRM aspects to consider; as well as phases; metric, and K, which are arguments to the Train algorithm. It removes aspects from the original list which are statistically shown (using the tuning set) not to contribute. Finally, it returns a CRM trained with all the data, using the CRM aspects chosen.