Skip to main content

Table 1 Description of training and test datasets

From: A question-entailment approach to question answering

Datasets

Type/Domain

# pairs

Positive Examples (Entailment/Similarity)

SNLI (2015)

Inference pairs of open-domain sentences.

550,152 (train)

PS: A child in a light and dark green ensemble sits in a chair in front of a typewriter looking off-camera. HS: A child sitting in front of a desk.

MultiNLI (2017)

Inference pairs of open-domain sentences.

392,702 (train)

PS: On the island of the Giudecca, you’ll find another of the great Palladio-designed churches (one of two in Venice), the Redentore. HS: There are two church in Venice that were designed by Palladio.

SemEval-cQA (2016)

Similar questions from the Qatar Living forum.

3169 (train)

PQ: Books. Where can i donate books? HQ: english books. Where to buy english books? Is there a public library in doha? thanks

Clinical-QE (2016)

Entailment pairs of questions asked by doctors.

8588

PQ: Patient is reluctant to take medications so I have been treating with smaller doses than I would with some other patients. How do I control her hypertension and still get her cooperation? HQ: Patient reluctant to take medication. How to control hypertension and still get her cooperation?

Quora (2017)

Open-domain question similarity pairs.

404,279

PQ: I’ve been working out in the gym for the last three months but I’m not successful in gaining weight. Should I go for a mass gainer? Is it safe? HQ: I have been working out from few months but I am unable to gain mass/weight.Which mass gainer should I take?

New Test Data (CHQs)

Entailment pairs of consumer health questions.

850

PQ: IHSS heart condition and WPW heart condition. Is there any way you could send me information on both these heart conditions? My son has to get tested for them eventually and I would just like information to understand the conditions of both of them more. HQ: What is Wolff-Parkinson-White syndrome ?