************************************************************************
DATASETS ASSOCIATED WITH COLING-2014 PAPER
"Empirical Analysis of Aggregation Methods for Collective Annotation"
by Ciyang Qing, Ulle Endriss, Raquel Fernandez and Justin Kruger
************************************************************************

This directory contains the datasets on linguistic judgments collected 
via Amazon's Mechanical Turk (AMT) described and analysed in the paper:

(1) Recognising Textual Entailment (RTE)
(2) Preposition Sense Disambiguation (PSD)
(3) Question Dialogue Acts (QDA)

The first annotation dataset (RTE) has been collected by Snow et al. 
(2008); the gold standard annotation was created by Dagan et al. (2006).
The other two annotation datasets are new. The PSD gold standard was 
created by Litkowski and Hargraves (2007) and the QDA gold standard by 
Jurafsky et al. (1997).

************************************************************************
SUMMARY OF BASIC PARAMETERS
************************************************************************

The following table provides an overview of some of the basic parameters
of the three datasets:

      	   	 RTE  PSD  QDA
#categories  	   2    3    4 
#items	     	 800  150  300
#annotators  	 164   45   63 
#annotators/item  10   10   10 
#items/HIT	  20   15   10 

************************************************************************
FILES ASSOCIATED WITH DATASETS
************************************************************************

For each of the three datasets (RTE, PSD, QDA) we provide two files:

(1) A CSV file with the annotations collected via AMT. Each row in the 
file corresponds to one individual annotation. There are four columns: 
 - the AMT Worker ID (Annotator)
 - the ID of the data example (Item) 
 - the worker label (Category)
 - the gold standard label (Gold)

(2) An additional file with the items themselves. In the case of RTE, 
this is an XML file distributed via the PASCAL RTE Challenge website 
(http://pascallin.ecs.soton.ac.uk/Challenges/RTE/). In the case of PSD 
and QDA, these are HTML files we have generated from the gold standard
annotations mentioned above. 

************************************************************************
REFERENCES
************************************************************************

Ido Dagan, Oren Glickman, and Bernardo Magnini. 2006. The PASCAL 
recognising textual entailment challenge. In Machine Learning 
Challenges, volume 3944 of LNCS, pages 177-190. Springer-Verlag.

Dan Jurafsky, Elizabeth Shriberg, and Debra Biasca. 1997. Switchboard
SWBD-DAMSL shallow-discourse-function-annotation coderÕs manual, 
Technical Report TR 97-02, Institute for Cognitive Science, University
of Colorado at Boulder.

Kenneth C. Litkowski and Orin Hargraves. 2007. SemEval-2007 Task 06: 
Word-Sense Disambiguation of Prepositions. In Proc. 4th International 
Workshop on Semantic Evaluations (SemEval-2007).

Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y. Ng. 2008.
Cheap and fast---but is it good? Evaluating non-expert annotations for 
natural language tasks. In Proc. Conference on Empirical Methods in 
Natural Language Processing (EMNLP-2008), pages 254-263.

************************************************************************