Demo of the Aggregators

The file aggregators.R includes implementations of the aggregators proposed in the paper, together with some variations.

The file evaluation.R contains functions that are used to evaluate the results against the gold standard.

source("./aggregators.R")
source("./evaluation.R")

(Please make sure the auxiliary file aux.R is in the same directory.)

Input

Each aggregator takes as input a data frame which should have 3 columns named Annotator, Item, and Category. If evaluation is needed (or the oracle aggregator ORA is used), then the data frame should have an additional column named Gold.

For example, below is the Question Dialogue Acts (QDA) dataset used in the paper.

qda <- read.csv("./QDA-annotations.csv", colClasses = "character")
head(qda, n = 11)

##         Annotator                   Item Category Gold
## 1   AQR70X1KIOP6B sw_0165_4079_A_59_utt2        2    2
## 2   AOOJY0XKNYJYZ sw_0165_4079_A_59_utt2        2    2
## 3  A3A4P6BWFE7HFN sw_0165_4079_A_59_utt2        2    2
## 4   AQ36OBO2GRA00 sw_0165_4079_A_59_utt2        2    2
## 5  A220XWTPR73XXE sw_0165_4079_A_59_utt2        2    2
## 6   AMSWIR4J9XEIJ sw_0165_4079_A_59_utt2        2    2
## 7    AAJZN4PRAHGG sw_0165_4079_A_59_utt2        2    2
## 8  A3NXXKUNDI4NJ0 sw_0165_4079_A_59_utt2        2    2
## 9  A2YFPO0N4GIS25 sw_0165_4079_A_59_utt2        2    2
## 10 A2WNW8A4MOR7T7 sw_0165_4079_A_59_utt2        2    2
## 11  AQR70X1KIOP6B  sw_0011_4358_A_1_utt2        2    2

Use of Aggregators

There are 6 aggregators: SPR, COM, INV, DIFF, RAT, AGR (with 2 additional variations AGR.PRIOR and AGR.ITER).

The result of applying an aggregator is a named vector showing the collective annotation for each item. For instance, below are some results for the Simple Plurality Rule (SPR).

qda.spr <- SPR(qda)
head(qda.spr)

## sw_0001_4325_A_25_utt1 sw_0004_4327_B_32_utt2 sw_0008_4321_B_12_utt4 
##                    "1"                    "1"                    "3" 
##  sw_0011_4358_A_1_utt2  sw_0029_4152_A_1_utt4 sw_0030_4166_A_68_utt1 
##                    "2"                    "2"                    "2"

As a sanity check, we know from above that all annotators chose category 2 for item sw_0165_4079_A_59_utt2.

qda.spr["sw_0165_4079_A_59_utt2"]

## sw_0165_4079_A_59_utt2 
##                    "2"

Sure enough, SPR outputs category 2 for that item.

Evaluation

We can use the functions in evaluation.R to extract the gold standard and calculate the observed aggreement between an aggregator and the gold standard:

qda.gold <- ReadGold(qda)
head(qda.gold)

## sw_0001_4325_A_25_utt1 sw_0004_4327_B_32_utt2 sw_0008_4321_B_12_utt4 
##                    "1"                    "1"                    "3" 
##  sw_0011_4358_A_1_utt2  sw_0029_4152_A_1_utt4 sw_0030_4166_A_68_utt1 
##                    "2"                    "2"                    "2"

ObservedAgreement(qda.spr, qda.gold)

## [1] 0.8567

(Please make sure to source evaluation.R if the oracle aggregator ORA is used, since it needs the ReadGold function to access the gold standard)

Optional Arguments

Other aggregators mostly work similarly, but the following aggregators have optional arguments.

For COM and the AGR family (including ORA), by default the number of categories K is the number of different categories that annotators assigned to all the items, which may not be correct if a category was never used by any annotator. On the other hand, in that case sometimes it is not unreasonable to exclude that category. In any case, K can be explicitly specified, e.g.,

qda.com <- COM(qda, K = 4)
head(qda.com)

## sw_0001_4325_A_25_utt1 sw_0004_4327_B_32_utt2 sw_0008_4321_B_12_utt4 
##                    "1"                    "1"                    "3" 
##  sw_0011_4358_A_1_utt2  sw_0029_4152_A_1_utt4 sw_0030_4166_A_68_utt1 
##                    "2"                    "2"                    "2"

ObservedAgreement(qda.com, qda.gold)

## [1] 0.87

In addition, for AGR.ITER, the default maximal number of iterations is 50, but it can be modified, e.g.,

qda.agrIter <- AGR.ITER(qda, K = 4, iter.max = 10)

## [1] "Converged after 1 iterations"

head(qda.agrIter)

## sw_0001_4325_A_25_utt1 sw_0004_4327_B_32_utt2 sw_0008_4321_B_12_utt4 
##                    "1"                    "1"                    "3" 
##  sw_0011_4358_A_1_utt2  sw_0029_4152_A_1_utt4 sw_0030_4166_A_68_utt1 
##                    "2"                    "2"                    "2"

ObservedAgreement(qda.agrIter, qda.gold)

## [1] 0.8667