1st Shared Task on the Analysis of Narrative Levels Through Annotation (SANTA)
We would like to invite you to participate in the 1st Shared Task on Narrative Level Annotation. It is an adaptation of the shared task-format established in the field of Computational Linguistics and Informatics to the field of Literary Studies. The goal of the first phase of this (two-phased) shared task is the collaborative creation of annotation guidelines, which in turn will serve as a basis for the second phase, an automatisation-oriented shared task. The (intended) audience for the first round of the shared task are researchers interested in the (manual) analysis of narrative, who then have influence over the target concept that computer scientists later aim at automatically finding.
During the first few months, the participants develop annotation guidelines on their own. We provide a development corpus (see below) that can be used to test guidelines internally. On June 15, the guidelines are to be submitted to us. Immediately thereafter, a test corpus is released. The participants then annotate the test corpus using their own guidelines, until June 25. After submitting these annotations (in an online annotation tool that we provide), the participants are asked to annotate the very same test corpus according to the guidelines of one other participant until July 6. In the same time, the organizers will gather annotations based on every participants guidelines made by students.
Finally, all participants meet for a workshop (in August/September) and discuss their own and the others’ annotation guidelines. Guidelines will be evaluated qualitatively and quantitatively (inter-annotator agreement) during the workshop. Based on the outcome of the workshop we will define the guidelines for the second phase of the shared task, i.e., the guidelines on which the automatization of narrative level annotation will be based. Participants of the first phase can also participate in the second, but don’t have to. Mixed teams, consisting of humanists/narratologists and information/computer scientists are welcome.
The corpus has been compiled to cover as much relevant phenomena as possible. It is heterogeneous with respect to genre, publication date and text length. Still, representativity (whatever that means for literature) was not a guiding principle. All texts are available in English and German. Some texts are translations from a third language.
The maximal length of the texts in this corpus is 2000 words. Since this limitation entails a bias with respect to the use of narrative levels, we also have included a long text, which is we make available in a shortened version. For the latter we removed passages that do not affect the overall narrative level structure in a substantial manner.
The corpus is freely available on GitHub.
June 15: Submission of annotation guidelines
June 25: Submission of annotations on test corpus
July 6: Submissions of annotations on test corpus, using foreign annotation guidelines
August/September: Workshop (will likely take place in Germany and take multiple days; we are working on funding)
If you have questions or comments, please do not hesitate to contact us.
Evelyn Gius, Nils Reiter, Jannik Strötgen and Marcus Willand