Notes: Discipline-independent argumentative zoning


Teufel, S., Siddharthan, A., & Batchelor, C. (2009, August). Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3-Volume 3 (pp. 1493-1502). Association for Computational Linguistics.


  • Argumentative Zoning (AZ) classifies each sentence into one of the categories below (inspired by knowledge claim KC) of authors :
    • Aim, Background, Basis, Contrast, Other and Textual.

[Refer AZ scheme – Teufel, S., Carletta, J., & Moens, M. (1999, June). An annotation scheme for discourse-level argumentation in research articles. In Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (pp. 110-117). Association for Computational Linguistics.]


  • Establishing a modified AZ scheme AZ-II with fine grained categories (11 instead of 7)  to recognize structure and relational categories.
  • Experimenting annotation using AZ scheme in two distinct domains: Chemistry and Computational Linguistics (CL).
  • Testing an annotation scheme to systematically exclude prior domain knowledge of annotators.



  • Domain independent categories so that the annotations can be done based on general, rhetorical and linguistic knowledge and no scientific domain knowledge is necessary.
  • Annotators are semi-informed experts following the rules below so that the existing domain knowledge has minimalist interference with annotations:
    • Justification is required for all annotations based on text based evidence such as cues, and other linguistic principles.
    • Discipline specific generics are provided based on high level domain knowledge so that the annotators can identify the validity of knowledge claims made in the domain (E.g. a “Chemistry primer” with high level information regarding common scientific terms to help a non-expert).
    • Guidelines are given with descriptions for annotating the categories; some categories might require domain knowledge for distinguishing them (e.g. Authors mentioning about the failure of previous methods: OWN_FAIL vs ANTISUPP, Reasoning required to come to conclusions from results: OWN_RES vs OWN_CONC).


  • Data:
    • Chemistry – 30 journal articles, 3745 sentences
    • CL – 9 conference articles, 1629 sentences
  • Independent annotations using web based tool. Refer example annotations in appendix of the paper.


  • Inter-annotator agreement: Fleiss Kappa coefficient, κ = 0.71 for Chemistry and κ = 0.65 for CL.
  • Wide variation in the frequency of categories –> fewer examples for supervised learning for rare categories (Refer ‘Figure 3: Frequency of AZ-II Categories’ in the paper to see the frequency distinctions between the two domains).
  • Pairwise agreement calculated to see the impact of domain knowledge between annotators: κAB  = 0.66, κBC  = 0.73 and κAB  = 0.73 –> Largest disagreement between expert (A) and non-expert (C).
  • Inter-annotator agreement to see the distinction between categories: κbinary = 0.78 for chemistry and κbinary = 0.65 for CL –> Easier distinction of categories in Chemistry than CL.
  • Krippendorff’s category distinctions to see how a category falls apart from the other collapsed categories: κ=0.71 for chemistry, κ=0.65 for CL
    • Well distinguised: OWN MTHD, OWN RES and FUT
    • Less distinguised: ANTISUPP, OWN FAIL and PREV OWN –> troubleshooting required for guidelines
  • Comparison of AZ-II to original AZ annotation scheme by collapsing into 6-category AZ annotation: κ=0.75 –> annotation of high consistency.


  • Positive result for domain independent application of AZ scheme and training non experts as annotators.
  • Annotating more established discipline like Chemistry was easier than CL.

Future Work:

  • Automation of AZ annotation
  • Expanding annotation guidelines to other disciplines and longer journal papers.