Notes: Automatic recognition of conceptualization zones in scientific articles

Liakata, M., Saha, S., Dobnik, S., Batchelor, C., & Rebholz-Schuhmann, D. (2012). Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics, 28(7), 991-1000.


  • Scientific discourse analysis helps in distinguishing the nature of knowledge in research articles (facts, hypothesis, existing and new work).
  • Annotation schemes vary across disciplines in scope and granularity.


  • To build a finer grained annotation scheme to capture the structure of scientific articles (CoreSC scheme).
  • To automate the annotation of full articles at sentence level with CoreSC scheme using machine learning classifiers (SAPIENT “Semantic Annotation of Papers: Interface & ENrichment Tool” available for download here).



  • 265 articles from biochemistry and chemistry, containing 39915 sentences (>1 million words) annotated in three phrases by multiple experts.
  • XML aware sentence splitter SSSplit used for splitting sentences.


  • First layer of the CoreSC scheme with 11 categories for annotation:
    • Background (BAC), Hypothesis (HYP), Motivation (MOT), Goal (GOA), Object (OBJ), Method (MET), Model (MOD), Experiment (EXP), Observation (OBS), Result (RES) and Conclusion (CON).


  1. Text classification:
    • Sentences classified independent of each other.
    • Uses Support Vector Machine (SVM).
    • Features extracted based on different aspects of a sentence: location within the paper, document structure (global features) to local features. For the complete list of features used, refer the paper.
  2. Sequence labelling:
    • Labels assigned to satisfy dependencies among sentences.
    • Uses Conditional Random Fields (CRF).

Results and discussion:

  • F-score: Ranges from 76% for EXP (Experiment) to 18% for the low frequency category MOT(Motivation) [Refer complete results from runs configured with different settings and features in Table 2 of the paper].
  • Most important features: n-grams (primarily bigrams), Grammatical triples (GRs), verbs, global features such as history (sequence of labels) and section headings (Detailed explanation for the features
  • Classifiers: LibS has the highest accuracy at 51.6%, CRF at 50.4% and LibL at 47.7%.

Application/Future Work:

  • Can be applied to create executive summaries of full papers (based on the entire content and not just abstracts) to identify key information in a paper.
  • CoreSC annotated biology papers to be used for guiding information extraction and retrieval.
  • Generalization to new domains in progress.

Notes: Discipline-independent argumentative zoning


Teufel, S., Siddharthan, A., & Batchelor, C. (2009, August). Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3-Volume 3 (pp. 1493-1502). Association for Computational Linguistics.


  • Argumentative Zoning (AZ) classifies each sentence into one of the categories below (inspired by knowledge claim KC) of authors :
    • Aim, Background, Basis, Contrast, Other and Textual.
[Refer AZ scheme – Teufel, S., Carletta, J., & Moens, M. (1999, June). An annotation scheme for discourse-level argumentation in research articles. In Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (pp. 110-117). Association for Computational Linguistics.]


  • Establishing a modified AZ scheme AZ-II with fine grained categories (11 instead of 7)  to recognize structure and relational categories.
  • Experimenting annotation using AZ scheme in two distinct domains: Chemistry and Computational Linguistics (CL).
  • Testing an annotation scheme to systematically exclude prior domain knowledge of annotators.



  • Domain independent categories so that the annotations can be done based on general, rhetorical and linguistic knowledge and no scientific domain knowledge is necessary.
  • Annotators are semi-informed experts following the rules below so that the existing domain knowledge has minimalist interference with annotations:
    • Justification is required for all annotations based on text based evidence such as cues, and other linguistic principles.
    • Discipline specific generics are provided based on high level domain knowledge so that the annotators can identify the validity of knowledge claims made in the domain (E.g. a “Chemistry primer” with high level information regarding common scientific terms to help a non-expert).
    • Guidelines are given with descriptions for annotating the categories; some categories might require domain knowledge for distinguishing them (e.g. Authors mentioning about the failure of previous methods: OWN_FAIL vs ANTISUPP, Reasoning required to come to conclusions from results: OWN_RES vs OWN_CONC).


  • Data:
    • Chemistry – 30 journal articles, 3745 sentences
    • CL – 9 conference articles, 1629 sentences
  • Independent annotations using web based tool. Refer example annotations in appendix of the paper.


  • Inter-annotator agreement: Fleiss Kappa coefficient, κ = 0.71 for Chemistry and κ = 0.65 for CL.
  • Wide variation in the frequency of categories –> fewer examples for supervised learning for rare categories (Refer ‘Figure 3: Frequency of AZ-II Categories’ in the paper to see the frequency distinctions between the two domains).
  • Pairwise agreement calculated to see the impact of domain knowledge between annotators: κAB  = 0.66, κBC  = 0.73 and κAB  = 0.73 –> Largest disagreement between expert (A) and non-expert (C).
  • Inter-annotator agreement to see the distinction between categories: κbinary = 0.78 for chemistry and κbinary = 0.65 for CL –> Easier distinction of categories in Chemistry than CL.
  • Krippendorff’s category distinctions to see how a category falls apart from the other collapsed categories: κ=0.71 for chemistry, κ=0.65 for CL
    • Well distinguised: OWN MTHD, OWN RES and FUT
    • Less distinguised: ANTISUPP, OWN FAIL and PREV OWN –> troubleshooting required for guidelines
  • Comparison of AZ-II to original AZ annotation scheme by collapsing into 6-category AZ annotation: κ=0.75 –> annotation of high consistency.


  • Positive result for domain independent application of AZ scheme and training non experts as annotators.
  • Annotating more established discipline like Chemistry was easier than CL.

Future Work:

  • Automation of AZ annotation
  • Expanding annotation guidelines to other disciplines and longer journal papers.