Creating reports in R #Code

I’ve recently been consolidating a lot of R code from different parts of my analysis into one file. I wanted to add good documentation and explanation of results and interpretations along with my code to make sense of it later. I came across this option of creating dynamic reports that can combine our code, custom text and R output to an output document using the knitR package in R. I find it a good practice to create such reports for any analysis (wish I followed this earlier), so here’s a post on how to create them. They are very useful coz of the following reasons:

  • It is a great option to generate PDF, HTML and word reports by combining our text explanations, code, R output and graphics at one go. It saves the hassle of saving and copying text, code, output and figures separately into a report.
  • We can easily share the file with someone else with the output and explanations.
  • It is much easier to generate a new report dynamically when the input file changes, as it runs the same code and generates new output report based on the new file at one go.

How to create dynamic reports in R?

The first step is to create a R Markdown file (with the extension .Rmd). If you’re using RStudio, you can go to File -> New File -> R Markdown to create a Rmd file.  You should specify whether you want an output in html, pdf or word. It generated the following skeleton code for me as I specified an output pdf file:

Rmd parts
Rmd parts

Alternatively, you can write the sections in R and save it as .Rmd file. The Header section begins and ends with three dashes (—). It contains title, date and author attributes and specifies the type of output document: E.g. html_document for html web page, pdf_document for a pdf file, word_document for Microsoft Word .docx etc. The header can include other options as needed: “runtime: shiny” if it should be run as an interactive shiny app, “css: styles.css” to change the stylesheet when working with html, “toc: true” to include a table of contents etc.

The following code contains instructions along with R code to create a simple html document. It should be pretty self-explanatory to follow instructions and edit the code as needed for your own use:

The html file created by the above code can be accessed here to view how the corresponding output is generated:

Sample Markdown file in R


Useful resources:




Adding CKEditor to webpages in PHP #Code

What is CKEditor

CKEditor is an open source, customizable web text editor that can be integrated to our webpages. It can be used in three different modes (Article editor, Document editor and Inline editor) for content creation. I was looking for a web editor like Google doc using which I can collect text data from students (but not requiring login with gmail), and I found CKEditor doing exactly what I wanted to do. I’m using it here as a web document editor.

In this blog post, I’m combining a few steps I did to integrate CKEditor to my webpage. This is the code I wrote after a few rounds of trial and error and many rounds of looking up on the CKEditor documentation and StackOverflow. Wish I found a blog like this when I was trying to implement this 😉

Setting up CKEditor

CKEditor is available for download here. I used the Standard package of the current stable version  (Version 4.6.2 • 12 Jan 2017). All you have to do is to copy the folder ‘ckeditor’ from the downloaded zip file to your program files folder and you’re ready to go. The main steps are below:

Include ckeditor.js in the head section of your code:

Create a text area for the editor in the body section followed by your CKEditor instance:

This simple full code renders you a CKEditor with the default configuration options (Default toolbar, width, height etc. – All of these can be customized). Continue reading “Adding CKEditor to webpages in PHP #Code”

Writing and publishing journal articles

Last week I attended a talk in UTS by Professor Witold Pedrycz on the essentials of effective publishing and how to disseminate research results. He is a well known Professor in the field of Computational Intelligence with  great credentials (Editor-in-chief of very high impact journals, 40,000+ citations etc.). In early stages of PhD and research, we tend to make pretty basic mistakes that could lead to rejections and dejection. These are my notes from his talk where he explained the key components expected from a well-written paper and how to avoid common mistakes. It was quite useful to hear about the do’s and don’ts of publishing from an experienced academic who rejects almost 2000 articles every year for his own journal 😉

Why, how, when to publish?

Why: People might have different personal motives for publishing (expanding CV, meeting KPIs, new year resolutions… :p ), but the key reason why a research should be published is to share important research findings to the research community.

How: The most popular way to disseminate results is still using journal articles. Publication in journals are considered secure and more established, thanks to the detailed peer review process involved. Most points of this post are mentioned in the context of journal articles in particular, although some may also apply to conference articles and other publications. 

When: There is no hard deadline; but the general rule is to publish when we have results to share, and not too late.

Choosing the right journal:

  • Read articles in the journal and research the style of the journal before submission.
  • Check journal citation reports for confirming the claimed impact factorThomson Reuters
  • Be cautious of Beall’s List: Potential, possible, or probable predatory scholarly open-access publishers ( containing details of blacklisted publishers and journals. Short peer review process and sudden request for fees are signs of predatory journals.
  • Not publishing in a good journal could be a bad hit to building a good CV later.

Checking criteria:

To make articles publishable, these are the three key points to keep in mind:

  1. Originality/ innovation – Novelty in the area of research identifying differences from what was already done by others
  2. Relevance/ Motivation – Clear objective of research on why it is done
  3. Presentation/exposure – Understandable writing

All the three criteria are equally important, and we will have to consider revising the paper even if it fails to achieve one of the above.

Preparing to write a quality manuscript:

Follow the standard article structure:

  • Title:
    • Use the fewest possible words to adequately describe the contents of the paper
    • Should contain findings, specific, concise, complete, attract readers
    • Don’t use jargon, abbreviations, ambiguous terms, unnecessary detail
  • Authors and affiliations
  • Abstract:
    • Strongly impacts editor’s decision
    • Should be precise and honest, stand alone entity, uses no tech jargon, brief and specific, cites no references
  • Keywords:
    • Important for indexing to make the article identified and cited
    • Check the guide
    • Specific (E.g. Specific algorithm rather than ‘neural network’ since it will bring millions of hits), avoid uncommon abbreviations and general terms
  • Introduction:
    • Why the current work was performed (Aims, significance), what has been done before (Literature review of prior work), what was done in the current research (brief), what was achieved (brief).
    • Consult the guide for word limit, set the scene, outline problem and hypothesis, balanced lit review (if included here), define non standard abbreviations and jargons, get to the point and keep it simple.
    • Lit review – well focused and linked to the paper.
    • Don’t write extensive review, cite, overuse terms like “novel” etc.
    • Mathematics: formula in papers – explain symbols, use standard notations.
    • I would also like to highlight Swales’ Creating a Research Space (CARS) model that provides a useful guide for writing introductions and other sections.
  • Flow of presentation:
    • Top-down approach: main idea→ fundamentals → algorithms → experiments → conclusions.
    • Avoid mixing different levels of abstraction (Explain concept, numeric values in introduction and not straight away in the experiment section, Explain what tool is used in the experiment section and not in the introduction section).
    • Brief, illustrative examples to motivate.
  • Results:
    • Use tables, figures to summarize, show results of statistical analysis, compare like with like (E.g. A simple, but commonly made mistake: “The results from this study are higher than the other study”: Doesn’t compare ‘results’ to ‘results’, but compares ‘results’ to another ‘study’).
    • Don’t duplicate data among tables, figures and text, use graphics for summarization of text (avoid large tables with many numbers).
    • Graphics: stand alone captions, easy to interpret, don’t overuse colors in charts (alternatives: diff types of lines), only essential information.
    • Clear legend, better organized data, present trend lines, don’t leave areas underutilized.
  • Discussion:
    • Study’s aim and hypothesis
    • Relating to other research
    • Avoid grand unsupported statements (E.g. novel organization method has enormously reduced the learning time), introducing new terms
  • Conclusion:
    • Put your study in context
    • How it represents advance in the field
    • Suggest future experiments
    • Avoid: repetition with other sections – same sentence in abstract, intro, discussion, conclusion, overly speculative, overemphasize the impact of the study.
  • Acknowledgement:
    • Contributions to paper: supplied materials or software, helped with writing or English, technical help.
  • References:
    • Include recent references
    • Check guide for correct format
    • Avoid: citing yourself/journal excessively, citing bad sources – which are not available, wikipedia – volatile, local language
    • Review paper requires experienced writing skills, survey paper has to digest and synthesize available research.
  • Supplementary material

Language essentials for a quality manuscript:

Ensure your manuscript has the three C’s below:

  1. Clarity
  2. Conciseness
  3. Correctness

Common traps: repetition, redundancy, ambiguity, exaggeration

You can make use of language editing services to polish the manuscript if required. Free tools are available online for checking surface level errors like grammar and spelling.

Ethical issues:

  • Multiple submissions, redundant publications, plagiarism, data fabrication and falsification, improper use of subjects, improper author contribution.
  • Plagiarism: Check the IEEE FAQ for details on plagiarism. Unacceptable paraphrasing, even with citation could be plagiarism.

Cover letter, Revisions and Responses to reviewers:

  • Write a brief cover letter to the editor to convey particular importance of your manuscript to the journal. Suggest potential reviewers (if required).
  • Indicate if the submitted paper is an extended version of a conference paper to avoid conflict of interest.
  • Review process: Draft a detailed letter of response to reviewers: respond to all points (accept with changes made or reject with polite reasoning), provide page and line numbers to refer to revisions, additional calculations if required to make the paper stronger.
    • E.g. “Thank you for the comment. However, we feel that the assumption in our model is supported by recent work by”.…. Rather than “the reviewer is clearly ignorant of the work of…”
  • Rejection: Not to be taken personally, try to understand why; don’t resubmit without significant revisions to another journal.
  • Journals allow paper to be distributed as an open source resource with an additional fee to reach wider audience (if required).



Tools for automated rhetorical analysis of academic writing

Alert – Long post!

In this post, I’m presenting a summary of my review on tools for automatically analyzing rhetorical structures from academic writing.

The tools considered are designed to cater to different users and purposes. AWA and RWT aim to provide feedback for improving students’ academic writing. Mover and SAPIENTA on the other hand, are to help researchers identify the structure of research articles. ‘Mover’ even allows users to give a second opinion on the classification of moves and add new training data (This can lead to a less accurate model if students with less expertise add potentially wrong training data). However, these tools have a common thread and fulfill the following criteria:

  • They look at scientific text – Full research articles, abstracts or introductions. Tools to automate argumentative zoning of other open text (Example) are not considered.
  • They automate the identification of rhetorical structures (zones, moves) in research articles (RA) with sentence being the unit of analysis.
  • They are broadly based on the Argumentative Zoning scheme by Simone Teufel or the CARS model by John Swales (Either the original schema or modified version of it).

Tools (in alphabetical order):

  1. Academic Writing Analytics (AWA) – Summary notes here

AWA also has a reflective parser to give feedback on students’ reflective writing, but the focus of this post is on the analytical parser. AWA demo, video courtesy of Dr. Simon Knight:

  1. Mover – Summary notes here

Available for download as a stand alone application. Sample screenshot below:


  1. Research Writing Tutor (RWT) – Summary notes here

RWT demo, video courtesy of Dr. Elena Cotos:

  1. SAPIENTA – Summary notes here.

Available for download as a stand alone java application or can be accessed as a web service. Sample screenshot of tagged output from SAPIENTA web service below:

sapienta-outputAnnotation Scheme:

The general aim of the schemes used is to be applicable to all academic writing and this has been successfully tested across data from different disciplines. A comparison of the schemes used by the tools is shown in the below table:

ToolSource & DescriptionAnnotation Scheme
AWAAWA Analytical scheme (Modified from AZ for sentence level parsing)-Summarizing
-Background knowledge
-Contrasting ideas
-Open question
Mover Modified CARS model
-three main moves and further steps
1. Establish a territory
-Claim centrality
-Generalize topics
-Review previous research
2. Establish a niche
-Counter claim
-Indicate a gap
-Raise questions
-Continue a tradition
3. Occupy the niche
-Outline purpose
-Announce research
-Announce findings
-Evaluate research
-Indicate RA structure
RWTModified CARS model
-3 moves, 17 steps
Move 1. Establishing a territory
-1. Claiming centrality
-2. Making topic generalizations
-3. Reviewing previous research
Move 2. Identifying a niche
-4. Indicating a gap
-5. Highlighting a problem
-6. Raising general questions
-7. Proposing general hypotheses
-8. Presenting a justification
Move 3. Addressing the niche
-9. Introducing present research descriptively
-10. Introducing present research purposefully
-11. Presenting research questions
-12. Presenting research hypotheses
-13. Clarifying definitions
-14. Summarizing methods
-15. Announcing principal outcomes
-16. Stating the value of the present research
-17. Outlining the structure of the paper
SAPIENTAfiner grained AZ scheme
-CoreSC scheme with 11 categories in the first layer
-Background (BAC)
-Hypothesis (HYP)
-Motivation (MOT)
-Goal (GOA)
-Object (OBJ)
-Method (MET)
-Model (MOD)
-Experiment (EXP)
-Observation (OBS)
-Result (RES)
-Conclusion (CON)


The tools are built on different data sets and methods for automating the analysis. Most of them use manually annotated data as a standard for training the model to automatically classify the categories. Details below:

ToolData typeAutomation method
AWAAny research writingNLP rule based - Xerox Incremental Parser (XIP) to annotate rhetorical functions in discourse.
MoverAbstractsSupervised learning - Naïve Bayes classifier with data represented as bag of clusters with location information.
RWTIntroductionsSupervised learning using Support Vector Machine (SVM) with n-dimensional vector representation and n-gram features.
SAPIENTA Full articleSupervised learning using SVM with sentence aspect features and Sequence Labelling using Conditional Random Fields (CRF) for sentence dependencies.


  • SciPo tool helps students write summaries and introductions for scientific texts in Portuguese.
  • Another tool CARE is a word concordancer used to search for words and moves from research abstracts- Summary notes here.
  • A ML approach considering three different schemes for annotating scientific abstracts (No tool).

If you think I’ve missed a tool which does similar automated tagging in research articles, do let me know so I can include it in my list 🙂

Notes: Discourse classification into rhetorical functions

Reference: Cotos, E., & Pendar, N. (2016). Discourse classification into rhetorical functions for AWE feedback. calico journal, 33(1), 92.


  • Computational techniques can be exploited to provide individualized feedback to learners on writing.
  • Genre analysis on writing to identify moves (communicative goal) and steps (rhetorical functions to help achieve the goal) [Swales, 1990].
  • Natural language processing (NLP) and machine learning categorization approach are widely used to automatically identify discourse structures (E.g. Mover, prior work on IADE).


  • To develop an automated analysis system ‘Research Writing Tutor‘ (RWT) for identifying rhetorical structures (moves and steps) from research writing and provide feedback to students.


  • Sentence level analysis – Each sentence classified to a move, step within the move.
  • Data: Introduction section from 1020 articles – 51 disciplines, each discipline containing 20 articles, total of 1,322,089 words.
  • Annotation Scheme:
    • 3 moves, 17 steps – Refer Table 1 from the original paper for detailed annotation scheme (Based on the CARS model).
    • Manual annotation using XML based markup by the Callisto Workbench.
  • Supervised learning approach steps:
    1. Feature selection:
      • Important features – unigrams, trigrams
      • n-gram feature set contained 5,825 unigrams and 11,630 trigrams for moves, and 27,689 unigrams and 27,160 trigrams for steps.
    2. Sentence representation:
      • Each sentence is represented as a n-dimensional vector in the R^n Euclidean space.
      • Boolean representation to indicate presence or absence of feature in sentence.
    3. Training classifier:
      • SVM model for classification.
      • 10-fold cross validation.
      • precision higher than recall – 70.3% versus 61.2% for the move classifier and 68.6% versus 55% for the step classifier – objective is to maximize accuracy.
      • RWT analyzer has two cascaded SVM – move classifier followed by step classifier.


  • Move and step classifiers predict some elements better than the others (Refer paper for detailed results):
    • Move 2 most difficult to identify (sparse training data).
    • Move 1 gained best recall- less ambiguous cues.
    • 10 out of 17 steps were predicted well.
    • Overall move accuracy of 72.6% and step accuracy of 72.9%.

Future Work:

  • Moving beyond sentence level to incorporate context information and sequence of moves/steps.
  • Knowledge-based approach for hard to identify steps – hand written rules and patterns.
  • Voting algorithm using independent analyzers.

Notes: XIP – Automated rhetorical parsing of scientific metadiscourse

Reference: Simsek, D., Buckingham Shum, S., Sandor, A., De Liddo, A., & Ferguson, R. (2013). XIP Dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse. In: 1st International Workshop on Discourse-Centric Learning Analytics, 8 Apr 2013, Leuven, Belgium.


Learners should have the ability to critically evaluate research articles and be able to identify the claims and ideas in scientific literature.


  • Automating analysis of research articles to identify evolution of ideas and findings.
  • Describing the Xerox Incremental Parser (XIP) which identifies rhetorically significant structures from research text.
  • Designing a visual analytics dashboard to provide overviews of the student corpus.


  • Argumentative Zoning (AZ) to annotate moves in research articles by Simone Teufel.
  • Sample discourse moves:
    • Summarizing: “The purpose of this article….”
    • Contrasting ideas: “With an absence of detailed work…”
      • Sub-classes: novelty, surprise, importance, emerging issue, open question
  • XIP outputs a raw output file containing semantic tags and concepts extracted from text.
  • Data: Papers from LAK & EDM conferences and journal – 66 LAK and 239 EDM papers extracting 7847 sentences and 40163 concepts.
  • Dashboard design – Refer original paper to see the process involved in prototyping the visualizations.


  • XIP is now embedded in the Academic Writing Analytics (AWA) tool by UTS. AWA provides analytical and reflective reports on students’ writing.

Notes: Automatic recognition of conceptualization zones in scientific articles

Liakata, M., Saha, S., Dobnik, S., Batchelor, C., & Rebholz-Schuhmann, D. (2012). Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics, 28(7), 991-1000.


  • Scientific discourse analysis helps in distinguishing the nature of knowledge in research articles (facts, hypothesis, existing and new work).
  • Annotation schemes vary across disciplines in scope and granularity.


  • To build a finer grained annotation scheme to capture the structure of scientific articles (CoreSC scheme).
  • To automate the annotation of full articles at sentence level with CoreSC scheme using machine learning classifiers (SAPIENT “Semantic Annotation of Papers: Interface & ENrichment Tool” available for download here).



  • 265 articles from biochemistry and chemistry, containing 39915 sentences (>1 million words) annotated in three phrases by multiple experts.
  • XML aware sentence splitter SSSplit used for splitting sentences.


  • First layer of the CoreSC scheme with 11 categories for annotation:
    • Background (BAC), Hypothesis (HYP), Motivation (MOT), Goal (GOA), Object (OBJ), Method (MET), Model (MOD), Experiment (EXP), Observation (OBS), Result (RES) and Conclusion (CON).


  1. Text classification:
    • Sentences classified independent of each other.
    • Uses Support Vector Machine (SVM).
    • Features extracted based on different aspects of a sentence: location within the paper, document structure (global features) to local features. For the complete list of features used, refer the paper.
  2. Sequence labelling:
    • Labels assigned to satisfy dependencies among sentences.
    • Uses Conditional Random Fields (CRF).

Results and discussion:

  • F-score: Ranges from 76% for EXP (Experiment) to 18% for the low frequency category MOT(Motivation) [Refer complete results from runs configured with different settings and features in Table 2 of the paper].
  • Most important features: n-grams (primarily bigrams), Grammatical triples (GRs), verbs, global features such as history (sequence of labels) and section headings (Detailed explanation for the features
  • Classifiers: LibS has the highest accuracy at 51.6%, CRF at 50.4% and LibL at 47.7%.

Application/Future Work:

  • Can be applied to create executive summaries of full papers (based on the entire content and not just abstracts) to identify key information in a paper.
  • CoreSC annotated biology papers to be used for guiding information extraction and retrieval.
  • Generalization to new domains in progress.

Notes: Computational analysis of move structures in academic abstracts


Wu, J. C., Chang, Y. C., Liou, H. C., & Chang, J. S. (2006, July). Computational analysis of move structures in academic abstracts. In Proceedings of the COLING/ACL on Interactive presentation sessions (pp. 41-44). Association for Computational Linguistics.


  • Swales pattern for research articles: Introduction, Methods, Results, Discussion (IMRD) and Creating a Research Space (CARS) model.
  • Studying the rhetorical structure of tests is found to be useful to aid reading and writing (Mover tool notes here).


  • To automatically analyze move structures (Background, Purpose, Method, Result, and Conclusion) from research article abstracts.
  • To develop an online learning system CARE (Concordancer for Academic wRiting in English) using move structures to help novice writers.


  • Processes involved:


  • TANGO Concordancer used for extracting collocations with chunking and clause information – Sample  Verb-Noun collocation structures in corpus: VP+NP, VP+PP+NP, and VP+NP+PP (Ref: Jian, J. Y., Chang, Y. C., & Chang, J. S. (2004, July). TANGO: Bilingual collocational concordancer. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions (p. 19). Association for Computational Linguistics.)
    • TANGO Tool accessible here.
  • Data: Corpus of 20,306 abstracts (95,960 sentences) from Citeseer. Manual tagging of moves in 106 abstracts containing 709 sentences. 72,708 collocation types extracted and manually tagged 317 collocations with moves.
  • Hidden Markov Model (HMM) trained using 115 abstracts containing 684 sentences.
  • Different parameters evaluated for the HMM model: “the frequency of collocation types, the number of sentences with collocation in each abstract, move sequence score and collocation score”


  • Precision of 80.54% achieved when 627 sentences were qualified with following parameters: weight of transitional probability function 0.7 , frequency threshold for a collocation to be applicable – 18 (crucial to exclude unreliable collocation).


  • CARE system interface created for querying and looking up sentences for a specific move.
  • System is expected to help non native speakers write abstracts for research articles.

Notes: Visualizing sequential patterns for text mining


Wong, P. C., Cowley, W., Foote, H., Jurrus, E., & Thomas, J. (2000). Visualizing sequential patterns for text mining. In Information Visualization, 2000. InfoVis 2000. IEEE Symposium on (pp. 105-111). IEEE.


  • Mining Sequential patterns aims to identify recurring patterns from data over a period of time.
  • A pattern is a finite series of elements from the same domain A -> B -> C -> D
  • Each pattern has a minimum ‘support’ value which indicates the percentage of pattern occurrence. (E.g. 90% of people who did this process, did the second process, followed by the third process)
  • Sequential pattern vs association rule:
    • Sequential pattern – studies ordering/arrangement of elements E.g. A -> B -> C -> D
    • Association rule – studies togetherness E.g. A+B+C -> D


  • Presenting a visual data mining system that combines pattern discovery and visualizations.



Open source corpus containing 1170 news articles from 1991 to 1997 and harvested news of 1990 from TREC5 distribution.


  1. Topic Extraction: Identifies the topic in documents based on the co-occurrence of words. Words separated by white space evaluated – stemming done, prepositions, pronouns, adjectives, and gerunds ignored.
  2. Multiresolution binning: Bins articles with the same timestamp (E.g. Binning by day, week, month, year)

Discovery of sequential patterns by Visualization:

  • Plotting topics/ topic combinations over time.
  • Strength: Can quickly view overall patterns and individual occurrence of events.
  • Weakness: No knowledge on exact connections that make up the pattern and statistical support on the individual patterns.

Discovery of sequential patterns by Data mining:

  • Building patterns on n-ary tree with elements as nodes.
  • Patterns are valid if the support value is greater than threshold.
  • A sample pattern mining from given input data is given in Figure 2 of the paper.
  • Strength: Provides accurate statistical (support) values for all weak and strong patterns.
  • Weakness: Loses temporal and locality information, large number of patterns produced in text format making human interpretation harder.

Visual Data Mining system:


  • Combining visualization and data mining to compensate each others’ weaknesses (Refer Figure 4 & 5 in the paper to see the pattern visualizations).
  • Binning resolution can be changed to see different patterns based on day, week, month, year etc.
  • Patterns associated to a particular topic can be picked.


  • Strength of pattern is not easily identifiable from the visualization without statistical measures. Pattern mining gets enhanced by graphical encoding with spatial and temporal information.
  • Knowledge discovery by humans is aided by combining statistical data mining and visualization.

Future Work:

  • Handling larger data sets using secondary memory support and improve display.
  • Integrating more techniques like association rules into visual data mining environment.

Notes: Discipline-independent argumentative zoning


Teufel, S., Siddharthan, A., & Batchelor, C. (2009, August). Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3-Volume 3 (pp. 1493-1502). Association for Computational Linguistics.


  • Argumentative Zoning (AZ) classifies each sentence into one of the categories below (inspired by knowledge claim KC) of authors :
    • Aim, Background, Basis, Contrast, Other and Textual.

[Refer AZ scheme – Teufel, S., Carletta, J., & Moens, M. (1999, June). An annotation scheme for discourse-level argumentation in research articles. In Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (pp. 110-117). Association for Computational Linguistics.]


  • Establishing a modified AZ scheme AZ-II with fine grained categories (11 instead of 7)  to recognize structure and relational categories.
  • Experimenting annotation using AZ scheme in two distinct domains: Chemistry and Computational Linguistics (CL).
  • Testing an annotation scheme to systematically exclude prior domain knowledge of annotators.



  • Domain independent categories so that the annotations can be done based on general, rhetorical and linguistic knowledge and no scientific domain knowledge is necessary.
  • Annotators are semi-informed experts following the rules below so that the existing domain knowledge has minimalist interference with annotations:
    • Justification is required for all annotations based on text based evidence such as cues, and other linguistic principles.
    • Discipline specific generics are provided based on high level domain knowledge so that the annotators can identify the validity of knowledge claims made in the domain (E.g. a “Chemistry primer” with high level information regarding common scientific terms to help a non-expert).
    • Guidelines are given with descriptions for annotating the categories; some categories might require domain knowledge for distinguishing them (e.g. Authors mentioning about the failure of previous methods: OWN_FAIL vs ANTISUPP, Reasoning required to come to conclusions from results: OWN_RES vs OWN_CONC).


  • Data:
    • Chemistry – 30 journal articles, 3745 sentences
    • CL – 9 conference articles, 1629 sentences
  • Independent annotations using web based tool. Refer example annotations in appendix of the paper.


  • Inter-annotator agreement: Fleiss Kappa coefficient, κ = 0.71 for Chemistry and κ = 0.65 for CL.
  • Wide variation in the frequency of categories –> fewer examples for supervised learning for rare categories (Refer ‘Figure 3: Frequency of AZ-II Categories’ in the paper to see the frequency distinctions between the two domains).
  • Pairwise agreement calculated to see the impact of domain knowledge between annotators: κAB  = 0.66, κBC  = 0.73 and κAB  = 0.73 –> Largest disagreement between expert (A) and non-expert (C).
  • Inter-annotator agreement to see the distinction between categories: κbinary = 0.78 for chemistry and κbinary = 0.65 for CL –> Easier distinction of categories in Chemistry than CL.
  • Krippendorff’s category distinctions to see how a category falls apart from the other collapsed categories: κ=0.71 for chemistry, κ=0.65 for CL
    • Well distinguised: OWN MTHD, OWN RES and FUT
    • Less distinguised: ANTISUPP, OWN FAIL and PREV OWN –> troubleshooting required for guidelines
  • Comparison of AZ-II to original AZ annotation scheme by collapsing into 6-category AZ annotation: κ=0.75 –> annotation of high consistency.


  • Positive result for domain independent application of AZ scheme and training non experts as annotators.
  • Annotating more established discipline like Chemistry was easier than CL.

Future Work:

  • Automation of AZ annotation
  • Expanding annotation guidelines to other disciplines and longer journal papers.