London Festival of Learning 2018

I attended the London Festival of Learning this year from June 22nd-30th, which brought together three conferences: the 13th International Conference of the Learning Sciences (ICLS), the Fifth Annual ACM Conference on Learning at Scale (L@S) and the 19th International Conference on Artificial Intelligence in Education (AIED).  It was great to see the convergence of ideas and academics from these three fields that generally work towards enhancing educational practices with technology. I could see overlaps and similarities in the topics of research being studied by these communities, but I also noticed they were divergent in terms of the main foci of their research. The festival was huge with over a 1000 attendees, and also involved edtech companies that wanted to develop evidence-informed products.

Throughout the conferences, I found an emphasis and move towards making more use of human ability and intelligence to augment what artificial intelligence can do for education in many keynotes and talks. This included concepts like giving importance to our internally persuasive voice and the power of negotiation in addition to “datafied” learning, and embracing imperfections from machines by adding in human context. A critical stance on what Artificial Intelligence can and cannot do was seen, with more conversations happening around the ethical use of learner’s data.

(Excuse me for the blurry pictures, I was not in a good spot to take pictures)

In the sessions, I could see a lot of research on developing intelligent tutoring systems, agents, intervention designs and adaptive learning systems for teaching specific skills, and advances made in their techniques. The majority of data comes from online settings i.e, students’ trace data from their usage with such systems. Recently, multi-modal data is getting more attention where sensors and wearables collect data from learner’s physical spaces as well. One best paper award winning work on Teacher-AI hybrid systems showcased the power of mixed-reality systems for real-time classroom orchestration. The cross-over session and the ALLIANCE best paper session showcased interesting research cutting across the three communities; it’s a shame we couldn’t attend both sessions since they ran in parallel.

Simon Knight presented our work on Augmenting Formative Writing Assessment with Learning Analytics: A Design Abstraction Approach at the cross-over session where he explained how we can augment existing good practices with learning analytics, and use design representations for standardizing these learning designs. I presented our poster on studying the revision process in writing in AIED, where I used snapshots of students’ writing data to study their drafting process at certain time intervals. I also participated in the collaborative writing workshop earlier in ICLS where many interesting tools to support writing were discussed. I shared about AcaWriter – a writing analytics tool providing automated feedback on rhetorical moves, developed by the Connected Intelligence Centre, UTS  which is now released open-source.

Overall, it was a great place to learn, network and follow work from related disciplines (with some catching up to do on the presented work, coz we can only be at one place at one time during the parallel sessions). I did feel a bit exhausted a the end of it (maybe I’m better off attending one conference at a time 🙂 ), but I guess that’s natural, and you can’t complain when your brain gets so much to learn in a week!

Preparing for a doctoral consortium

There are many opportunities for doctoral students to participate in a doctoral consortium in the ed-tech research community, amongst others. A Doctoral consortium is usually organized by conferences where graduate students come together to present their work to experts in the field and peers, and get feedback from them. The expert panel might also offer advice on career and other skills. Some conferences also offer Young Researcher’s workshops/ Early Career Workshops which are useful for graduating students and young researchers in the field.

Having attended two doctoral consortium in different conferences, I would recommend PhD students to do it at some point of time.  I found it useful for a number of reasons, so in this post I’m going to list why I think so and how to prepare for a Doctoral consortium – some tips on making the best use of it.

Why participate?

  • Enhancing research skills: It’s a wonderful opportunity to put your thoughts together and think about the big picture of your research. It helps you identify the core ideas of your research and present them succinctly in a limited time. Explaining a potential 60,000 word thesis of your PhD in less than 30 minutes is a great skill to acquire. In some conferences, you might be asked to present a poster explaining your research as well. Also, it is a place where you can actually discuss more about your methodology and design, and not just the results.
  • Expert feedback: It is a great place to get some early feedback (and criticism) on your PhD work and thesis statement. It’s nice to have some extra eyes other than your phd supervisors. You become clear on what your claims can be and what your limitations are. You will be prepared to answer any question and know what to expect as possible questions next time when you present your work to different audiences. Even if you don’t get great advice at all times, you will most likely walk away with a better understanding of what you want to do. And if there’s a certain problem you’re grappling with in your research, you can ask for specific advice.
  • Networking: You meet other PhD Students from closely related fields. Not always do we get a chance to meet students from other universities around the world and know about their research. They are also sailing on the same boat, so it is always good to connect with your peers to get some support, and their feedback on your work. It is also a good opportunity to network with experts in the field and introduce your name in the research community. Who knows, the academic expert you impressed might be the person who gives you a job when you graduate 🙂
  • Financial Support: Most conferences provide some level of financial support for grad students who get accepted to the doctoral consortium. This is especially useful for self-financing students, as it covers registration fees or travel depending on the conference.

Based on my experience and the advice I’ve heard, here are some tips to make the best use of your time at the Doctoral Consortium:

  • Pick the right time to go – Best to go when you have conceptualized your research and done some work, so that you don’t go as an empty slate. The experts want to see what you have thought through so they can give you advice. Also don’t go too late (for example when you are almost submitting your thesis) by which time you can’t make any more changes to your research and thesis.
  • Make a proper submission – Most doctoral consortium require students to make formal submissions which include a short paper describing the research, supporting documents like a letter of support from the supervisor, and sometimes your own statement and CV. They usually look for sharp minds who can benefit from the discussion and contribute to the research community, so make sure you follow the mentioned format while submitting your application with well-written documents.
  • Practise and be ready to explain your research – You are usually provided a limited time to present (15-20 mins), and given that you are attempting to present your whole thesis in this time slot, practise well in advance to highlight the key aspects. Even better if you can present to your local peers and get their advice earlier. Sometimes, we tend to run through some ideas quickly without noticing that they need more emphasis or highlight less important aspects more, which your peers can notice for you.
  • Go prepared with your questions & answers: It is always nice to be prepared with questions to ask advice from experts. If there’s a particular problem you’re grappling with in your research, make sure you point that out and ask for suggestions. This helps you get focused attention on that problem rather than spend a lot of time on other minor things you are  not very interested in. If you want feedback from a specific expert, you can try mentioning that too. Be prepared to face tough questions and criticism on your research work (a good rehearsal before your phd defence). Also, if your peer’s work is previously made available, take some time to read about their research so you can contribute to the discussion and add value with your feedback.

LAK 2018 in Sydney

This post is on the exciting week of the Learning Analytics and Knowledge Conference LAK 2018, held in Sydney. LAK is a prestigious conference dedicated for sharing work in Learning Analytics across the globe. LAK coming down under was something we were looking forward to for quite some time. LAK is in fact the very first international conference I’ve ever attended (back in 2015), so it is always extra special 🙂

I started off with a Writing Analytics workshop, which we organized in Day 1 of LAK. We used a Jupyter notebook which runs Python code to demonstrate the application of text analysis for writing feedback and the pedagogic constructs behind designing such applications for learning analytics. Our aim was to bridge the gap between pedagogic contexts and the technical infrastructure (analytics) by crafting meaningful feedback for students on their writing, and to do so by developing writing analytics literacy. The participants were quite engaged in this hands on approach and we had good discussion on the implications of such Writing Analytics techniques.

The next day, I participated in the Doctoral Consortium, which is a whole day workshop where doctoral students present their work, discuss and receive feedback on their work from experts and other students. To know more about a Doctoral Consortium, read this. My doctoral consortium paper published in the companion proceedings is available here:

The new workshop for school practitioners was of interest to many educators working in K-12 learning analytics applications, and the Hackathon continues to be of wide interest. After the pre-conference events, the main conference officially started with the first keynote by Prof. David Williamson Shaffer on ‘The Importance of Meaning: Going Beyond Mixed Methods to Turn Big Data into Real Understanding’. David talked about how data is not scarce anymore, and to analyze such a sheer volume of data for learning, how we have to go beyond traditional quantitative and qualitative approaches. He gave examples of logical fallacies where statistics is likely to be misused while interpreting the concepts in learning, and introduced the notion of quantitative ethnography which can close the interpretive gap between the model and the data.

If you want to hear the full talk, all the keynotes are available along with the slides here: 

In general, there was great interest in the development of theories around designing dashboards, discussing how to and how not to develop dashboards for students.

Aligning learning analytics with learning design was increasingly emphasized. The demo paper which I presented that day exemplifying this in a Writing Analytics context is here (bonus pic with the supervisors):

The second day of the main conference (aptly on International women’s day) started with Prof. Christina Conati’s keynote on user adaptive visualizations, where she talked about adaptive interactions.

She showed how visualizations can be personalized for users by building user models based on eye tracking features.

Visualization in general was another key topic which gathered growing interest in the LAK community, along with other topics like Discourse analysis and Writing Analytics, many of them moving towards more near real-time applications.

I attended the SOLAR executive meeting for the first time to see what’s happening around SOLAR. It felt great to be part of a very welcoming community of researchers and practitioners. That’s where they announced this:

We also celebrated Women’s day:

It was quite an eventful day ending with the conference banquet in a Sydney harbour cruise.

The final keynote on the last day touched upon a number of criticisms around learning analytics and how we can progress the field further taking into account the key aims of learning analytics.

Multi modal learning analytics, MOOCS, Ethics and Policies, Theories, Self-regulated learning and Co-designing with stakeholders are other areas which continued to be discussed throughout the conference.

And then to wrap it up, happy hour!

To read all the interesting papers from LAK, follow this link.

For more tweets from the awesome LAK community, check #LAK18, #LAK2018, @lak2018syd


The changing face of learning and how to adapt to it

This post is based on my notes from Prof. Roger Säljö‘s talk at a Sydney Ideas event, hosted by the University of Sydney. I was undecided at first about attending this talk since it was held on a Valentine’s day evening, but I’m really glad I did 🙂 In his intriguing talk, Prof. Roger shared how the nature of knowledge and learning have changed in our current digital societies compared to previous traditional forms, and how educators should respond to it.

We’ve almost always been finding ways to preserve and communicate knowledge, from scripts and stone age symbols to modern digital libraries. That’s how we learn, grow and improve the society we live in. The Game of Thrones quote about its library town ‘Citadel’ is something I could immediately relate to:


While all societies need to reproduce knowledge for the next generations (like how its been done for ages), the conditions for reproducing the cultural memory is quite different in modern, digital societies. The size and complexity of such knowledge have grown tremendously in modern societies due to technology, which is why it is important to prioritize the skills and knowledge for learning. What is of value for students to learn in the new digital world and what skills they need should be considered. There could be two strategies for thinking about this:

  1. We can preserve what has been done (back-to-basics movement in education)
  2. Or we can think about what might be productive for the future

I’m more inclined towards working for a productive future, considering the new changes (by preserving the traditional elements that are essential, of course).


The changing face:

So what has actually changed over the years? Why should we think through these for education NOW? Education has been evolving all the time: from scribal schools over 5000 years ago meant for systematic training of the human mind, to the still relevant act of ‘studying’ which was once a social revolution.  Symbolic technologies are well developed to share a common understanding with all people of the world. In particular, Writing is a literacy that’s probably not dying anytime soon, although its forms may have changed. Text is still the main source of knowledge and is used everyday in many forms including emails, messages, and social media posts. The concepts of schooling have remained stable, although its focus on reproduction (not creativity) and individual as a source of knowledge are changing in recent times.

However, the biggest changes to our society are brought by technology, which has digitized the world. In addition to the growing amount of knowledge in the form of digital data, the conditions of learning have also changed tremendously. A lot of cognitive functions have been externalized and cognitive habits transformed. For instance, we make use of computer software to perform spelling/grammar checks in our everyday writing and even for simple arithmetic calculations (we should probably try to do mental calculations once in a while so that we don’t always need a calculator for 451 * 23). We are dependent on apps for cognitive tasks like remembering and problem solving. Children are starting to learn writing by typing using keyboards, and are moving from passive media consumption to active forms of interaction. We are able to master complex tasks, without understanding the basic steps involved.  There are statistical packages for use today which can help us come up with solutions for highly complex tasks with few lines of code without understanding the sequential steps it involves. Advanced technologies act as a black-box, which cannot be unpacked for education in the classroom: one example from my research context is a machine learning scoring algorithm that doesn’t disclose the features used for calculating these scores of students in a writing task.

Technological changes have made minds hybrid with thinking detours and collaboration with artifacts, which no longer nurture a concentrated mind. The way we look for information has also changed completely with search engines. Google has become our go-to place to seek any information we want, and is available for anyone. There is increased internet use in young children, even on their own. This places huge emphasis on coming up with strategies like restrictions and parental guidance for responsible internet usage by children, and opens a whole new dimension of security. We cannot control the learning trajectory of children from 2 years to 11 years as before, since we don’t know what they learn externally out of class (indirect curriculum). Schools can have no control on external tools and knowledge as it is hard to restrict access to computers at home. One can just hope that such external knowledge children gain is for the good, and guide them to distinguish it from other non desirable content on the internet.

Because the future is digital and there’s no coming back, our duty is to adapt to it the best we can. For this, Prof. Roger emphasizes that the metaphors of learning should shift to respond to the changing environment. Learning should be more performative (rather than reproductive) and focus on learning as design. Learners should be encouraged to participate in and contribute to communities and collective practices, and no longer consider knowledge as an individual asset. With the human mind, interactions with symbolic technologies and communications with people should be relational. Technologies and artificial intelligence should be used with care in education, keeping in mind that “Education is not production, it is not a smoothly running machine”. For young teachers to cope with the advances, they have to learn how to marry the resources to the ambitions of the school, while understanding that technology changes the nature of education, but does not solve the problem. The education system will also have to change assessments to assess the skills that matter the most in the future. While the advances have a role to play in improving learning (E.g. virtual environments where students can experience near reality complex environments), they should also have co-ordination with the teacher to get user perspectives. And for people to accept it more broadly, there should be steps taken to ensure digital literacy. Further, the knowledge, value and skills of an individual should be connected to what technology has to offer. Such design of transparent technology to respond to the natural repertoire of uses will be more relevant for education in the future.

To learn more about Prof. Roger’s work, visit:






ICCE 2017 in New Zealand

Last month I attended the 25th International Conference on Computers in Education ICCE 2017 at Christchurch, New Zealand, organised by the Asia-Pacific Society for Computers in Education (APSCE). It was the first time I attended this conference, although I have heard of it previously when I was working in NIE, Singapore. Overall, it was a great experience, and I could see different sub-fields under ‘Computers in Education’ coming together. Being a slightly more extensive field than learning analytics, it helped widen my knowledge beyond my current expertise.

I found the keynote speeches and talks very exciting, and I was tweeting some of my key take-home messages with   tag. Personalized and adaptive learning, learner models and how we can empower learners with technology were some key topics discussed in the keynotes and invited talks:

Emerging technical solutions and capabilities shared in the paper and poster sessions, especially on Virtual Reality, Augmented Reality, Mobile and Sensor technologies were widening the horizon of technologies used in the field of education. New applications of gaming technology used for teaching in many levels of education were quite interesting. Combining multiple forms and modes of data (multimodal data) was another emerging topic in collaborative learning, personalized learning and language education.

The overarching theme of pedagogy and learning were emphasized and questioned along the way, when some talks were focussed more on technology than its appropriate usage in educational settings. I believe that this topic is widely discussed these days in many areas where technology is used for education: an emphasis to go back to the basic aim of improving education, working alongside teachers, with technology as only a helping factor.

In particular, we had fruitful discussions within the Learning Analytics (LA) community on testing the effectiveness of LA applications, providing actionable insights for learners and teachers, creating standards for LA and data ethics issues.


I presented a full paper on the “Design and implementation of a pedagogic intervention using Writing Analytics”, where I shared work done with our colleagues at UTS Connected Intelligence Centre (UTS CIC) on exemplifying authentic classroom integration of learning analytics applications. It was well-received and provoked discussion on supporting students in their pedagogic contexts with the right kinds of feedback using analytics.

I also presented a doctoral consortium paper on “Combining automated and peer feedback for effective learning design in Writing practices” based on my main doctoral research idea, where we had discussions on how an embedded human component can add to automated analytic capabilities. I received the APSCE Merit Scholarship of USD500 to help me attend the conference, which is quite special as it is my first external scholarship/award during my PhD😊 I’m also thankful for the VC’s conference fund from UTS and the constant support from my lab UTS CIC at all levels (mentorship and financial support to attend conferences – I attended ALASI at Brisbane just the week before attending this one).

APSCE Merit scholarship_Shibani


In general, I could see a good mix of senior and young researchers from the Asia-Pacific region sharing their work enthusiastically and networking with peers from different communities of the broader educational research field. I caught up with some old friends and met some new interesting people too 😊 The hosts of the conference were amazing and everything was well-organized. We were given an introduction to the local culture with a lot of tidbits and entertainment along the way. I noticed a lot of photos being taken both by official photographers as well as the delegates to capture special moments (Is it just me who observed this? I’m super happy anyway to see those pics). The conference banquet dinner and the celebrations for the 25th anniversary of the conference need a special mention, as the past APSCE presidents were paid tribute. Also, watching the traditional Haka being performed during the banquet was a whole new experience. It was definitely a very well-organized conference, with every detail thought of and paid attention to; credits to the local organizing committee. Plus, New Zealand was so beautiful and I got to see some lovely places like these after the conference:

Lake Tekapo
Lake Tekapo
Mount Cook
Mount Cook, New Zealand

Creating reports in R #Code

I’ve recently been consolidating a lot of R code from different parts of my analysis into one file. I wanted to add good documentation and explanation of results and interpretations along with my code to make sense of it later. I came across this option of creating dynamic reports that can combine our code, custom text and R output to an output document using the knitR package in R. I find it a good practice to create such reports for any analysis (wish I followed this earlier), so here’s a post on how to create them. They are very useful coz of the following reasons:

  • It is a great option to generate PDF, HTML and word reports by combining our text explanations, code, R output and graphics at one go. It saves the hassle of saving and copying text, code, output and figures separately into a report.
  • We can easily share the file with someone else with the output and explanations.
  • It is much easier to generate a new report dynamically when the input file changes, as it runs the same code and generates new output report based on the new file at one go.

How to create dynamic reports in R?

The first step is to create a R Markdown file (with the extension .Rmd). If you’re using RStudio, you can go to File -> New File -> R Markdown to create a Rmd file.  You should specify whether you want an output in html, pdf or word. It generated the following skeleton code for me as I specified an output pdf file:

Rmd parts
Rmd parts

Alternatively, you can write the sections in R and save it as .Rmd file. The Header section begins and ends with three dashes (—). It contains title, date and author attributes and specifies the type of output document: E.g. html_document for html web page, pdf_document for a pdf file, word_document for Microsoft Word .docx etc. The header can include other options as needed: “runtime: shiny” if it should be run as an interactive shiny app, “css: styles.css” to change the stylesheet when working with html, “toc: true” to include a table of contents etc.

The following code contains instructions along with R code to create a simple html document. It should be pretty self-explanatory to follow instructions and edit the code as needed for your own use:

The html file created by the above code can be accessed here to view how the corresponding output is generated:

Sample Markdown file in R


Useful resources:




Adding CKEditor to webpages in PHP #Code

What is CKEditor

CKEditor is an open source, customizable web text editor that can be integrated to our webpages. It can be used in three different modes (Article editor, Document editor and Inline editor) for content creation. I was looking for a web editor like Google doc using which I can collect text data from students (but not requiring login with gmail), and I found CKEditor doing exactly what I wanted to do. I’m using it here as a web document editor.

In this blog post, I’m combining a few steps I did to integrate CKEditor to my webpage. This is the code I wrote after a few rounds of trial and error and many rounds of looking up on the CKEditor documentation and StackOverflow. Wish I found a blog like this when I was trying to implement this 😉

Setting up CKEditor

CKEditor is available for download here. I used the Standard package of the current stable version  (Version 4.6.2 • 12 Jan 2017). All you have to do is to copy the folder ‘ckeditor’ from the downloaded zip file to your program files folder and you’re ready to go. The main steps are below:

Include ckeditor.js in the head section of your code:

Create a text area for the editor in the body section followed by your CKEditor instance:

This simple full code renders you a CKEditor with the default configuration options (Default toolbar, width, height etc. – All of these can be customized). Continue reading “Adding CKEditor to webpages in PHP #Code”

Writing and publishing journal articles

Last week I attended a talk in UTS by Professor Witold Pedrycz on the essentials of effective publishing and how to disseminate research results. He is a well known Professor in the field of Computational Intelligence with  great credentials (Editor-in-chief of very high impact journals, 40,000+ citations etc.). In early stages of PhD and research, we tend to make pretty basic mistakes that could lead to rejections and dejection. These are my notes from his talk where he explained the key components expected from a well-written paper and how to avoid common mistakes. It was quite useful to hear about the do’s and don’ts of publishing from an experienced academic who rejects almost 2000 articles every year for his own journal 😉

Why, how, when to publish?

Why: People might have different personal motives for publishing (expanding CV, meeting KPIs, new year resolutions… :p ), but the key reason why a research should be published is to share important research findings to the research community.

How: The most popular way to disseminate results is still using journal articles. Publication in journals are considered secure and more established, thanks to the detailed peer review process involved. Most points of this post are mentioned in the context of journal articles in particular, although some may also apply to conference articles and other publications. 

When: There is no hard deadline; but the general rule is to publish when we have results to share, and not too late.

Choosing the right journal:

  • Read articles in the journal and research the style of the journal before submission.
  • Check journal citation reports for confirming the claimed impact factorThomson Reuters
  • Be cautious of Beall’s List: Potential, possible, or probable predatory scholarly open-access publishers ( containing details of blacklisted publishers and journals. Short peer review process and sudden request for fees are signs of predatory journals.
  • Not publishing in a good journal could be a bad hit to building a good CV later.

Checking criteria:

To make articles publishable, these are the three key points to keep in mind:

  1. Originality/ innovation – Novelty in the area of research identifying differences from what was already done by others
  2. Relevance/ Motivation – Clear objective of research on why it is done
  3. Presentation/exposure – Understandable writing

All the three criteria are equally important, and we will have to consider revising the paper even if it fails to achieve one of the above.

Preparing to write a quality manuscript:

Follow the standard article structure:

  • Title:
    • Use the fewest possible words to adequately describe the contents of the paper
    • Should contain findings, specific, concise, complete, attract readers
    • Don’t use jargon, abbreviations, ambiguous terms, unnecessary detail
  • Authors and affiliations
  • Abstract:
    • Strongly impacts editor’s decision
    • Should be precise and honest, stand alone entity, uses no tech jargon, brief and specific, cites no references
  • Keywords:
    • Important for indexing to make the article identified and cited
    • Check the guide
    • Specific (E.g. Specific algorithm rather than ‘neural network’ since it will bring millions of hits), avoid uncommon abbreviations and general terms
  • Introduction:
    • Why the current work was performed (Aims, significance), what has been done before (Literature review of prior work), what was done in the current research (brief), what was achieved (brief).
    • Consult the guide for word limit, set the scene, outline problem and hypothesis, balanced lit review (if included here), define non standard abbreviations and jargons, get to the point and keep it simple.
    • Lit review – well focused and linked to the paper.
    • Don’t write extensive review, cite, overuse terms like “novel” etc.
    • Mathematics: formula in papers – explain symbols, use standard notations.
    • I would also like to highlight Swales’ Creating a Research Space (CARS) model that provides a useful guide for writing introductions and other sections.
  • Flow of presentation:
    • Top-down approach: main idea→ fundamentals → algorithms → experiments → conclusions.
    • Avoid mixing different levels of abstraction (Explain concept, numeric values in introduction and not straight away in the experiment section, Explain what tool is used in the experiment section and not in the introduction section).
    • Brief, illustrative examples to motivate.
  • Results:
    • Use tables, figures to summarize, show results of statistical analysis, compare like with like (E.g. A simple, but commonly made mistake: “The results from this study are higher than the other study”: Doesn’t compare ‘results’ to ‘results’, but compares ‘results’ to another ‘study’).
    • Don’t duplicate data among tables, figures and text, use graphics for summarization of text (avoid large tables with many numbers).
    • Graphics: stand alone captions, easy to interpret, don’t overuse colors in charts (alternatives: diff types of lines), only essential information.
    • Clear legend, better organized data, present trend lines, don’t leave areas underutilized.
  • Discussion:
    • Study’s aim and hypothesis
    • Relating to other research
    • Avoid grand unsupported statements (E.g. novel organization method has enormously reduced the learning time), introducing new terms
  • Conclusion:
    • Put your study in context
    • How it represents advance in the field
    • Suggest future experiments
    • Avoid: repetition with other sections – same sentence in abstract, intro, discussion, conclusion, overly speculative, overemphasize the impact of the study.
  • Acknowledgement:
    • Contributions to paper: supplied materials or software, helped with writing or English, technical help.
  • References:
    • Include recent references
    • Check guide for correct format
    • Avoid: citing yourself/journal excessively, citing bad sources – which are not available, wikipedia – volatile, local language
    • Review paper requires experienced writing skills, survey paper has to digest and synthesize available research.
  • Supplementary material

Language essentials for a quality manuscript:

Ensure your manuscript has the three C’s below:

  1. Clarity
  2. Conciseness
  3. Correctness

Common traps: repetition, redundancy, ambiguity, exaggeration

You can make use of language editing services to polish the manuscript if required. Free tools are available online for checking surface level errors like grammar and spelling.

Ethical issues:

  • Multiple submissions, redundant publications, plagiarism, data fabrication and falsification, improper use of subjects, improper author contribution.
  • Plagiarism: Check the IEEE FAQ for details on plagiarism. Unacceptable paraphrasing, even with citation could be plagiarism.

Cover letter, Revisions and Responses to reviewers:

  • Write a brief cover letter to the editor to convey particular importance of your manuscript to the journal. Suggest potential reviewers (if required).
  • Indicate if the submitted paper is an extended version of a conference paper to avoid conflict of interest.
  • Review process: Draft a detailed letter of response to reviewers: respond to all points (accept with changes made or reject with polite reasoning), provide page and line numbers to refer to revisions, additional calculations if required to make the paper stronger.
    • E.g. “Thank you for the comment. However, we feel that the assumption in our model is supported by recent work by”.…. Rather than “the reviewer is clearly ignorant of the work of…”
  • Rejection: Not to be taken personally, try to understand why; don’t resubmit without significant revisions to another journal.
  • Journals allow paper to be distributed as an open source resource with an additional fee to reach wider audience (if required).



Tools for automated rhetorical analysis of academic writing

Alert – Long post!

In this post, I’m presenting a summary of my review on tools for automatically analyzing rhetorical structures from academic writing.

The tools considered are designed to cater to different users and purposes. AWA and RWT aim to provide feedback for improving students’ academic writing. Mover and SAPIENTA on the other hand, are to help researchers identify the structure of research articles. ‘Mover’ even allows users to give a second opinion on the classification of moves and add new training data (This can lead to a less accurate model if students with less expertise add potentially wrong training data). However, these tools have a common thread and fulfill the following criteria:

  • They look at scientific text – Full research articles, abstracts or introductions. Tools to automate argumentative zoning of other open text (Example) are not considered.
  • They automate the identification of rhetorical structures (zones, moves) in research articles (RA) with sentence being the unit of analysis.
  • They are broadly based on the Argumentative Zoning scheme by Simone Teufel or the CARS model by John Swales (Either the original schema or modified version of it).

Tools (in alphabetical order):

  1. Academic Writing Analytics (AWA) – Summary notes here

AWA also has a reflective parser to give feedback on students’ reflective writing, but the focus of this post is on the analytical parser. AWA demo, video courtesy of Dr. Simon Knight:

  1. Mover – Summary notes here

Available for download as a stand alone application. Sample screenshot below:


  1. Research Writing Tutor (RWT) – Summary notes here

RWT demo, video courtesy of Dr. Elena Cotos:

  1. SAPIENTA – Summary notes here.

Available for download as a stand alone java application or can be accessed as a web service. Sample screenshot of tagged output from SAPIENTA web service below:

sapienta-outputAnnotation Scheme:

The general aim of the schemes used is to be applicable to all academic writing and this has been successfully tested across data from different disciplines. A comparison of the schemes used by the tools is shown in the below table:

ToolSource & DescriptionAnnotation Scheme
AWAAWA Analytical scheme (Modified from AZ for sentence level parsing)-Summarizing
-Background knowledge
-Contrasting ideas
-Open question
Mover Modified CARS model
-three main moves and further steps
1. Establish a territory
-Claim centrality
-Generalize topics
-Review previous research
2. Establish a niche
-Counter claim
-Indicate a gap
-Raise questions
-Continue a tradition
3. Occupy the niche
-Outline purpose
-Announce research
-Announce findings
-Evaluate research
-Indicate RA structure
RWTModified CARS model
-3 moves, 17 steps
Move 1. Establishing a territory
-1. Claiming centrality
-2. Making topic generalizations
-3. Reviewing previous research
Move 2. Identifying a niche
-4. Indicating a gap
-5. Highlighting a problem
-6. Raising general questions
-7. Proposing general hypotheses
-8. Presenting a justification
Move 3. Addressing the niche
-9. Introducing present research descriptively
-10. Introducing present research purposefully
-11. Presenting research questions
-12. Presenting research hypotheses
-13. Clarifying definitions
-14. Summarizing methods
-15. Announcing principal outcomes
-16. Stating the value of the present research
-17. Outlining the structure of the paper
SAPIENTAfiner grained AZ scheme
-CoreSC scheme with 11 categories in the first layer
-Background (BAC)
-Hypothesis (HYP)
-Motivation (MOT)
-Goal (GOA)
-Object (OBJ)
-Method (MET)
-Model (MOD)
-Experiment (EXP)
-Observation (OBS)
-Result (RES)
-Conclusion (CON)


The tools are built on different data sets and methods for automating the analysis. Most of them use manually annotated data as a standard for training the model to automatically classify the categories. Details below:

ToolData typeAutomation method
AWAAny research writingNLP rule based - Xerox Incremental Parser (XIP) to annotate rhetorical functions in discourse.
MoverAbstractsSupervised learning - Naïve Bayes classifier with data represented as bag of clusters with location information.
RWTIntroductionsSupervised learning using Support Vector Machine (SVM) with n-dimensional vector representation and n-gram features.
SAPIENTA Full articleSupervised learning using SVM with sentence aspect features and Sequence Labelling using Conditional Random Fields (CRF) for sentence dependencies.


  • SciPo tool helps students write summaries and introductions for scientific texts in Portuguese.
  • Another tool CARE is a word concordancer used to search for words and moves from research abstracts- Summary notes here.
  • A ML approach considering three different schemes for annotating scientific abstracts (No tool).

If you think I’ve missed a tool which does similar automated tagging in research articles, do let me know so I can include it in my list 🙂

Notes: Discourse classification into rhetorical functions

Reference: Cotos, E., & Pendar, N. (2016). Discourse classification into rhetorical functions for AWE feedback. calico journal, 33(1), 92.


  • Computational techniques can be exploited to provide individualized feedback to learners on writing.
  • Genre analysis on writing to identify moves (communicative goal) and steps (rhetorical functions to help achieve the goal) [Swales, 1990].
  • Natural language processing (NLP) and machine learning categorization approach are widely used to automatically identify discourse structures (E.g. Mover, prior work on IADE).


  • To develop an automated analysis system ‘Research Writing Tutor‘ (RWT) for identifying rhetorical structures (moves and steps) from research writing and provide feedback to students.


  • Sentence level analysis – Each sentence classified to a move, step within the move.
  • Data: Introduction section from 1020 articles – 51 disciplines, each discipline containing 20 articles, total of 1,322,089 words.
  • Annotation Scheme:
    • 3 moves, 17 steps – Refer Table 1 from the original paper for detailed annotation scheme (Based on the CARS model).
    • Manual annotation using XML based markup by the Callisto Workbench.
  • Supervised learning approach steps:
    1. Feature selection:
      • Important features – unigrams, trigrams
      • n-gram feature set contained 5,825 unigrams and 11,630 trigrams for moves, and 27,689 unigrams and 27,160 trigrams for steps.
    2. Sentence representation:
      • Each sentence is represented as a n-dimensional vector in the R^n Euclidean space.
      • Boolean representation to indicate presence or absence of feature in sentence.
    3. Training classifier:
      • SVM model for classification.
      • 10-fold cross validation.
      • precision higher than recall – 70.3% versus 61.2% for the move classifier and 68.6% versus 55% for the step classifier – objective is to maximize accuracy.
      • RWT analyzer has two cascaded SVM – move classifier followed by step classifier.


  • Move and step classifiers predict some elements better than the others (Refer paper for detailed results):
    • Move 2 most difficult to identify (sparse training data).
    • Move 1 gained best recall- less ambiguous cues.
    • 10 out of 17 steps were predicted well.
    • Overall move accuracy of 72.6% and step accuracy of 72.9%.

Future Work:

  • Moving beyond sentence level to incorporate context information and sequence of moves/steps.
  • Knowledge-based approach for hard to identify steps – hand written rules and patterns.
  • Voting algorithm using independent analyzers.