In this post, I’m presenting a summary of my review on tools for automatically analyzing rhetorical structures from academic writing.
The tools considered are designed to cater to different users and purposes. AWA and RWT aim to provide feedback for improving students’ academic writing. Mover and SAPIENTA on the other hand, are to help researchers identify the structure of research articles. ‘Mover’ even allows users to give a second opinion on the classification of moves and add new training data (This can lead to a less accurate model if students with less expertise add potentially wrong training data). However, these tools have a common thread and fulfill the following criteria:
They look at scientific text – Full research articles, abstracts or introductions. Tools to automate argumentative zoning of other open text (Example) are not considered.
They automate the identification of rhetorical structures (zones, moves) in research articles (RA) with sentence being the unit of analysis.
They are broadly based on the Argumentative Zoning scheme by Simone Teufel or the CARS model by John Swales (Either the original schema or modified version of it).
Available for download as a stand alone java application or can be accessed as a web service. Sample screenshot of tagged output from SAPIENTA web service below:
The general aim of the schemes used is to be applicable to all academic writing and this has been successfully tested across data from different disciplines. A comparison of the schemes used by the tools is shown in the below table:
Source & Description
AWA Analytical scheme (Modified from AZ for sentence level parsing)
Modified CARS model
-three main moves and further steps
1. Establish a territory
-Review previous research
2. Establish a niche
-Indicate a gap
-Continue a tradition
3. Occupy the niche
-Indicate RA structure
Modified CARS model
-3 moves, 17 steps
Move 1. Establishing a territory
-1. Claiming centrality
-2. Making topic generalizations
-3. Reviewing previous research
Move 2. Identifying a niche
-4. Indicating a gap
-5. Highlighting a problem
-6. Raising general questions
-7. Proposing general hypotheses
-8. Presenting a justification
Move 3. Addressing the niche
-9. Introducing present research descriptively
-10. Introducing present research purposefully
-11. Presenting research questions
-12. Presenting research hypotheses
-13. Clarifying definitions
-14. Summarizing methods
-15. Announcing principal outcomes
-16. Stating the value of the present research
-17. Outlining the structure of the paper
finer grained AZ scheme
-CoreSC scheme with 11 categories in the first layer
The tools are built on different data sets and methods for automating the analysis. Most of them use manually annotated data as a standard for training the model to automatically classify the categories. Details below:
Any research writing
NLP rule based - Xerox Incremental Parser (XIP) to annotate rhetorical functions in discourse.
Supervised learning - Naïve Bayes classifier with data represented as bag of clusters with location information.
Supervised learning using Support Vector Machine (SVM) with n-dimensional vector representation and n-gram features.
Supervised learning using SVM with sentence aspect features and Sequence Labelling using Conditional Random Fields (CRF) for sentence dependencies.
SciPo tool helps students write summaries and introductions for scientific texts in Portuguese.
Another tool CARE is a word concordancer used to search for words and moves from research abstracts- Summary notes here.
Reading research articles can be a daunting task for new students. Even after reading many articles over the last few years, I still take time to read, understand and critically evaluate research articles (Takes double the time for theoretical ones, since I’m from a technical background). I’m no expert on this topic (or any topic for that matter :p), but I thought this post could be useful for fellow students who toil with research papers just like me. The post is going to be a combination of few tips from my own experience plus a useful course I attended at UTS by Dr. Terry Royce (Reading & writing for your Literature Review: Getting started and what to look for).
The first thing to keep in mind while reading articles is that it is a time consuming process. So do not get dejected if it takes longer than your allotted time. Not everyone reads in the same pace as you. Give yourself more time, especially if you’re reading a new topic. Your reading skills will definitely improve with experience.
Concentration is key, so take a break and refresh your mind if you’re stuck with an article for very long. How I wish it was as easy as reading a fiction novel for hours with absolute concentration… Sigh!!
Read articles in whichever form that is comfortable for you – either soft or hard copy is your choice. I recently moved from hard copy to soft copy format since it is more convenient to look up my notes anytime and easily portable. I still print important articles and make them ugly with highlights though 🙂 Managing and organising all articles you’ve read/ going to read is another arduous task, for which you probably have to plan early!
Now for the ‘real strategies’ for reading:
Read widely and extensively. When you get a fuzzy boundary sense after extensive reading (that the article doesn’t add anything new), that’s when you stop. PhD students might want to read over 300 articles before writing their thesis 😮
Learn ‘purposeful focused reading’ – you don’t necessarily have to read a whole book if you only need a chapter of it. Similarly, you can only read what you need in an article.
Employ these reading strategies:
Reviewing (looking at title, keywords and flipping through)