CHI’24 research publications

I attended the prestigious Human-Computer Interaction (HCI) conference CHI’24 at Hawaii, Honolulu in May 2024. While I’m quite familiar with the field of HCI, it was my first time attending the conference because it is much wider than my main area of research (Learning Analytics, AI in education, and Writing Analytics). The sheer scale of the conference (~2K to 3K attendees) and the broad range of topics it covers (check out this full program) is almost impossible to fully grasp!

TLDR; Go to the end for the list of paper from CHI’24.

My personal highlight was the Intelligent Writing Assistants Workshop, which was running for the third time at CHI, organized by a bunch of fun people who are all super keen about researching the use of AI to assist writing. Picture from our workshop below (thanks, Theimo, for the LinkedIn post)

Pictured: Participants of the Intelligent Writing Assistants CHI’24 workshop at the end of the session

The workshop had many mini presentations on the overall theme of Dark Sides: Envisioning, Understanding, and Preventing Harmful Effects of Writing Assistants. I presented my work with Prof. Simon Buckingham Shum on AI-Assisted Writing in Education: Ecosystem Risks and Mitigations, where we examined key factors (in the broader socio-technical ecosystem which are often hidden) that need consideration for implementing AI writing assistants at scale in educational contexts.


This was actually a deep dive into the Ecosystem aspect of a larger piece of work we presented at CHI on A Design Space for Intelligent and Interactive Writing Assistants. The full design space from our full paper mapped the space of intelligent writing assistants reviewing 115 papers from HCI and NLP, with a team of 36 authors, led by Mina Lee.

Figure: Design space for intelligent and interactive writing assistants consisting of five key aspects—task, user, technology, interaction, and ecosystem from our full paper.

An interactive tool is also presented to explore the literature in detail.


I also had a late-breaking work poster presentation on Critical Interaction with AI on Written Assessment (I have a seperate post about it!) where we explored how students engaged with generative AI tools like ChatGPT for their writing tasks, and if they were able to navigate this interaction critically.

A cherished memory to hold on to was also the time I spent with my friend Vanessa, who is currently a Research Fellow at Monash university during this trip in Hawaii. Vanessa and I started our PhD together at th Connected Intelligence Centre at UTS ~8 years ago, and it was really nice to catch up after a long time (along with few others). I had also just visited Monash university’s CoLAM a week before for a talk and meeting fellow Learning Analytics researchers, hosted by her and Roberto. The group do interesting work in Learning Analytics that is worth checking out.

6 years apart… On the left: Vanessa and I in 2018 while attending AIED/ ICLS 2018 in London; On the right: Us while attending CHI in 2024 in Hawaii.


TLDR -> Research publications:

Here are all the papers from the work we presented at CHI’24:

Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L.C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Agnia Sergeyuk, Antonette Shibani, Disha Shrivastava, Lila Shroff, Jessi Stark, Sarah Sterman, Sitong Wang, Antoine Bosselut, Daniel Buschek, Joseph Chee Chang, Sherol Chen, Max Kreminski, Joonsuk Park, and Roy Pea, Eugenia H. Rho, Shannon Zejiang Shen, Pao Siangliulue. 2024. A Design Space for Intelligent and Interactive Writing Assistants. In Proceedings
of the CHI Conference on Human Factors in Computing Systems (CHI ’24),
May 11–16, 2024, Honolulu, HI, USA. ACM, New York, NY, USA, 33 pages.
https://doi.org/10.1145/3613904.3642697

Antonette Shibani, Simon Knight, Kirsty Kitto, Ajanie Karunanayake, Simon Buckingham Shum (2024). Untangling Critical Interaction with AI in Students’ Written Assessment. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI ’24), May 11-16, 2024, Honolulu, HI, USA. doi.org/10.1145/3613905.3651083

Antonette Shibani & Simon Buckingham Shum (2024). AI-Assisted Writing in Education: Ecosystem Risks and Mitigations. In The Third Workshop on Intelligent and Interactive Writing Assistants @ CHI ’24, Honolulu, HI, USA. https://arxiv.org/abs/2404.10281

Tamil Co-Writer: Inclusive AI for writing support

Next week, I’m presenting my work in the First workshop on
Generative AI for Learning Analytics (GenAI-LA) at the 14th International Conference on Learning Analytics and Knowledge LAK 2024:

Antonette Shibani, Faerie Mattins, Srivarshan Selvaraj, Ratnavel Rajalakshmi & Gnana Bharathy (2024) Tamil Co-Writer: Towards inclusive use of generative AI for writing support. In Joint Proceedings of LAK 2024 Workshops, co-located with 14th International Conference on Learning Analytics and Knowledge (LAK 2024), Kyoto, Japan, March 18-22, 2024.

With colleagues in India, we developed Tamil Co-Writer, a GenAI-supported writing tool that offers AI suggestions for writing in the regional Indian language Tamil (which is my first language). The majority of AI-based writing assistants are created for English language users and do not address the needs of linguistically diverse groups of learners. Catering to languages typically under-represented in NLP is important in the generative AI era for the inclusive use of AI for learner support. Combined with analytics on AI usage, the tool can offer writers improved productivity and a chance to reflect on their optimal/sub-optimal collaborations with AI.

The tool combined the following elements:

  1. An interactive AI writing environment that offers several input modes to write in Tamil
  2. Analytics of writer’s AI interaction in the session for reflection (See post on CoAuthorViz for details, and related paper here)

A short video summarising the key insights from the paper is below:

Understanding human-AI collaboration in writing (CoAuthorViz)

Generative AI (GenAI) has captured global attention since ChatGPT was publicly released in November 2022. The remarkable capabilities of AI have sparked a myriad of discussions around its vast potential, ethical considerations, and transformative impact across diverse sectors, including education. In particular, how humans can learn to work with AI to augment their intelligence rather than undermine it greatly interests many communities.

My own interest in writing research led me to explore human-AI partnerships for writing. We are not very far from using generative AI technologies in everyday writing when co-pilots become the norm rather than an exception. It is possible that a ubiquitous tool like Microsoft Word that many use as their preferred platform for digital writing comes with AI support as an essential feature (and early research shows how people are imagining these) for improved productivity. But at what cost?

In our recent full paper, we explored an analytic approach to study writers’ support seeking behaviour and dependence on AI in a co-writing environment:

Antonette Shibani, Ratnavel Rajalakshmi, Srivarshan Selvaraj, Faerie Mattins, Simon Knight (2023). Visual representation of co-authorship with GPT-3: Studying human-machine interaction for effective writing. In M. Feng, T. K¨aser, and P. Talukdar, editors, Proceedings of the 16th International Conference on Educational Data Mining, pages 183–193, Bengaluru, India, July 2023. International Educational Data Mining Society [PDF].

Using keystroke data from the interactive writing environment CoAuthor powered by GPT-3, we developed CoAuthorViz (See example figure below) to characterize writer interaction with AI feedback. ‘CoAuthorViz’ captured key constructs such as the writer incorporating a GPT-3 suggested text as is (GPT-3 suggestion selection), the writer not incorporating a GPT-3 suggestion
(Empty GPT-3 call), the writer modifying the suggested text (GPT-3 suggestion modification), and the writer’s own writing (user text addition). We demonstrated how such visualizations (and associated metrics) help characterise varied levels of AI interaction in writing from low to high dependency on AI.

Figure: CoAuthorViz legend and three samples of AI-assisted writing (squares denote writer written text, and triangles denote AI suggested text)

Full details of the work can be found in the resources below:

Several complex questions are yet to be answered:

  • Is autonomy (self-writing, without AI support) preferable to better quality writing (with AI support)?
  • As AI becomes embedded into our everyday writing, do we lose our own writing skills? And if so, is that of concern, or will writing become one of those outdated skills in the future that AI can do much better than humans?
  • Do we lose our ‘uniquely human’ attributes if we continue to write with AI?
  • What is an acceptable use of AI in writing that still lets you think? (We know by writing we think more clearly; would an AI tool providing the first draft restrict our thinking?)
  • What knowledge and skills do writers need to use AI tools appropriately?

Edit: If you want to delve into the topic further, here’s an intriguing article that imagines how writing might look in the future: https://simon.buckinghamshum.net/2023/03/the-writing-synth-hypothesis/

Questioning Learning Analytics – Cultivating critical engagement (LAK’22)

Gist of LAK 22 paper

Our full research paper has been nominated for Best Paper at the prestigious Learning Analytics and Knowledge (LAK) Conference:

Antonette Shibani, Simon Knight and Simon Buckingham Shum (2022, Forthcoming). Questioning learning analytics? Cultivating critical engagement as student automated feedback literacy. [BEST RESEARCH PAPER NOMINEE] The 12th International Learning Analytics & Knowledge Conference (LAK ’22).

Here’s the gist of what the paper talks about:

  • Learning Analytics (LA) still requires substantive evidence for outcomes of impact in educational practice. A human-centered approach can bring about better uptake of LA.
  • We need critical engagement and interaction with LA to help tackle issues ranging from black-boxing, imperfect analytics, and the lack of explainability of algorithms and artificial intelligence systems, to the required relevant skills and capabilities of LA users when dealing with such advanced technologies.
  • Students must be able to, and should be encouraged to, question analytics in student-facing LA systems as Critical engagement is a metacognitive capacity that both demonstrates and builds student understanding.
  • This puts the power back to users and empowers them with agency when using LA.
  • Critical engagement with LA should be facilitated with careful design for learning; we provide an example case with automated writing feedback – see the paper for details on what the design involved.
  • We show empirical data and findings from student annotations of automated feedback from AcaWriter, where we want them to develop their automated feedback literacy.

The full paper is available for download at this link: [Author accepted manuscript pdf].

This paper was the hardest for me to write personally since I was running on 2-3 hours of sleep right after joining work part-time following my maternity leave. Super stoked to hear about the best paper nomination, as my work as a new mum paid off. Good to be back at work while also taking care of the little bubba 🙂 Thanks to my co-authors for accommodating my writing request really close to the deadline!

Also, workshops coming up in LAK22:

  • Antonette Shibani, Andrew Gibson, Simon Knight, Philip H Winne, Diane Litman (2022, Forthcoming). Writing Analytics for higher-order thinking skills. Accepted workshop at The 12th International Learning Analytics & Knowledge Conference (LAK ’22).
  • Yi-Shan Tsai, Melanie Peffer, Antonette Shibani, Isabel Hilliger, Bodong Chen, Yizhou Fan, Rogers Kaliisa, Nia Dowell and Simon Knight (2022, Forthcoming). Writing for Publication: Engaging Your Audience. Accepted workshop at The 12th International Learning Analytics & Knowledge Conference (LAK ’22).

Automated Writing Feedback in AcaWriter

You might be familiar with my research in the field of Writing Analytics, particularly Automated Writing Feedback during my PhD and beyond. The work is based off an automated feedback tool called AcaWriter (previously called Automated Writing Analytics/ AWA) which we developed at the Connected Intelligence Centre, University of Technology Sydney.

Recently we have come up with resources to spread the word and introduce the tool to anyone who wants to learn more. First is an introductory blog post I wrote for the Society for Learning Analytics Research (SoLAR) Nexus publication. You can access the full blog post here: https://www.solaresearch.org/2020/11/acawriter-designing-automated-feedback-on-writing-that-teachers-and-students-trust/

We also ran a 2 hour long workshop online as part of a LALN event to add more detail and resources for others to participate. Details are here: http://wa.utscic.edu.au/events/laln-2020-workshop/

Video recording from the event is available for replay:

Learn more: https://cic.uts.edu.au/tools/awa/

Automated Revision Graphs – AIED 2020

I’ve recently had my writing analytics work published at the 21st international conference on artificial intelligence in education (AIED 2020) where the theme was “Augmented Intelligence to Empower Education”. It is a short paper describing a text analysis and visualisation method to study revisions. It introduced ‘Automated Revision Graphs’ to study revisions in short texts at a sentence level by visualising text as graph, with open source code.

Shibani A. (2020) Constructing Automated Revision Graphs: A Novel Visualization Technique to Study Student Writing. In: Bittencourt I., Cukurova M., Muldner K., Luckin R., Millán E. (eds) Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science, vol 12164. Springer, Cham. [pdf] https://doi.org/10.1007/978-3-030-52240-7_52

I did a short introductory video for the conference, which can be viewed below:

I also had another paper I co-authored on multi-modal learning analytics lead by Roberto Martinez, which received the best paper award in the conference. The main contribution of the paper is a set of conceptual mappings from x-y positional data (captured from sensors) to meaningful measurable constructs in physical classroom movements, grounded in the theory of Spatial Pedagogy. Great effort by the team!

Details of the second paper can be found here:

Martinez-Maldonado R., Echeverria V., Schulte J., Shibani A., Mangaroska K., Buckingham Shum S. (2020) Moodoo: Indoor Positioning Analytics for Characterising Classroom Teaching. In: Bittencourt I., Cukurova M., Muldner K., Luckin R., Millán E. (eds) Artificial Intelligence in Education. AIED 2020. Lecture Notes in Computer Science, vol 12163. Springer, Cham. [pdf] https://doi.org/10.1007/978-3-030-52237-7_29

Notes: ‘Digital support for academic writing: A review of technologies and pedagogies’

I came across this review article on writing tools published in 2019, and wanted to make some quick notes to come back to in this post. I’m following the usual format I use for article notes which summarizes the gist of a paper with short descriptions under respective headers. I had a few thoughts on what I thought the paper missed, which I will also describe in this post.

Reference:

Carola Strobl, Emilie Ailhaud, Kalliopi Benetos, Ann Devitt, Otto Kruse, Antje Proske, Christian Rapp (2019). Digital support for academic writing: A review of technologies and pedagogies. Computers & Education 131 (33–48).

Aim:

  • To present a review of the technologies designed to support writing instruction in secondary and higher education.

Method:

Data collection:

  • Writing tools collected from two sources: 1) Systematic search in literature databases and search engines, 2) Responses from the online survey sent to research communities on writing instruction.
  • 44 tools selected for fine-grained analysis.

Tools selected:

Academic Vocabulary
Article Writing Tool
AWSuM
C-SAW (Computer-Supported Argumentative Writing)
Calliope
Carnegie Mellon prose style tool
CohVis
Corpuscript
Correct English (Vantage Learning)
Criterion
De-Jargonizer
Deutsch-uni online
DicSci (Dictionary of Verbs in Science)
Editor (Serenity Software)
escribo
Essay Jack
Essay Map
Gingko
Grammark
Klinkende Taal
Lärka
Marking Mate (standard version)
My Access!
Open Essayist
Paper rater
PEG Writing
Rationale
RedacText
Research Writing Tutor
Right Writer
SWAN (Scientific Writing Assistant)
Scribo – Research Question and Literature Search Tool
StyleWriter
Thesis Writer
Turnitin (Revision Assistant)
White Smoke
Write&Improve
WriteCheck
Writefull

Inclusion criteria:

  • Tools intended solely for primary and secondary education, since the main focus of the paper was on higher education.
  • Tools with the sole focus on features like grammar, spelling, style, or plagiarism detection were excluded.
  • Technologies without an instructional focus, like pure online text editors and tools, platforms or content management systems excluded.

I have my concerns in the way tools were included for this analysis, particularly because some key tools like AWA/ AcaWriter,
Writing Mentor, Essay Critic, and Grammarly were not considered. This is one of the main limitations I found in the study. It is not clear how the tools were selected in the systematic search as there is no information about the databases and keywords used for the search. The way tools focusing on higher education were picked is not explained as well.

Continue reading “Notes: ‘Digital support for academic writing: A review of technologies and pedagogies’”

Tools for automated rhetorical analysis of academic writing

Alert – Long post!

In this post, I’m presenting a summary of my review on tools for automatically analyzing rhetorical structures from academic writing.

The tools considered are designed to cater to different users and purposes. AWA and RWT aim to provide feedback for improving students’ academic writing. Mover and SAPIENTA on the other hand, are to help researchers identify the structure of research articles. ‘Mover’ even allows users to give a second opinion on the classification of moves and add new training data (This can lead to a less accurate model if students with less expertise add potentially wrong training data). However, these tools have a common thread and fulfill the following criteria:

  • They look at scientific text – Full research articles, abstracts or introductions. Tools to automate argumentative zoning of other open text (Example) are not considered.
  • They automate the identification of rhetorical structures (zones, moves) in research articles (RA) with sentence being the unit of analysis.
  • They are broadly based on the Argumentative Zoning scheme by Simone Teufel or the CARS model by John Swales (Either the original schema or modified version of it).

Tools (in alphabetical order):

  1. Academic Writing Analytics (AWA) – Summary notes here

AWA also has a reflective parser to give feedback on students’ reflective writing, but the focus of this post is on the analytical parser. AWA demo, video courtesy of Dr. Simon Knight:

  1. Mover – Summary notes here

Available for download as a stand alone application. Sample screenshot below:

antmover

  1. Research Writing Tutor (RWT) – Summary notes here

RWT demo, video courtesy of Dr. Elena Cotos:

  1. SAPIENTA – Summary notes here.

Available for download as a stand alone java application or can be accessed as a web service. Sample screenshot of tagged output from SAPIENTA web service below:

sapienta-outputAnnotation Scheme:

The general aim of the schemes used is to be applicable to all academic writing and this has been successfully tested across data from different disciplines. A comparison of the schemes used by the tools is shown in the below table:

ToolSource & DescriptionAnnotation Scheme
AWAAWA Analytical scheme (Modified from AZ for sentence level parsing)-Summarizing
-Background knowledge
-Contrasting ideas
-Novelty
-Significance
-Surprise
-Open question
-Generalizing
Mover Modified CARS model
-three main moves and further steps
1. Establish a territory
-Claim centrality
-Generalize topics
-Review previous research
2. Establish a niche
-Counter claim
-Indicate a gap
-Raise questions
-Continue a tradition
3. Occupy the niche
-Outline purpose
-Announce research
-Announce findings
-Evaluate research
-Indicate RA structure
RWTModified CARS model
-3 moves, 17 steps
Move 1. Establishing a territory
-1. Claiming centrality
-2. Making topic generalizations
-3. Reviewing previous research
Move 2. Identifying a niche
-4. Indicating a gap
-5. Highlighting a problem
-6. Raising general questions
-7. Proposing general hypotheses
-8. Presenting a justification
Move 3. Addressing the niche
-9. Introducing present research descriptively
-10. Introducing present research purposefully
-11. Presenting research questions
-12. Presenting research hypotheses
-13. Clarifying definitions
-14. Summarizing methods
-15. Announcing principal outcomes
-16. Stating the value of the present research
-17. Outlining the structure of the paper
SAPIENTAfiner grained AZ scheme
-CoreSC scheme with 11 categories in the first layer
-Background (BAC)
-Hypothesis (HYP)
-Motivation (MOT)
-Goal (GOA)
-Object (OBJ)
-Method (MET)
-Model (MOD)
-Experiment (EXP)
-Observation (OBS)
-Result (RES)
-Conclusion (CON)

Method:

The tools are built on different data sets and methods for automating the analysis. Most of them use manually annotated data as a standard for training the model to automatically classify the categories. Details below:

ToolData typeAutomation method
AWAAny research writingNLP rule based - Xerox Incremental Parser (XIP) to annotate rhetorical functions in discourse.
MoverAbstractsSupervised learning - Naïve Bayes classifier with data represented as bag of clusters with location information.
RWTIntroductionsSupervised learning using Support Vector Machine (SVM) with n-dimensional vector representation and n-gram features.
SAPIENTA Full articleSupervised learning using SVM with sentence aspect features and Sequence Labelling using Conditional Random Fields (CRF) for sentence dependencies.

Others:

  • SciPo tool helps students write summaries and introductions for scientific texts in Portuguese.
  • Another tool CARE is a word concordancer used to search for words and moves from research abstracts- Summary notes here.
  • A ML approach considering three different schemes for annotating scientific abstracts (No tool).

If you think I’ve missed a tool which does similar automated tagging in research articles, do let me know so I can include it in my list 🙂

Notes: XIP – Automated rhetorical parsing of scientific metadiscourse

Reference: Simsek, D., Buckingham Shum, S., Sandor, A., De Liddo, A., & Ferguson, R. (2013). XIP Dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse. In: 1st International Workshop on Discourse-Centric Learning Analytics, 8 Apr 2013, Leuven, Belgium.

Background:

Learners should have the ability to critically evaluate research articles and be able to identify the claims and ideas in scientific literature.

Purpose:

  • Automating analysis of research articles to identify evolution of ideas and findings.
  • Describing the Xerox Incremental Parser (XIP) which identifies rhetorically significant structures from research text.
  • Designing a visual analytics dashboard to provide overviews of the student corpus.

Method:

  • Argumentative Zoning (AZ) to annotate moves in research articles by Simone Teufel.
  • Rhetorical moves tagged by XIP – partly overlap and partly different from AZ scheme: SUMMARIZING, BACKGROUND KNOWLEDGE, CONTRASTING IDEAS, NOVELTY, SIGNIFICANCE, SURPRISE, OPEN QUESTION, GENERALIZING
  • Sample discourse moves:
    • Summarizing: “The purpose of this article….”
    • Contrasting ideas: “With an absence of detailed work…”
      • Sub-classes: novelty, surprise, importance, emerging issue, open question
  • XIP outputs a raw output file containing semantic tags and concepts extracted from text.
  • Data: Papers from LAK & EDM conferences and journal – 66 LAK and 239 EDM papers extracting 7847 sentences and 40163 concepts.
  • Dashboard design – Refer original paper to see the process involved in prototyping the visualizations.

Tool:

  • XIP is now embedded in the Academic Writing Analytics (AWA) tool by UTS. AWA provides analytical and reflective reports on students’ writing.

Notes: NLP Techniques for peer feedback

Reference: Xiong, W., Litman, D. J., & Schunn, C. D. (2012). Natural language processing techniques for researching and improving peer feedback. Journal of Writing Research, 4(2), 155-176.

Background:

  • Feedback on writing is seen to improve students’ writing, but the process is resource intensive.
  • Possible options to reduce the workload in giving feedback:
    • Direct feedback using technology assisted approaches (from grammar checks to complex computational linguistics).
    • Peer Review [Considered in this paper].
  • Peer review:
    • Good feedback from a group of peers is found to be as useful as the instructor’s feedback and even weaker writers are seen to provide useful feedback to stronger writers (See references in original paper).
    • When providing feedback on other students’ work, students become mindful of the mistakes and improve their own writing.
    • Some web-based peer review systems: PeerMark in turnitin.com, SWoRD (used in this study) and Calibrated Peer Review.

Problem:

  • Challenge lies in the form of feedback provided by peers – peer feedback might not be in a form useful to make revisions. Key features identified to aid revisions:
    1. Localized information (Providing exact location details like paragraph, page numbers or quotations).
    2. Concrete solution (Suggesting possible solution rather than just pointing the problem).
  • Research problem: Studying peer review is hard with a large amount of feedback data.
  • Practical problem: Identifying useful feedback for students and possible interventions to help them provide good feedback.

Purpose:

  • To automatically process peer feedback and identify the presence or absence of the two key features (Providing feedback on feedback for students and automatically coding feedback for researchers).
  • Refer prototype shown in figure 1 of the original paper that suggests students to provide localized and explicit solutions.

Technical – How? (Details explained in study 1 and study 2)

  1. Building a domain lexicon from common unigrams and bigrams in student papers.
  2. Counting basic features like domain words, modals, negations, overlap between comment and paper etc. from each feedback.
  3. Creating a logic model to identify the type of feedback (Contains localization information or not/ contains explicit solution or not) – classification task in machine learning.

Method and Results:

Study 1 – Localization Detection:

  • Each feedback comment represented as a vector of the four attributes below:
    1. regularExpressionTag: Regular expressions to match phrases that use location in a comment (E.g. “on page 5”).
    2. #domainWord: Counting the number of domain-related words in a comment (based on the domain lexicon gathered from frequent terms in student papers).
    3. sub-domain-obj, deDeterminer: Extracting syntactic attributes (sub-domain-obj) and count of words like “this, that, these, those” which are demonstrative determiners.
    4. windowSize, #overlaps: Extracting the length of matching words from the document to identify quotes (windowSize) and words overlapped.
  • Weka models to automatically code localization information. The decision tree model had better accuracy (77%, recall 82%, precision 73%) in predicting if a feedback was localized or not. To refer the rules that made up the decision tree, take a look at Figure 2 of the original paper.

Study 2 – Solution Detection:

  • Feedback comment represented as vectors using the three types of attributes(Refer table 2 in the original paper for details).
    • Simple features like word count and the order of comment in overall feedback.
    • Essay attributes to capture the relationship between the comment and the essay and domain topics.
    • Keyword attributes semi-automatically learned based on semantic and syntactic functions.
  • Logistic Regression model to detect the presence/ absence of explicit solutions (accuracy 83%, recall 91%, precision 83%). Domain-topic words followed by suggestions were highly associated with prediction. Detailed coefficients of attributes predicting presence of solution can be referred in Table 3 of the original paper.

Study 3: Can Research Rely on Automatic Coding?

  • Comparing automatically coded data to hand coded data to see if the accuracy is sufficiently high for practical implementation.
  • Helpfulness ratings by peers and 2 experts (content, writing experts) on peer comments at a review level.
  • To account for expert ratings:
    • Regression analysis using feedback type proportions (praise only comments, summary only comments, problem/solution containing comments), proportion localized critical comments, and proportion solution providing comments as predictors.
    • 10 fold cross validation – SVM best fit.
    • To check if same models are built using machine coded and hand coded data – 10 stepwise regressions. Refer Table 4 in the original paper to see the feedback features commonly included in the model by the different raters – Different features were helpful for different raters.
    • Overall regression model is similar to hand coded localization data (Most of the positivity, solution and localization were similar between hand coding and automatic coding).

Discussion:

  • Predictive models for detecting localization and solution information are statistical tools and do not provide deep content insights.
  • To be integrated into SWoRD to provide real time feedback on comments.
  • Technical note: Comments were already pre-processed – segmented into idea units by hand; data split by hand into comment type (summary, praise, criticism).
  • Future work:
    • Examine impact of feedback on feedback comments
    • Obtaining generalization across courses
    • Improving accuracy of prediction