Notes: Computational analysis of move structures in academic abstracts

Reference:

Wu, J. C., Chang, Y. C., Liou, H. C., & Chang, J. S. (2006, July). Computational analysis of move structures in academic abstracts. In Proceedings of the COLING/ACL on Interactive presentation sessions (pp. 41-44). Association for Computational Linguistics.

Background:

  • Swales pattern for research articles: Introduction, Methods, Results, Discussion (IMRD) and Creating a Research Space (CARS) model.
  • Studying the rhetorical structure of tests is found to be useful to aid reading and writing (Mover tool notes here).

Purpose:

  • To automatically analyze move structures (Background, Purpose, Method, Result, and Conclusion) from research article abstracts.
  • To develop an online learning system CARE (Concordancer for Academic wRiting in English) using move structures to help novice writers.

Method:

  • Processes involved:

care-system

  • TANGO Concordancer used for extracting collocations with chunking and clause information – Sample¬† Verb-Noun collocation structures in corpus: VP+NP, VP+PP+NP, and VP+NP+PP (Ref: Jian, J. Y., Chang, Y. C., & Chang, J. S. (2004, July). TANGO: Bilingual collocational concordancer. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions (p. 19). Association for Computational Linguistics.)
    • TANGO Tool accessible here.
  • Data: Corpus of 20,306 abstracts (95,960 sentences) from Citeseer. Manual tagging of moves in 106 abstracts containing 709 sentences. 72,708 collocation types extracted and manually tagged 317 collocations with moves.
  • Hidden Markov Model (HMM) trained using 115 abstracts containing 684 sentences.
  • Different parameters evaluated for the HMM model: “the frequency of collocation types, the number of sentences with collocation in each abstract, move sequence score and collocation score”

Results:

  • Precision of 80.54% achieved when 627 sentences were qualified with following parameters: weight of transitional probability function 0.7 , frequency threshold for a collocation to be applicable – 18 (crucial to exclude unreliable collocation).

Conclusion:

  • CARE system interface created for querying and looking up sentences for a specific move.
  • System is expected to help non native speakers write abstracts for research articles.