Writing and publishing journal articles

Last week I attended a talk in UTS by Professor Witold Pedrycz on the essentials of effective publishing and how to disseminate research results. He is a well known Professor in the field of Computational Intelligence with  great credentials (Editor-in-chief of very high impact journals, 40,000+ citations etc.). In early stages of PhD and research, we tend to make pretty basic mistakes that could lead to rejections and dejection. These are my notes from his talk where he explained the key components expected from a well-written paper and how to avoid common mistakes. It was quite useful to hear about the do’s and don’ts of publishing from an experienced academic who rejects almost 2000 articles every year for his own journal 😉

Why, how, when to publish?

Why: People might have different personal motives for publishing (expanding CV, meeting KPIs, new year resolutions… :p ), but the key reason why a research should be published is to share important research findings to the research community.

How: The most popular way to disseminate results is still using journal articles. Publication in journals are considered secure and more established, thanks to the detailed peer review process involved. Most points of this post are mentioned in the context of journal articles in particular, although some may also apply to conference articles and other publications. 

When: There is no hard deadline; but the general rule is to publish when we have results to share, and not too late.

Choosing the right journal:

  • Read articles in the journal and research the style of the journal before submission.
  • Check journal citation reports for confirming the claimed impact factorThomson Reuters
  • Be cautious of Beall’s List: Potential, possible, or probable predatory scholarly open-access publishers (scholarlyoa.com) containing details of blacklisted publishers and journals. Short peer review process and sudden request for fees are signs of predatory journals.
  • Not publishing in a good journal could be a bad hit to building a good CV later.

Checking criteria:

To make articles publishable, these are the three key points to keep in mind:

  1. Originality/ innovation – Novelty in the area of research identifying differences from what was already done by others
  2. Relevance/ Motivation – Clear objective of research on why it is done
  3. Presentation/exposure – Understandable writing

All the three criteria are equally important, and we will have to consider revising the paper even if it fails to achieve one of the above.

Preparing to write a quality manuscript:

Follow the standard article structure:

  • Title:
    • Use the fewest possible words to adequately describe the contents of the paper
    • Should contain findings, specific, concise, complete, attract readers
    • Don’t use jargon, abbreviations, ambiguous terms, unnecessary detail
  • Authors and affiliations
  • Abstract:
    • Strongly impacts editor’s decision
    • Should be precise and honest, stand alone entity, uses no tech jargon, brief and specific, cites no references
  • Keywords:
    • Important for indexing to make the article identified and cited
    • Check the guide
    • Specific (E.g. Specific algorithm rather than ‘neural network’ since it will bring millions of hits), avoid uncommon abbreviations and general terms
  • Introduction:
    • Why the current work was performed (Aims, significance), what has been done before (Literature review of prior work), what was done in the current research (brief), what was achieved (brief).
    • Consult the guide for word limit, set the scene, outline problem and hypothesis, balanced lit review (if included here), define non standard abbreviations and jargons, get to the point and keep it simple.
    • Lit review – well focused and linked to the paper.
    • Don’t write extensive review, cite, overuse terms like “novel” etc.
    • Mathematics: formula in papers – explain symbols, use standard notations.
    • I would also like to highlight Swales’ Creating a Research Space (CARS) model that provides a useful guide for writing introductions and other sections.
  • Flow of presentation:
    • Top-down approach: main idea→ fundamentals → algorithms → experiments → conclusions.
    • Avoid mixing different levels of abstraction (Explain concept, numeric values in introduction and not straight away in the experiment section, Explain what tool is used in the experiment section and not in the introduction section).
    • Brief, illustrative examples to motivate.
  • Results:
    • Use tables, figures to summarize, show results of statistical analysis, compare like with like (E.g. A simple, but commonly made mistake: “The results from this study are higher than the other study”: Doesn’t compare ‘results’ to ‘results’, but compares ‘results’ to another ‘study’).
    • Don’t duplicate data among tables, figures and text, use graphics for summarization of text (avoid large tables with many numbers).
    • Graphics: stand alone captions, easy to interpret, don’t overuse colors in charts (alternatives: diff types of lines), only essential information.
    • Clear legend, better organized data, present trend lines, don’t leave areas underutilized.
  • Discussion:
    • Study’s aim and hypothesis
    • Relating to other research
    • Avoid grand unsupported statements (E.g. novel organization method has enormously reduced the learning time), introducing new terms
  • Conclusion:
    • Put your study in context
    • How it represents advance in the field
    • Suggest future experiments
    • Avoid: repetition with other sections – same sentence in abstract, intro, discussion, conclusion, overly speculative, overemphasize the impact of the study.
  • Acknowledgement:
    • Contributions to paper: supplied materials or software, helped with writing or English, technical help.
  • References:
    • Include recent references
    • Check guide for correct format
    • Avoid: citing yourself/journal excessively, citing bad sources – which are not available, wikipedia – volatile, local language
    • Review paper requires experienced writing skills, survey paper has to digest and synthesize available research.
  • Supplementary material

Language essentials for a quality manuscript:

Ensure your manuscript has the three C’s below:

  1. Clarity
  2. Conciseness
  3. Correctness

Common traps: repetition, redundancy, ambiguity, exaggeration

You can make use of language editing services to polish the manuscript if required. Free tools are available online for checking surface level errors like grammar and spelling.

Ethical issues:

  • Multiple submissions, redundant publications, plagiarism, data fabrication and falsification, improper use of subjects, improper author contribution.
  • Plagiarism: Check the IEEE FAQ for details on plagiarism. Unacceptable paraphrasing, even with citation could be plagiarism.

Cover letter, Revisions and Responses to reviewers:

  • Write a brief cover letter to the editor to convey particular importance of your manuscript to the journal. Suggest potential reviewers (if required).
  • Indicate if the submitted paper is an extended version of a conference paper to avoid conflict of interest.
  • Review process: Draft a detailed letter of response to reviewers: respond to all points (accept with changes made or reject with polite reasoning), provide page and line numbers to refer to revisions, additional calculations if required to make the paper stronger.
    • E.g. “Thank you for the comment. However, we feel that the assumption in our model is supported by recent work by”.…. Rather than “the reviewer is clearly ignorant of the work of…”
  • Rejection: Not to be taken personally, try to understand why; don’t resubmit without significant revisions to another journal.
  • Journals allow paper to be distributed as an open source resource with an additional fee to reach wider audience (if required).

 

 

Tools for automated rhetorical analysis of academic writing

Alert – Long post!

In this post, I’m presenting a summary of my review on tools for automatically analyzing rhetorical structures from academic writing.

The tools considered are designed to cater to different users and purposes. AWA and RWT aim to provide feedback for improving students’ academic writing. Mover and SAPIENTA on the other hand, are to help researchers identify the structure of research articles. ‘Mover’ even allows users to give a second opinion on the classification of moves and add new training data (This can lead to a less accurate model if students with less expertise add potentially wrong training data). However, these tools have a common thread and fulfill the following criteria:

  • They look at scientific text – Full research articles, abstracts or introductions. Tools to automate argumentative zoning of other open text (Example) are not considered.
  • They automate the identification of rhetorical structures (zones, moves) in research articles (RA) with sentence being the unit of analysis.
  • They are broadly based on the Argumentative Zoning scheme by Simone Teufel or the CARS model by John Swales (Either the original schema or modified version of it).

Tools (in alphabetical order):

  1. Academic Writing Analytics (AWA) – Summary notes here

AWA also has a reflective parser to give feedback on students’ reflective writing, but the focus of this post is on the analytical parser. AWA demo, video courtesy of Dr. Simon Knight:

  1. Mover – Summary notes here

Available for download as a stand alone application. Sample screenshot below:

antmover

  1. Research Writing Tutor (RWT) – Summary notes here

RWT demo, video courtesy of Dr. Elena Cotos:

  1. SAPIENTA – Summary notes here.

Available for download as a stand alone java application or can be accessed as a web service. Sample screenshot of tagged output from SAPIENTA web service below:

sapienta-outputAnnotation Scheme:

The general aim of the schemes used is to be applicable to all academic writing and this has been successfully tested across data from different disciplines. A comparison of the schemes used by the tools is shown in the below table:

ToolSource & DescriptionAnnotation Scheme
AWAAWA Analytical scheme (Modified from AZ for sentence level parsing)-Summarizing
-Background knowledge
-Contrasting ideas
-Novelty
-Significance
-Surprise
-Open question
-Generalizing
Mover Modified CARS model
-three main moves and further steps
1. Establish a territory
-Claim centrality
-Generalize topics
-Review previous research
2. Establish a niche
-Counter claim
-Indicate a gap
-Raise questions
-Continue a tradition
3. Occupy the niche
-Outline purpose
-Announce research
-Announce findings
-Evaluate research
-Indicate RA structure
RWTModified CARS model
-3 moves, 17 steps
Move 1. Establishing a territory
-1. Claiming centrality
-2. Making topic generalizations
-3. Reviewing previous research
Move 2. Identifying a niche
-4. Indicating a gap
-5. Highlighting a problem
-6. Raising general questions
-7. Proposing general hypotheses
-8. Presenting a justification
Move 3. Addressing the niche
-9. Introducing present research descriptively
-10. Introducing present research purposefully
-11. Presenting research questions
-12. Presenting research hypotheses
-13. Clarifying definitions
-14. Summarizing methods
-15. Announcing principal outcomes
-16. Stating the value of the present research
-17. Outlining the structure of the paper
SAPIENTAfiner grained AZ scheme
-CoreSC scheme with 11 categories in the first layer
-Background (BAC)
-Hypothesis (HYP)
-Motivation (MOT)
-Goal (GOA)
-Object (OBJ)
-Method (MET)
-Model (MOD)
-Experiment (EXP)
-Observation (OBS)
-Result (RES)
-Conclusion (CON)

Method:

The tools are built on different data sets and methods for automating the analysis. Most of them use manually annotated data as a standard for training the model to automatically classify the categories. Details below:

ToolData typeAutomation method
AWAAny research writingNLP rule based - Xerox Incremental Parser (XIP) to annotate rhetorical functions in discourse.
MoverAbstractsSupervised learning - Naïve Bayes classifier with data represented as bag of clusters with location information.
RWTIntroductionsSupervised learning using Support Vector Machine (SVM) with n-dimensional vector representation and n-gram features.
SAPIENTA Full articleSupervised learning using SVM with sentence aspect features and Sequence Labelling using Conditional Random Fields (CRF) for sentence dependencies.

Others:

  • SciPo tool helps students write summaries and introductions for scientific texts in Portuguese.
  • Another tool CARE is a word concordancer used to search for words and moves from research abstracts- Summary notes here.
  • A ML approach considering three different schemes for annotating scientific abstracts (No tool).

If you think I’ve missed a tool which does similar automated tagging in research articles, do let me know so I can include it in my list 🙂