Understanding human-AI collaboration in writing (CoAuthorViz)

Generative AI (GenAI) has captured global attention since ChatGPT was publicly released in November 2022. The remarkable capabilities of AI have sparked a myriad of discussions around its vast potential, ethical considerations, and transformative impact across diverse sectors, including education. In particular, how humans can learn to work with AI to augment their intelligence rather than undermine it greatly interests many communities.

My own interest in writing research led me to explore human-AI partnerships for writing. We are not very far from using generative AI technologies in everyday writing when co-pilots become the norm rather than an exception. It is possible that a ubiquitous tool like Microsoft Word that many use as their preferred platform for digital writing comes with AI support as an essential feature (and early research shows how people are imagining these) for improved productivity. But at what cost?

In our recent full paper, we explored an analytic approach to study writers’ support seeking behaviour and dependence on AI in a co-writing environment:

Antonette Shibani, Ratnavel Rajalakshmi, Srivarshan Selvaraj, Faerie Mattins, Simon Knight (2023). Visual representation of co-authorship with GPT-3: Studying human-machine interaction for effective writing. In M. Feng, T. K¨aser, and P. Talukdar, editors, Proceedings of the 16th International Conference on Educational Data Mining, pages 183–193, Bengaluru, India, July 2023. International Educational Data Mining Society [PDF].

Using keystroke data from the interactive writing environment CoAuthor powered by GPT-3, we developed CoAuthorViz (See example figure below) to characterize writer interaction with AI feedback. ‘CoAuthorViz’ captured key constructs such as the writer incorporating a GPT-3 suggested text as is (GPT-3 suggestion selection), the writer not incorporating a GPT-3 suggestion
(Empty GPT-3 call), the writer modifying the suggested text (GPT-3 suggestion modification), and the writer’s own writing (user text addition). We demonstrated how such visualizations (and associated metrics) help characterise varied levels of AI interaction in writing from low to high dependency on AI.

Figure: CoAuthorViz legend and three samples of AI-assisted writing (squares denote writer written text, and triangles denote AI suggested text)

Full details of the work can be found in the resources below:

Several complex questions are yet to be answered:

  • Is autonomy (self-writing, without AI support) preferable to better quality writing (with AI support)?
  • As AI becomes embedded into our everyday writing, do we lose our own writing skills? And if so, is that of concern, or will writing become one of those outdated skills in the future that AI can do much better than humans?
  • Do we lose our ‘uniquely human’ attributes if we continue to write with AI?
  • What is an acceptable use of AI in writing that still lets you think? (We know by writing we think more clearly; would an AI tool providing the first draft restrict our thinking?)
  • What knowledge and skills do writers need to use AI tools appropriately?

Edit: If you want to delve into the topic further, here’s an intriguing article that imagines how writing might look in the future: https://simon.buckinghamshum.net/2023/03/the-writing-synth-hypothesis/

Working with Jupyter notebooks #code

Jupyter is an open source program that helps you share and run code in many different programming languages. Jupyter notebooks are great to quickly prototype different versions of code, as they are easy to edit and try different outputs. The format of a Jupyter notebook is similar to reports in the form of Markdowns that are usually used in R. It can contain blocks of text, code, equations and results (including visualizations) all in one page. We’ve used Jupyter notebooks to run text analysis workshops in conferences, and the feedback was pretty good.

The Writing Analytics workshop is starting at #LAK18. Jupyter notebooks are being used. #great pic.twitter.com/56Zd66ku9L

I find that Jupyter notebooks are great for sharing code and results across different people, and if you’re hosting it, it saves a lot of trouble in organising a workshop where you want participants to install software. It works well for non-technical audience too, since they can choose to ignore what’s inside the code block by simply running it and focus on the results block. They are quite popular now for data science experiments, so this post will be a good place to start to know and use them. You can use an already available notebook (if you’ve downloaded one from Github) and play with it, or create your own Jupyter notebook from scratch. This post will guide you to create your own notebook from scratch demonstrating some basic text analysis in Python.

Installing Jupyter

If you want to try a Jupyter notebook first without installing anything, you can do so in this notebook hosted in the official Jupyter site. If you want to install your own copy of Jupyter running in your machine to develop code, then use one of the two options below:

  • If you are new to Python programming, and don’t have python installed in your machine, the easiest way to install Jupyter is by downloading the Anaconda distribution. This comes with in-built Python (you can choose either 2.7 or 3.6 version of Python when you download the distribution – the code I’m writing in this post is in 2.7).
  • If you already have Python working in your machine (as I did), the easiest way is to install Jupyter using the pip command as you do for any Python package. Note that if pip and python are already setup in your system path, you can simply use $ pip install jupyter from the command prompt.

Now that Jupyter is installed, type the command below in your anaconda prompt/command prompt to start a Jupyter notebook:

$ jupyter notebook

The Jupyter homepage opens in your default browser at http://localhost:8888, displaying the files present in the current folder like below. You can now create a new Python jupyter notebook by clicking on New -> Python2 (or Python 3 if you have Python version 3). You can move between folders or create a new folder for your Python notebooks. To change the default opening directory, you should first move to the required path using cd in the command prompt, and then type$ jupyter notebookOpen the created notebook, which would look like this:

This cell is a code block by default, which can be changed to a markdown text block from the drop-down list (check the figure above) to add narrative text accompanying the Python code. Now name your notebook, and try adding both a code block, and markdown block with different levels of text following the sample here:

To execute the blocks, click on the Run button (Alternatively, use Ctrl+Enter in Windows – Keyboard shortcuts can be found in Help -> Keyboard shortcuts). This renders the output of your code and your markdown text like this:

That’s it. You have a simple Jupyter notebook running on your machine. Now to try a bit more, here’s the sample code you can download and run to do some basic text analysis. I’ve defined three steps in this code: Importing required packages, defining input text, and analysis. Before importing the packages/ libraries you need in step 1 however, they should be first installed in your machine. This can be done using the Pip command in the command prompt/anaconda prompt like this:  $ pip install wordcloud (If you run into problems with that, the other option is to download an appropriate version of the package’s wheel from here and install it using $pip install C:/some-dir/some-file.whl).

Python code for the three steps is below:

#Step 1 - Importing libraries
 
from wordcloud import WordCloud, STOPWORDS  #For word cloud generation
import matplotlib.pyplot as plt             #For displaying figures
import re                          #Regular expresions for string operations


#Step 2 - Defining input text

inputtext = "A cockatoo is a parrot that is any of the 21 species belonging to the bird family Cacatuidae, the only family in the superfamily Cacatuoidea. Along with the Psittacoidea (true parrots) and the Strigopoidea (large New Zealand parrots), they make up the order Psittaciformes (parrots). The family has a mainly Australasian distribution, ranging from the Philippines and the eastern Indonesian islands of Wallacea to New Guinea, the Solomon Islands and Australia. Cockatoos are recognisable by the showy crests and curved bills. Their plumage is generally less colourful than that of other parrots, being mainly white, grey or black and often with coloured features in the crest, cheeks or tail. On average they are larger than other parrots; however, the cockatiel, the smallest cockatoo species, is a small bird. The phylogenetic position of the cockatiel remains unresolved, other than that it is one of the earliest offshoots of the cockatoo lineage. The remaining species are in two main clades. The five large black coloured cockatoos of the genus Calyptorhynchus form one branch. The second and larger branch is formed by the genus Cacatua, comprising 11 species of white-plumaged cockatoos and four monotypic genera that branched off earlier; namely the pink and white Major Mitchell's cockatoo, the pink and grey galah, the mainly grey gang-gang cockatoo and the large black-plumaged palm cockatoo. Cockatoos prefer to eat seeds, tubers, corms, fruit, flowers and insects. They often feed in large flocks, particularly when ground-feeding. Cockatoos are monogamous and nest in tree hollows. Some cockatoo species have been adversely affected by habitat loss, particularly from a shortage of suitable nesting hollows after large mature trees are cleared; conversely, some species have adapted well to human changes and are considered agricultural pests. Cockatoos are popular birds in aviculture, but their needs are difficult to meet. The cockatiel is the easiest cockatoo species to maintain and is by far the most frequently kept in captivity. White cockatoos are more commonly found in captivity than black cockatoos. Illegal trade in wild-caught birds contributes to the decline of some cockatoo species in the wild. Source: https://en.wikipedia.org/wiki/Cockatoo"


print("\nInput text for analysis:\n ")
print(inputtext)


#Step 3 - Analysis

print "Summary statistics of input text:"

wordcount = len(re.findall(r'\w+', inputtext))
print "Wordcount: ", wordcount

charcount = len(inputtext) #including spaces
print "Number of characters: ", charcount

#More options for wordclouds here: https://github.com/amueller/word_cloud
wordcloud = WordCloud(    stopwords=STOPWORDS,
                          background_color='black',
                         ).generate(inputtext)

plt.imshow(wordcloud, interpolation="bilinear")
plt.axis('off')
plt.show()

The downloadable ipynb file is available on Github.

Other notes:

  • This post is intended for anyone who wants to start working with Jupyter notebooks, and assumes prior understanding of programming in Python. The Jupyter notebook is another environment to easily work with code, but the coding process is still very traditional. If you’re new to Python programming, this website is a good place to start.
  • You can use multiple versions of Python to run Jupyter notebooks by changing its Kernel (the computational engine which executes the code). I have both Python 2 & Python 3 installed, and I switch between them for different programs as needed.
  • While Jupyter notebooks are mainly used to run Python code, they can also be used to run R programs, which requires R kernel to be installed. The blog post below is a useful guide to do that: https://www.datacamp.com/community/blog/jupyter-notebook-r

Creating reports in R #Code

I’ve recently been consolidating a lot of R code from different parts of my analysis into one file. I wanted to add good documentation and explanation of results and interpretations along with my code to make sense of it later. I came across this option of creating dynamic reports that can combine our code, custom text and R output to an output document using the knitR package in R. I find it a good practice to create such reports for any analysis (wish I followed this earlier), so here’s a post on how to create them. They are very useful coz of the following reasons:

  • It is a great option to generate PDF, HTML and word reports by combining our text explanations, code, R output and graphics at one go. It saves the hassle of saving and copying text, code, output and figures separately into a report.
  • We can easily share the file with someone else with the output and explanations.
  • It is much easier to generate a new report dynamically when the input file changes, as it runs the same code and generates new output report based on the new file at one go.

How to create dynamic reports in R?

The first step is to create a R Markdown file (with the extension .Rmd). If you’re using RStudio, you can go to File -> New File -> R Markdown to create a Rmd file.  You should specify whether you want an output in html, pdf or word. It generated the following skeleton code for me as I specified an output pdf file:

Rmd parts
Rmd parts

Alternatively, you can write the sections in R and save it as .Rmd file. The Header section begins and ends with three dashes (—). It contains title, date and author attributes and specifies the type of output document: E.g. html_document for html web page, pdf_document for a pdf file, word_document for Microsoft Word .docx etc. The header can include other options as needed: “runtime: shiny” if it should be run as an interactive shiny app, “css: styles.css” to change the stylesheet when working with html, “toc: true” to include a table of contents etc.

The following code contains instructions along with R code to create a simple html document. It should be pretty self-explanatory to follow instructions and edit the code as needed for your own use:

---
title: "Sample Markdown file in R"
author: "Shibani"
output:
  pdf_document: default
  html_document: null
  toc: yes
---

# Main Heading of the report 
Plain text can be included as part of the report. This can be used to give introduction and explanations to your R code and results. Level 1 heading is added using '#'.

Blank line generates a new paragraph

## Including Text and Formatting 
Level 2 heading added using '##', deeper heading using '###' and so on.  

### Bulleted list using '*'

* item 1
* item 2
* item 3

### Numbered list using '1.'
1. item 1
2. item 2
3. item 3

### Formatting parts of text by enclosing within the symbols as below: 
_italic_, *italic*, __bold__, **bold**, `monospace`


### Horizontal line using '- - -'

---

## Including Code 

Code chunks start with ```{r name, option=value} and end with ```. Instead of typing them manually, they can be created in RStudio using Ctrl+Alt+i command (command+option+i in mac).  Optional parameters can be specified to customize the display of code chunks in the report.Some commonly used options below:

* include=TRUE (default) displays all code and results in the report, and include=FALSE runs the code but doesnot display it in the report
* echo=TRUE displays the code in the report, but not the results, and echo=FALSE hides the code in the report
* warning=TRUE/ FALSE to display warnings generated
* cache=TRUE to use knitr for caching to improve performance for time-consuming computations

Inline code can be added by enclosing in single quotes E.g. "Two plus two equals `r 2 + 2`" displays "Two plus two equals 4"

knitr::opts_chunk$set(options) sets options globally to all code chunks unless overwritten by local options

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, error = TRUE)
library(knitr)
library(ggplot2)
```

### Code chunk calculating and displaying results

Using a sample data set to demonstrate code usage:
```{r calculate_chunk, echo=T}

data("cars")
head(cars)

cars$time <- round(cars$dist/cars$speed,2)
cars$time

```

## Including Figures

Any figure produced by the code will be automatically included in the report when the code is run. You can hide accompanying code using echo=FALSE in the code chunk. Figure display options can be set in the code chunk as well as follows:

```{r barplot, fig.width=5, fig.height=5, fig.cap= "Barplot of Distance vs Speed", echo=F}

ggplot(data=cars,aes(x=dist,y=speed))+geom_bar(stat = "identity")

```

## Including Tables

`kable()` function in the `knitr` package can be used to produce a table in the report with a matrix or data  frame as input. To further customize the table with advanced fomatting options, use pander package.

```{r, echo=F}
kable(head(cars), digits = 2)
```

----

The html file created by the above code can be accessed here to view how the corresponding output is generated:

Sample Markdown file in R

 

Useful resources:

https://yihui.name/knitr/

https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf

 

 

 

Adding CKEditor to webpages in PHP #Code

What is CKEditor

CKEditor is an open source, customizable web text editor that can be integrated to our webpages. It can be used in three different modes (Article editor, Document editor and Inline editor) for content creation. I was looking for a web editor like Google doc using which I can collect text data from students (but not requiring login with gmail), and I found CKEditor doing exactly what I wanted to do. I’m using it here as a web document editor.

In this blog post, I’m combining a few steps I did to integrate CKEditor to my webpage. This is the code I wrote after a few rounds of trial and error and many rounds of looking up on the CKEditor documentation and StackOverflow. Wish I found a blog like this when I was trying to implement this 😉

Setting up CKEditor

CKEditor is available for download here. I used the Standard package of the current stable version  (Version 4.6.2 • 12 Jan 2017). All you have to do is to copy the folder ‘ckeditor’ from the downloaded zip file to your program files folder and you’re ready to go. The main steps are below:

Include ckeditor.js in the head section of your code:

<!-- make sure the src path points to your copied ckeditor folder -->
<script src="ckeditor/ckeditor.js"></script>

Create a text area for the editor in the body section followed by your CKEditor instance:

<!-- creating a text area for my editor in the form -->
<textarea id="myeditor" name="myeditor" id="myeditor"></textarea>

<!-- creating a CKEditor instance called myeditor -->
<script type="text/javascript">
	CKEDITOR.replace('myeditor');
</script>

This simple full code renders you a CKEditor with the default configuration options (Default toolbar, width, height etc. – All of these can be customized). Continue reading “Adding CKEditor to webpages in PHP #Code”