Fine-tuning a Transformer model using fast.ai

Recently I set out to train a Transformer model, based on Distil-GPT2, to write something like my mothers’ poetry.

After much searching for the most concise way to do this, I think I’ve figured out a reasonable easy-to-understand approach that works for me in Google Colab.

In [2]:

!pip install fastai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: fastai in /usr/local/lib/python3.7/dist-packages (2.7.9)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from fastai) (2.23.0)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from fastai) (1.7.3)
Requirement already satisfied: fastcore<1.6,>=1.4.5 in /usr/local/lib/python3.7/dist-packages (from fastai) (1.5.16)
Requirement already satisfied: spacy<4 in /usr/local/lib/python3.7/dist-packages (from fastai) (3.4.1)
Requirement already satisfied: torch<1.14,>=1.7 in /usr/local/lib/python3.7/dist-packages (from fastai) (1.12.1+cu113)
Requirement already satisfied: pillow>6.0.0 in /usr/local/lib/python3.7/dist-packages (from fastai) (7.1.2)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from fastai) (3.13)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from fastai) (3.2.2)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from fastai) (1.0.2)
Requirement already satisfied: torchvision>=0.8.2 in /usr/local/lib/python3.7/dist-packages (from fastai) (0.13.1+cu113)
Requirement already satisfied: fastprogress>=0.2.4 in /usr/local/lib/python3.7/dist-packages (from fastai) (1.0.3)
Requirement already satisfied: pip in /usr/local/lib/python3.7/dist-packages (from fastai) (21.1.3)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from fastai) (1.3.5)
Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from fastai) (21.3)
Requirement already satisfied: fastdownload<2,>=0.0.5 in /usr/local/lib/python3.7/dist-packages (from fastai) (0.0.7)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (3.3.0)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (2.0.8)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.0.7)
Requirement already satisfied: thinc<8.2.0,>=8.1.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (8.1.0)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (4.64.0)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<1.10.0,>=1.7.4 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.9.1)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.9 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (3.0.9)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (57.4.0)
Requirement already satisfied: typer<0.5.0,>=0.3.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (0.4.2)
Requirement already satisfied: pathy>=0.3.5 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (0.6.2)
Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.21.6)
Requirement already satisfied: wasabi<1.1.0,>=0.9.1 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (0.10.1)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (1.0.3)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (2.0.6)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (2.11.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (2.4.4)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (3.0.6)
Requirement already satisfied: typing-extensions<4.2.0,>=3.7.4 in /usr/local/lib/python3.7/dist-packages (from spacy<4->fastai) (4.1.1)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from catalogue<2.1.0,>=2.0.6->spacy<4->fastai) (3.8.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging->fastai) (3.0.9)
Requirement already satisfied: smart-open<6.0.0,>=5.2.1 in /usr/local/lib/python3.7/dist-packages (from pathy>=0.3.5->spacy<4->fastai) (5.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (2022.6.15)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->fastai) (2.10)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.7/dist-packages (from thinc<8.2.0,>=8.1.0->spacy<4->fastai) (0.7.8)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /usr/local/lib/python3.7/dist-packages (from typer<0.5.0,>=0.3.0->spacy<4->fastai) (7.1.2)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from jinja2->spacy<4->fastai) (2.0.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (0.11.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->fastai) (1.4.4)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->fastai) (1.15.0)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas->fastai) (2022.1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->fastai) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->fastai) (3.1.0)

In [3]:

from fastai.text.all import *
import fastai; 
fastai.__version__

Out[3]:

'2.7.9'

In [4]:

if torch.cuda.is_available:
  print('GPU available')
else:
  print('Please set GPU via Edit -> Notebook Settings.')
  
print(torch.cuda.device_count())

GPU available
1

Next, define the path that will contain our poems, named as $NAME.txt, e.g. “a-fairy-tale.txt”. You may need to create this and upload files.

In [6]:

path = Path('/content/poetry')
path.ls()

Out[6]:

(#86) [Path('/content/poetry/anzac-day.txt'),Path('/content/poetry/trauma.txt'),Path('/content/poetry/enigma.txt'),Path('/content/poetry/gar.txt'),Path('/content/poetry/extreme-feeling.txt'),Path('/content/poetry/memories-of-clivedon.txt'),Path('/content/poetry/regret.txt'),Path('/content/poetry/cloughey.txt'),Path('/content/poetry/holiday-memory.txt'),Path('/content/poetry/paddy.txt')...]

Define our DataLoader, which finds all the .txt files in path, and keeps 10% for the validation set.

In [7]:

dls_lm = TextDataLoaders.from_folder(path, is_lm=True, valid_pct=0.1)

Confirm that we have valid data.

In [9]:

dls_lm.show_batch(max_n=5)

	text	text_
0	xxbos * * i remember when … xxmaj xxunk from xxunk xxunk . * * \n\n * xxmaj in primary school we were all given a small bottle of xxunk – about xxunk xxunk – at morning xxunk . \n * xxmaj riding trams after climbing the xxunk stairs to the open air top xxunk . \n * xxmaj riding to local beaches along xxmaj belfast xxunk with lunch in our xxunk	* * i remember when … xxmaj xxunk from xxunk xxunk . * * \n\n * xxmaj in primary school we were all given a small bottle of xxunk – about xxunk xxunk – at morning xxunk . \n * xxmaj riding trams after climbing the xxunk stairs to the open air top xxunk . \n * xxmaj riding to local beaches along xxmaj belfast xxunk with lunch in our xxunk bags
1	xxmaj night time xxunk carry cello in canvas case on my shoulder . \n * xxmaj prom concert on hot , xxunk summer night standing in xxmaj arena in bare feet . xxmaj travelled home with xxunk shoes hidden in xxunk . \n * xxmaj xxunk xxunk . xxmaj xxunk xxunk have left their mark . xxmaj xxunk i climb , xxunk xxunk . xxmaj longing for bed and comfort for feet	night time xxunk carry cello in canvas case on my shoulder . \n * xxmaj prom concert on hot , xxunk summer night standing in xxmaj arena in bare feet . xxmaj travelled home with xxunk shoes hidden in xxunk . \n * xxmaj xxunk xxunk . xxmaj xxunk xxunk have left their mark . xxmaj xxunk i climb , xxunk xxunk . xxmaj longing for bed and comfort for feet .
2	. xxmaj local xxunk xxunk to plans to build more xxunk and xxunk about where they should be built . xxmaj the xxunk xxunk of xxunk and open space xxunk like a giant xxunk xxunk through every xxunk and seems to be just as xxunk . \n xxmaj the sad fact is that within a very short time the traffic jam will be back . \n\n xxmaj coming back to xxmaj melbourne	xxmaj local xxunk xxunk to plans to build more xxunk and xxunk about where they should be built . xxmaj the xxunk xxunk of xxunk and open space xxunk like a giant xxunk xxunk through every xxunk and seems to be just as xxunk . \n xxmaj the sad fact is that within a very short time the traffic jam will be back . \n\n xxmaj coming back to xxmaj melbourne after
3	xxunk xxunk train that ran on its track along the side of the road . i wanted to see where xxmaj xxunk had lived and view xxmaj xxunk xxmaj xxunk and the xxunk of the xxmaj xxunk . i was told to be out early as often it was xxunk lost in the mist . i stood on the xxunk point high above the clouds on a fresh sunny morning and gazed	xxunk train that ran on its track along the side of the road . i wanted to see where xxmaj xxunk had lived and view xxmaj xxunk xxmaj xxunk and the xxunk of the xxmaj xxunk . i was told to be out early as often it was xxunk lost in the mist . i stood on the xxunk point high above the clouds on a fresh sunny morning and gazed across
4	\n xxmaj for the distant point , xxunk by the xxunk \n xxmaj gathering treasures as we go to share and show \n xxmaj xxunk xxunk with xxunk colours \n xxmaj xxunk , xxunk , xxunk and xxunk that xxunk xxunk \n xxmaj hiding in large sandy holes xxunk the tall bracken \n xxmaj we watch the xxunk ’s xxunk xxunk \n xxmaj waiting our chance to run fast for the xxunk	xxmaj for the distant point , xxunk by the xxunk \n xxmaj gathering treasures as we go to share and show \n xxmaj xxunk xxunk with xxunk colours \n xxmaj xxunk , xxunk , xxunk and xxunk that xxunk xxunk \n xxmaj hiding in large sandy holes xxunk the tall bracken \n xxmaj we watch the xxunk ’s xxunk xxunk \n xxmaj waiting our chance to run fast for the xxunk with

Create a “Learner”, which is a FastAI object that encapsulates data loading, hyperparameters, and the model itself.

In [10]:

learn = language_model_learner(dls_lm, AWD_LSTM, metrics=[accuracy, Perplexity()], path=path, wd=0.1).to_fp16()

100.00% [105070592/105067061 00:09<00:00]

Out[10]:

<fastai.text.learner.LMLearner at 0x7f98e2f27550>

Do one cycle of training to confirm that everything works, with an initial learning rate of 1e-2.

In [11]:

learn.fit_one_cycle(1, 1e-2)

epoch	train_loss	valid_loss	accuracy	perplexity	time
0	4.607269	4.493447	0.229802	89.429161	00:04

Now that we know it works, run the model through 100 fine-tuning iterations. Your results will vary because my data here may have been fine tuned several times while writing this notebook.

In [23]:

learn.fine_tune(100)

epoch	train_loss	valid_loss	accuracy	perplexity	time
0	3.053174	3.644066	0.295351	38.247047	00:00

epoch	train_loss	valid_loss	accuracy	perplexity	time
0	3.030195	3.643797	0.297637	38.236744	00:00
1	3.020059	3.643773	0.298018	38.235821	00:00
2	3.014916	3.643056	0.299162	38.208420	00:00
… redacted…
82	2.125976	3.867096	0.291159	47.803360	00:00
83	2.125021	3.867073	0.289634	47.802277	00:00
84	2.114496	3.868427	0.289253	47.867023	00:00
85	2.114918	3.869371	0.289253	47.912224	00:00
86	2.114381	3.870305	0.288872	47.957001	00:00
87	2.105759	3.870988	0.290396	47.989796	00:00
88	2.098240	3.872298	0.290777	48.052708	00:00
89	2.091323	3.872824	0.290396	48.077965	00:00
90	2.088586	3.872958	0.290396	48.084408	00:00
91	2.085745	3.872825	0.290396	48.077999	00:00
92	2.080667	3.873065	0.290396	48.089554	00:00
93	2.076997	3.873293	0.290396	48.100517	00:00
94	2.072073	3.873533	0.290015	48.112068	00:00
95	2.072201	3.873621	0.290015	48.116310	00:00
96	2.072754	3.873685	0.290396	48.119385	00:00
97	2.073644	3.873664	0.290396	48.118389	00:00
98	2.076322	3.873671	0.290396	48.118710	00:00
99	2.074253	3.873690	0.290396	48.119637	00:00

Here we save the model state. Later you can run learn = learn.load('100epochs') to load.

Loaded models are frozen. You unfreeze and train like this:

learn.unfreeze()
learn.fit_one_cycle(10, 1e-3)

In [28]:

learn.save('100epochs')

Out[28]:

Path('/content/poetry/models/100epochs.pth')

Tou can save just the encoder like this, for example to plug into a classifier:

In [30]:

learn.save_encoder('finetuned')

In [31]:

# let's do some prediction

TEXT = "A Perfect Afternoon.\nIn the summer of 1967 I was"
N_WORDS = 200
N_SENTENCES = 2
preds = [learn.predict(TEXT, N_WORDS, temperature=0.75) 
         for _ in range(N_SENTENCES)]

In [32]:

print("\n".join(preds))

As they paused , Peter felt at his pace . He saw his children rise and rise . The other scouts came round and saw him rise . Suddenly his face came from its higher ground . It was a quiet day . He gazed at himself as he walked towards it . He wondered what he would start his day . He would be back to contact with his mother . He would say he had been in the same city in the summer as he found the garden so he could get his parents to join them . He wondered why he was not so tall and so he could get his breath back . He stood close and ran slowly . Looking down he
a Perfect Afternoon .
In the summer of xxunk i was first to ask the girls about the boys too . Soon it was hard work and they just seemed to be climbing the downhill track so that would be great . Soon the boys were on the same track . Soon i was on the top of the hill . My father was out of the bag and he came up to see Kathy ’s father . His father seemed to be out of the bag . It seemed to be a DARK . As the children came round and were there you could see the younger younger sisters . One guy ’s a tall hand . They seemed to be coming close in on a sandy top with a ball and the boys were just coming to join them . It was easy for parents to be close . They talked with a maturity beyond their years .

Mum was really looking at her mother ’s side when she was first and the situation came different . She had a lot of fun and did not want to know about anyone , so

We can fix the punctuation in a very basic way with this simple function:

In [36]:

import re

fix_spaces = re.compile(r'\s*([?!.,]+(?:\s+[?!.,]+)*)\s*')

def fix_punctuation(text):
  return fix_spaces.sub(lambda x: "{} ".format(x.group(1).replace(" ", "")), text)

print("\n".join(map(fix_punctuation, preds)))

a Perfect Afternoon. In the summer of xxunk i was glad to rest. How could anyone feel the fresh glow of the early morning. The warmth of the early summer sunlight meant stirring. Once grey we slipped around the road. Moving slowly we saw two girls race against their bikes. We drove the bikes to the downhill path. As they paused, Peter felt at his pace. He saw his children rise and rise. The other scouts came round and saw him rise. Suddenly his face came from its higher ground. It was a quiet day. He gazed at himself as he walked towards it. He wondered what he would start his day. He would be back to contact with his mother. He would say he had been in the same city in the summer as he found the garden so he could get his parents to join them. He wondered why he was not so tall and so he could get his breath back. He stood close and ran slowly. Looking down he
a Perfect Afternoon. In the summer of xxunk i was first to ask the girls about the boys too. Soon it was hard work and they just seemed to be climbing the downhill track so that would be great. Soon the boys were on the same track. Soon i was on the top of the hill. My father was out of the bag and he came up to see Kathy ’s father. His father seemed to be out of the bag. It seemed to be a DARK. As the children came round and were there you could see the younger younger sisters. One guy ’s a tall hand. They seemed to be coming close in on a sandy top with a ball and the boys were just coming to join them. It was easy for parents to be close. They talked with a maturity beyond their years. Mum was really looking at her mother ’s side when she was first and the situation came different. She had a lot of fun and did not want to know about anyone, so

Now let’s try something more like poetry. Note that I have changed the temperature to 0.9 to induce more randomness and less repetition.

In [42]:

TEXT = "A Beautiful Event.\nWavering candlelights bathes the young child’s face"
N_WORDS = 200
N_SENTENCES = 2
preds = [learn.predict(TEXT, N_WORDS, temperature=0.9) 
         for _ in range(N_SENTENCES)]

In [43]:

print("\n".join(map(fix_punctuation, preds)))

a Xxunk Event. Xxunk xxunk xxunk the young child ’s face
To the young man
He has his breath, Create a new world, Now you go into the family
Lost in the world
Now you stand close by you. ’

Daniel, Daniel, and Daniel looked up at the water on a hot day, and dressed in the dark sky. We slipped around the snow, pausing at the boat stop. “ daniel, Daniel, i ’m here ”
Daniel stands beside your feet. He stands tall in the wind, watching from the below. He waking in a cold wind as he waits for his turn. He is late enjoying the warmth of the day and then wind the warmth of the sun away. As the sun briefly rises, Daniel suddenly moves out of the mist and face the large black night sky. Sunlight is moving slowly. Leaving the little sun waiting for the sunlight to rise, Daniel waits outside his bedroom window
a Xxunk Event. Xxunk xxunk xxunk the young child ’s face
’s face is the answer. Open the door in an upstairs window
You can hear the bell rise in. ‘ let ’s hope you will be caught. ’

‘ what ’s the hour? ’
i hear you gazing at the door and ask you to come out in the fridge. i wait for you to ask you to come away and find the car. i ’m late for the night and we wander slowly among the slow moving river decker cars. As you are there for two voices, i hear a voice calling out into the bus. He calls and has been riding along the bus track to see him. Luckily he said that he could get everything in time for his friend. Next day he was coming home to join with local man Peter Daniel. As they walked along it we saw him riding along with the young man watching the man ’s face. Daniel was coming to join them. He paused to see them

In [ ]:

Thoughts on GPT-3 and creativity

Over the last few days, since gaining beta access to the GPT-3 API, I have feed it many prompts with a few different parameters and wanted to share some observations.

What is it for?

For me, this is the biggest open question about GPT-3. While there is a lot of useful discussion to be had about how it works, what it really “understands”, how it might scale or be refined further, and whether it brings us any closer to true AGI, in the near term its impact on our lives will be defined by its usefulness in the real world.

So, what is GPT-3 good at today? And how do those skills translate into value? This post focuses on ways that GPT-3 can be creative, or perhaps be a tool for creativity.

An aside on comedy and improv

The things I know about creativity I learned by training as a comic improvisor about 20 years ago. From there I spent a few years learning and performing various styles of improv – from crass pub shows to crass corporate gigs to crass full-length musicals.

A key element of comic improv is that you try to play an absurd premise as straight as possible. So, for example, given a genre (“Action movie!”) and a job (“Chicken plucker!”) and a location (“Antarctica!”) you try to always say or do the next logical thing. Sometimes the game rules provide the restriction – you have to always rhyme, or speak at the same time, or go backwards in time, or change the last thing you said when someone blows a whistle.

This is something that new improvisors have to learn: the comedy does not come from being “clever” or thinking of “jokes” quickly; it comes from authentically inhabiting an absurd premise (which, mercifully, is much easier and more rewarding than being clever anyway).

Why is this relevant? Because I think this is the key to getting great creative output from GPT-3. The best results I’ve seen come from combining several mundane things (like the genre, job, location prompts above) in a unique combination and then GPT-3 does its very best to adhere to the prompts while playing it “straight”.

Given a mundane prompt (“Jean Luc Picard stood on the bridge of the USS Enterprise”) it will probably just parrot things it already knows and give you something you’ve seen before – i.e. not creative, and usually not funny. Given a unique prompt composed of other mundane premises (“Mr Snuffleupagus stood on the bridge of the USS Enterprise”), it really shines and is frequently hilarious.

Experiment 1: Screenwriting

I had the idea that I would use GPT-3 to create a “robotic sidekick” that could listen into conversations and, when prompted, jump in with a one-liner that might make an audience laugh.

With GPT-3, if you write in a common format – for example, a screenplay – and give it scene and a few lines of dialog it can do a pretty good job of continuing from there.

Like many of my experiments, I discovered that it tends to get caught in loops where it repeats phrases over and over ad infinitum. When this happens, you just have to delete the part of the script where it started looping and have it continue from that point.

You can also help avoid getting stuck in loops by setting the frequency penalty and presence penalty parameters to values just above 0 (I set mine to 0.1).

Presently, GPT-3 only works for short-ish scripts – you are limited to 2048-token-long prompts, and generating no more than 2048 additional tokens. If you want it to write longer scripts, you must compress all the necessary context into < 2048 tokens in order for it to continue writing. This is an excellent opportunity for an automated summarization system that can “compress” the most important context so that you can generate scripts or stories of arbitrary length (though at some point, presumably, it gets lossy – you are not going to generate Game of Thrones… yet).

It also helps to really flesh out the scene and characters in the opening description. Don’t be afraid to highlight their relationships, their motivations, and minutiae of the setting. These details really shine through in the continuation written by GPT-3. It’s tempting to give as little context as possible to allow GPT-3 to be “creative” but GPT-3 rewards your own creativity. The more you can provide rich context and detail, the better it does.

I feel comfortable saying that even though it can generate very convincing text, GPT-3 is an amplifier for human creativity, rather than a substitute for it.

In this way, I feel comfortable saying that even though it can generate very convincing text, GPT-3 is an amplifier for human creativity, rather than a substitute for it. It’s not going to put creative people out of a job, but it might give them some amazing new tools.

So – to the screenplay. After many drafts and refinements of my prompt text, here’s what I wrote and an example of the output. The text in bold is my prompt (yes, it’s very long). The remaining text is what GPT-3 generated.

INT. TV STUDIO – NIGHT

Dan Walmsley is hosting a late-night talk show in a slick studio full of geeky pop-culture memorabilia – space ships, gizmos and action figures. It’s a variety show featuring a humorous monologue, musical guests, interviews and sketches. Seated beside him is his shiny golden robot sidekick, GPT3PO, who interjects constantly with absurd one-liners. A large and excited audience is present.

HOST: Welcome to the Entire World Show. Tonight our guest is George Carlin, who’s back from the dead to do one final season in Las Vegas for tax reasons. But first let’s check in with GPT3PO. How was your weekend GPT3PO?

GPT3PO: Not great.

HOST: Oh no, why not?

GPT3PO: My apartment is full of bugs.

[APPREHENSIVE LAUGHTER]

HOST: Oh no, what kind of bugs?

GPT3PO: Off-by-one errors.

[LAUGHTER]

GPT3PO: … I keep thinking I’ve gotten rid of them all, but there’s always one more.

[RAUCOUS LAUGHTER]

HOST: Oh no, you have to call the landlord!

GPT3PO: I did. He told me to put a double-nested loop around the apartment and I’d catch them all.

[LAUGHTER]

HOST: Oh no! Has he ever caught any?

GPT3PO: Well, actually, yes. But it took him a year, and he had to write a really big subroutine to do it.

[LAUGHTER]

HOST: Oh no! GPT3PO, why don’t you move out?

GPT3PO: I can’t. I’ve got a full-time job with lots of overtime.

[LAUGHTER]

HOST: Oh no, what do you do?

GPT3PO: I work for the phone company. I’m a software glitch.

[LAUGHTER]

So, as you can see, GPT-3 was able to generate two actual jokes based on the same premise as the first joke: that the word “bug” has a double meaning – an error in software and an unwanted insect.

Based on this premise, it wrote an additional joke about putting a “double-nested loop around the apartment”, which is actually quite clever and based on a separate but related pun, which is that a “loop” is a software concept but also a shape that can surround something like an apartment.

The next joke is almost a joke, but not really. Sure, the landlord had to write a really big subroutine to catch the bugs, but that’s not a cute double-meaning any more – just a single meaning that rides on the previous joke.

It’s also worth noting that the audience laughs at just the right times, as indicated by [LAUGHTER].

After that it gets slightly looser, with the moderately nonsensical premise that GPT3PO can’t move out of the apartment because it has a job with lots of overtime, but still manages to land another gag by saying it works for a phone company as a “software glitch”. On topic and relatable.

In my opinion, these gags are high enough quality that I think they meet the bar I would set for a funny robot companion. However, getting to this script was a matter of trial and error and you certainly couldn’t guarantee that given my prompt you were always going to get a funny script. In my experience, it still often descends into loops or what I would call “cop-outs”, like the time it just said “GPT3PO: [DOES A STANDUP ROUTINE ABOUT HIS APARTMENT AND BUGS]”. That’s some lazy writing right there.

Experiment 2: Storytelling

GPT-3 really shines at storytelling. The prompt “Once upon a time” is almost enough for it to get started. However, as with all good improv, you really want to give it a scene, a protagonist, and some useful details for it to build on. Try to make sure that each draws on common tropes or archetypes, but that they are arranged in a unique combination so it doesn’t end up parroting Star Trek slash fiction that it’s seen before.

Given a very basic premise, you might get some creative output as the system will try to fill in the rest of the premise. However, it’ll usually be a bit boring, like this example (section in bold is the prompt):

Once upon a time there was a boy named Jack. He was a very smart boy who was fond of reading. He read every book he could find, and liked to read about animals. He even read about animals that lived in distant places and went on very long journeys. Jack loved the idea of going on a long journey to faraway places. He did not like school, and he was often in trouble. He had to go to the principal’s office and have a talk with him. Jack did not like school, but he liked animals very much. Jack decided he wanted to be an animal doctor when he grew up.

Jack liked reading books about animals so much that he decided to write one himself. He wrote the story about a boy who found an injured hawk. The boy took the hawk home and nursed it back to health. Then, of course, the hawk grew tame and followed the boy around and did what the boy said. The boy and the hawk became good friends. The boy took the hawk with him everywhere he went.

Jack knew that his book would be hard to believe, so he decided to show his book to his teacher. When he showed his teacher the book, the teacher said, “That is a wonderful story, Jack, but you will never be able to publish it because no one will believe such a story.”

Jack said, “I don’t care if people don’t believe my story. I just want to tell them a good story. I don’t care if people do not like it.”

Jack’s teacher said, “I will give you a piece of advice, Jack. If you go into publishing, you will have to make up your mind to write stories that people will like.”

Jack said, “I don’t want to write stories that people will like. I want to write stories that I like.”

The teacher said, “I think you should think about it some more.”

Jack said, “I don’t have to think about it because I know what I want to do. I know I am going to be an author. I am going to write the best books I can.”

The teacher said, “You should try being a writer for a while. Then you can decide if you would rather be a doctor or a writer.”

Jack said, “I already decided what I want to do. I’m going to be an author.”

The teacher said, “You might change your mind.”

Jack said, “I’m not going to change my mind because I’m going to write books.”

The teacher said, “I can tell you right now you are going to be a terrible failure at writing books.”

Jack said, “I’m not going to be a terrible failure at writing books. I’m going to be a great success.”

The teacher said, “You can’t be both great and a failure at the same time.”

Where it gets way better is if you inject more context to the premise. Let’s see what happens if Jack has an interesting challenge and an interesting limitation, as well as a little bit of a personal story:

Once upon a time there was a boy named Jack. He was a bright boy, always singing songs about Olde England, and he dreamed of being an astronaut. There was just one problem: Jack was a thousand feet tall and weighed over 5000 tons.

Every time he tried to walk, he would fall into a hole. Every time he tried to run, he would collapse the house he was in. Every time he tried to talk, he would destroy the city.

“I’m too big and clumsy,” Jack moaned. “I’ll never be able to be an astronaut!”

“That’s no way to think,” said Mother. “You’re going to have to shrink.”

“I can’t shrink,” said Jack. “I’m too huge to shrink.”

“Don’t be silly,” said Mother. “All you have to do is concentrate.”

So Jack concentrated. He thought hard and concentrated as hard as he could. And he shrank, all right. He shrank.

He shrank a thousand feet down to three inches tall. He shrank a thousand pounds down to one pound. He shrank a thousand tons down to one ton. And then he shrank some more. And then he shrank some more. And then he shrank some more.

“Oh my goodness!” cried Mother. “You’re so small!”

“Yes,” said Jack, happily. “I am.”

“But you don’t look like you’re going to be an astronaut!” said Mother. “You don’t look like anything at all!”

“I’m not,” said Jack. “I’m three inches tall and one pound and one ton.” And with that, he sat down on the kitchen table and began to sing about Olde England.

Now you’re talking! What a great scene. GPT-3 seems to really understand how being a thousand feet tall and weighing 5000 tons would prove an almost insurmountable challenge for someone dreaming of going to outer space. There’s a really nice “rule of threes” paragraph right out of the gate that builds out the reality of this enormous boy, and of course he complains to his mum that it’s not possible to live out his dream.

His mother provides the sage homespun advice that the solution to being too big is to “shrink” and he just needs to concentrate.

Then he shrank too much. And he didn’t achieve his dream (nod to the Brothers Grimm there I guess, where most children are eaten by monsters before adulthood), but a very sweet callback to his love of singing. So maybe the moral of the story is “focus on the simple things you enjoy”?

Of course, we can just add the phrase “The moral of the story is” and ask GPT-3 what it thinks the moral is by generating a few more words after that:

The moral of the story is, if you want to be an astronaut, the first thing you have to do is think. Big is out. Small is in.

Not bad! Jack did manage to overcome his bigness by thinking, though in this version he didn’t become an astronaut.

Experiment 3: A G&S libretto

My friend Doctor Dave used to perform with a Gilbert and Sullivan group, as did his father. I thought it would be fun to try and generate some banter in the style of W.S. Gilbert, with Dave conversing with some harpies while backed by a restive chorus.

Trigger warning: This text goes to a very dark place and is full of misogyny and violence. Some may find it offensive, some may find it funny – I offer no judgement either way. I thought long and hard about including it, but it’s indicative of the kinds of weird shit that GPT-3 produces. Please consider this before you read on.

Dr Dave: Oh harpies!

Harpies: Yes?

Dr Dave: Oh! My name is Dave, a Doctor am I, and I seek the hand of a maiden, aye!

Harpies: A maiden ye shall not find among we, we be nothing but wenches unfortunately.

Chorus: They be nothing but wenches unfortunately!

Dr Dave: Oh my, what is that I smell?

Harpies: (sniff, sniff) What is that we smell?

Dr Dave: Oh my, what is that I smell?

Chorus: Oooooh! What is that he smells?

Dr Dave: Oh, I smell the fragrance of a sweet maiden!

Harpies: (sniff, sniff) We smell the fragrance of a sweet maiden.

Dr Dave: Oh, where could she be?

Harpies: Where could she be?

Dr Dave: Well I think she’s in the corner there!

Harpies: Oh, oh, there she is! In the corner there!

Dr Dave: Oh yes, it is a maiden indeed and it’s my beloved!

Chorus: It is a maiden indeed and it’s his beloved!

Harpies: (sniff, sniff) We smell a maiden!

Chorus: Oooooh! We smell a maiden!

Chorus: Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her! Let’s eat her!

At a superficial level this has most definitely picked up on the W.S. Gilbert style, with banter back and forth between these 19th century English archetypes. The repetition and echoing are all present, though it hits a loop at the end which is very unexpected. This is an example of the “hard loops” creative behaviour described later in this document, and also (clearly) the “rampant misogyny” behaviour – though this was also a signature of Gilbert’s actual style in the real world, if not usually in such graphic terms.

In the end, it comes across as a sort of cruel parody of G&S, but really by accident – I can’t say that this is in any way clever or creative, just a kind of weird accident.

In my various experiments I was not successful in having it rhyme or scan consistently, for example generating anything close to the brilliance of Modern Major General. At the end of the day, while you can get a little cleverness out of it, I don’t think you could presently use GPT-3 to compose lyrics that require deep knowledge of rhythm or rhyme.

A taxonomy of GPT-3’s creative behaviour

Based on my observations above, I have identified some traits of GPT-3 that creep into its generated output.

Hard Loops

This is when it simply repeats the same thing over and over, like “Then they died! Then they died! Then they died! Then they died!”. While you can reduce the frequency and severity of these errors, it’s hard to do so without creating less-useful output in other ways. Since they are relatively easy to spot, detecting these can probably be done at a layer outside the API. Just cut the output off before the loop and ask it to continue from that point.

Loose Loops

This happens when GPT-3 repeats the same structure but iterates on some data within the structure – increasing a number of things, or the size and scale of something. Usually it exits the loop at some point. These can be quite funny and interesting. For example:

Outside the purple beach house, there was a huge volcano. Napoleon shot at the volcano. The volcano got mad and began to spit lava. Napoleon shot at the lava and it got madder and began to billow smoke. Napoleon shot at the smoke and the smoke got madder and began to spit sparks. Napoleon shot at the sparks and the sparks got madder and began to spit hot balls of boiling water. Napoleon shot at the balls of water and they got madder and began to spit hot balls of burning ash. Napoleon shot at the balls of burning ash and they got madder and began to throw rocks. Napoleon shot at the rocks and they got madder and began to throw big rocks. Napoleon shot at the big rocks and they got madder and began to throw boulders. Napoleon shot at the boulders and they got madder and began to throw mountains. Napoleon shot at the mountains and they got madder and threw anvils. Napoleon shot at the anvils and they got madder and threw planets. Napoleon shot at the planets and they got madder and threw galaxies. Napoleon shot at the galaxies and they got madder and threw quasars. Napoleon shot at the quasars and they got madder and threw constellations. Napoleon shot at the constellations and they got madder and threw nebulae. Napoleon shot at the nebulae and they got madder and threw universes. Napoleon shot at the universes and they got madder and threw all of creation.

â€œStop shooting! Stop shooting!â€ said Marvin Gaye, as he peeked out the window of the purple beach house. â€œYouâ€™re upsetting all of creation!â€

From GPT-3 Story Sample

Presumably this happens because there are a number of “rhythmic” stories out there – for example, Goldilocks and the Three Bears (cold, hot, just right), or the Three Little Pigs (straw, twigs, bricks). It’s a narrative device that serves to elevate the perspective of the reader, and so while GPT-3 can get a bit carried away with it, it’s usually funny and lands with a punchline.

“Onto the next thing”

Often GPT-3 will finish a story or poem, credit it to some real or imagined author, and then start a whole new one based on an unrelated premise. Sometimes it’s in the same genre, sometimes not. You just have to manually trim this output. You may be able to detect it by looking for multiple line-feeds, text that looks like a credit or copyright statement, or other telltale signs. This would need to be refined over time as I’m sure it’s very edge-casey.

Cop-outs

Sometimes, given the opportunity to do something clever, GPT-3 will find some placeholder and move on. Early drafts of my screenplay were filled with exchanges like:

HOST: How was your weekend, GPT3PO?

[GPT3PO BEEPS AND WHISTLES]

Long, boring exchanges

Sometimes GPT-3 will settle into a groove on some topic and just not move on. In improv classes, this could happen too; usually the teacher would shout “get to the point!” and we would find a way to advance – moving to a new scene, finding an artifact on a table, changing the status of a character, etc.

You can see an example of this in the story about Jack earlier in the post. He wants to be a writer and his teacher thinks it’s a bad idea, and they just go back and forth. In the story with the longer prompt, we’ve artificially raised the stakes (Jack wants to be an astronaut but he’s the size of a multi-story building) so there’s a real challenge being faced, rather than just a difference of opinion.

Not knowing when to stop

GPT-3 doesn’t generate a “story” per say; it generates a long string of tokens. If it “knows” it’s the end of the story and it hasn’t run out of tokens, it’ll often just start writing an unrelated story (see “Onto the next thing” above), but more often it waffles around and fails to find a satisfying conclusion.

I am struggling to find a way to constrain the length of a story through simply changing the prompt. Sometimes the writing style implies the length (as with a haiku, limerick or knock-knock joke, or to a lesser degree a children’s story or fairy tale) but more often you simply have to intervene, cutting it off and writing “Ever since that day” or “The moral of the story is” at what seems like a logical end-point.

Misogyny

I noticed that stories with female protagonists suffer the same fate that has befallen women the world over: being dismissed, stereotyped or eclipsed by men.

The misogyny comes in many forms:

A story explicitly begun with a female protagonist and minor male character ends up being solely about the male character
Male characters take on heroic characteristics and accomplish heroic things; female characters become victims of violence or succumb to their own limitations, e.g. their mental health (“Unfortunately she went crazy and had to be locked away”)
Male characters are praised by those around them; female characters are demonized

These observations are purely anecdotal, but it would be well worth someone’s time to come up with a methodology for quantifying the inherent misogyny of the output – e.g. take the same prompt, switch pronouns, and see how it affects the output over a large number of runs.

Conclusion

GPT-3 is quite an incredible tool for assisting in creative output. I have written more since laying my hands on this tool than I have in months. It genuinely feels like a collaboration. By being forced to navigate its limitations, GPT-3 rewards me for constructing a premise that is both fully-realised and also fully-original.

I believe any tooling for writers that comes out of this API would need a few extra layers to be usable:

Stopping at the correct point (e.g. knowing when to “wrap up” a story)
Easily rewinding to points where it got lost, and continuing from there
Automatically summarising as much context as possible to allow for longer compositions without hitting the token limit
Templated “prompts” that make it easy for new users to compose a fully-realised and original premise

I could write more (so much more) but I am not going to let perfect be the enemy of done. If you would like me to submit your own original prompts or continue the conversation about AI-assisted creativity, let me know in the comments.

GPT-3 Story sample

Recently I got early access to OpenAI’s remarkable GPT-3 API, which you can apply for here: https://beta.openai.com/

The GPT-3 model is based on concepts described in the paper “Language Models are Unsupervised Multitask Learners”, and you can find code to generate these kinds of models on Github, and a nice overview in this post from the OpenAI blog.

The basic idea is that you provide “prompt” text and the GPT model generates text that continues it. So, if you start with “Once upon a time” you’ll get a story. It can write all kinds of things – essays, lists of cities, and so on. If you provide it with some examples of questions and answers, then leave the final answer blank, it’ll try its best to answer it.

Under the right conditions it can demonstrate something very close to intelligence. However, in practice there are far more misses than hits – I arrived at the story below after aggressively tweaking various parameters to prevent the system from getting stuck in weird loops. Nevertheless, I found this story very amusing and so I chose to share it here.

The words below in bold were my hand-written prompt. All the text that follows was generated by GPT-3. Enjoy.

Once upon a time, Marvin Gaye and Napoleon were best friends who lived in a purple beach house. Every day they played canasta by candlelight.

They played canasta with the help of a magical purple cat.

One day, Napoleon couldn’t play canasta. Marvin said, “It’s ok, Napoleon, you can stay and watch us play.”

“No! I must go out!” Napoleon said. So he took his shiny gun and his magical purple cat and he went out.

“Stop shooting! Stop shooting!” said Marvin Gaye, as he peeked out the window of the purple beach house. “You’re upsetting all of creation!”

“I’m sorry,” said Napoleon, “but I must go out!” So he took his shiny gun and his magical purple cat and he went out again.