Member-only story

GPT3 Linguistics 101, a multi-part series. This is Part 1, On Structure.

31 min readDec 13, 2020

Overview

OpenAI’s API feature GPT3 is a wonderfully expressive technology. The API is a text in/text out API converts the input text by way of a extremely large language “transformer network” into the output text. The API is so expressive that it is easy to get lost in conversation/interaction with it and miss forensic details about its behavior. The more sophisticated your desired use the more the nuances of GPT3 become important to spot and understand.

Note about the intent of this essay series

This post is not an architecture nor technical essay — it is about understanding and using the behavior of the API and GPT3. (If you want a quick developer tutorial Twilio has a good one)

Interested readers should go read up on transformers and GPT2 and GPT3 architectures. Transformer networks are sufficiently complex that knowing their architecture in detail doesn’t offer much guidance on the emergent behavior of the network. This is a common experience in physics, chemistry and biology. Different levels of a system have unique properties and often has correlative but not fully deterministic causal relationships. Thus the experiments and exercises here explore the linguistic, algebraic and statistical behavior of GPT3 and the API.

Additionally we will cover what GPT3 is made of — human language as put onto the web AND the encoding algebra (the math/relationships) of this language.

All that said… this is an attempt to use simple examples, simple language and very little, if any, “hard math”. There will be pattern observation and simple arithmetic.

If this exercise does a job the reader will come away much more equipped to fluently deploy GPT3 in a variety of use cases.

Major Concepts in this Linguistics 101 Series

Structure
mathematical/data structure: lists, number, string, delimiters
language structure: parts of speech/types, grammar, punctuation
programming and markup structure: code, mark up, text
Meaning/Semantics
human language meaning
mathematical/algebraic/computational meaning
corpus
Interpretation/Language Tasks
searching
classification
summarization (compression)
questioning and answering
prediction/completion
translation/encoding-decoding