Locating the Dinosaur in the Language Model

What does it mean to understand language?

How does self-reflection manifest?

These questions formed the running thread of the first Dining with Dinosaurs with Professor Tarassenko. ‘Everything you wanted to know about ChatGPT but were afraid to ask’ turned into an excellent foundation on the inner workings of ChatGPT that reflected on insights on earlier seminars on Turing Tests.

In this piece, allow for the fact that this geographer will not even come close to the elegant, mathematically founded explanations rendered in the presentation. ChatGPT and other language models have been trained on the vast libraries of textual content from the internet. Each word is assigned a token and then each token is analysed for its nearby words. The ‘nearness’ of these words creates values for the tokens. These ‘meaning spaces’ for each token then includes words that tend to exist near each other in textual content. Words like fish and sea will be more likely occupy the same general meaning space instead of Optimus Prime and fish. Even (maybe) accounting for the rare Tumblr post referencing Optimus Prime’s liking of seafood.

After the core concepts of tokenization, embedding, and meaning space, language models use self-attention to understand the relationship between words. This allows the model to assign weights to different words in order of relevance and decide what word comes after the other in a sentence. This ability improves once text is scaled up – so the more text the model has access to, the better this self-reflection performs. Examples on what this looks like in the context of BabyGPT learning over 30,000 iterations of trying to write complete sentences in the Harry Potter universe by predicting the next letter. It gets better and better and better, but might not make the most sense (to us).

What becomes apparent is ChatGPT’s impressive and often faintly insidious ability at figuring out what the user is asking it, gathering the necessary and relevant information, and fairly naturally presenting a coherent response. A useful thing for many language and information processing tasks!

But does ChatGPT understand language in the same way as us? Definitely not yet.

Do we need it to?

A question for another day for the Professor.

dwd mt23 week 1 image 1 small cropped

A recording of Lionel Tarassenko's talk can be viewed on Reuben's YouTube channel.