Simply explained: how does GPT work?

sizeoftheuniverse@programming.dev · 2 years ago

Simply explained: how does GPT work?

qwertyasdef@programming.dev · 2 years ago

Ask it a question about basketball. It looks through all documents it can find about basketball…

I get that this is a simplified explanation but want to add that this part can be misleading. The model doesn’t contain the original documents and doesn’t have internet access to look up the documents (though that can be added as an extra feature, but even then it’s used more as a source to show humans than something for the model to learn from on the fly). The actual word associations are all learned during training, and during inference it just uses the stored weights. One implication of this is that the model doesn’t know about anything that happened after its training data was collected.

W^Unt!2@waveform.social · 2 years ago

I wonder what an ELI5 version of ‘stored weights’ would be in this context.

Lmaydev@programming.dev · edit-2 2 years ago

How closely related words and their attributes are to other words.