Transcribed

How large language models work, a visual intro to transformers

Oct 27, 2024 · 15m 31s
How large language models work, a visual intro to transformers
Description

The inner workings of large language models (LLMs) like ChatGPT, focusing on the transformer architecture. The speaker starts by defining what LLMs are and how they use pre-trained transformers to...

show more
The inner workings of large language models (LLMs) like ChatGPT, focusing on the transformer architecture. The speaker starts by defining what LLMs are and how they use pre-trained transformers to generate text. The main focus is on the attention mechanism, which allows LLMs to learn the relationship between words in a sentence and understand their context. The video uses a visual approach and provides simple analogies to explain complex concepts. It also briefly discusses the embedding process, which translates words into numerical representations, and the softmax function, which normalizes these representations into probability distributions.
show less
Information
Author Alan Shore and Denise
Organization DeepDive
Website -
Tags

Looks like you don't have any active episode

Browse Spreaker Catalogue to discover great new content

Current

Podcast Cover

Looks like you don't have any episodes in your queue

Browse Spreaker Catalogue to discover great new content

Next Up

Episode Cover Episode Cover

It's so quiet here...

Time to discover new episodes!

Discover
Your Library
Search