Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
sentiment, NER,..
QA, translate, summarize
Chat bot
BERT, GPT,BART
Dataset
BERT, GPT,BART
BERT, GPT,BART
BERT, GPT,BART
Standford Sentiment Tree Bank (SST)
SNLI
LAMBADA
Predict the last word of a Long sentence
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
(T4 has 16 GB of Memory)
(V100 has 32 GB of Memory)
A single node with multiple GPUs for models having a few Billion parameters
Multi node GPU clusters for models having Billions of parameters
(V100 has 32 GB of Memory)
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 12
Embedding Matrix
summarize: ip text
Watched the transformer movie
Transformer Block 1
Transformer Block 2
Transformer Block 12
Embedding Matrix
GPT with 110 Million Parameters
GPT-2 with 1.5 Billion Parameters
Layers: \(4X\)
Parameters: \(10X\)
Transformer Block 1
Transformer Block 2
Transformer Block 3
Transformer Block 48
Embedding Matrix
guide the user to reach the destination
Text:how to reach Marina Beach from IIT Madras
by train
Classify the text into neutral, negative or positive. Text: I enjoyed watching the transformers movie. Sentiment:
positive
"Making language models bigger does not inherently make them better at following a user’s intent." -[Instruct GPT paper]
Text: Guide me on how to reach Marina Beach from IIT Madras
GPT
by train
Text: Guide me on how to reach Marina Beach from IIT Madras
Instruct GPT
*Actual response from ChatGPT
(V100 has 32 GB of Memory)
from datasets import load_dataset
from transformers import AutoTokenizer
from transformers import DataCollatorForLanguageModeling
from transformers import GPT2Config, GPT2LMHeadModel
from transformers import TrainingArguments, Trainer
from transformers import GPT2ForSequenceClassification
model = GPT2ForSequenceClassification.from_pretrained()
from transformers import TrainingArguments, Trainer
from peft import LoraConfig
Supervised Fine Tuning
Continual PreTraining
Instruction Fine Tuning
How does LLM generate a factual answer for a question?
How does LLM generate a factual answer for a question?
InstructGPT
RAG
Agent
Indexing
Get the learned embedding