Generative AI Fundamentals
Understanding Generative AI
What is Generative AI? Generative AI refers to artificial intelligence systems that can create new content rather than just analyzing existing data.
Traditional AI Classifies emails as spam or not spam
Generative AI Can write a completely new email for you.
It analyzes, processes data based on human language and hence they are called as language models. They use similar mechanism as synaptic response of a human brain to do so.
Three pillars that made it possible -
Algorithms Transformer architecture (2017) revolutionized processing of long text passages
Neural Networks
Transformers - Establishes relationships between words of text which is critical for analyzing languages
Data explosion Explosion of digital data (websites, code repositories, and other text) provided raw material for training these systems
Computing power Massive increases in computational power (chips like GPUs) made it possible to train these models on all the data
GPUs, TPUs made the processing fasters.
Clustering different infrastructure together generated immense computational power never possible before.
Scaling laws
As more data and computational power increases researchers found that model performs better, so (data, computation) is directly proportional to model intelligence.
How it works
Pre-training Models analyze billions of text examples, learning to predict what comes next
Fine-tuning Models are refined to follow instructions, be helpful, and avoid harmful content by using re-enforced learning of rewards and penalty.
Deployment Users provide prompts, and the model generates responses based on the prompts and patterns it learned during training
AI Context Window - A practical limit to how much information a LLM can consider at once -> it consists of your prompts, AI responses and any other info shared. Claude currently has 1M context.
Key capabilities
Versatile language skills
General purpose abilities
Learning from example
Connecting to external tools and data
Current limitations
Knowledge cutoff date
Potential inaccuracies ( hallucinations" )
Context window constraint
Challenges with complex reasoning and math