Generative AI Fundamentals

Understanding Generative AI

What is Generative AI? Generative AI refers to artificial intelligence systems that can create new content rather than just analyzing existing data.

Traditional AI Classifies emails as spam or not spam
Generative AI Can write a completely new email for you.
- It analyzes, processes data based on human language and hence they are called as language models. They use similar mechanism as synaptic response of a human brain to do so.

Three pillars that made it possible -

Algorithms Transformer architecture (2017) revolutionized processing of long text passages
- Neural Networks
- Transformers - Establishes relationships between words of text which is critical for analyzing languages
Data explosion Explosion of digital data (websites, code repositories, and other text) provided raw material for training these systems
Computing power Massive increases in computational power (chips like GPUs) made it possible  to train these models on all  the data
- GPUs, TPUs made the processing fasters.
- Clustering different infrastructure together generated immense computational power never possible before.

Scaling laws

As more data and computational power increases researchers found that model performs better, so (data, computation) is directly proportional to model intelligence.

How it works

Pre-training Models analyze billions of text examples, learning to predict what comes next

Fine-tuning Models are refined to follow instructions, be helpful, and avoid harmful content by using re-enforced learning of rewards and penalty.

Deployment Users provide prompts,  and the model generates responses based on the prompts and patterns it learned during training

AI Context Window - A practical limit to how much information a LLM can consider at once -> it consists of your prompts, AI responses and any other info shared. Claude currently has 1M context.

Key capabilities

Versatile language skills
General purpose abilities
Learning from example
Connecting to external tools and data

Current limitations

Knowledge cutoff date
Potential inaccuracies ( hallucinations" )
Context window constraint
Challenges with complex reasoning and math

4D Framework

Capabilities and Limitation