Skip to Content
21 Quizzes03 Phase 7 RAG Pre Quiz

Retrieval-Augmented Generation (RAG) - Pre-Quiz

Time: 15 minutes
Questions: 10
Passing Score: 70%
Purpose: Assess your baseline knowledge before learning about RAG systems


Question 1 (Easy)

What does RAG stand for?

A) Random Access Generation
B) Retrieval-Augmented Generation ✓
C) Recursive Automated Generation
D) Rapid AI Gateway

Explanation

Answer: B) Retrieval-Augmented Generation

RAG is a technique that combines information retrieval with language generation. It retrieves relevant documents/passages and uses them to augment the LLM’s response, grounding answers in factual sources.

Reference: Phase 8 - Introduction to RAG


Question 2 (Easy)

What is the main problem that RAG solves?

A) Slow inference speed
B) LLM hallucinations and outdated knowledge ✓
C) High training costs
D) Model size limitations

Explanation

Answer: B) LLM hallucinations and outdated knowledge

RAG addresses:

  • Hallucinations: LLMs making up facts → RAG grounds responses in retrieved documents
  • Outdated knowledge: LLMs trained on old data → RAG accesses current information
  • Domain-specific needs: LLMs lack specialized knowledge → RAG retrieves from custom knowledge bases

Reference: Phase 8 - Why RAG?


Question 3 (Medium)

What are the three main components of a RAG system?

A) Train, Test, Deploy
B) Indexing, Retrieval, Generation ✓
C) Input, Process, Output
D) Encode, Store, Decode

Explanation

Answer: B) Indexing, Retrieval, Generation

RAG Pipeline:

  1. Indexing (Offline):

    • Split documents into chunks
    • Create embeddings
    • Store in vector database
  2. Retrieval (Query time):

    • Embed user query
    • Find similar chunks
    • Retrieve top-k relevant passages
  3. Generation (Query time):

    • Combine query + retrieved docs
    • Send to LLM
    • Generate grounded response

Reference: Phase 8 - RAG Architecture


Question 4 (Medium)

What is a vector embedding?

A) A compressed file format
B) A numerical representation of text in high-dimensional space ✓
C) A type of database index
D) A machine learning model

Explanation

Answer: B) A numerical representation of text in high-dimensional space

Vector Embedding:

  • Converts text → array of numbers (e.g., 384, 768, or 1536 dimensions)
  • Similar meaning → similar vectors (close in vector space)
  • Enables semantic search (beyond keyword matching)

Example:

"dog" → [0.12, -0.34, 0.56, ..., 0.78] # 384 dims "puppy" → [0.15, -0.31, 0.54, ..., 0.81] # Similar! "car" → [-0.45, 0.67, -0.23, ..., -0.12] # Different

Reference: Phase 8 - Embeddings & Vector Search


Question 5 (Medium)

What is the purpose of “chunking” documents in RAG?

A) To compress file size
B) To break large documents into smaller, retrievable pieces ✓
C) To encrypt data
D) To format text for display

Explanation

Answer: B) To break large documents into smaller, retrievable pieces

Why chunk?

  • LLMs have token limits (can’t process entire books)
  • Retrieve only relevant sections (not entire documents)
  • Better precision (specific paragraphs vs. whole pages)

Chunking strategies:

  • Fixed size: 500 tokens per chunk
  • Sentence-based: Complete sentences only
  • Semantic: Chunks by topic/section
  • Overlap: Chunks share 50-100 tokens

Reference: Phase 8 - Document Processing


Question 6 (Hard)

Which similarity metric is commonly used to find relevant chunks?

A) Euclidean distance
B) Manhattan distance
C) Cosine similarity ✓
D) Hamming distance

Explanation

Answer: C) Cosine similarity

Cosine Similarity:

similarity = (A · B) / (||A|| * ||B||)

Range: -1 to +1

  • +1: Identical direction
  • 0: Orthogonal (unrelated)
  • -1: Opposite direction

Why cosine?

  • Measures angle, not magnitude
  • Works well for high-dimensional text embeddings
  • Efficient to compute

Alternatives:

  • Dot product: Faster but magnitude-sensitive
  • Euclidean: Can work but less common

Reference: Phase 8 - Similarity Search


Question 7 (Medium)

What is a vector database?

A) A SQL database with vectors
B) A specialized database optimized for similarity search ✓
C) A NoSQL document store
D) A graph database

Explanation

Answer: B) A specialized database optimized for similarity search

Vector Databases:

  • Store high-dimensional embeddings efficiently
  • Perform fast approximate nearest neighbor (ANN) search
  • Scale to millions/billions of vectors

Popular vector DBs:

  • Pinecone (managed)
  • Weaviate (open-source)
  • Qdrant (open-source)
  • Chroma (lightweight)
  • FAISS (Facebook AI library)

Traditional databases can’t: Efficiently search “find most similar vectors” at scale

Reference: Phase 7 - Vector Databases


Question 8 (Hard)

In a RAG system, what is the “context window”?

A) The time limit for queries
B) The maximum tokens the LLM can process ✓
C) The number of documents indexed
D) The embedding dimension

Explanation

Answer: B) The maximum tokens the LLM can process

Context Window:

  • Maximum tokens LLM can handle in one request
  • Includes: system prompt + retrieved docs + user query + response

Example (GPT-3.5):

  • Context window: 4,096 tokens
  • System prompt: 500 tokens
  • Retrieved docs: 2,000 tokens
  • User query: 100 tokens
  • Available for response: 1,496 tokens

Modern models:

  • GPT-4: 8K-128K tokens
  • Claude 3: 200K tokens
  • Gemini 1.5: 1M tokens

RAG consideration: Must fit retrieved chunks within context window!

Reference: Phase 8 - LLM Context Limits


Question 9 (Medium)

What does “k” represent in “top-k retrieval”?

A) The number of keywords
B) The number of documents to retrieve ✓
C) The embedding dimension
D) The chunk size

Explanation

Answer: B) The number of documents to retrieve

Top-k Retrieval:

  • Retrieve the k most similar chunks
  • Common values: k = 3, 5, or 10

Example:

# Query: "What is RAG?" # Vector DB contains 10,000 chunks # top_k = 5 results = vector_db.search(query_embedding, top_k=5) # Returns 5 most relevant chunks

Trade-offs:

  • Small k (3): Fast, focused, but might miss context
  • Large k (10): More context, but slower and noisier

Reference: Phase 8 - Retrieval Strategies


Question 10 (Hard)

What is the main difference between RAG and fine-tuning?

A) RAG is faster to deploy
B) RAG retrieves external knowledge; fine-tuning updates model weights ✓
C) RAG is more accurate
D) Fine-tuning doesn’t require data

Explanation

Answer: B) RAG retrieves external knowledge; fine-tuning updates model weights

AspectRAGFine-Tuning
Knowledge sourceExternal documents (retrieved)Model parameters (learned)
UpdatesAdd/remove documents anytimeRetrain model (expensive)
Use caseFactual Q&A, current dataDomain adaptation, style
CostLow (no retraining)High (GPU hours)
TransparencyCan cite sourcesBlack box

When to use:

  • RAG: Frequently changing data, need citations
  • Fine-tuning: Specialized behavior, consistent style
  • Both: Often combined for best results!

Reference: Phase 8 - RAG vs Fine-Tuning


Self-Check Guide

0-3 correct: RAG is an advanced topic. Make sure you’ve completed:

  • Phase 6 (Neural Networks)
  • Phase 11 (Prompt Engineering & LangChain)

4-5 correct: You have some relevant background. Review vector embeddings and LLM basics before starting.

6-7 correct: Good foundation for learning RAG with focused effort.

8-9 correct: Strong conceptual understanding; this phase should push implementation.

10 correct: Excellent starting point, with room to sharpen production practices and advanced techniques.


Prerequisites Checklist

Before starting Phase 8, ensure you understand:

  • ✅ How LLMs work (Phase 11)
  • ✅ Vector embeddings (Phase 5)
  • ✅ Python and APIs
  • ✅ Basic prompt engineering
  • ✅ JSON and data structures

Next Steps

After this pre-quiz:

  1. If the quiz felt difficult: Review prerequisites first

    • Phase 11: Prompt Engineering
    • Phase 7: Vector Databases intro
    • LLM fundamentals
  2. If you felt partly comfortable: Start Phase 8 but take it slow

    • Revisit embedding concepts
    • Practice with simple examples
    • Ask questions in community
  3. If most of it felt comfortable: Dive into Phase 8

    • Build a RAG system
    • Complete the assignment
    • Try advanced challenges

Remember: RAG is complex but incredibly powerful. Take your time and build incrementally! 🚀📚

Last updated on