Generative
AI Lab Viva Questions and Answers
1. What is Generative AI?
Answer:
Generative AI is a branch of Artificial Intelligence that can create new
content such as text, images, audio, videos, and code. Unlike traditional AI
systems that only classify or predict, Generative AI learns patterns from large
datasets and generates new outputs similar to the training data.
Examples:
- ChatGPT generates text.
- DALL-E generates images.
- GitHub Copilot generates code.
Applications:
- Chatbots
- Content generation
- Image creation
- Healthcare assistance
- Education
2. What are Word Embeddings?
Answer:
Word embeddings are numerical vector representations of words that capture
semantic meaning and relationships between words.
Instead of representing
words as simple IDs, embeddings convert words into vectors of numbers.
Example:
King → [0.25, 0.67,
0.91, ...]
Queen → [0.28, 0.69,
0.89, ...]
Since "King"
and "Queen" have similar meanings, their vectors are close in vector
space.
3. Why are Word Embeddings Important?
Answer:
Word embeddings help machines understand the meaning and context of words.
Benefits:
- Capture semantic relationships
- Reduce dimensionality
- Improve NLP performance
- Enable similarity search
Applications:
- Chatbots
- Sentiment analysis
- Text summarization
- Machine translation
4. What is Word2Vec?
Answer:
Word2Vec is a neural network-based technique that converts words into dense
vector representations.
It learns word meanings
from the context in which words appear.
For example:
doctor → hospital
teacher → school
Word2Vec learns these
relationships automatically.
5. What are the architectures of Word2Vec?
Answer:
1. CBOW (Continuous Bag
of Words)
Predicts the target
word using surrounding words.
Example:
I love _____ learning
Predicts:
machine
2. Skip-Gram
Predicts surrounding
words using a target word.
Example:
Input:
machine
Predicts:
learning
artificial
intelligence
Difference:
|
CBOW |
Skip-Gram |
|
Faster |
More accurate |
|
Better for large
datasets |
Better for small
datasets |
6. What is Semantic Similarity?
Answer:
Semantic similarity measures how close two words are in meaning.
Examples:
Doctor ↔ Nurse
King ↔ Queen
Car ↔ Vehicle
Word embeddings capture
these similarities mathematically.
7. Explain Vector Arithmetic in Word Embeddings.
Answer:
Word vectors support mathematical operations.
Famous example:
King - Man + Woman =
Queen
This demonstrates that
embeddings capture semantic relationships.
Another example:
Paris - France + Italy
= Rome
This capability is
useful in NLP applications.
8. What is Dimensionality Reduction?
Answer:
Word embeddings may contain 50, 100, or 300 dimensions.
Humans cannot visualize
high-dimensional data.
Dimensionality
reduction converts high-dimensional vectors into 2D or 3D while preserving
important information.
Methods:
- PCA
- t-SNE
9. What is PCA?
Answer:
PCA (Principal Component Analysis) is a dimensionality reduction technique.
It finds directions
called principal components that contain maximum variance.
Advantages:
- Fast
- Easy to interpret
- Useful for visualization
Applications:
- Data compression
- Feature extraction
- Embedding visualization
10. What is t-SNE?
Answer:
t-SNE (t-Distributed Stochastic Neighbor Embedding) is a visualization
algorithm.
It preserves local
relationships between data points.
Advantages:
- Better cluster visualization
- Excellent for word embeddings
Limitation:
- Computationally expensive
11. Why do similar words form clusters?
Answer:
Words appearing in similar contexts obtain similar vector representations.
Example:
Programming Cluster:
- software
- programming
- algorithm
Networking Cluster:
- internet
- network
Data Cluster:
- database
- data
These clusters show
semantic relationships.
12. What is a Domain-Specific Corpus?
Answer:
A collection of text related to a specific field.
Examples:
Medical Corpus
- diabetes
- insulin
- glucose
Legal Corpus
- lawyer
- court
- judge
Financial Corpus
- stock
- investment
- profit
13. Why train a custom Word2Vec model?
Answer:
General models may not understand specialized terminology.
Training on a
domain-specific corpus helps learn:
- Technical vocabulary
- Domain relationships
- Industry-specific meanings
Example:
Medical corpus:
diabetes ↔ insulin
glucose ↔ blood sugar
14. What is Prompt Engineering?
Answer:
Prompt engineering is the process of designing effective prompts to obtain
better responses from AI models.
Example:
Simple Prompt:
Explain AI.
Better Prompt:
Explain AI in 5 points
with examples suitable for engineering students.
The second prompt
produces better output.
15. What is Prompt Enrichment?
Answer:
Prompt enrichment means adding related words and context to improve AI
responses.
Example:
Original Prompt:
Tell me about
education.
Enriched Prompt:
Tell me about
education, learning, teaching, knowledge, academic growth and skill
development.
The enriched prompt
produces more detailed output.
16. What is Hugging Face?
Answer:
Hugging Face is an open-source platform that provides pre-trained AI models for
NLP tasks.
Tasks include:
- Sentiment analysis
- Translation
- Summarization
- Question answering
- Text generation
17. What is a Pipeline in Hugging Face?
Answer:
A pipeline is a simple API that allows users to perform NLP tasks with very
little code.
Example:
from transformers
import pipeline
classifier =
pipeline("sentiment-analysis")
The pipeline loads the
model and tokenizer automatically.
18. What is Sentiment Analysis?
Answer:
Sentiment analysis identifies emotions in text.
Outputs:
- Positive
- Negative
- Neutral
Example:
I love this movie.
→ Positive
This product is
terrible.
→ Negative
19. What are real-world applications of Sentiment Analysis?
Answer:
- Product reviews
- Customer feedback
- Social media monitoring
- Brand reputation analysis
- Survey analysis
Companies use it to
understand customer satisfaction.
20. What is Text Summarization?
Answer:
Text summarization converts long documents into shorter versions while
preserving important information.
Example:
A 5-page article can be
summarized into a few paragraphs.
Types:
- Extractive
- Abstractive
21. What is LangChain?
Answer:
LangChain is a framework for developing applications using Large Language
Models (LLMs).
Features:
- Prompt templates
- Chains
- Memory
- Document loaders
- Vector databases
22. What is a Prompt Template?
Answer:
A Prompt Template defines a fixed structure for model input.
Example:
template =
"""
Summarize:
{text}
"""
Benefits:
- Consistency
- Reusability
- Better outputs
23. What is Pydantic?
Answer:
Pydantic is a Python library used for data validation and structured output
generation.
It helps define
schemas.
Example:
class
Student(BaseModel):
name:str
age:int
24. What is FAISS?
Answer:
FAISS (Facebook AI Similarity Search) is a library used for efficient
similarity search among embeddings.
Functions:
- Store vectors
- Retrieve similar vectors
- Fast searching
Used in:
- Chatbots
- Recommendation systems
- Search engines
25. Explain the Working of the IPC Chatbot.
Answer:
Step 1
Load IPC PDF document.
Step 2
Split document into
chunks.
Step 3
Generate embeddings
using a Hugging Face model.
Step 4
Store embeddings in
FAISS.
Step 5
User asks a question.
Step 6
Question is converted
into embedding.
Step 7
Similarity search
retrieves relevant IPC sections.
Step 8
Relevant answer is
displayed.
This process combines:
- PDF processing
- Embeddings
- FAISS
- Retrieval-based chatbot
architecture
0 Comments