Ticker

6/recent/ticker-posts

BAI657C Generative AI Viva questions and answers

 

Generative AI Lab Viva Questions and Answers

1. What is Generative AI?

Answer:
Generative AI is a branch of Artificial Intelligence that can create new content such as text, images, audio, videos, and code. Unlike traditional AI systems that only classify or predict, Generative AI learns patterns from large datasets and generates new outputs similar to the training data.

Examples:

  • ChatGPT generates text.
  • DALL-E generates images.
  • GitHub Copilot generates code.

Applications:

  • Chatbots
  • Content generation
  • Image creation
  • Healthcare assistance
  • Education

2. What are Word Embeddings?

Answer:
Word embeddings are numerical vector representations of words that capture semantic meaning and relationships between words.

Instead of representing words as simple IDs, embeddings convert words into vectors of numbers.

Example:

King → [0.25, 0.67, 0.91, ...]

Queen → [0.28, 0.69, 0.89, ...]

Since "King" and "Queen" have similar meanings, their vectors are close in vector space.

3. Why are Word Embeddings Important?

Answer:
Word embeddings help machines understand the meaning and context of words.

Benefits:

  • Capture semantic relationships
  • Reduce dimensionality
  • Improve NLP performance
  • Enable similarity search

Applications:

  • Chatbots
  • Sentiment analysis
  • Text summarization
  • Machine translation

4. What is Word2Vec?

Answer:
Word2Vec is a neural network-based technique that converts words into dense vector representations.

It learns word meanings from the context in which words appear.

For example:

doctor → hospital

teacher → school

Word2Vec learns these relationships automatically.

5. What are the architectures of Word2Vec?

Answer:

1. CBOW (Continuous Bag of Words)

Predicts the target word using surrounding words.

Example:

I love _____ learning

Predicts:

machine

2. Skip-Gram

Predicts surrounding words using a target word.

Example:

Input:

machine

Predicts:

learning

artificial

intelligence

Difference:

CBOW

Skip-Gram

Faster

More accurate

Better for large datasets

Better for small datasets

6. What is Semantic Similarity?

Answer:
Semantic similarity measures how close two words are in meaning.

Examples:

Doctor ↔ Nurse

King ↔ Queen

Car ↔ Vehicle

Word embeddings capture these similarities mathematically.

7. Explain Vector Arithmetic in Word Embeddings.

Answer:
Word vectors support mathematical operations.

Famous example:

King - Man + Woman = Queen

This demonstrates that embeddings capture semantic relationships.

Another example:

Paris - France + Italy = Rome

This capability is useful in NLP applications.

8. What is Dimensionality Reduction?

Answer:
Word embeddings may contain 50, 100, or 300 dimensions.

Humans cannot visualize high-dimensional data.

Dimensionality reduction converts high-dimensional vectors into 2D or 3D while preserving important information.

Methods:

  • PCA
  • t-SNE

9. What is PCA?

Answer:
PCA (Principal Component Analysis) is a dimensionality reduction technique.

It finds directions called principal components that contain maximum variance.

Advantages:

  • Fast
  • Easy to interpret
  • Useful for visualization

Applications:

  • Data compression
  • Feature extraction
  • Embedding visualization

10. What is t-SNE?

Answer:
t-SNE (t-Distributed Stochastic Neighbor Embedding) is a visualization algorithm.

It preserves local relationships between data points.

Advantages:

  • Better cluster visualization
  • Excellent for word embeddings

Limitation:

  • Computationally expensive

11. Why do similar words form clusters?

Answer:
Words appearing in similar contexts obtain similar vector representations.

Example:

Programming Cluster:

  • software
  • programming
  • algorithm

Networking Cluster:

  • internet
  • network

Data Cluster:

  • database
  • data

These clusters show semantic relationships.

12. What is a Domain-Specific Corpus?

Answer:
A collection of text related to a specific field.

Examples:

Medical Corpus

  • diabetes
  • insulin
  • glucose

Legal Corpus

  • lawyer
  • court
  • judge

Financial Corpus

  • stock
  • investment
  • profit

13. Why train a custom Word2Vec model?

Answer:
General models may not understand specialized terminology.

Training on a domain-specific corpus helps learn:

  • Technical vocabulary
  • Domain relationships
  • Industry-specific meanings

Example:

Medical corpus:

diabetes ↔ insulin

glucose ↔ blood sugar

14. What is Prompt Engineering?

Answer:
Prompt engineering is the process of designing effective prompts to obtain better responses from AI models.

Example:

Simple Prompt:

Explain AI.

Better Prompt:

Explain AI in 5 points with examples suitable for engineering students.

The second prompt produces better output.

15. What is Prompt Enrichment?

Answer:
Prompt enrichment means adding related words and context to improve AI responses.

Example:

Original Prompt:

Tell me about education.

Enriched Prompt:

Tell me about education, learning, teaching, knowledge, academic growth and skill development.

The enriched prompt produces more detailed output.

16. What is Hugging Face?

Answer:
Hugging Face is an open-source platform that provides pre-trained AI models for NLP tasks.

Tasks include:

  • Sentiment analysis
  • Translation
  • Summarization
  • Question answering
  • Text generation

17. What is a Pipeline in Hugging Face?

Answer:
A pipeline is a simple API that allows users to perform NLP tasks with very little code.

Example:

from transformers import pipeline

 

classifier = pipeline("sentiment-analysis")

The pipeline loads the model and tokenizer automatically.

18. What is Sentiment Analysis?

Answer:
Sentiment analysis identifies emotions in text.

Outputs:

  • Positive
  • Negative
  • Neutral

Example:

I love this movie.

→ Positive

This product is terrible.

→ Negative

19. What are real-world applications of Sentiment Analysis?

Answer:

  • Product reviews
  • Customer feedback
  • Social media monitoring
  • Brand reputation analysis
  • Survey analysis

Companies use it to understand customer satisfaction.

20. What is Text Summarization?

Answer:
Text summarization converts long documents into shorter versions while preserving important information.

Example:

A 5-page article can be summarized into a few paragraphs.

Types:

  1. Extractive
  2. Abstractive

21. What is LangChain?

Answer:
LangChain is a framework for developing applications using Large Language Models (LLMs).

Features:

  • Prompt templates
  • Chains
  • Memory
  • Document loaders
  • Vector databases

22. What is a Prompt Template?

Answer:
A Prompt Template defines a fixed structure for model input.

Example:

template = """

Summarize:

{text}

"""

Benefits:

  • Consistency
  • Reusability
  • Better outputs

23. What is Pydantic?

Answer:
Pydantic is a Python library used for data validation and structured output generation.

It helps define schemas.

Example:

class Student(BaseModel):

    name:str

    age:int

24. What is FAISS?

Answer:
FAISS (Facebook AI Similarity Search) is a library used for efficient similarity search among embeddings.

Functions:

  • Store vectors
  • Retrieve similar vectors
  • Fast searching

Used in:

  • Chatbots
  • Recommendation systems
  • Search engines

25. Explain the Working of the IPC Chatbot.

Answer:

Step 1

Load IPC PDF document.

Step 2

Split document into chunks.

Step 3

Generate embeddings using a Hugging Face model.

Step 4

Store embeddings in FAISS.

Step 5

User asks a question.

Step 6

Question is converted into embedding.

Step 7

Similarity search retrieves relevant IPC sections.

Step 8

Relevant answer is displayed.

This process combines:

  • PDF processing
  • Embeddings
  • FAISS
  • Retrieval-based chatbot architecture

 

Post a Comment

0 Comments