1. What is a Neural
Network?
A Neural Network is a
type of artificial intelligence model inspired by how the human brain works. Just
like your brain has neurons that pass signals to each other, a neural network
has artificial neurons arranged in layers.
These neurons learn patterns from data.
Structure of a Neural Network
A basic neural network has three types of layers:
- Input
Layer
- Takes
the data (like images, text, numbers).
- Hidden
Layers
- These
layers do the actual learning.
- More
hidden layers → deeper network → better pattern learning.
- Output
Layer
- Gives
the final result
(e.g., “DR present”, “cat”, “spam email”, etc.).
How it learns
It learns by adjusting the weight of each
connection, similar to how we strengthen or weaken ideas in our brain.
- Data
is given to the network.
- The
network predicts something.
- If
prediction is wrong, it adjusts itself.
- Over
time, it gets better and better.
Example Uses
- Detecting
diabetic retinopathy
- Face
recognition
- Language
translation
- Self-driving
cars
2. What is Reinforcement
Learning?
Reinforcement Learning
(RL) is a type of machine learning where an agent learns by interacting with an
environment and receiving rewards or punishments.
It is like training a
pet or learning a game.
Key Idea
The agent tries different actions.
- If
the action is good → It gets a reward
- If
the action is bad → It gets a penalty
Over time, the agent learns the best actions to
maximize reward.
Important Terms
- Agent:
The learner (like a robot or software program).
- Environment:
Where it acts (a game, a room, a factory).
- Action:
What the agent does.
- Reward:
Feedback given for actions.
- Policy:
The strategy the agent learns to follow.
Example
A robot wants to learn to walk:
- When
it moves forward → +10 reward
- When
it falls → –5 penalty
- Over
time, it learns stable walking.
Where RL is used
- Robotics
(navigation, manipulation)
- Game
playing (Chess, Go, Atari, etc.)
- Self-driving
cars
- Healthcare
treatment recommendations
- Industrial
automation
3.Explain Autoencoder in
detail
An Autoencoder is a special type of neural network
that learns to compress data and then reconstruct it.
Its main goal is to learn the most important patterns in the data.
In simple words:
Autoencoder = Data Compressor + Data Rebuilder
It is an unsupervised learning technique because it
does not need labeled data.
Why do we need Autoencoders?
Autoencoders help in:
- Reducing
the size of data (compression)
- Removing
noise from images
- Finding
important features automatically
- Detecting
anomalies (fraud, defects, rare events)
- Enhancing
medical images
They are widely used in image processing, NLP, and
anomaly detection.
Structure of an Autoencoder
An autoencoder has three main parts:
1. Encoder
- Takes
the input data (image, text, etc.)
- Compresses
it into a smaller-sized representation
- This
condensed representation is called latent vector or bottleneck
Example:
Input image: 1000 pixels
Encoder compresses it to: 50 values
It learns only the essential features.
2. Bottleneck (Latent Space)
This is the compressed knowledge of the data.
- Represents
only the important features.
- Removes
unnecessary details.
- Acts
like the “core meaning” of the data.
This is where the autoencoder learns hidden
patterns.
3. Decoder
- Takes
the latent vector
- Tries
to reconstruct the original input
The goal is to make the reconstructed output as
close as possible to the original input.
How Autoencoders Learn
During training, the autoencoder compares:
- Input
Image
vs - Reconstructed
Image
The difference between them is the reconstruction
error.
The network adjusts weights to minimize this error.
4.
Explain Word Embedding in detail
Word Embedding is a technique in Natural Language
Processing (NLP) where words are converted into numerical vectors so
that a machine can understand and process them.
In simple words:
Word embedding = representing words as numbers in
such a way that words with similar meanings have similar vector values.
It helps computers understand meaning, context, and
relationships between words.
Why Do We Need Word Embeddings?
Computers understand numbers, not text.
Earlier, NLP used One-Hot Encoding:
- Each
word gets a unique vector (0s and 1s)
- Very
large and sparse
- No
meaning — “king” and “queen” look unrelated
Example:
- KING
= [0,0,1,0,0,0,...]
- QUEEN
= [0,0,0,1,0,0,...]
No similarity is captured.
·
Problem: One-hot encoding does not
capture meaning or relationships.
·
Word Embeddings solve this by placing
similar words close together in vector space.
What Are Word Embeddings?
Word embeddings are dense vectors (e.g., 50, 100,
300 dimensions) that capture:
- Semantics
(meaning)
- Context
- Relationships
- Similarity
between words
Example of embeddings:
- king
→ [0.27, 0.68, 0.12, …]
- queen
→ [0.26, 0.70, 0.10, …]
- man
→ [0.11, 0.01, 0.19, …]
- woman
→ [0.10, 0.02, 0.20, …]
Notice: king and queen vectors are similar, as are
man and woman.
How Word Embeddings Work
Word embeddings rely on a simple idea:
“Tell me who your neighbors are, and I’ll tell you
what you mean.”
Words occurring in similar contexts have similar
meanings.
Example:
- “The
doctor treated the patient.”
- “The
nurse cared for the patient.”
The words doctor and nurse appear near patient, so
embeddings place them close together.
5.
Explain Types of Word
Embeddings
1️⃣ Word2Vec (Most popular)
Created by Google.
Two versions:
- CBOW
(Continuous Bag of Words)
Predicts a word from surrounding context
Example: ___ ate an apple → predicts “He” - Skip-Gram
Predicts context words from a given word
Example: “apple” → predicts “ate”, “fruit”, “red”
Key feature:
- Learns
semantic relationships like:
👉
king – man + woman = queen
2️⃣ GloVe (Global Vectors)
Created by Stanford.
Uses word-word co-occurrence statistics from the
entire text.
Strength:
Captures global relationships, not just local
context.
3️⃣ FastText (by Facebook)
FastText breaks words into sub-word units
(character n-grams).
Example:
- “playing”
→ “play”, “lay”, “ing”
Strength:
- Handles
rare words better
- Works
well for languages with rich morphology (Tamil, Kannada, Hindi)
4️⃣ Contextual Embeddings (Modern NLP)
These embeddings depend on the sentence context.
Examples:
- BERT
- GPT
- ELMo
Example:
Word: "bank"
- “He
sat near the bank of the river.”
- “He
deposited money in the bank.”
Traditional embeddings give same vector → WRONG
Contextual embeddings give different vectors → CORRECT
6.
Datasets Used in the
Deep Learning Lab – Detailed Explanation
Your lab manual uses five major datasets
across different deep learning tasks:
- Custom
Text Corpus → for Word Embeddings
- MNIST
Dataset → for Deep Neural Networks & Autoencoders
- CIFAR-10
Dataset → for Pretrained CNN Models
- IMDB
Movie Reviews Dataset → for Text Classification
- Synthetic
Time Series Dataset → for LSTM Forecasting
Below is a clear explanation of each.
1️⃣ Custom Text Corpus (Used in Word
Embedding Lab)
Where Used:
✔
Experiment 1 – Generate Word Embeddings using Word2Vec
What it is:
A text dataset created by the user, often raw text
paragraphs like:
- Articles
- Documents
- Stories
- Wikipedia
text
- Classroom
notes
This text is fed into spaCy for tokenization and
Word2Vec for generating embeddings.
Why used:
- To
help students understand how Word2Vec learns context.
- Shows
how embeddings capture similarity and relations between words.
Data Format:
A plain string or text file, converted into:
- Tokens
→ words
- Training
pairs → (center word, context word)
Example tokens:
["neural", "networks", "are",
"powerful", "models", ...]
The dataset size can be small, even 5–10 sentences,
because goal is learning concept.
2️⃣ MNIST Dataset (Used in Experiments
2 & 4)
📌
Where Used:
✔
Experiment 2 – Deep Neural Network for Classification
✔ Experiment 4 – Autoencoder for Image
Compression
📌
What MNIST Is:
MNIST is a world-famous dataset of handwritten
digits (0–9).
It contains grayscale images, each 28×28 pixels.
Dataset size:
- 60,000
training images
- 10,000
testing images
- Total
= 70,000 images
Each image:
- Resolution:
28 × 28
- Color:
1 channel (grayscale)
- Pixel
range: 0–255, normalized to 0–1
Why MNIST is used:
- Very
clean, standardized dataset
- Perfect
for beginners
- Ideal
for testing neural networks, autoencoders, and CNN basics
Where it fits in your labs:
- Deep
Neural Network learns to classify digits
- Autoencoder
learns to compress and reconstruct digit images
3️⃣ CIFAR-10 Dataset (Used in
Pretrained Model Lab)
📌
Where Used:
✔
Experiment 7 – Using Pretrained MobileNetV2 for Image Classification
✔ Prediction and
Evaluation on New Images
📌
What CIFAR-10 Is:
A popular dataset of color images, used to test
computer vision models.
Dataset size:
- 50,000
training images
- 10,000
test images
- Total
= 60,000 images
Each image:
- Resolution:
32 × 32 pixels
- Channels:
3 (RGB)
- 10
categories:
|
Class |
Example Images |
|
airplane |
✈️ |
|
automobile |
🚗 |
|
bird |
🐦 |
|
cat |
🐱 |
|
deer |
🦌 |
|
dog |
🐶 |
|
frog |
🐸 |
|
horse |
🐴 |
|
ship |
🚢 |
|
truck |
🚚 |
Why CIFAR-10 is used:
- More
complex than MNIST
- Suitable
for testing pretrained CNNs
- Helps
students understand transfer learning
Where it fits in your labs:
- You
resize CIFAR images to 96×96 for MobileNetV2
- Pretrained
model extracts features
- Custom
classifier predicts the category
4️⃣ IMDB Movie Reviews Dataset (Used
in Text Classification Lab)
📌
Where Used:
✔
Experiment 5 – Deep learning network for text classification (sentiment
analysis)
📌
What IMDB Dataset Is:
A large text dataset of movie reviews with labeled
sentiment.
Each review is either:
- ⭐ Positive
- ⭐ Negative
Dataset size:
- 25,000
training reviews
- 25,000
testing reviews
- Balanced
dataset (50% positive, 50% negative)
Data format:
- Reviews
are already converted into integers (word indices)
- Example:
The movie was great → [34, 12, 5, 98]
Why IMDB is used:
- Standard
benchmark for sentiment analysis
- Useful
for learning embeddings + LSTM/BiLSTM networks
In your lab:
- Word
indices → Embedding layer (128 dimensions)
- Bidirectional
LSTM extracts meaning
- Output
→ probability of positive vs negative
5️⃣ Synthetic Time Series Dataset
(Used in LSTM Forecasting)
📌
Where Used:
✔
Experiment 6 – Deep Learning Model for Time Series Forecasting
📌
What the Synthetic Dataset Is:
A generated time series made of:
- Sine
wave
- Trend
- Noise
Example generated data:
y = sin(0.02x) + random_noise + trend
Dataset size:
- Usually
500 data points, but can be changed.
Why Synthetic?
- Lets
students practice forecasting without needing real industrial data
- Easy
to visualize
- Shows
the role of sequence windows
In the lab:
- Scaled
using MinMaxScaler
- Prepared
in sliding windows
- LSTM
predicts the next value in sequence
7.What is called as
pretrained model?
A pretrained model is a deep learning model that has
already been trained on a very large dataset and can be reused for your own
tasks.
In simple words:
A pretrained model = a model that someone else has
already trained for you on a huge dataset, so you don’t have to start from
zero.
Why Pretrained Models Are Useful
Training deep models from scratch requires:
- millions
of images or text samples
- huge
GPU power
- days
or weeks of training
Most students, researchers, and even companies do
not have these resources.
So instead, we reuse models trained by experts on
massive datasets like:
- ImageNet
(14 million images)
- COCO
dataset
- Wikipedia
text
- Common
Crawl data
These models already learned generic features, and
we only fine-tune them.
Example of Pretrained Models
Pretrained CNNs for Images
- MobileNetV2
- ResNet
- VGG16
- InceptionV3
- EfficientNet
These are trained on ImageNet to recognize 1000
classes.
You can reuse them to classify:
- medical
images
- animals
- vehicles
- plants
even if these images are not part of the original dataset.
Pretrained NLP Models
- BERT
- GPT
- Word2Vec
- Fast
Text
These are trained on billions of words.
They already understand:
- grammar
- sentence
structure
- word
meaning
- context
You only fine-tune them for tasks like:
- sentiment
analysis
- summarization
- translation
- chatbots
0 Comments