Artificial Intelligence vs Machine Learning vs Deep Learning
|
AL |
ML |
Deep Learning |
| Originated |
1950 |
1960 |
1970 |
| What |
Simulated Intelligence in Machines |
Machine making decisions without being programmed |
Using
Neural networks
to solve complex problems
|
| Objective |
Building machines which can think like humans |
Algo which can learn thru data |
Neural n/w to
identify patterns
|
Activation Function
Relu?
Relu is operation carried on 2D Tensor. if value is less than 0, Take 0
else take the original value
#Means max(x,0)
def relu(x):
if (element < 0)
Replace with 0
else
Keep element as it is
def naive_relu(x): #x is 2D Tensor
assert len(x.shape) == 2
x = x.copy() #copy to avoid changing input
for i in range(x.shape[0]): #x.shape[0]=2
for j in range(x.shape[1]): #x.shape[1]=3
x[i, j] = max(x[i, j], 0)
return x
a = np.array([ #Dimension=2. Shape(2,3)
[0, 1, -2],
[4, 5, -6],
])
b = naive_relu(a)
print(b)
[[0 1 0]
[4 5 0]]
Adding activation function to a layer introduces non-linearity into the
model, allowing it to learn more complex relationships between the input
and output data
Non-linear activation functions such as ReLU, Sigmoid, and Tanh can help
the model to better fit the training data and make more accurate
predictions on new data.
Conda
Miniconda
is the recommended approach for installing TensorFlow with GPU support
It creates a separate environment to avoid changing any installed
software in your system. This is also the easiest way to install the
required software especially for the GPU setup.
Optimizers
optimizer is an algorithm used to adjust the parameters of a model in
order to minimize the error or loss function
| Optimizer |
Meaning |
Example |
| 1. RMSprop (Root Mean Square Propagation) |
It divides the learning rate for a weight by a running average of
the magnitudes
|
from keras.optimizers import RMSprop
optimizer = RMSprop(learning_rate=0.001, rho=0.9)
|
| 2. Stochastic Gradient Descent (SGD) |
Adjusts the model parameters based on the average gradient of the
loss
|
from keras.optimizers import SGD
optimizer = SGD(learning_rate=0.01, momentum=0.9)
|
Orchestration
Monitoring and handling ML pipelines
Steps in Orchestration?
1. Workflow Definition: Defining the sequence of steps (tasks)
2. Automation: Automatically triggering and running the
pipeline
3. Resource Management: managing the underlying computational
resources (CPUs, GPUs)
4. Monitoring and Logging: Tracking the execution status,
performance metrics
Bias/Sampling Bias
We should use a training data set that is representative of the cases we
want model to predict.
if the sample is too small, you will have sampling noise (i.e.,
nonrepresentative data as a result of chance), but even very large
samples can be nonrepresentative if the sampling method is flawed. This
is called sampling bias.
CNTK
This is Microsoft Cognitive Toolkit (CNTK) backend, plugged with keras.
Fitting
Overfitting
Means that the model performs well on the training data, but it does not
generalize well(ie produces good results on real world/unseen data),
This is because model memorizes the exact relationship in the sample
data including noise or minor details.
Overfitting Example on DecisionTreeRegressor
Underfitting
Does not produces good results on traning data and bad results on new
data as well
Underfitting Example on DecisionTreeRegressor
Large language model
This is a computational model that can do natural language processing
tasks such as classification.
LLMs learns from LM by
self-supervised and semi-supervised training.
LLMs can be used for text generation, a form of generative AI, by taking
an input text
Llama (Large Language Model Meta AI)
A LLMs released by Meta AI starting in February 2023.
Llama Models
|
Llama 2 (Released on Jun,2023) |
Llama 3 (Released on Apr,2024) |
| Trained Model sizes |
7, 13, and 70 billion parameters |
8B and 70B parameters |
| Improvments |
fine-tuned for dialog wrt Llama-1 |
400B+ parameters is currently being trained |
Layer / class or Function
Layer processes input data(tensor) and produces an output(tensor) in
specific format.
Neural network is
created by cascading multiple layers.
Types of Layers:
|
Dense Layer / Fully connected layer |
Convolutional Layer |
Recurrent Layer |
| What |
Each neuron is connected to every neuron in the previous layer.
|
Use convolutional operations to detect local patterns in the input
data.
|
Processes sequential data, where the order of the input matters
|
| Usage |
image classification, regression, and more |
image classification, object detection, image segmentation, spatial
hierarchies
|
natural language processing (NLP), time series analysis, and speech
recognition
|
E. Generative AI
Class of models that creates content from user input. For example,
generative AI can create unique images, music etc
Example:
Text-to-text
Text-to-image
Text-to-video
Text-to-code
Text-to-speech
Image and text-to-image
Loss Function
In Keras, a loss function is used during the training of a
neural network. It
measures the difference between the model's predictions and the actual
target values. The goal of training is to minimize this loss
Loss function = (Actual O/P) - (Expected output)
| Types |
What |
Example |
| 1. categorical_crossentropy |
used in multi-class classification problems when the target variable
is one-hot encoded
|
model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
|
| 2. Binary Crossentropy (binary_crossentropy) |
Used for binary classification problems, where the target variable
is binary (0 or 1)
|
model.compile(optimizer='adam', loss='binary_crossentropy',
metrics=['accuracy'])
|
| 3. Mean Squared Error (mean_squared_error or mse) |
Measures the average squared difference between the true and
predicted values
|
model.compile(optimizer='adam', loss='mean_squared_error') |
to_categorical
Function to convert labels into a one-hot encoded format.
One-hot encoding is converting categorical labels into a binary matrix
(1s and 0s)
from keras.utils import to_categorical
train_labels = [0, 1, 2, 0, 1]
train_labels_one_hot = to_categorical(train_labels) # Convert to one-hot encoding
print(train_labels_one_hot)
array([[1., 0., 0.], //represents 0
[0., 1., 0.], #1
[0., 0., 1.], #2
[1., 0., 0.], #0
[0., 1., 0.]], dtype=float32)#1
Matplotlib
Popular plotting library for Python that provides a variety of
high-quality 2D and 3D plots and visualizations.
Matplotlib.pyplot is a collection of functions that make
Matplotlib work like MATLAB, allowing you to create plots, charts
imshow(digit, cmap=plt.cm.binary) is used to display images in
digit array, cmap=colormap for mapping the data values to colors in the
plot. plt.cm.binary = black and white colors.
Metrics
Metrics to monitor during training and testing
Tensor = n-D Matrix
This is matrix(as in maths). Multi-dimensional numpy arrays used to
store numbers during computation.
Types of Tensors
Dimension/Rank /Axis/Ndim |
Name |
Representation |
Examples |
Shape(Rows,cols) Represents Number of elements in each
direction
|
Processed By (Keras) |
| 0 |
Scalar |
[0] |
|
(0) |
|
| 1 |
vector |
[1,2,3,4] |
|
(4) //Since 4 elements in 1 direction |
|
| 2 |
Matrix / 2D Tensor |
| 1 2 3 |
| 4 5 6 |
|
samples |
(2,3) //Since 2 elements in 1 direction & 3 in other |
Dense Layer |
| 3 |
3D Tensor |
[
| 1 2 3 |
| 4 5 6 |,
| 1 2 3 |
| 4 5 6 |
]
|
Timestamped data |
(2,2,3) |
Recurrent layers(eg: LSTM layer) |
| 4 |
4D Tensor |
3D tensors packed together |
|
2D convolution layers (Conv2D) |
|
import numpy as np
########## 2-D Tensor ###########
b = np.array(
[
[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11],
]
)
print("Dimension/Ndim:", b.ndim) # 2 //2d array
print("Shape:", b.shape) # (3, 4) //(row,col)
########## 3-D Tensor, Packing 2-D matrices ###########
c = np.array(
[
[
[0, 1, 2],
[4, 5, 6],
[8, 9, 10],
],
[
[10, 11, 12],
[14, 15, 16],
[18, 19, 110],
]
]
)
print("Dimension/Ndim:", c.ndim) # 3 //3d array
print("Shape:", c.shape) # (2,3,3) //(2=arrays, 3=row, 3=col)
####### Operations ############
#all,row,col
d = c[:, 2:, 2:] # Select all elements 2nd(row), 2nd(col) onwards.
print(d) # [[[ 10]] [[110]]]
Tensor Operations
Add vector(dimension=1) + matrix(dimension=2)
|1 2 3| + | 1 2 3 | = |2 4 6|
| 4 5 6 | |5 7 9|
import numpy as np
def naive_add_matrix_and_vector(x, y):
assert len(x.shape) == 2 #Matrix
assert len(y.shape) == 1 #vector
assert x.shape[1] == y.shape[0]
x = x.copy()
for i in range(x.shape[0]):
for j in range(x.shape[1]):
x[i, j] += y[j]
return x
x = np.array(
[
[1,2,3],
[4,5,6]
]
)
y = np.array([1,2,3])
z = naive_add_matrix_and_vector(x,y)
print(z)
'''
[[2,4,6]
[5,7,9]]
'''
Dot Product (.)
| 2 Vectors = Scalar |
Vectors(1D).Matrix(2D) = vector |
[1,2,3].[2,3,4] = 1*2+2*3+3*4 = 20.0
import numpy as np
def naive_vector_dot(x, y):
z = 0. #float
for i in range(x.shape[0]):
z += x[i] * y[i]
return z
x = np.array([1,2,3])
y = np.array([2,3,4])
z = naive_vector_dot(x, y)
print(z) #20.0
|
| 1 2 3 | . | 1 2 3 | = |1x1 + 2x2 + 3x3 , 1x4 + 2x5 + 3x6|
| 4 5 6 | = |13.0, 32.0|
import numpy as np
def naive_matrix_vector_dot(x, y):
assert len(x.shape) == 2 #matrix
assert len(y.shape) == 1 #vector
assert x.shape[1] == y.shape[0]
z = np.zeros(x.shape[0])
for i in range(x.shape[0]):
for j in range(x.shape[1]):
z[i] += x[i, j] * y[j]
return z
y = np.array([1,2,3]) #y.shape = (3)
x = np.array( #x.shape = (2,3)
[
[1,2,3],
[4,5,6]
]
)
print(naive_matrix_vector_dot(x,y))
[14. 32.]
|
Tensor Reshaping
Reshaping a tensor means rearranging its rows and columns to match a
target shape.
Shape(3,2) to (6,1). (2,3)
>>> x = np.array([[0, 1],
[2, 3],
[4, 5]])
>>> print(x.shape) #See above how shape(3,2)
(3, 2)
>>> x = x.reshape((6, 1))
>>> x
array([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5]])
>>> x = x.reshape((2, 3))
>>> x
array([[ 0, 1, 2],
[ 3, 4, 5]])
Transpose
Change Shape(x, y) to (y ,x)
>>> x = np.zeros((300, 20))
>>> x = np.transpose(x)
>>> print(x.shape)
(20, 300)
Tensor Terms
| Term |
Mearning |
| Data types(dtype) |
Data type of data present in tensor. Eg: float32, uint8, float64
String tensors don’t exist in Numpy (or in most other libraries),
because tensors are preallocated contiguous memory segments, and
strings, being variable length
|
| Axis/ndim/Rank/Dimension |
Dimension of matrix
Dimension/Axis Array
0 np.array(12) // Point does not have any dimension
1 np.array([1,2]) // 1x2. 1 Dimensional
2 np.array([[5, 78, 2, 34, 0], // 3x4. 2 Dimensional
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]])
|
| Shape |
Tells how many size tensor has along each axis
previous matrix example has shape (3, 5), and the 3D tensor example
has shape (3, 3, 5)
|
Variance
Variance is the tendency to learn random things unrelated to the real
signal