Machine Learning Terms

Artificial Intelligence vs Machine Learning vs Deep Learning

	AL	ML	Deep Learning
Originated	1950	1960	1970
What	Simulated Intelligence in Machines	Machine making decisions without being programmed	Using Neural networks to solve complex problems
Objective	Building machines which can think like humans	Algo which can learn thru data	Neural n/w to identify patterns

Activation Function

Relu?

Relu is operation carried on 2D Tensor. if value is less than 0, Take 0 else take the original value


#Means max(x,0)

def relu(x):     
    if (element < 0)
        Replace with 0
    else
        Keep element as it is

def naive_relu(x):                      #x is 2D Tensor
    assert len(x.shape) == 2

    x = x.copy()                        #copy to avoid changing input
    for i in range(x.shape[0]):         #x.shape[0]=2
        for j in range(x.shape[1]):     #x.shape[1]=3
            x[i, j] = max(x[i, j], 0)
return x            

a = np.array([              #Dimension=2. Shape(2,3)
            [0, 1, -2],
            [4, 5, -6],
        ])
b = naive_relu(a)
print(b)
[[0 1 0]
 [4 5 0]]

Adding activation function to a layer introduces non-linearity into the model, allowing it to learn more complex relationships between the input and output data

Non-linear activation functions such as ReLU, Sigmoid, and Tanh can help the model to better fit the training data and make more accurate predictions on new data.

Conda

Miniconda is the recommended approach for installing TensorFlow with GPU support

It creates a separate environment to avoid changing any installed software in your system. This is also the easiest way to install the required software especially for the GPU setup.

Optimizers

optimizer is an algorithm used to adjust the parameters of a model in order to minimize the error or loss function

Optimizer	Meaning	Example
1. RMSprop (Root Mean Square Propagation)	It divides the learning rate for a weight by a running average of the magnitudes	`from keras.optimizers import RMSprop optimizer = RMSprop(learning_rate=0.001, rho=0.9)`
2. Stochastic Gradient Descent (SGD)	Adjusts the model parameters based on the average gradient of the loss	`from keras.optimizers import SGD optimizer = SGD(learning_rate=0.01, momentum=0.9)`

Bias/Sampling Bias

We should use a training data set that is representative of the cases we want model to predict.

if the sample is too small, you will have sampling noise (i.e., nonrepresentative data as a result of chance), but even very large samples can be nonrepresentative if the sampling method is flawed. This is called sampling bias.

CNTK

This is Microsoft Cognitive Toolkit (CNTK) backend, plugged with keras.

Fitting

Overfitting

Means that the model performs well on the training data, but it does not generalize well(ie produces good results on real world/unseen data), This is because model memorizes the exact relationship in the sample data including noise or minor details.
Overfitting Example on DecisionTreeRegressor

Underfitting

Does not produces good results on traning data and bad results on new data as well
Underfitting Example on DecisionTreeRegressor

Keras

Library(in Python) which provides functions/APIs to build deep-learning models. Different backends can be plugged with keras


                    Keras
            Tensorflow / Theano / CNTK
             CUDA           BLAS,Eigen
             GPU            CPU

Large language model

This is a computational model that can do natural language processing tasks such as classification.
LLMs learns from LM by self-supervised and semi-supervised training.
LLMs can be used for text generation, a form of generative AI, by taking an input text

Llama (Large Language Model Meta AI)

A LLMs released by Meta AI starting in February 2023.

Llama Models

	Llama 2 (Released on Jun,2023)	Llama 3 (Released on Apr,2024)
Trained Model sizes	7, 13, and 70 billion parameters	8B and 70B parameters
Improvments	fine-tuned for dialog wrt Llama-1	400B+ parameters is currently being trained

Layer / class or Function

Layer processes input data(tensor) and produces an output(tensor) in specific format. Neural network is created by cascading multiple layers.
Types of Layers:

	Dense Layer / Fully connected layer	Convolutional Layer	Recurrent Layer
What	Each neuron is connected to every neuron in the previous layer.	Use convolutional operations to detect local patterns in the input data.	Processes sequential data, where the order of the input matters
Usage	image classification, regression, and more	image classification, object detection, image segmentation, spatial hierarchies	natural language processing (NLP), time series analysis, and speech recognition

E. Generative AI

Class of models that creates content from user input. For example, generative AI can create unique images, music etc
Example:
Text-to-text
Text-to-image
Text-to-video
Text-to-code
Text-to-speech
Image and text-to-image

Loss Function

In Keras, a loss function is used during the training of a neural network. It measures the difference between the model's predictions and the actual target values. The goal of training is to minimize this loss


                    Loss function = (Actual O/P) - (Expected output)

Types	What	Example
1. categorical_crossentropy	used in multi-class classification problems when the target variable is one-hot encoded	model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
2. Binary Crossentropy (binary_crossentropy)	Used for binary classification problems, where the target variable is binary (0 or 1)	model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
3. Mean Squared Error (mean_squared_error or mse)	Measures the average squared difference between the true and predicted values	model.compile(optimizer='adam', loss='mean_squared_error')

to_categorical

Function to convert labels into a one-hot encoded format.
One-hot encoding is converting categorical labels into a binary matrix (1s and 0s)


from keras.utils import to_categorical
train_labels = [0, 1, 2, 0, 1]
train_labels_one_hot = to_categorical(train_labels) # Convert to one-hot encoding
print(train_labels_one_hot)

array([[1., 0., 0.],                //represents 0
       [0., 1., 0.],                #1
       [0., 0., 1.],                #2
       [1., 0., 0.],                #0
       [0., 1., 0.]], dtype=float32)#1

Matplotlib

Popular plotting library for Python that provides a variety of high-quality 2D and 3D plots and visualizations.
Matplotlib.pyplot is a collection of functions that make Matplotlib work like MATLAB, allowing you to create plots, charts
imshow(digit, cmap=plt.cm.binary) is used to display images in digit array, cmap=colormap for mapping the data values to colors in the plot. plt.cm.binary = black and white colors.

Metrics

Metrics to monitor during training and testing

Neuron / Node / Function or class

A neuron is the basic unit within a layer. It takes input, performs a computation, and produces an output.

Each neuron has weight, bias, activation function

weight

Neurons receive input signals, and each input is associated with a weight. These weights represent the strength of the connection between the input and the neuron.


class Neuron:
    def __init__(self, num_inputs):
        self.weights = initialize_weights(num_inputs)
        self.bias = initialize_bias()
        self.activation_function = relu

    def forward(self, input_data):
        # Compute the weighted sum of inputs
        weighted_sum = sum(weight * input_value for weight, input_value in zip(self.weights, input_data)) + self.bias
        
        # Apply the activation function
        output = self.activation_function(weighted_sum)
        
        return output

Neural Network

It try to emulate the human brain, combining computer science and statistics to solve common problems in the field of AI
It contains an input layer, one or more hidden layers, and an output layer.

Tensor = n-D Matrix

This is matrix(as in maths). Multi-dimensional numpy arrays used to store numbers during computation.

Types of Tensors

Dimension/Rank /Axis/Ndim	Name	Representation	Examples	Shape(Rows,cols) Represents Number of elements in each direction	Processed By (Keras)
0	Scalar	[0]		(0)
1	vector	[1,2,3,4]		(4) //Since 4 elements in 1 direction
2	Matrix / 2D Tensor	`\| 1 2 3 \| \| 4 5 6 \|`	samples	(2,3) //Since 2 elements in 1 direction & 3 in other	Dense Layer
3	3D Tensor	`[ \| 1 2 3 \| \| 4 5 6 \|, \| 1 2 3 \| \| 4 5 6 \| ]`	Timestamped data	(2,2,3)	Recurrent layers(eg: LSTM layer)
4	4D Tensor	3D tensors packed together		2D convolution layers (Conv2D)


            import numpy as np
            ########## 2-D Tensor ###########
            b = np.array(
                [
                    [0, 1, 2, 3],
                    [4, 5, 6, 7],
                    [8, 9, 10, 11],
                ]
            )
            print("Dimension/Ndim:", b.ndim)        # 2         //2d array
            print("Shape:", b.shape)                # (3, 4)    //(row,col)
            
            ########## 3-D Tensor, Packing 2-D matrices ###########
            c = np.array(
                [
                    [
                        [0, 1, 2],
                        [4, 5, 6],
                        [8, 9, 10],
                    ],
                    [
                        [10, 11, 12],
                        [14, 15, 16],
                        [18, 19, 110],
                    ]
                ]
            )
            print("Dimension/Ndim:", c.ndim)    # 3         //3d array
            print("Shape:", c.shape)            # (2,3,3)   //(2=arrays, 3=row, 3=col)

            ####### Operations ############
                #all,row,col
            d = c[:, 2:, 2:]            # Select all elements 2nd(row), 2nd(col) onwards.
            print(d)                    #   [[[ 10]]  [[110]]]

Tensor Operations

Add vector(dimension=1) + matrix(dimension=2)


|1 2 3| + | 1 2 3 | = |2 4 6|
          | 4 5 6 |   |5 7 9|

import numpy as np
def naive_add_matrix_and_vector(x, y):
    assert len(x.shape) == 2    #Matrix
    assert len(y.shape) == 1    #vector
    assert x.shape[1] == y.shape[0]

    x = x.copy()
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            x[i, j] += y[j]
    return x

x = np.array(
        [
            [1,2,3],
            [4,5,6]
        ]
)
y = np.array([1,2,3])
z = naive_add_matrix_and_vector(x,y)
print(z)
'''
[[2,4,6]
[5,7,9]]
'''

Dot Product (.)

2 Vectors = Scalar Vectors(1D).Matrix(2D) = vector


[1,2,3].[2,3,4] = 1*2+2*3+3*4 = 20.0

import numpy as np
def naive_vector_dot(x, y):
    z = 0.                  #float
    for i in range(x.shape[0]):
        z += x[i] * y[i]
    return z

x = np.array([1,2,3])
y = np.array([2,3,4])
z = naive_vector_dot(x, y)
print(z)        #20.0


| 1 2 3 | . | 1 2 3 | = |1x1 + 2x2 + 3x3 , 1x4 + 2x5 + 3x6| 
            | 4 5 6 | = |13.0, 32.0|
import numpy as np
def naive_matrix_vector_dot(x, y):
    assert len(x.shape) == 2    #matrix
    assert len(y.shape) == 1    #vector
    assert x.shape[1] == y.shape[0]
    z = np.zeros(x.shape[0])
    for i in range(x.shape[0]):
        for j in range(x.shape[1]):
            z[i] += x[i, j] * y[j]
    return z

y = np.array([1,2,3])       #y.shape = (3)
x = np.array(               #x.shape = (2,3) 
                [
                    [1,2,3],
                    [4,5,6]
                ]
            )
print(naive_matrix_vector_dot(x,y))
[14. 32.]

Tensor Reshaping

Reshaping a tensor means rearranging its rows and columns to match a target shape.

Shape(3,2) to (6,1). (2,3)


>>> x = np.array([[0, 1],
                  [2, 3],
                  [4, 5]])
>>> print(x.shape)      #See above how shape(3,2)
(3, 2)
>>> x = x.reshape((6, 1))
>>> x
array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5]])
>>> x = x.reshape((2, 3))
>>> x
array([[ 0, 1, 2],
       [ 3, 4, 5]])

Transpose

Change Shape(x, y) to (y ,x)


>>> x = np.zeros((300, 20))
>>> x = np.transpose(x)
>>> print(x.shape)
(20, 300)

Tensor Terms

Term	Mearning
Data types(dtype)	Data type of data present in tensor. Eg: float32, uint8, float64 String tensors don’t exist in Numpy (or in most other libraries), because tensors are preallocated contiguous memory segments, and strings, being variable length
Axis/ndim/Rank/Dimension	Dimension of matrix Dimension/Axis Array 0 np.array(12) // Point does not have any dimension 1 np.array([1,2]) // 1x2. 1 Dimensional 2 np.array([[5, 78, 2, 34, 0], // 3x4. 2 Dimensional [6, 79, 3, 35, 1], [7, 80, 4, 36, 2]])
Shape	Tells how many size tensor has along each axis previous matrix example has shape (3, 5), and the 3D tensor example has shape (3, 3, 5)

Tensorflow

This is ML Open source library(EXPOSING APIs) for numerical computation and large-scale ML supports CPUs & GPUs.

Python Front-end APIs & backend written in c++ for high performance.


                    //Install conda https://docs.conda.io/projects/miniconda/en/latest/
            C:\Users\amitk\source\repos\Python> mkdir venv_ml1
            C:\Users\amitk\source\repos\Python> cd venv_ml1
            C:\Users\amitk\source\repos\Python\venv_ml1>"c:\Users\amitk\miniconda3\Scripts\activate" venv_ml1
            //Env is created here: C:\Users\amitk\miniconda3\envs
            (venv_ml1) C:\Users\amitk\source\repos\Python\venv_ml1>
            (venv_ml1) C:\Users\amitk\source\repos\Python\venv_ml1>"c:\Users\amitk\miniconda3\condabin\deactivate.bat"  //deactivate
            (venv_ml1) C:\Users\amitk\source\repos\Python\venv_ml1>pip install tensorflow
            Downloading tensorflow-2.6.2-cp36-cp36m-win_amd64.whl (423.3 MB)

Variance

Variance is the tendency to learn random things unrelated to the real signal