Day 2: Deep dive into Supervised Learning

After an eventful day one, I found myself tossing and turning all night, my mind buzzing with a ton of information to process. Who needs sleep when there's so much to learn, right?

After getting an overview of machine learning in general, and the three concepts it's divided into (Supervised, Unsupervised and Reinforcement learning), it's time to buckle up! We're diving headfirst into the world of supervised learning, the superhero that will help us conquer image classification!

When do we use Supervised Learning

Short explanation: Whenever we have "labelled" data i.e. we know the answer to all the questions we push into our algorithm.
Supervised Learning is like a teacher who gives a student bunch of questions while he already knows the answers. He expects the student to use his skills, knowledge etc to answer the questions. If the student fails, the teacher gives him more knowledge and tries again. Same stuff with Supervised Learning.
We know the answers to our questions and we build an algorithm that will get all the answers right so that in the future, it can predict the correct answer.

Use case: Image classification

That's the stuff that I want to implement!
In my case, I want to crack the code of image classification. I want to feed my algorithm tons of adorable kitty pictures, the kind that makes millions of people go "aww" on Instagram. Sure, I know they're cats, but I need my algorithm to look at any picture and confidently declare, "Hey, that's a cat!" or dismissively say, "Nah, it's a dog."
To understand the fundamentals of image classification, we need to take a look at the two methods within Supervised Learning: Classification and Regression.

Supervised Learning Method 1: Classification

Classification is the type that is used whenever we want to assign data into specific categories. Just like in our example with the cute cats on Instagram.
Whenever the algorithm gets to see a picture it should tell us whether it's a cat or not.
This is called Binary classification cause the categories are "yes" and "no" or as we nerds prefer to say: "0" or "1".
We can however have a larger set of categories if we, instead of just saying "yes" or "no" want to determine what kind of category we are dealing with. This is called "Multiclass classification". We have multiple classes/categories such as cat, dog, elephant, mouse etc.

Mathematical foundations

Now, let's talk math. I know, it's not the prettiest sight, but trust me, we'll unravel its beauty together. Classification has its own set of preferred algorithms, and guess what? They're mostly rooted in good old statistics (yeah, we've had our moments). Feast your eyes on these contenders:

Typical algorithms for classification are:

Naive Bayes
Decision Trees
Support Vector Machines
Random Forest
K - Nearest Neighbours

I'm not sure yet which one to use but from the various papers I have glimpsed at, KNN is likely going to be the one.

Supervised Learning Method 1: Regression

I will make the careful assumption that this method is the more "spicy" one as its often used where the big money is involved i.e. in stock market predictions, revenue etc.
In contrast to the classification algorithms, we don't deal with yes or no, cats, dogs or elephants. No. We deal in real or continuous values. Weights, prices, salary, speed, whatever.

Like with Classification, Regression has some juicy algorithms for us:

Simple Linear Regression
Support Vector Regression
Decision Trees
Lasso Regression
Neural Network Regression
Ridge Regression

Since I'm still doing it "diary" style and not purely educational, I won't explain each and everyone specifically but rather try to take you along my thinking process.
At this point, It's time to study the mathematical foundations i.e. pick one algorithm at a time, learn it, understand it, implement it, fail, try again, and succeed.

Next steps: Find, read and understand the papers

Next stop: Finding that perfect paper! The one that unlocks the secrets of image classification, featuring those irresistibly cute kitties. Fear not guys, for I have the mighty ChatGPT by my side. Together, we shall scour the vast expanse of knowledge, searching for a selection of suitable papers that cover all the required topics such as image classification, the mathematical foundations, and metrics for the algorithms etc.

In my next article, Im going to discuss my findings, given that I will understand any of them and let's see if we can figure it out together.
Until then, good night and stay put ;)

DDD. Dennis' Dev Diaries