Hey reader, welcome to the š„ free edition š„ of my weekly newsletter. I write about ML concepts, how to build ML products, and how to thrive at work. You can learn more about me here. Feel free to send me your questions and Iām happy to offer my thoughts. Subscribe to this newsletter to receive it in your inbox every week.
Let's say you're building a machine learning tool that can look at a medical image and predict whether the patient has an illness. You have gathered a large number of images for your training dataset. And to gauge the accuracy of this tool, we count the number of times it's wrong on the test dataset.
There are two ways it can be wrong in its predictions:
The patient actually has an illness, but the ML tool predicts that they don't have an illness.
The patient actually doesn't have an illness, but the ML tool predicts that they have an illness.
In regular scenarios, both types of misclassifications are considered equally bad. But is that really true in this case? Clearly the first situation is way more dangerous than the second one. How do we build an ML model that takes it into consideration? This is where cost sensitive learning comes into the picture.
Why do we need it?
We need it in situations where one type of error is more expensive than another type of error. Such problems are called "cost sensitive" because we need to give different weightage to different types of errors when we train the model.
Let's say that we build a model using our regular methods and determine that the overall accuracy of the model is 90%. That means it misclassifies 10% of the time. Now what if out of that 10%, 9.9% is due to the ML tool missing out on the illness when the patient actually has it. The patient might incorrectly think that they're safe, when in fact they're not. This is dangerous.
Cost sensitive learning provides a framework to build a model that account for this.
How does cost sensitive learning address this issue?
Cost sensitive learning addresses this issue by allowing us to assign different weights to different types of misclassifications. It is closely related to imbalanced learning where we have to build classification models using datasets where we have way more samples of one kind vs the others.
For example, let's say that dataset has 9,000 images and 3 classes. In an ideal world, we'll have 3,000 images for each class so that the model can learn well. But what if we have 8,900 images for one class and 50 images each for the other two? This is a skewed dataset. The resulting ML model will think everything in the world belongs to the class with 8,900 images.
How does it work?
In traditional ML models, we aim to minimize the error during the training process. There are a range of functions available that can be used to compute the error on the training dataset. It is frequently referred to as loss function.
In cost sensitive learning, we assign a penalty to an incorrect prediction. It's referred to as the "cost". We assign different costs to different types of errors. In our earlier example, we assign higher cost to errors where the model misses out on an actual illness. So now the goal is to minimize the cost of an ML model on the training dataset.
It's very similar to how we address the problem of class imbalance in skewed datasets. You can see that class imbalance is addressed using cost sensitive learning.
Cost sensitivity can be applied to almost all classification models in ML. Many libraries come equipped with it. You can use the class weight argument in classifier methods to bring cost sensitivity into the mix. You can use this in neural networks as well.
How is it used in the real world?
In addition to medical diagnosis, cost sensitivity appears in various real world use cases.
Let's consider the example of bank loans. When banks issue loans to individuals, they need to assess whether they'll repay the loan. Banks deal with a large number of individuals. And they have to operate at speed if they want to remain competitive.
Manual reviews of millions of prospective customers will take a long time. So they use machine learning to assess the chances of an individual not paying back the loan. From the bank's point of view, it's more important for them to prevent bad loans. They are relatively okay missing out on giving a loan to a good customer. It's not as bad as someone defaulting on their loan. In this situation, their ML model needs to use cost sensitive learning.
Let's consider another example of assessing security threats. Government agencies use surveillance equipment to capture audio, video, and text data to assess security threats on various fronts. There's so much data that checking all of that manually is not possible. So they employ machine learning systems to parse through all that data and detect important threats.
From their point of view, it's more important for them to prevent catastrophic events. They are relatively okay if the system flags something as a red-alert threat and it turns out to be a low-level threat. But they definitely donāt want to miss a single red-alert threat. In this situation, their ML model needs to use cost sensitive learning.
Where to go from here?
In this post, we discussed the concept of cost sensitive learning. We looked at why we need cost sensitive learning methods and how it works in practice. We talked about how it can be infused into various classification algorithms in ML. We discussed real world use cases where we employ cost sensitive learning to achieve our goals. To get a better understanding of this concept, choose a problem where one type of error is more expensive than another type. And build an ML model using a Python library of your choice.
šš„ Two new episodes on the Infinite ML pod
Demetrios Brinkmann and Vishnu Rachakonda on how they built the world's largest online hub for production Machine Learning practitioners.
Duration: 41 mins
š§ Apple Podcasts
š§ Spotify
š§ Google Podcasts
What's new in ML: I talk about predicting battery lifetimes, fighting wildfires, rainfall mapping, world's largest publicly available machine learning hub, plant nutrient detected by AI.
Duration: 16 mins
š§ Apple Podcasts
š§ Spotify
š§ Google Podcasts
š Job Opportunities in AI
Check out this job board for the latest opportunities in AI. It features a list of open roles in Machine Learning, Data Science, Computer Vision, and NLP at startups and big tech.
šš»āāļø šš»āāļø How would you rate this weekās newsletter?
You can rate this newsletter to let me know what you think. Your feedback will help make it better.
Really interesting article. The way you have explained using the examples is really interesting. Looking forward for more such content