Hey reader, welcome to the š„ free edition š„ of my weekly newsletter. I write about ML concepts, how to build ML products, and how to thrive at work. You can learn more about me here. Feel free to send me your questions and Iām happy to offer my thoughts. Subscribe to this newsletter to receive it in your inbox every week.
Ensemble learning is one of the most powerful techniques in machine learning. Models that are built using ensemble learning are robust and can perform well under a wide variety of situations. They're also lightweight compared to other models, which comes in handy when you build and deploy them. But why do we need ensemble learning in the first place? And what makes it special?
Why do we need ensemble learning
Let's say you want to buy a wireless speaker. You want the sound quality to be great and you want the price to be reasonable as well. To do your research, you look online for deals. You keep an eye out for any discount offers. You ask your friends. And you walk into electronics stores to check it out yourself. Once you gather all the information, you then make a decision on what wireless speaker to buy.
Why did you gather information from multiple sources? Why not just go to a single place and buy the first speaker you come across?
Because we want to aggregate the knowledge from different sources and make the best decision. We don't want to be susceptible to the biases and errors of a single source. Ensemble learning does something similar in machine learning. It aggregates the decisions of multiple models and combines them to give the best result. This improves the reliability and accuracy of the predictions.
What is an ensemble
An ensemble refers to a group of complementary parts that contribute towards a single goal. In machine learning models, we have many sources of error that causes the accuracy and reliability to go down. Examples include noise, bias, and variance. Ensemble learning methods aim to minimize these errors.
In ensemble learning, we train multiple machine learning models and combine their outputs. Each of those models are called base learners. These base learners are models that can make fast decisions. Each model by itself might not be strong, but the combined model ends up being super strong. It leads to improved stability, higher reliability, and more accurate predictions.
How does ensemble learning combine those base learners
There are multiple techniques that can be used to create an ensemble learning model. The goal of the ensemble learning model is to combine those base learners in a way that minimizes the error.
There are 4 main ways to combine these base learners:
Maximum voting: Let's consider the wireless speaker example. If most of your sources recommend Sonos, then you end up buying the Sonos speaker. This is called maximum voting. The ensemble learning model will output the result based on what item has the highest vote.
Weighted averaging: Some of your sources say Sonos, some say Bose, and some say JBL. Now what do you do? You look at who made those recommendations. If someone with a lot of knowledge in electronics has made a recommendation, you probably want to give that more weightage as compared to people who don't know much about electronics. Once you tally it up, you can make the decision on what to buy. This is called weighted averaging. The ensemble learning model will output the result based on the weighted average.
Bagging: This lets each base learner train on a different random subset of the training dataset in parallel. The goal is to reduce the variance that some machine learning models are sensitive to. You're allowing the base learners to look for all the variations so that the overall outcome is robust. You also reduce the possibility of overfitting by randomly selecting the training subset for each model. Random Forest is perhaps the most popular algorithm that uses bagging.
Boosting: While bagging trains the models in parallel, boosting trains the models sequentially. Each base learner has its own strength. The goal of boosting is to combine the strength of these base learners by iteration. Let's say we train one base learner on the dataset. We now compare the model results to the ground truth to find out what cases were classified incorrectly. These cases are now given higher weightage when we train the next base learner. This allows for the mistakes to be corrected by the subsequent base learners, thereby resulting in a robust overall model. Gradient Boosting is perhaps the most popular algorithm that uses boosting.
Where to go from here
The number of uses of ensemble learning has grown tremendously due to the increased computing power in recent years. It is used for face recognition, cybersecurity, image analysis, detecting financial fraud, healthcare, and more. Ensemble learning is available in all the machine learning libraries. You can start with scikit-learn and get familiar with them by building new models.
š New podcast episode
Infinite Machine Learning is a podcast that features conversations with amazing leaders in this field and explores how they built their career. I invited Alexey Grigorev for the debut episode. He is the founder of DataTalks.Club and an author of multiple ML books. You can listen to the episode on Apple Podcasts, Google Podcasts, or Spotify.
š„ Featured job opportunities
Check out Infinite AI Job Board for the latest job opportunities in AI. It features a list of open roles in Machine Learning, Data Science, Computer Vision, and NLP at startups and big tech.
šš»āāļø šš»āāļø How would you rate this weekās newsletter?
You can rate this newsletter to let me know what you think. Your feedback will help make it better.