In this post, we’ll talk about:
why do we need Mixture Models
how do Gaussian Mixture Models (GMMs) work
Where are GMMs used in the real world
If you have a question, submit it here and I’ll get back to you with my thoughts. Subscribe to this newsletter to receive it in your inbox every week.
Let's say you have a large dataset on the household consumption of groceries across all of US. You have information about their income, location, age, and the number of family members. Let's assume that there's a statistical model that governs the underlying distribution. And now you want to estimate what that model is. Why? Because you can predict the household grocery consumption of a family that's not in the dataset using this model.
To estimate the model, the simplest thing you can do would be to compute the mean and variance of this data. This will get you the governing distribution. But what if there are several different groups within this data? How do we detect the presence of subpopulations within an overall population? This is where Mixture Models come into the picture.
What is a Mixture Model?
A Mixture Model is a probabilistic model consisting of a number of component functions. These component functions are then combined to provide an overall model. This model basically allows you to account for subpopulations within the larger population. The component functions will model the subpopulations and you can then combine them to create the overall model.
Let's say that you want to analyze the data from different countries in Africa. It's not sufficient to model it as a single big group. You need to understand what countries are inside Africa, where they are located, what their boundaries are, and what each country consists of. If you want to get a good representative model, you need to account for the multiple modes within this larger dataset.
How are Mixture Models used in practice?
One application of Mixture Models is to model the colors of an object in order to perform tasks such as color-based tracking and segmentation. These tasks can be made more robust by generating a Mixture Model corresponding to background colors in addition to a foreground model. You can then employ a simple Bayesian classifier to perform pixel classification. Mixture Models are useful in coping with slowly-varying lighting conditions as well.
Mixture Models are semi-parametric models. What does that mean? It means that they have both parametric and nonparametric components.
Parametric means that the model depends on a fixed set of parameters. It assumes that the data comes from a population that can be modeled by a probability distribution.
Nonparametric means that the model doesn't have a fixed set of parameters. It either doesn't follow any distribution or has a specified distribution where we don't know the parameters.
Compared to purely nonparametric methods, Mixture Models provide greater flexibility and precision in modeling the data. They are able to smoothen the gaps resulting from sparse sample data.
What is a Gaussian Mixture Model?
A Gaussian Mixture Model (GMM) is a Mixture Model where the component functions are Gaussians. Didn't see that coming, did you? It's quite a shocker!
GMMs are parametric models and are represented as a weighted sum of Gaussians. They are commonly used to model the distribution of continuous measurements in a system. The parameters of a GMM are estimated from training data using the iterative Expectation-Maximization (EM) algorithm or Maximum A-Posteriori (MAP) estimation from a well-trained prior model.
In terms of neural networks, GMMs can be viewed as a form of generalized radial basis function network in which each Gaussian component is a basis function. What's a radial basis function network? It's a neural network that uses radial basis functions as activation functions. And the component priors can be viewed as weights in an output layer.
Where do we use GMMs?
GMMs are statistically mature methods for data modeling. We can use them for various applications:
Speaker Identification: You can identify the person who is talking without having to depend on the actual words that are being uttered.
Housing market: The prices of different houses vary in different neighborhoods across different cities. But the prices of specific types of houses tend to cluster around each other. GMMs can be used to understand how these clusters are formed.
Image Retrieval: GMMs of texture and color are built to retrieve images quickly from a huge database.
Facial Expression Models: Analysis of facial expressions and building corresponding models for it.
Finance: GMMs are used to model market fluctuations and volatility.
Multimodal Biometric Verification: Using different parts of the body such as eyes, fingers, and voice to build a robust biometric verification system.
Where to go from here?
In this post, we discussed the concept of Gaussian Mixture Models. We talked about Mixture Models and how they're constructed. We discussed how GMMs are used in practice. It's a useful tool to have in your arsenal as you analyze data across different scenarios.