What Is Geometric Deep Learning
What it's all about. Why is it important. How it's used in the real world.
Hey reader, welcome to the 💥 free edition 💥 of my weekly newsletter. I write about ML concepts, how to build ML products, and how to thrive at work. You can learn more about me here. Feel free to send me your questions and I’m happy to offer my thoughts. Subscribe to this newsletter to receive it in your inbox every week.
Deep Learning has turbocharged the Machine Learning domain. The neural network architecture allows Deep Learning models to solve a variety of real world problems including recognizing faces, understanding text, and identifying speech. It works great on such 1D and 2D data. But the real world is 3D. It has graphs and networks. Does deep learning work in these situations?
Why do we need Geometric Deep Learning?
Deep Learning algorithms need Euclidean data to work. This includes data in the 1D and 2D domains such as text, images, audio, sensor data, and so on. Here are a couple of examples of Deep Learning models:
Convolutional Neural Networks explicitly account for the 2D nature of image data
Recurrent Neural Networks explicitly account for the sequential nature of audio/sensor data.
Humans live in the 3D world. The data in this world contains complex items. They need to be represented with more accuracy than 1D or 2D representations. Examples include graphs, networks, molecular structures, trees, manifolds, point clouds, and so on.
Deep Learning assumes that the data can be transformed to lower dimension and the relationships can be preserved. For example, let's say you have a square shape. If you rotate it, you will still preserve the angles and distances within this shape.
The point to note is that this is only true for Euclidean data. Non-Euclidean data doesn't behave this way. The relationships in the data won't be preserved if you do that. For example, let's say you take a cube shape. If you project it onto a 2D plane, the relationships won't be preserved as the cube is moved around.
If you build a model based on this, it won't be accurate. It will be an approximation. Current Deep Learning models can't deal with non-Euclidean data in its native form. That's where Geometric Deep Learning comes in.
Where did it all start
Geometric Deep Learning first came to light in a 2017 paper by Bronstein et al titled Geometric Deep Learning: Going beyond Euclidean data. In addition to that, they (Bronstein, Bruna, Cohen, and Veličković) wrote a preview to a book titled Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. You can get all the information on their website.
How do Deep Learning models learn?
Let's consider the example of recognizing faces. Human faces are 3-dimensional. When we want to build a face recognition system, we take pictures of many faces to form a training dataset. These pictures are 2D image files. What we're doing here is converting data from the non-Euclidean (3D) to Euclidean (2D) form so that the model can learn from it. We're losing valuable information along the way.
Why not just capture it in 3D and directly feed it to a Deep Learning model? Current Deep Learning algorithms are not setup to directly deal with 3D data. Why? Because when we train deep learning models, we're using the concept that these models can learn the patterns by changing the structure of that data. The way the structure is changed in relational.
You cannot change the structure of non-Euclidean data in the same way because it won't retain its properties. To consider the earlier example of square and cube, the angles and distances won't remain the same.
How do we build models that can directly learn from non-Euclidean data?
Geometric Deep Learning is a field under Deep Learning that aims to build neural networks that can learn from non-Euclidean data.
Where can Geometric Deep Learning be used in the real world?
Here are 4 real world examples:
3D modeling for pose detection
A Convolutional Neural Network has to convert 3D data into 2D image and then learn from it. There's loss of information during this process. The geometric equivalent of this network can directly learn from 3D data. You can accurate estimate human poses from the 3D data captured.
Predicting the spread of viral infection
You can train a network to predict the probability of a viral infection based on previous cases in the database. The data that the doctors and nurses gathered from patients will be in the form of graphs. It can be fed as-is to a neural network and it will learn how to predict the likelihood of the disease spreading to new patients.
Visual segmentation
This is very useful for autonomous cars. The autonomous car has to capture real world data and accurately segment it into its constituent parts. In these situations, the pedestrians are usually represented 2D bounding boxes or as skeletons. With 3D segmentation, the autonomous car's perception would be more accurate and it can put accurate 3D bounding boxes on everything in the scene.
Molecular modeling
A Recurrent Neural Network has to convert graph data into strings and then learn from it. There's loss of information during this process. The geometric equivalent of this neural network can directly learn from molecular graph data. It's very useful in medicine, chemistry, and pharmaceuticals.
Why is this exciting?
In Deep Learning, each algorithm specializes in a single type of data. For example, Convolutional Neural Networks are used for images and Recurrent Neural Networks are used for text/audio/sensors. In Geometric Deep Learning, we have Graph Neural Networks (GNNs) that are built for graphs and networks. It can potentially be generalized to all the data as it appears in the real world.
It allows us to take advantage of data with its inherent relationships and connections.
In Geometric Deep Learning, we can analyze data in its native form as opposed to converting it into a lower dimensional space. We can use non-Euclidean data directly to train models. There will be no loss of information from the rich data we collect. We can build generalizable models that can work with data as it appears in the real world.
🔥 Featured job opportunities
If you're looking for latest job opportunities in AI, you can check out Infinite AI Job Board. It features a list of open roles in Machine Learning, Data Science, Computer Vision, and NLP at startups and big tech.
💁🏻♀️ 💁🏻♂️ How would you rate this week’s newsletter?
Your feedback is important. You can rate this week’s post to let me know what you think. It will help make this newsletter better.
Thank you for this detailed explanation
Sounds like “Non-Euclidean” is the same thing as “non-spacial data or spatial data in more than 2 dimensions.” Is that accurate?
Also, I’m a little confused by the discussion about reducing the dimensionality of the data. You talk about 3D data being projected to 2D approximation and how that loses information. But when you talk about 2D data wouldn’t the correct anaology be projecting it to a 1D approximation?