Stefanie Jegelka, an associate professor in MIT’s Department of Electrical Engineering and Computer Science, investigated the prospect of unravelling the “black boxes” enigma left by the deep learning process. The so-called black box nomenclature developed because of the complexity of deep learning nodes, and even the scientists who built them do not understand everything underneath the shell.
Researchers still need to learn everything that happens inside a deep-learning model or how it might impact how a model learns and behaves. But Jegelka is excited to continue researching these things since she isn’t happy with the “black box” prompt.
“With machine learning, you can achieve much, but only if you have the correct model and data. So building an understanding relevant to practice will help us design better models and help us understand what is going on inside them so we know when we can deploy a model and when we can’t. It is not a black-box device that you throw at data, and it works,” said the woman, who is also a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Institute for Data, Systems, and Society (IDSS) (IDSS).
Deep learning models frequently outperform people in the real world, such as detecting financial crime from credit card activity or identifying cancer in medical pictures. But what are these deep learning models learning?
These strong machine-learning models are often built on artificial neural networks with millions of nodes processing data to create predictions. She then delved into deep learning to understand what these models can learn, how they behave, and how to incorporate specific prior knowledge into these models.
Building an understanding that is relevant in deep learning practice, according to Jegelka, will help researchers develop a better model and comprehend what is happening. The more she studied machine learning, the more she was drawn to the challenges of understanding how models behave and how to manipulate this behaviour.
Graph models
Jegelka is specifically interested in optimising machine-learning models with graph input information. Graph data presents unique issues because it contains information about individual nodes and edges and the structure — what is related to what. Furthermore, graphs feature mathematical symmetries that must be honoured by the machine-learning model such that, for example, the same chart always yields the same forecast.
However, incorporating such symmetries into a machine-learning model is typically tricky. Take, for example, molecules. Molecules can be represented as graphs, with vertices representing atoms and edges representing chemical bonds between them. As a result, pharmaceutical companies may use deep learning to rapidly anticipate the properties of numerous compounds, reducing the number of molecules that must be physically tested in the lab.
Jegelka investigates approaches for developing mathematical machine-learning models that can effectively take graph data as input and produce something else, in this case, a prediction of the chemical properties of a molecule. This is especially difficult because the qualities of a molecule are influenced not only by the atoms within it but also by the connections between them. Traffic routing, chip design, and recommender systems are some applications of machine learning on graphs.
Deep learning consistency
What motivates Jegelka is her interest in the principles of machine learning, particularly the issue of robustness. Frequently, a model performs well on training data but degrades when deployed on slightly different data.
For instance, the model may have been trained on small molecular graphs or traffic networks, but the charts it encounters once deployed are much larger or more complex. Building past knowledge into a model can increase its reliability but recognising what information the model requires and how to incorporate it is more complicated.
She approaches this challenge by fusing her interest in algorithms and discrete mathematics with her enthusiasm for machine learning. She feels the model will be unable to learn everything due to various difficulties in computer science. However, how you build up the model determines what you can and cannot comprehend.