How big data has moved graph theory to new dimensions

Graph theory is not enough.

Mathematical language for conversations about connections, which usually depends on networks-vertices (points) and edges (lines that connect them), is an invaluable way of modeling phenomena in the real world, at least since the 18th century. But a few decades ago, the advent of giant datasets forced researchers to expand their toolkits and at the same time gave them extensive sandboxes in which new mathematical ideas could be applied. Since then, said Josh Grohov, a computer scientist at the University of Colorado, Boulder, there has been an exciting period of rapid growth as researchers have developed new types of network models that can find complex structures and signals in big data noise.

Grohov is among the growing researchers who note that when it comes to finding connections in big data, graph theory has its limits. The chart represents all relationships as a dyad or pairwise interaction. However, many complex systems cannot be represented by binary connections alone. Recent progress in this area shows how to move forward.

Try to create a network model of parenting. Obviously, each parent has a connection to the child, but a parental relationship is not just the sum of two references, as graph theory can model. The same goes for trying to simulate a phenomenon like peer pressure.

“There are a lot of intuitive models. The impact of peer pressure on social dynamics is recorded only if your data already has groups, ”said Leoni Neuheiser of RWTH Aachen in Germany. But binary networks do not capture the influence of the group.

Mathematicians and computer scientists use the term “higher-order interaction” to describe these complex ways in which group dynamics, rather than binary relationships, can influence individual behavior. These mathematical phenomena appear in everything from interactions in quantum mechanics to the trajectory of a disease that spreads across a population. For example, if a pharmacologist wanted to simulate drug interactions, graph theory could show how two drugs respond to each other, but what about three? Or four?

Although the tools for studying these interactions are not new, only in recent years have arrogant datasets become the engine of discovery, giving mathematicians and network theorists new ideas. These efforts have yielded interesting results regarding the boundaries of schedules and the possibilities for increasing them.

“Now we know that the network is just a shadow,” Grokhov said. If a data set has a complex basic structure, then modeling it as a graph can reveal only a limited projection of the entire history.

Emily Purwin of the Pacific Northwest National Laboratory is excited about the ability of tools such as hypergraphs to detect finer connections between data points.

Photo: Andrea Starr / Northwest National Laboratory of the Pacific

“We realized that the data structures we used to study things, from a mathematical point of view, didn’t quite match what we saw in the data,” said mathematician Emily Purwin of the Pacific Northwest National Laboratory.

This is why mathematicians, computer scientists, and other researchers are increasingly focusing on ways to generalize graph theory — in its many forms — to study higher-order phenomena. The last few years have brought a number of proposed ways to characterize these interactions and mathematically test them in sets of high measures.

For Purvine, a mathematical study of higher-order interactions is like mapping new dimensions. “Think of the schedule as the basis for a two-dimensional plot of land,” she said. The three-dimensional buildings that can go from the top can vary significantly. “When you go down to ground level, they look the same, but what you create on top is different.”

Enter Hypergraph

Finding these structures of a higher size is where math becomes particularly dim — and interesting. An analogue of a higher-order graph, for example, is called a hypergraph, and instead of edges it has “hyper-edges.” They can connect several nodes, and therefore can represent a multilateral (or multi-line) relationship. Instead of a line it is possible to overlap a surface, as a tarpaulin located in three and more places.

Which is normal, but we still don’t know much about how these structures relate to their conventional counterparts. Mathematicians are currently studying which rules of graph theory also apply to higher-order interactions by proposing new areas of study.

To illustrate the relationships that a hypergraph can extract from a large set of data – and a conventional graph cannot – Purwin points to a simple example, close to home, to the world of scientific publications. Imagine two sets of data, each containing articles co-authored by up to three mathematicians; for simplicity call them A, B, and C. One data set contains six documents, two papers each of three different pairs (AB, AC, and BC). The other contains only two papers, each co-authored by all three mathematicians (ABC).

Leave a Reply

Your email address will not be published. Required fields are marked *