Map Equation logo
The Map Equation

Hairball graph

A network is not enough

Networks of nodes and links are powerful abstractions of complex systems. But when the networks have thousands of nodes and links, they are themselves too complicated to comprehend, unless we simplify and highlight their organization.

Maps of networks

Geographic maps simplify and highlight streets, neighborhoods, cities, and highways from high-resolution satellite images. We want to create similar maps of networks.

The Earth
Map over Europe
Map over Umeå
Map over Umeå University

Network flows

We are often not interested in the network itself. We want to know how things move — flow — on the network.

  • The flow of ideas in social networks
  • Passenger moving in traffic networks
  • Money in transaction networks

The things moving on a network tend to stay within certain groups of nodes for a relatively long time before exiting. We call these groups modules, and these are what we are interested in.

We can simulate the flows using a random walk on the network. Notice how the random walker will tend to get stuck in modules. To prevent getting too stuck, it teleports with a low probability to a random node.

The duality between compression and finding regularities

Compression algorithms use regularities to compress data. The more regularities they find, the better they can compress. In this image, the top half is easier to compress than the bottom part because of the repeated pattern in the clear blue sky.

5.8 MB → 0.91 MB
top
5.8 MB → 2.8 MB

By searching for optimal compression we will find the regularities that simplifies the data.

Huffman coding

To use the machinery of information theory, we describe the random walker with a binary message. Huffman coding (Like Morse code, more frequently used symbols should be shorter).

0000110001011001100000001110000111000100001101110000110111110010101001001101100000100000111011111010101111110111100001110101111100
Average codelength: 0.000 bits
01000010111011001000100101011011100100100011100100101110110111010111000
Average codelength: 0.000 bits
010111110111001000111110011101100101010010100011100100100001011111011010010110011100001101010100101000
One-level codelength 4.133 bits
Index codelength 0.136 bits
Module codelength 2.661 bits
Codelength 2.796 bits