A network is not enough
Networks of nodes and links are powerful abstractions of complex systems. But when the networks have thousands of nodes and links, they are themselves too complicated to comprehend, unless we simplify and highlight their organization.
Maps of networks
Geographic maps simplify and highlight streets, neighborhoods, cities, and highways from high-resolution satellite images. We want to create similar maps of networks.
We are often not interested in the network itself. We want to know how things move — flow — on the network.
- The flow of ideas in social networks
- Passenger moving in traffic networks
- Money in transaction networks
The things moving on a network tend to stay within certain groups of nodes for a relatively long time before exiting. We call these groups modules, and these are what we are interested in.
We can simulate the flows using a random walk on the network. Notice how the random walker will tend to get stuck in modules. To prevent getting too stuck, it teleports with a low probability to a random node.
The duality between compression and finding regularities
Compression algorithms use regularities to compress data. The more regularities they find, the better they can compress. In this image, the top half is easier to compress than the bottom part because of the repeated pattern in the clear blue sky.
By searching for optimal compression we will find the regularities that simplifies the data.
To use the machinery of information theory, we describe the random walker with a binary message. Huffman coding (Like Morse code, more frequently used symbols should be shorter).
One-level codelength 4.133 bits
Index codelength 0.136 bits
Module codelength 2.661 bits
Codelength 2.796 bits