To view this page ensure that Adobe Flash Player version 11.1.0 or greater is installed.
In the Flash applet built with flare above, we explore the duality between finding community structure in networks and minimizing the description length of a random walker's movements on a network. For a given network partition
M, the map equation specifies the theoretical limit
L(M) of how concisely we can describe the trajectory of this random walk.
The underlying code structure of the map equation is designed such that the description can be compressed if the network has regions in which the random walker tends to stay for a long time. Therefore, with a random walk as a proxy for real flow, minimizing the map equation over all possible network partitions reveals important aspects of network structure with respect to the dynamics on the network.
To take advantage of the regional structure of the network, one index codebook and
m module codebooks, one for each module in the network, are used to describe the random walker's movements. The module codebooks have codewords for nodes within each module (and exit codes to leave the module), which are derived from the node visit/exit frequencies of the random walker. The index codebook has codewords for the modules, which are derived from the module switch rates of the random walker. Therefore, the average length of the code describing a step of the random walker is the average length of codewords from the index codebook and the module codebooks weighted by their rates of use (mouseover the map equation for more information):
M. For module partition
mmodules, the lower bound of the average length of the code describing a step of the random walker.
mmodules. The total height of the blocks under "Index codebook" in the visualization above corresponds to this rate.
mmodule codebooks. This corresponds to the total height of the blocks under "Module codebooks" in the visualization above. For module
i, this is given by the fraction of time the random walker spends in module
i, which is given by the total probability that any node in the module is visited, plus the probability that it exits the module and the exit message is used.
i. The entropy of the relative rates at which the random walker exits module
iand visits each node in module
imeasures the smallest average codeword length that is theoretically possible. The heights of individual blocks under "Module codebooks" in the visualization above correspond to the relative rates and the codeword lengths approximately correspond to the negative logarithm of the rates in base 2.
In the Flash applet, release a random walker and see how, with multiple Huffman codebooks, we can represent its trajectory on the network with a compressed binary message. Change the assignment of nodes into modules by changing their node colors and see how the description length changes. The code structure corresponding to the current partition is shown by the stacked boxes on the right. With multiple module codebooks, each of which re-uses a similar set of codewords, we must specify which module codebook should be used. This is implemented by using the index codebook shown to the left and exit codes in each module (marked by arrows). The coding procedure is as follows. The index codebook specifies a module codebook, and the module codebook specifies a succession of nodes within that module. When the random walker leaves the module, and we need to return to the index codebook, we use the exit code. The codeword lengths in the index codebook are derived from the relative rates at which a random walker enters each module. The codeword lengths for each module codebook are derived from the relative rates at which a random walker visits each node in the module or exits the module. Mouseover objects in the visualization for more information.