The quest for a quantitative characterization of community and modular structure of complex networks produced a variety of methods and algorithms to classify different networks. However, it is not clear if such methods provide consistent, robust, and meaningful results when considering hierarchies as a whole. Part of the problem is the lack of a similarity measure for the comparison of hierarchical community structures. In this work we give a contribution by introducing the hierarchical mutual information, which is a generalization of the traditional mutual information and makes it possible to compare hierarchical partitions and hierarchical community structures. The normalized version of the hierarchical mutual information should behave analogously to the traditional normalized mutual information. Here the correct behavior of the hierarchical mutual information is corroborated on an extensive battery of numerical experiments. The experiments are performed on artificial hierarchies and on the hierarchical community structure of artificial and empirical networks. Furthermore, the experiments illustrate some of the practical applications of the hierarchical mutual information, namely the comparison of different community detection methods and the study of the consistency, robustness, and temporal evolution of the hierarchical modular structure of networks.