Numerous studies have documented the behavioral advantages conferred on professional musicians and children undergoing music training in processing speech sounds varying in the spectral and temporal dimensions. These beneficial effects have previously often been associated with local functional and structural changes in the auditory cortex (AC). However, this perspective is oversimplified, in that it does not take into account the intrinsic organization of the human brain, namely, neural networks and oscillatory dynamics. Therefore, we propose a new framework for extending these previous findings to a network perspective by integrating multimodal imaging, electrophysiology, and neural oscillations. In particular, we provide concrete examples of how functional and structural connectivity can be used to model simple neural circuits exerting a modulatory influence on AC activity. In addition, we describe how such a network approach can be used for better comprehending the beneficial effects of music training on more complex speech functions, such as word learning.