This paper proposes a tempo feature extraction method. The tempo information is modeled by the narrow-band, low-pass temporal modulation component, which is decomposed into a modulation spectrum via joint frequency analysis. In implementation, the modulation spectrum is directly estimated from the modified discrete cosine transform coefficients, which are output of partial MP3 (MPEG 1 Layer 3) decoder. Then the log-scale modulation frequency coefficients are extracted from the amplitude of modulation spectrum. The tempo feature is employed in automatic music emotion classification. The accuracy is improved with several hybrid classification methods based on posterior fusion. The experimental results confirm the effectiveness of the presented tempo feature and the hybrid classification approach.