The Patient Health Questionnaire – PHQ-9 is a well known self-report measure of nine depression symptoms according to DSM-IV/DSM-V criteria. As the PHQ-9 is not yet validated in most of former Yugoslavian countries, the goal of this research was to determine its factor structure and psychometric properties on a large non-clinical BCS language speaking sample. A total of 1875 participants (61.5% female), the average age of 28.26 (SD=8.32) years, completed the PHQ-9 via an anonymous online survey. WLSMV/DWLS based confirmatory factor analysis (CFA) revealed that two-factor model with correlated (.83) Cognitive/affective and Somatic factors fits the data well (χ(26)=287.8, p<.001; CFI=0.972, NNFI/TLI=0.961, RMSEA=0.073, 90% CI [0.066, 0.081]), better than single-factor solution (χ(27)=444.2, p<.001; CFI=0.956, NNFI/TLI=0.941, RMSEA=0.091, 90% CI [0.083, 0.098]). The twofactor model also fits better in other research in non-clinical samples (e.g. palliative care), while unidimensionality is detected in clinical/psychiatric samples. The two-factor model had strong gender invariance (i.e. configural invariance + equal loadings + equal thresholds; Δχ(Δdf)=9.48(14.75), p=.839, ΔCFI<.001). However, after adding the equal means constrains, the model remained invariant based on ΔCFI criterion (.008), but became non-invariant based on Δχ(Δdf) criterion (31.63(14.75), p=.006). The source of potential invariance was the higher Somatic score for females (Mfemales=3.32, SDfemales=2.22, Mmales=2.89, SDmales=2.23, t(1873)=4.15, p<.001), but this difference was just at the cutoff between trivial and small effect size (d=0.197). Finally, the results of bifactor analysis and good reliability of the whole scale (ω=.89) suggested that using a single PHQ-9 score is probably advisable for most purposes, but researchers should use a bifactor approach to test Cognitive/affective and Somatic domain specific hypotheses. In conclusion, the PHQ-9 exhibits well-fitting latent structure, it has strong gender factor invariance, and good reliability, suggesting good potential for its research purposes use in BCS language. However, its convergent and discriminative validation and norming on clinical samples are pending.