This report provides a description of the methods applied in CIST system participating ACL MultiLing 2013. Summarization is based on sentence extraction. hLDA topic model is adopted for multilingual multi-document mod-eling. Various features are combined to evaluate and extract candidate summary sentences.
Hierarchical Latent Dirichlet Allocation (hLDA) has achieved good results in the supervised and unsupervised multi-document hierarchical topic modeling. However, the result is diversified. The results maintain randomness even with the same parameters. Thus, this paper proposed automatic evaluation methods for unsupervised multi-document hLDA modeling… (More)