Cross-Lingual Dataless Classification for Many Languages

@inproceedings{Song2016CrossLingualDC,
  title={Cross-Lingual Dataless Classification for Many Languages},
  author={Yangqiu Song and Shyam Upadhyay and Haoruo Peng and Dan Roth},
  booktitle={IJCAI},
  year={2016}
}
Dataless text classification [Chang et al., 2008] is a classification paradigm which maps documents into a given label space without requiring any annotated training data. This paper explores a crosslingual variant of this paradigm, where documents in multiple languages are classified into an English label space. We use CLESA (cross-lingual explicit semantic analysis) to embed both foreign language documents and an English label space into a shared semantic space, and select the best label(s… CONTINUE READING

Figures, Tables, and Topics from this paper.

References

Publications referenced by this paper.
SHOWING 1-10 OF 28 REFERENCES

In ACL

Peter Prettenhofer, Benno Stein. Cross-language text classification using st learning
  • pages 1118–1127,
  • 2010
VIEW 24 EXCERPTS
HIGHLY INFLUENTIAL

In NAACL-HLT

Yangqiu Song, Dan Roth. Unsupervised sparse vector densification for similarity
  • pages 1275–1280,
  • 2015
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

In AAAI

Yangqiu Song, Dan Roth. On dataless hierarchical text classification
  • pages 1579–1585,
  • 2014
VIEW 6 EXCERPTS
HIGHLY INFLUENTIAL

In NIPS

Tomas Mikolov, Ilya Sutskever, +4 authors their compositionality
  • pages 3111–3119.
  • 2013
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

In ECIR

Martin Potthast, Benno Stein, Maik Anderka. A wikipedia-based multilingual retrieval model
  • pages 522–530,
  • 2008
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

NewsWeeder: Learning to Filter Netnews

  • ICML
  • 1995
VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

In ACL

Karl Moritz Hermann, Phil Blunsom. Multilingual models for compositional dis semantics
  • pages 58–68,
  • 2014