CISA: Chinese Information Structure Analysis for Scientific Writing with Cross-lingual Adversarial Learning

  title={CISA: Chinese Information Structure Analysis for Scientific Writing with Cross-lingual Adversarial Learning},
  author={Hen-Hsen Huang and Hsin-Hsi Chen},
This work demonstrates a writing assistant system that provides high level advice for Chinese scientific writing. Cross-lingual approaches are investigated to analyze the information structure of a given Chinese abstract and retrieve useful knowledge in the related work written in both English and Chinese. To the best of our knowledge, this is the first study on Chinese information structure identification. Without the need of labeled Chinese data, our novel model is capable of dealing… 
1 Citations

Figures and Tables from this paper

AutoSurvey: Automatic Survey Generation based on a Research Draft

This work presents AutoSurvey, an intelligent system that performs literature survey and generates a summary specific to a research draft that is extremely used for both academic and educational purposes.



Overview of NLP-TEA 2016 Shared Task for Chinese Grammatical Error Diagnosis

This paper presents the NLP-TEA 2016 shared task which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language, and describes the task definition, data preparation, performance metrics, and evaluation results.

Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources

Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, it is shown that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.

Cross-language Learning with Adversarial Neural Networks

This work proposes to use adversarial training of neural networks to learn high-level features that are discriminative for the main learning task, and at the same time are invariant across the input languages.

DISA: A Scientific Writing Advisor with Deep Information Structure Analysis

DISA, a higher-level writing assistant system, is demonstrated, which analyzes the information structure of abstracts, and retrieves the knowledge according to the research goals from the related work, by incorporating the latest neural-network technologies.

Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM

This paper proposes (bidirectional) LSTM sequence labeling models and explores various features to detect word usage errors in Chinese sentences and achieves accuracy 0.5138 and MRR 0.6789 on the HSK dataset.

BilBOWA: Fast Bilingual Distributed Representations without Word Alignments

It is shown that bilingual embeddings learned using the proposed BilBOWA model outperform state-of-the-art methods on a cross-lingual document classification task as well as a lexical translation task on WMT11 data.

Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings

A novel word representation learning model called Bilingual Word Embeddings Skip-Gram (BWESG) is presented which is the first model able to learn bilingual word embeddings solely on the basis of document-aligned comparable data.

UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation

The acquisition of a large scale and high quality parallel corpora for English and Chinese for Statistical Machine Translation (SMT) is described, designed to embrace eight different domains.

Using Argumentative Zones for Extractive Summarization of Scientific Articles

This work develops a summarization system that uses AZ categories as features and in the final sentence selection process, and shows that AZ can support both full document and customized summarization.

Argumentative Zoning for Improved Citation Indexing

  • Simone Teufel
  • Computer Science
    Computing Attitude and Affect in Text
  • 2006
The problem of automatically classifying academic citations in scientific articles according to author affect is addressed, using machine learning from indicators of affect and presentation of ownership of ideas to improve citation indexing.