Learn More
Twitter is a type of social media that contains diverse user-generated texts. Traditional models are not applicable to tweet data because the text style is not as gram-maticalized as that of newswire. In this paper, we construct word embeddings via canonical correlation analysis (CCA) on a considerable amount of tweet data and show the efficacy of word(More)
This paper proposes a correction method for urine biomarkers for better diagnosis of the ovarian cancer. The biomarkers were obtained from each urine sample of 163 patients (cancer: 42, benign: 121). The logistic regression of the combinations of 2 biomarkers calibrated by creatinine concentration was compared with that of the 3 biomarkers including(More)
In named entity recognition task especially for massive data like Twitter, having a large amount of high quality gazetteers can alleviate the problem of training data scarcity. One could collect large gazetteers from knowledge graph and phrase embeddings to obtain high coverage of gazetteers. However, large gazetteers cause a side-effect called " feature(More)
This paper investigates the calibration effect of urine biomarkers by creatinine. The biomarkers were obtained from each urine sample of 256 patients (cancer: 137, benign: 119). The logistic regression of the combinations of 2 biomarkers calibrated by creatinine concentration was compared with that of the 3 biomarkers including creatinine. The average AUC(More)
  • 1