Learn More
Twitter is a type of social media that contains diverse user-generated texts. Traditional models are not applicable to tweet data because the text style is not as gram-maticalized as that of newswire. In this paper, we construct word embeddings via canonical correlation analysis (CCA) on a considerable amount of tweet data and show the efficacy of word(More)
  • K Wong, E Wu, +5 authors E Yang
  • 2005
In clinical first-pass myocardial perfusion studies, physiological and patient motions are inevitable. Such motions impair the sensitivity and reliability in assessing myocardial perfusion abnormalities. The current study aims to correct the misregistration of myocardium during first-pass perfusion imaging by using a normalized mutual information approach.(More)
In named entity recognition task especially for massive data like Twitter, having a large amount of high quality gazetteers can alleviate the problem of training data scarcity. One could collect large gazetteers from knowledge graph and phrase embeddings to obtain high coverage of gazetteers. However, large gazetteers cause a side-effect called " feature(More)
This paper investigates the calibration effect of urine biomarkers by creatinine. The biomarkers were obtained from each urine sample of 256 patients (cancer: 137, benign: 119). The logistic regression of the combinations of 2 biomarkers calibrated by creatinine concentration was compared with that of the 3 biomarkers including creatinine. The average AUC(More)
This paper proposes a correction method for urine biomarkers for better diagnosis of the ovarian cancer. The biomarkers were obtained from each urine sample of 163 patients (cancer: 42, benign: 121). The logistic regression of the combinations of 2 biomarkers calibrated by creatinine concentration was compared with that of the 3 biomarkers including(More)
  • 1