Corpus ID: 17649106

Science and Ethnicity: How Ethnicities Shape the Evolution of Computer Science Research Community

@article{Wu2014ScienceAE,
  title={Science and Ethnicity: How Ethnicities Shape the Evolution of Computer Science Research Community},
  author={Zhaohui Wu and Dayu Yuan and Pucktada Treeratpituk and C. Lee Giles},
  journal={ArXiv},
  year={2014},
  volume={abs/1411.1129}
}
Globalization and the world wide web has resulted in academia and science being an international and multicultural community forged by researchers and scientists with different ethnicities. How ethnicity shapes the evolution of membership, status and interactions of the scientific community, however, is not well understood. This is due to the difficulty of ethnicity identification at the large scale. We use name ethnicity classification as an indicator of ethnicity. Based on automatic name… Expand
Nationality Classification Using Name Embeddings
TLDR
This work designs a fine-grained nationality classifier covering 39 groups representing over 90% of the world population and exploits the phenomena of homophily in communication patterns to learn name embeddings, a new representation that encodes gender, ethnicity, and nationality which is readily applicable to building classifiers and other systems. Expand
Ethnea -- an instance-based ethnicity classifier based on geo-coded author names in a large-scale bibliographic database
We present a nearest neighbor approach to ethnicity classification. Given an author name, all of its instances (or the most similar ones) in PubMed are identified and coupled with their respectiveExpand
Large-Scale Diversity Estimation Through Surname Origin Inference
The study of surnames as both linguistic and geographical markers of the past has proven valuable in several research fields spanning from biology and genetics to demography and social mobility. ThisExpand
A profile analysis of the top Brazilian Computer Science graduate programs
TLDR
A detailed analysis of the top Brazilian Computer Science graduate programs shows that the highest ranked programs include more experienced faculty members, who have mentored more Ph.D. students, and that programs target distinct publication venues, with the best ranked ones focusing on higher quality conferences and journals. Expand
Name Nationality Classification with Recurrent Neural Networks
TLDR
Evaluation of Olympic record data shows that the proposed recurrent neural network based model which predicts nationalities of each name using automatic feature extraction achieves greater accuracy than previous feature based approaches in nationality prediction tasks. Expand
A Deep Learning Approach to Predicting Race Using Personal Name and Location (Natural Language Processing)
In this project, I train a recurrent neural network to predict individuals’ race using information contained in their name and location of residence. I introduce a novel data source that containsExpand

References

SHOWING 1-10 OF 39 REFERENCES
ePluribus: Ethnicity on Social Networks
TLDR
An approach to determine the ethnic breakdown of a population based solely on people's names and data provided by the U.S. Census Bureau is demonstrated to be able to predict the ethnicities of individuals as well as the ethnicity of an entire population better than natural alternatives. Expand
A review of name-based ethnicity classification methods and their potential in population studies
Several approaches have been proposed to classify populations into ethnic groups using people's names, as an alternative to ethnicity self-identification information when this is not available. TheseExpand
Ethnic Scientific Communities and International Technology Diffusion
  • W. Kerr
  • Business, Economics
  • The Review of Economics and Statistics
  • 2008
This study explores the role of U.S. ethnic scientific and entrepreneurial communities for international technology transfer to their home countries. U.S. ethnic researchers are quantified through anExpand
Group formation in large social networks: membership, growth, and evolution
TLDR
It is found that the propensity of individuals to join communities, and of communities to grow rapidly, depends in subtle ways on the underlying network structure, and decision-tree techniques are used to identify the most significant structural determinants of these properties. Expand
Race, Ethnicity, and NIH Research Awards
TLDR
It is found that Asians are 4 percentage points and black or African-American applicants are 13 percentage points less likely to receive NIH investigator-initiated research funding compared with whites, after controlling for the applicant’s educational background, country of origin, training, previous research awards, publication record, and employer characteristics. Expand
National characteristics in international scientific co-authorship relations
  • W. Glänzel
  • Political Science, Computer Science
  • Scientometrics
  • 2004
TLDR
As expected, international co-authorship, on an average, results in publications with higher citation rates than purely domestic papers, however, the influence of international collaboration on the national citation impact varies considerably between the countries (and within one individual country between fields). Expand
Name-ethnicity classification from open sources
TLDR
This paper reports on the development of an ethnicity classifier where all training data is extracted from public, non-confidential (and hence somewhat unreliable) sources, and uses hidden Markov models (HMMs) and decision trees to classify names into 13 cultural/ethnic groups with individual group accuracy comparable accuracy to earlier binary classifiers. Expand
A geographical analysis of knowledge production in computer science
TLDR
The patterns of collaboration analyzed in this paper contribute to an overall understanding of Computer Science research in different geographical regions that could not be achieved without the use of complex networks and a large publication database. Expand
Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching
TLDR
A novel alignment-based name matching algorithm, based on Smith-Waterman algorithm and logistic regression, is proposed, which can effectively identify nameethnicity from personal names in Wikipedia, which is used to define name-ethnicity to within 85% accuracy. Expand
Understanding Patterns of International Scientific Collaboration
International scientific collaboration has increased both in volume and importance. In this article, the authors study the interpretation of macro-level data on international co authorshipExpand
...
1
2
3
4
...