Scaling Author Name Disambiguation with CNF Blocking

@article{Kim2017ScalingAN,
  title={Scaling Author Name Disambiguation with CNF Blocking},
  author={Kunho Kim and Athar Sefid and C. Lee Giles},
  journal={CoRR},
  year={2017},
  volume={abs/1709.09657}
}
An author name disambiguation (AND) algorithm identifies a unique author entity record from all similar or same publication records in scholarly or similar databases. Typically, a clustering method is used that requires calculation of similarities between each possible record pair. However, the total number of pairs grows quadratically with the size of the author database making such clustering difficult for millions of records. One remedy for this is a blocking function that reduces the number… CONTINUE READING