In this paper, we describe our process of creating a citation graph from a given repository of physics publications in LATEX format. The task involved a series of information extraction, data cleaning, matching and ranking steps. This paper describes the challenges we faced along the way and the issues involved in resolving them.
