Defect prediction includes tasks that are based on methods gener ated using software fault data sets and requires much effort to be completed. In defect prediction, although there are methods to conduct an analysis involving the classification of data sets and localisation of defects, those methods are not sufficient without eliminating repeated data points. The NASA Metrics Data Program (Nasa MDP) and Software Research Laboratory (SOFTLAP) data sets are frequently used in this field. Here, we present a novel method developed on the Nasa MDP and SOFTLAB data sets that detects repeated data points and analyses low level metrics. Also, a framework and an algorithm are presented for the proposed method. Statistical methods have been used for detecting repeated data points. This work sheds new lights on the extent to which repeated data adversely affects defect prediction performance, and stresses the importance of using low level metrics.