Linear Discriminant Analysis (LDA) is a popular tool for multiclass discriminative dimensionality reduction. However, LDA suffers from two major problems: (1) It only optimizes the Bayes error for the case of unimodal Gaussian classes with equal covariances (assuming full rank matrices) and, (2) The multiclass extension maximizes the sum of pairwise distances between the classes, and does not “simultaneously” maximize each pairwise distance between the classes. This typically results in serious overlapping in the projected space between classes that are “close” in the input space. To solve these two problems, this paper proposes Pareto Discriminant Analysis (PARDA). Firstly, PARDA explicitly models each of the classes as a multidimensional Gaussian with a sample covariance. Secondly, PARDA decomposes the multiclass problem to a set of pairwise objective functions representing the pairwise distance between different classes. Unlike existing extensions of Fisher discriminant analysis (FDA) to multiclass problems, that typically maximize the sum of pairwise distances between classes, PARDA simultaneously maximizes each pairwise distance, thus encouraging the case that all classes are equidistant from each other in the lower dimensional space. Solving PARDA is a multiobjective optimization problem – simultaneously optimizing more than one, possibly conflicting, objective functions – and the resulting solution is known to be “Pareto Optimal”. Experimental results on synthetic data, several image data sets and data sets from the UCI repository show positive and encouraging results in favor of PARDA when compared with standard and state-of-the-art multiclass extensions of LDA.