Interprocessor synchronization, while extremely important for ensuring execution correctness, can be very costly in terms of both power and performance overheads. Unfortunately, many parallelizing compilers are very conservative in inserting barrier synchronizations at the end of each and every parallel loop. This can lead to significant power consumption in chip multiprocessor based execution environments. This paper proposes a compiler-directed approach for eliminating such synchronization calls between neighboring parallel loops. It achieves its goal by partitioning loop iterations across processors such that each processor executes iterations from both the loops that access the same set of array elements. We implemented the proposed approach using an experimental compilation framework and made experiments with ten SPEC benchmark codes. Our experiments clearly show that the proposed compiler-directed approach is very effective and reduces energy overheads due to synchronizations by about 75.5%, and this corresponds to around 5.48% saving on average in overall energy consumption.