PurposeThe purpose of this paper is to evaluate how to use nodes in a cluster efficiently by studying the NAS Parallel Benchmarks (NASPB) on Intel Xeon and AMD Opteron dual CPU Linux clusters.Design/methodology/approachThe performance results of the NASPB are presented both with one MPI process per node (1 ppn) and with two MPI processes per node (2 ppn). These benchmark results were analyzed by considering the impact of cache effects, code scalability, memory bandwidth within nodes, and the impact of MPI and the MPI communication network. Memory bandwidth was benchmarked using MPI versions of the Streams benchmarks. The impact of MPI and the MPI communication network are evaluated by benchmarking the performance of MPI sends and receives, MPI broadcast, and the MPI all‐to‐all routines.FindingsThe performance results from running the NASPB and from the memory bandwidth benchmarks show that better performance can sometimes be achieved using 1 ppn. Performance results show that the AMD Opteron/Myrinet cluster is able to achieve significantly better utilization of the second processor than the Intel Xeon/Myrinet cluster.Practical implicationsMost Linux clusters are purchased with two processors per node. One would like to run all applications on a cluster with two processors per node using 2 ppn instead of 1 ppn in order to utilize the second processor on each node. However, our results show that this is not always the best choice. Users should always assess their program performance with both 1 ppn and 2 ppn before running production calculations. This issue becomes even more important with the emergence of multi‐core processors.Originality/valueTo the authors' best knowledge, this is the only detailed comparison of AMD Opteron and Intel Xeon dual processor node parallel performance on large Myrinet clusters. The paper should be of value to everybody considering running on or purchasing AMD or Intel‐based Linux cluster.