This paper presents a group of multiple-way graph (with weighted nodes and edges) partitioning algorithms based on a 2-stage constructive-and-refinement mechanism. The graph partitions can be used to control allocation of program units to distributed processors in a way that minimizes the completion time and for design automation applications. In the constructive stage, 4 clustering algorithms are used to construct raw partitions, the second refinement step first adjusts the cluster number to the processor number and then iteratively improves the partitioning cost by employing a Kernighan-Lin based heuristic. This approach represents several extensions to the state-of-the-art methods. A performance comparison of the proposed algorithms is given, based on experiment results.