In this paper we describe paradigms for building and designing parallel computing machines. Firstly we elaborate the uniqueness of MIMD model for the execution of diverse applications. Then we compare the General Purpose Architecture of Parallel Computers with Special Purpose Architecture of Parallel Computers in terms of cost, throughput and efficiency. Then we describe how Parallel Computer Architecture employs parallelism and concurrency through pipelining. Since Pipelining improves the performance of a machine by dividing an instruction into a number of stages, therefore we describe how the performance of a vector processor is enhanced by employing multi pipelining among its processing elements. Also we have elaborated the RISC architecture and Pipelining in RISC machines After comparing RISC computers with CISC computers we observe that although the high speed of RISC computers is very desirable but the significance of speed of a computer is dependent on implementation strategies. Only CPU clock speed is not the only parameter to move the system software from CISC to RISC computers but the other parameters should also be considered like instruction size or format, addressing modes, complexity of instructions and machine cycles required by instructions. Considering all parameters will give performance gain . We discuss Multiprocessor and Data Flow Machines in a concise manner. Then we discuss three SIMD (Single Instruction stream Multiple Data stream) machines which are DEC/MasPar MP-1, Systolic Processors and Wavefront array Processors. The DEC/MasPar MP-1 is a massively parallel SIMD array processor. A wide variety of number representations and arithmetic systems for computers can be implemented easily on the DEC/MasPar MP-1 system. The principal advantages of using such 64×64 SIMD array of 4-bit processors for the implementation of a computer arithmetic laboratory arise out of its flexibility. After comparison of Systolic Processors with Wave front Processors we found that both of the Systolic Processors and Wave front Processors are fast and implemented in VLSI. The major drawback of Systolic Processors is the problem of availability of inputs when clock ticks because of propagation delays in connection buses. The Wave front Processors combine the Systolic Processor architecture with Data Flow machine architecture. Although the Wave front processors use asynchronous data flow computing structure, the timing in the interconnection buses, at input and at output is not problematic..