Independent component analysis (ICA) is an efficient blind source separation technique, but the extracted independent components are randomly permuted in classic ICA algorithms, and subsequent identification is required to find the desired component. Such a two-stage method causes inefficiency. By utilizing the prior information, ICA with reference (ICA-R) algorithm can extract the desired source signal in one stage without subsequent processing. There are many theoretical extensions on ICA-R, but few hardware implementations can be found. Therefore, the efficient VLSI design of a fast one-stage independent component extracting system based on ICA-R algorithm is presented in this paper. The proposed system consists of Preprocess module and Iteration module, which are designed highly parallel and pipelined to accelerate the extraction process. The designed system is implemented in Kintex-7 FPGA, and its performance is verified using synthesized signal. Experiment results show that the presented system can extract the desired components in one stage without subsequent identification, and the highly parallel circuit structure of the system speeds up the component extracting process by [Formula: see text] compared to software implementation.