Abstract

This thesis presents a design, implementation and performance benchmark of custom hardware for computing Singular Value Decomposition (SVD) of the radio communication channel characteristic matrix. Software Defined Radio (SDR) is a concept in which the radio transceiver is implemented by software programs running on a processor. SVD of the channel characteristic matrix is used in pre-coding, equalization and beamforming for Multiple Input Multiple Output (MIMO) and Orthogonal Frequency Division Modulation (OFDM) communication systems (e.g., IEEE 802.11n). Since SVD is computationally intensive, it may require custom hardware to reduce the computing time. The pipeline processor developed in this thesis is suitable for computing the SVD of a sequence of 2 × 2 matrices. A stream of 2×2 matrices is sent to the custom hardware, which returns the corresponding streams of singular values and unitary matrices. The architecture is based on the two sided Jacobi method utilizing Coordinate Rotation Digital Computer (CORDIC) algorithms. A 2×2 SVD prototype was implemented on Field-Programmable Gate Array (FPGA) for SDR applications. The 2×2 SVD prototype design can output the singular values and the corresponding unitary matrices in pipeline while operating at a data rate of 324 MHz on a Virtex 6 (xc6vlx240t-lff1156) FPGA. The prototype design consists of fifty-five CORDIC cores which takes 32 percent of available logic on the FPGA. It achieves the optimal pipeline rate equaled to the maximum hardware clock rate. The depth of the pipeline (latency) is 173 clock-cycles for 16-bit data hardware. The proposed architecture provides performance gains over standard software libraries, such as the ZGESVD function of Linear Algebra PACKage (LAPACK) library, which is based on Golub-Kahan-Reinsch SVD algorithm, when running on standard processors. The ZGESVD function of LAPACK implemented in Intel’s Math Kernel Library (MKL) will achieve a projected data rate of 40 MHz on a 2.50 GHz Intel Quad (Q9300) CPU. The pipeline SVD hardware ban width equals the clock frequency and the data rate can reach 324 MHz on the ML605 board (Virtex 6 xc6vlx240t). The proposed architecture also has the potential to be easily extended to solve 4×4 SVD problems used in pre-coding and equalization schemes. The proposed algorithm and design have better performance for small matrices, even though the general timing complexity is n2 when compared to nlog(n) complexity of Brent-Luk-Van Loan (BLV) systolic array using non-pipeline 2×2 processors. The performance gain of the proposed design is at the cost of increased circuit area.%%%%M.S., Computer Engineering – Drexel University, 2010

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.