Low-complexity multiuser vector precoders FPGA implementation

Design and FPGA implementation of low-complexity multiuser vector precoders M. Barrenechea, M. Mendicute, L. Barbero, J. Thompson Signal Theory and Communications Area Mondragon Goi Eskola Politeknikoa University of Mondragon ,[object Object],[object Object]

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Vector precoding In uncoordinated receiver scenarios, the use of precoding techniques at the base station can allow the separation of users’ information streams. . . . x 1 x 2 x M-1 x M y 2 User 2 y K User K Wireless K x M channel matrix H User 1 y 1 Precoder Multiuser MIMO downlink channel s 1 s 2 s K-1 s K . . . Base Station

Vector precoding Linear precoding techniques Main linear approaches: Zero-Forcing: Regularized: MMSE (WF):

Vector precoding Vector precoding The perturbation vector a that minimizes the unscaled transmitted power can be found as: Another approach is to minimize the MMSE (WF-VP):

Vector precoding Solution: search for the closest point in a lattice The problem is similar to maximum likelihood (ML) detection in MIMO systems: The main differences are the following: 1- VP lattice, which is infinite, must be reduced to be implemented. 2- VP search is not affected by noise. 3- Quantization is less critical in VP since both s and a belong to known sets. 4.- A failure of the search causes bit errors in MIMO detection, whereas it only means a larger unscaled power and a more noisy reception in VP, which may affect BER slightly.

Fixed Sphere Encoder Sphere encoder (SE): ,[object Object],[object Object],[object Object]

Fixed Sphere Encoder Sphere encoder search tree Sequential algorithm  Suboptimal resource usage. Variable complexity  Variable throughput.

Fixed Sphere Encoder ,[object Object],[object Object],[Barbero06] L. Barbero, Rapid prototyping of a fixed-complexity sphere decoder and its application to iterative decoding of turbo-MIMO systems, PhD dissertation, University of Edinburgh, 2006.

Fixed Sphere Encoder ,[object Object],Real Imaginary %

Channel matrix pre-processing Ordering of the channel matrix ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Channel matrix pre-processing Ordering of the channel matrix Averaged values of u ii for different levels depending on the ordering: Averaged numbers of evaluated nodes at each level:

Channel matrix pre-processing Effect of ordering on the number of evaluated nodes: 6x6 System with 16-QAM modulation

Simulation results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Simulation results Number of visited nodes:

Simulation results ,[object Object]

FPGA implementation and optimization - 6 x 6 system - 16-QAM modulation - Tree configuration vector - 3 pipeline stages - Restricted group (5x5=25 points) of integers instead of the lattice. - Channel ordering, which is carried out every transmission block, has not been considered. - Distance computation: Implemented VP FSE algorithm PED AED

FPGA implementation and optimization Algorithm implementation Implemented using Xilinx System Generator for DSP

FPGA implementation and optimization Special features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

FPGA implementation and optimization 274 multipliers required  Prohibitive for low-cost FPGA implementation. A series of hardware optimizations have been proposed to reduce the number of required embedded multipliers. Optimization 1: Rearrangement of complex multiplications - Initial system  4 multipliers and 2 adders - Alternative complex multiplication  3 multipliers and 5 adders - Required number of multipliers after OPT. 1  224 Optimization 2: Hard quantization If the values of u ij /u ii are quantized to a very small number of bits , and the multiplications required to compute z i are implemented using programmable logic, the number of multipliers reduces to 74 , although the number of required slices is slightly incremented. Small degradation is introduced.

FPGA implementation and optimization Optimization 3: Approximated Euclidean distance Replace the -norm calculation performed to obtain the PEDs by a simpler method. 1.- The Manhattan distance metric ( ) 2.- The metric Both of these techniques introduce a small BER performance degradation. However, after the implementation of OPT3 the number of multipliers has been reduced to 30 .

FPGA implementation and optimization Optimization 4: Simplified 2-point slicer ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

FPGA implementation and optimization Summary of results The performance loss derived from the implementation of the optimization strategies is just 0.2 dB at a BER of 10 -4 . As for the HW resources, a reduced-complexity implement-ation has been achieved.

Summary and conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object]

End Thank you for your attention!! You can send any comments/requests/questions to: Dr. Mikel Mendicute [email_address]

Low-complexity multiuser vector precoders FPGA implementation

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (7)

Similaire à Low-complexity multiuser vector precoders FPGA implementation

Similaire à Low-complexity multiuser vector precoders FPGA implementation (20)

Plus de TSC University of Mondragon

Plus de TSC University of Mondragon (6)

Dernier

Dernier (20)

Low-complexity multiuser vector precoders FPGA implementation