3. Vector precoding In uncoordinated receiver scenarios, the use of precoding techniques at the base station can allow the separation of users’ information streams. . . . x 1 x 2 x M-1 x M y 2 User 2 y K User K Wireless K x M channel matrix H User 1 y 1 Precoder Multiuser MIMO downlink channel s 1 s 2 s K-1 s K . . . Base Station
4. Vector precoding Linear precoding techniques Main linear approaches: Zero-Forcing: Regularized: MMSE (WF):
5. Vector precoding Vector precoding The perturbation vector a that minimizes the unscaled transmitted power can be found as: Another approach is to minimize the MMSE (WF-VP):
6. Vector precoding Solution: search for the closest point in a lattice The problem is similar to maximum likelihood (ML) detection in MIMO systems: The main differences are the following: 1- VP lattice, which is infinite, must be reduced to be implemented. 2- VP search is not affected by noise. 3- Quantization is less critical in VP since both s and a belong to known sets. 4.- A failure of the search causes bit errors in MIMO detection, whereas it only means a larger unscaled power and a more noisy reception in VP, which may affect BER slightly.
14. Channel matrix pre-processing Ordering of the channel matrix Averaged values of u ii for different levels depending on the ordering: Averaged numbers of evaluated nodes at each level:
15. Channel matrix pre-processing Effect of ordering on the number of evaluated nodes: 6x6 System with 16-QAM modulation
21. FPGA implementation and optimization - 6 x 6 system - 16-QAM modulation - Tree configuration vector - 3 pipeline stages - Restricted group (5x5=25 points) of integers instead of the lattice. - Channel ordering, which is carried out every transmission block, has not been considered. - Distance computation: Implemented VP FSE algorithm PED AED
22. FPGA implementation and optimization Algorithm implementation Implemented using Xilinx System Generator for DSP
23.
24. FPGA implementation and optimization 274 multipliers required Prohibitive for low-cost FPGA implementation. A series of hardware optimizations have been proposed to reduce the number of required embedded multipliers. Optimization 1: Rearrangement of complex multiplications - Initial system 4 multipliers and 2 adders - Alternative complex multiplication 3 multipliers and 5 adders - Required number of multipliers after OPT. 1 224 Optimization 2: Hard quantization If the values of u ij /u ii are quantized to a very small number of bits , and the multiplications required to compute z i are implemented using programmable logic, the number of multipliers reduces to 74 , although the number of required slices is slightly incremented. Small degradation is introduced.
25. FPGA implementation and optimization Optimization 3: Approximated Euclidean distance Replace the -norm calculation performed to obtain the PEDs by a simpler method. 1.- The Manhattan distance metric ( ) 2.- The metric Both of these techniques introduce a small BER performance degradation. However, after the implementation of OPT3 the number of multipliers has been reduced to 30 .
26.
27.
28. FPGA implementation and optimization Summary of results The performance loss derived from the implementation of the optimization strategies is just 0.2 dB at a BER of 10 -4 . As for the HW resources, a reduced-complexity implement-ation has been achieved.
29.
30.
31. End Thank you for your attention!! You can send any comments/requests/questions to: Dr. Mikel Mendicute [email_address]