SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
International Journal of Engineering Science Invention
ISSN (Online): 2319 – 6734, ISSN (Print): 2319 – 6726
www.ijesi.org Volume 3 Issue 4 ǁ April 2014 ǁ PP.15-20
www.ijesi.org 15 | Page
An Efficient Adaptive Fir Filter Based On Distributed Arithmetic
1,
M.Usha , 2,
R.Ramadoss
1,
P.G.Scholar , 2,
Assistant Professor
1,2
,Dept. of Electronics and communication Engineering Sri Muthukumaran Institute Of Technology
Chennai,Tamilnadu,India
ABSTRACT: Adaptive filtering constitutes an important class of DSP algorithms employed in several hand
held mobile devices for applications such as echo cancellation, signal de-noising, and channel equalization.
The throughput of the proposed design is increased by parallel lookup table (LUT) update.The 16:1 multiplexer
is replace by a 8:1 and 2:1 MUX. The conventional adder-based shift accumulation for DA-based inner-product
computation is replaced by conditional signed carry-save accumulation in order to reduce the area complexity;
the power consumption of the proposed design is reduced by using a fast bit clock for all operations. It involves
the same number of multiplexors, smaller LUT, and nearly half the number of adders compared to the existing
DA-based design. The proposed architecture is found to involve significantly 29% less area-13% less power
and throughput compared with the existing DA-based implementations of FIR filter and a increase in operating
frequency of 12MHZ is achieved.
KEYWORDS: Adaptive filter, circuit optimization, distributed arithmetic (DA), least mean square (LMS)
algorithm
I. INTRODUCTION
Adaptive filters are widely used in several digital signal processing (DSP) applications. The tap-delay
line finite impulse response (FIR) filters whose weights are updated by the famous Widrow-Hoff least mean
square (LMS) algorithm [1] is the most popularly used adaptive filter not only due to its simplicity but also due
to its satisfactory convergence performance [2]. The direct form FIR filter configuration for the implementation
of LMS adaptive filter results in either zero or lower adaptation-delay but involves a large critical-path due to an
inner-product computation to obtain the filter output. Therefore, when the input signal has high sample-rate, the
critical-path could exceed the sample period. In such cases, it is necessary to reduce the critical-path by
pipelined implementation. Since the conventional LMS algorithm does not support pipelined implementation
due to its recursive behavior, it is modified to a form called delayed LMS algorithm [3], which allows pipelined
implementation of different sections of the adaptive filter. State-of-the-art hardware implementation of adaptive
filters typically involves DSP microprocessors or custom logic design using one or more hardware multiply-
accumulate (MAC) units. While an implementation employing DSP microprocessors provides easy
programmability, the serial implementation on a single processing unit adversely affects the throughput of these
filters. This is especially true for higher order filters. Custom logic design using one or more hardware MAC
units may be used to parallelize the implementation and thus improve the throughput but at the cost of increased
logic complexity, chip area usage, and power consumption [4].
Various types of DSP operations are employed in practice. Filtering is one of the most widely used
signal processing operations [5]. For FIR filters, output y(n) is a linear convolution of weights wn and inputs.
Distributed arithmetic (DA) is one way to implement convolution without multiplier, where the MAC
operations are replaced by a series of LUT access and summations. Techniques, such as ROM decomposition
[6] and offset-binary coding (OBC) [7] can reduce the LUT size, which would otherwise increase exponentially
with the filter lengthN+1for conventional DA. However, in many applications such as echo cancelation and
system identification, coefficient adaptation is needed. This adaptation makes it challenging to implement DA-
based adaptive filters with low cost due to the necessity of updating LUTs. Several approaches have been
developed for DA-based adaptive filters, i.e., from the point of view of reducing logic complexity [8]. Recently,
a DA-based FIR adaptive filter implementation scheme has been presented in [9], which uses extra “auxiliary”
LUTs to help in the updating; however, memory usage is doubled. The rest of this paper is organized as follows.
Section 2 describes the Review of LMS Adaptive Algorithms. Section 3 introduces the proposed architecture
and Proposed DA based approach for inner product computation, Section 4 describes the simulation results for
proposed system. Conclusions are finally drawn in Section 5.
An Efficient Adaptive Fir Filter Based…
www.ijesi.org 16 | Page
II. EXISTING SYSTEM
A. Review of LMS Adaptive Algorithms
During each cycle, the LMS algorithm computes a filter output and an error value that is equal to the
difference between the current filter output and the desired response. The estimated error is then used to update
the filter weights in every training cycle. The weights of LMS adaptive filter during the nth iteration are updated
according to the following equations:
w(n+1) = w(n) + μ·e(n)·x(n) (1)
where
e(n) = d(n) − y(n) (2)
y(n) = w qT(n)·x(n). (3)
The input vector x(n)and the weight vector w(n)at the nth training iteration are respectively given by
x(n = [x(n),x(n−1),...,x(n−N+1)]T (4)
w(n)= [w0(n),w1(n),...,wN−1(n)]T. (5)
d(n) is the desired response, and y(n) is the filter output of the nth iteration. e(n)denotes the error computed
during the nth iteration, which is used to update the weights, μ is the convergence factor, and N is the filter
length.In the case of pipelined designs, the feedback error e(n) becomes available after certain number of
cycles, called the “adaptation delay.” The pipelined architectures therefore use the delayed error e(n−m) for
updating the current weight instead of the most recent error, where m is the adaptation de-lay. The weight-
update equation of such delayed LMS adaptive filter is given by
w(n+1)= w(n)+μ·e(n−m)·x(n−m). (6)
Fig. 1. Existing System Block Diagram
III. PROPOSED SYSTEM
A. Proposed DA based Adaptive Filter
The proposed structure of DA-based adaptive filter of length N=4 is shown in Fig. 2. It consists of a
four-point inner-product block and a weight-increment block along with additional circuits for the computation
of error value e(n) and control word t for the barrel shifters. The four-point inner-product block Fig. 3 includes
a DA table consisting of an array of 15 registers which stores the partial inner products yl for 0 < l ≤ 15and a 16
: 1 multiplexor (MUX) to select the content of one of those registers. Bit slices of weights A={w3l w2l w1l
w0l} for 0 ≤ l ≤ L−1are fed to the MUX as control in LSB-to-MSB order, and the output of the MUX is fed to
the carry-save accumulator (shown in Fig. 2). After L bit cycles, the carry-save accumulator shift accumulates
all the partial inner products and generates a sum word and a carry word of size (L+2) bit each. The carry and
sum words are shifted added with an input carry “1” to generate filter output which is subsequently subtracted
from the desired output d(n)to obtain the error e(n). The magnitude of the computed error is decoded to generate
the control word t for the barrel shifter. The logic used for the generation of control word t to be used for the
barrel shifter. The convergence factor μ is usually taken to be O(1/N). We have taken μ= 1/N. However, one can
take μ as 2 −i/N, where i is a small integer. The number of shifts t in that case is increased by i, and the input to
the barrel shifters is pre shifted by i locations accordingly to reduce the hardware complexity.
An Efficient Adaptive Fir Filter Based…
www.ijesi.org 17 | Page
Fig. 2. Proposed System Structure
The weight-increment unit consists of four barrel shifters and four adder/subtractor cells. The barrel
shifter shifts the different input values xk for k= 0,1,...,N−1 by appropriate number of locations (deter-mined by
the location of the most significant one in the estimated error). The barrel shifter yields the desired increments
to be added with or subtracted from the current weights. The sign bit of the error is used as the control for
adder/subtractor cells such that, when sign bit is zero or one, the barrel-shifter output is respectively added with
or subtracted from the content of the corresponding current value in the weight register.
B. Proposed DA based approach for inner product computation
The LMS adaptive filter, in each cycle, needs to perform an inner-product computation which
contributes to the most of the critical path. For simplicity of presentation, let the inner product of (3) be given
by
Where Wk and XK for 0≤ k ≤ N−1 form the N-point vectors corresponding the current weights and most recent
N−1 input, respectively. Assuming L to be the bit width of the weight, each component of the weight vector
may be expressed in two’s complement representation
Where Wkl denotes the lth
bit of Wk. Substituting (8), we can write (7) in an expanded form
To convert the sum-of-products form of (7) into a distributed form, the order of summations over the indices k
and l in (9) can be interchanged to have
(10)
and the inner product given by (10) can be computed as
where
An Efficient Adaptive Fir Filter Based…
www.ijesi.org 18 | Page
Since any element of the N-point bit sequence {wkl for 0≤ k≤N−1}can either be zero or one, the partial sum yl
for l=0,1,...,L−1can have possible values. If all the possible values of yl are precomputed and stored in a
LUT, the partial sums yl can be read out from the LUT using the bit sequence{wkl}as address bits for
computing the inner product.The inner product can be calculated in L cycles of shift accumulation, followed by
LUT-read operations corresponding to L number of bit slices{wkl}for 0≤ l ≤ L−1.The bit slices of vector ware
fed one after the next in the least significant bit (LSB) to the most significant bit (MSB) order to the carry-save
accumulator. However, the negative (two’s complement) of the LUT output needs to be accumulated in case of
MSB slices. Therefore, all the bits of LUT output are passed through XOR gates with a sign-control input which
is set to one only when the MSB slice appears complement of the LUT output corresponding to the MSB slice
but do not affect the output for other bit slices. Finally, the sum and carry words obtained after L clock cycles
are required to be added by a final adder (not shown in the figure), and the input carry of the final adder is
required to be set to one to account for the two’s complement operation of the LUT output corresponding to the
MSB slice.
The content of the kth
LUT location can be expressed as
(12)
where kj is the (j+1)th
bit of kth
LUT location can be expressed as N-bit binary representation of integer k for
0≤k≤ −1. Note that ck for 0≤k≤ −1 can be pre computed and stored in RAM-based LUT of words.
Fig. 3. . DA table for generation of possible sums of input samples.
However, instead of storing words in LUT, an example of such a DA table for N=4 is shown in Fig. 4. It
contains only 15 registers to store the as address. The XOR gates thus produce the one’s pre computed sums of
input words. Seven new values of ck are computed by seven adders in parallel.
Fig. 4 Internal blocks of inner product computation block.
An Efficient Adaptive Fir Filter Based…
www.ijesi.org 19 | Page
IV. RESULTS AND DISCUSSION
The proposed System was executed on Windows XP operating system at an operating frequency of
2.0GHz using Xillinx simu lator.
Fig. 5. Proposed method pin diagram without slow clock
Fig. 6. Filter output wave form
Its seen from the synthesis result that the frequency of the proposed system is increased by 12 MHZ.
V. CONCLUSION
We have suggested an efficient pipelined architecture for low-power, high-throughput, and low-area
implementation of DA-based adaptive filter. Throughput rate is significantly enhanced by parallel LUT update
and concurrent processing of filtering operation and weight-update operation. We have also proposed a carry-
save accumulation scheme of signed partial inner products for the computation of filter output. From the
synthesis results, we find that the proposed design consumes 13% less power and 29% less ADP over our
previous DA-based FIR adaptive filter in average for filter lengths N=16 Compared to the best of other existing
designs, our proposed architecture provides 9.5 times less power and 4.6 times less ADP. Offset binary coding
is popularly used to reduce the LUT size to half for area-efficient implementation of DA which can be applied
to our design as well.
An Efficient Adaptive Fir Filter Based…
www.ijesi.org 20 | Page
REFERENCES
[1] B. Widrow and S. D. Stearns, Adaptive signal processing. Prentice Hall, Englewood Cliffs, NJ, 1985.
[2] S. Haykin and B. Widrow, Least-mean-square adaptive filters. Wiley-Interscience, Hoboken, NJ, 2003.
[3] M. Meyer and P. Agrawal, “A modular pipelined implementation of a delayed LMS transversal adaptive filter,” in Proc. IEEE
International Symposium on Circuits and Systems, ISCAS ’09, May 1990, pp. 1943–1946.
[4] D. Allred, V. Krishnan, W. Huang, and D. Anderson, “Implementation of an LMS adaptive filter on an FPGA employing
multiplexed multiplier architecture,” in Proc. Asilomar Conf. Signals Systems Computers, Nov. 2003, pp.918-921.
[5] S. K. Mitra, Digital Signal Processing: A Computer-Based Approach, 2nd
. New York: McGraw-Hill, 2001.
[6] K.K.Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. Hoboken, NJ: Wiley, 1999.
[7] S. A. White, “Applications of distributed arithmetic to digital signal processing: A tutorial review,”IEEE ASSP Mag., vol. 6, no.
3, pp. 4–19,Jul. 1989.
[8] D. J. Allred, H. Yoo, V. Krishnan, W. Huang, and D. V. Anderson, “An FPGA implementation for a high throughput adaptive
filter using distributed arithmetic,” in Proc. 12th Annu. IEEE Symp. Field-Programmable Custom Comput. Mach., 2004, pp.
324–325
[9] D. J. Allred, H. Yoo, V. Krishnan, W. Huang, and D. V. Anderson, “A novel high performance distributed arithmetic adaptive
filter implementation on an FPGA,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2004, vol. 5, pp. V-161–V-164

Contenu connexe

Tendances

A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
IJERA Editor
 
Flexible dsp accelerator architecture exploiting carry save arithmetic
Flexible dsp accelerator architecture exploiting carry save arithmeticFlexible dsp accelerator architecture exploiting carry save arithmetic
Flexible dsp accelerator architecture exploiting carry save arithmetic
Nexgen Technology
 
Iisrt zzzz shamili ch
Iisrt zzzz shamili chIisrt zzzz shamili ch
Iisrt zzzz shamili ch
IISRT
 
Ultrasound Modular Architecture
Ultrasound Modular ArchitectureUltrasound Modular Architecture
Ultrasound Modular Architecture
Jose Miguel Moreno
 

Tendances (20)

Chap10 slides
Chap10 slidesChap10 slides
Chap10 slides
 
Chap9 slides
Chap9 slidesChap9 slides
Chap9 slides
 
Chap4 slides
Chap4 slidesChap4 slides
Chap4 slides
 
Low power tool paper
Low power tool paperLow power tool paper
Low power tool paper
 
On Channel Estimation of OFDM-BPSK and -QPSK over Nakagami-m Fading Channels
On Channel Estimation of OFDM-BPSK and -QPSK over Nakagami-m Fading ChannelsOn Channel Estimation of OFDM-BPSK and -QPSK over Nakagami-m Fading Channels
On Channel Estimation of OFDM-BPSK and -QPSK over Nakagami-m Fading Channels
 
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
A Review: Compensation of Mismatches in Time Interleaved Analog to Digital Co...
 
Chap8 slides
Chap8 slidesChap8 slides
Chap8 slides
 
Chap12 slides
Chap12 slidesChap12 slides
Chap12 slides
 
A Novel CAZAC Sequence Based Timing Synchronization Scheme for OFDM System
A Novel CAZAC Sequence Based Timing Synchronization Scheme for OFDM SystemA Novel CAZAC Sequence Based Timing Synchronization Scheme for OFDM System
A Novel CAZAC Sequence Based Timing Synchronization Scheme for OFDM System
 
Dg34662666
Dg34662666Dg34662666
Dg34662666
 
parallel
parallelparallel
parallel
 
Review on Doubling the Rate of SEFDM Systems using Hilbert Pairs
Review on Doubling the Rate of SEFDM Systems using Hilbert PairsReview on Doubling the Rate of SEFDM Systems using Hilbert Pairs
Review on Doubling the Rate of SEFDM Systems using Hilbert Pairs
 
A Novel Route Optimized Cluster Based Routing Protocol for Pollution Controll...
A Novel Route Optimized Cluster Based Routing Protocol for Pollution Controll...A Novel Route Optimized Cluster Based Routing Protocol for Pollution Controll...
A Novel Route Optimized Cluster Based Routing Protocol for Pollution Controll...
 
Flexible dsp accelerator architecture exploiting carry save arithmetic
Flexible dsp accelerator architecture exploiting carry save arithmeticFlexible dsp accelerator architecture exploiting carry save arithmetic
Flexible dsp accelerator architecture exploiting carry save arithmetic
 
Discrete wavelet transform-based RI adaptive algorithm for system identification
Discrete wavelet transform-based RI adaptive algorithm for system identificationDiscrete wavelet transform-based RI adaptive algorithm for system identification
Discrete wavelet transform-based RI adaptive algorithm for system identification
 
An Advanced Implementation of a Digital Artificial Reverberator
An Advanced Implementation of a Digital Artificial ReverberatorAn Advanced Implementation of a Digital Artificial Reverberator
An Advanced Implementation of a Digital Artificial Reverberator
 
Iisrt zzzz shamili ch
Iisrt zzzz shamili chIisrt zzzz shamili ch
Iisrt zzzz shamili ch
 
Ultrasound Modular Architecture
Ultrasound Modular ArchitectureUltrasound Modular Architecture
Ultrasound Modular Architecture
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designs
 
05725150
0572515005725150
05725150
 

Similaire à D0341015020

Efficient realization-of-an-adfe-with-a-new-adaptive-algorithm
Efficient realization-of-an-adfe-with-a-new-adaptive-algorithmEfficient realization-of-an-adfe-with-a-new-adaptive-algorithm
Efficient realization-of-an-adfe-with-a-new-adaptive-algorithm
Cemal Ardil
 
Design and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated MultiplierDesign and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated Multiplier
ijsrd.com
 
A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...
A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...
A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...
IOSR Journals
 

Similaire à D0341015020 (20)

Ijecet 06 10_004
Ijecet 06 10_004Ijecet 06 10_004
Ijecet 06 10_004
 
Multi-carrier Equalization by Restoration of RedundancY (MERRY) for Adaptive ...
Multi-carrier Equalization by Restoration of RedundancY (MERRY) for Adaptive ...Multi-carrier Equalization by Restoration of RedundancY (MERRY) for Adaptive ...
Multi-carrier Equalization by Restoration of RedundancY (MERRY) for Adaptive ...
 
FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILT...
FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILT...FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILT...
FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILT...
 
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
Low Power Adaptive FIR Filter Based on Distributed ArithmeticLow Power Adaptive FIR Filter Based on Distributed Arithmetic
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
 
Design of an Adaptive Hearing Aid Algorithm using Booth-Wallace Tree Multiplier
Design of an Adaptive Hearing Aid Algorithm using Booth-Wallace Tree MultiplierDesign of an Adaptive Hearing Aid Algorithm using Booth-Wallace Tree Multiplier
Design of an Adaptive Hearing Aid Algorithm using Booth-Wallace Tree Multiplier
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Performance Analysis of OFDM Transceiver with Folded FFT and LMS Filter
Performance Analysis of OFDM Transceiver with Folded FFT and LMS FilterPerformance Analysis of OFDM Transceiver with Folded FFT and LMS Filter
Performance Analysis of OFDM Transceiver with Folded FFT and LMS Filter
 
Efficient realization-of-an-adfe-with-a-new-adaptive-algorithm
Efficient realization-of-an-adfe-with-a-new-adaptive-algorithmEfficient realization-of-an-adfe-with-a-new-adaptive-algorithm
Efficient realization-of-an-adfe-with-a-new-adaptive-algorithm
 
Optimized implementation of an innovative digital audio equalizer
Optimized implementation of an innovative digital audio equalizerOptimized implementation of an innovative digital audio equalizer
Optimized implementation of an innovative digital audio equalizer
 
Time domain analysis and synthesis using Pth norm filter design
Time domain analysis and synthesis using Pth norm filter designTime domain analysis and synthesis using Pth norm filter design
Time domain analysis and synthesis using Pth norm filter design
 
Design of Optimized FIR Filter Using FCSD Representation
Design  of  Optimized  FIR  Filter  Using  FCSD Representation Design  of  Optimized  FIR  Filter  Using  FCSD Representation
Design of Optimized FIR Filter Using FCSD Representation
 
Design & Implementation of LUT Based Multiplier Using APCOMS Technique
Design & Implementation of LUT Based Multiplier Using APCOMS TechniqueDesign & Implementation of LUT Based Multiplier Using APCOMS Technique
Design & Implementation of LUT Based Multiplier Using APCOMS Technique
 
System performance evaluation of fixed and adaptive resource allocation of 3 ...
System performance evaluation of fixed and adaptive resource allocation of 3 ...System performance evaluation of fixed and adaptive resource allocation of 3 ...
System performance evaluation of fixed and adaptive resource allocation of 3 ...
 
Comparative Analysis of Different Wavelet Functions using Modified Adaptive F...
Comparative Analysis of Different Wavelet Functions using Modified Adaptive F...Comparative Analysis of Different Wavelet Functions using Modified Adaptive F...
Comparative Analysis of Different Wavelet Functions using Modified Adaptive F...
 
B43030508
B43030508B43030508
B43030508
 
Modified Adaptive Lifting Structure Of CDF 9/7 Wavelet With Spiht For Lossy I...
Modified Adaptive Lifting Structure Of CDF 9/7 Wavelet With Spiht For Lossy I...Modified Adaptive Lifting Structure Of CDF 9/7 Wavelet With Spiht For Lossy I...
Modified Adaptive Lifting Structure Of CDF 9/7 Wavelet With Spiht For Lossy I...
 
Design and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated MultiplierDesign and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated Multiplier
 
An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter
 
An fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filterAn fpga implementation of the lms adaptive filter
An fpga implementation of the lms adaptive filter
 
A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...
A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...
A Novel Approach of Area-Efficient FIR Filter Design Using Distributed Arithm...
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

D0341015020

  • 1. International Journal of Engineering Science Invention ISSN (Online): 2319 – 6734, ISSN (Print): 2319 – 6726 www.ijesi.org Volume 3 Issue 4 ǁ April 2014 ǁ PP.15-20 www.ijesi.org 15 | Page An Efficient Adaptive Fir Filter Based On Distributed Arithmetic 1, M.Usha , 2, R.Ramadoss 1, P.G.Scholar , 2, Assistant Professor 1,2 ,Dept. of Electronics and communication Engineering Sri Muthukumaran Institute Of Technology Chennai,Tamilnadu,India ABSTRACT: Adaptive filtering constitutes an important class of DSP algorithms employed in several hand held mobile devices for applications such as echo cancellation, signal de-noising, and channel equalization. The throughput of the proposed design is increased by parallel lookup table (LUT) update.The 16:1 multiplexer is replace by a 8:1 and 2:1 MUX. The conventional adder-based shift accumulation for DA-based inner-product computation is replaced by conditional signed carry-save accumulation in order to reduce the area complexity; the power consumption of the proposed design is reduced by using a fast bit clock for all operations. It involves the same number of multiplexors, smaller LUT, and nearly half the number of adders compared to the existing DA-based design. The proposed architecture is found to involve significantly 29% less area-13% less power and throughput compared with the existing DA-based implementations of FIR filter and a increase in operating frequency of 12MHZ is achieved. KEYWORDS: Adaptive filter, circuit optimization, distributed arithmetic (DA), least mean square (LMS) algorithm I. INTRODUCTION Adaptive filters are widely used in several digital signal processing (DSP) applications. The tap-delay line finite impulse response (FIR) filters whose weights are updated by the famous Widrow-Hoff least mean square (LMS) algorithm [1] is the most popularly used adaptive filter not only due to its simplicity but also due to its satisfactory convergence performance [2]. The direct form FIR filter configuration for the implementation of LMS adaptive filter results in either zero or lower adaptation-delay but involves a large critical-path due to an inner-product computation to obtain the filter output. Therefore, when the input signal has high sample-rate, the critical-path could exceed the sample period. In such cases, it is necessary to reduce the critical-path by pipelined implementation. Since the conventional LMS algorithm does not support pipelined implementation due to its recursive behavior, it is modified to a form called delayed LMS algorithm [3], which allows pipelined implementation of different sections of the adaptive filter. State-of-the-art hardware implementation of adaptive filters typically involves DSP microprocessors or custom logic design using one or more hardware multiply- accumulate (MAC) units. While an implementation employing DSP microprocessors provides easy programmability, the serial implementation on a single processing unit adversely affects the throughput of these filters. This is especially true for higher order filters. Custom logic design using one or more hardware MAC units may be used to parallelize the implementation and thus improve the throughput but at the cost of increased logic complexity, chip area usage, and power consumption [4]. Various types of DSP operations are employed in practice. Filtering is one of the most widely used signal processing operations [5]. For FIR filters, output y(n) is a linear convolution of weights wn and inputs. Distributed arithmetic (DA) is one way to implement convolution without multiplier, where the MAC operations are replaced by a series of LUT access and summations. Techniques, such as ROM decomposition [6] and offset-binary coding (OBC) [7] can reduce the LUT size, which would otherwise increase exponentially with the filter lengthN+1for conventional DA. However, in many applications such as echo cancelation and system identification, coefficient adaptation is needed. This adaptation makes it challenging to implement DA- based adaptive filters with low cost due to the necessity of updating LUTs. Several approaches have been developed for DA-based adaptive filters, i.e., from the point of view of reducing logic complexity [8]. Recently, a DA-based FIR adaptive filter implementation scheme has been presented in [9], which uses extra “auxiliary” LUTs to help in the updating; however, memory usage is doubled. The rest of this paper is organized as follows. Section 2 describes the Review of LMS Adaptive Algorithms. Section 3 introduces the proposed architecture and Proposed DA based approach for inner product computation, Section 4 describes the simulation results for proposed system. Conclusions are finally drawn in Section 5.
  • 2. An Efficient Adaptive Fir Filter Based… www.ijesi.org 16 | Page II. EXISTING SYSTEM A. Review of LMS Adaptive Algorithms During each cycle, the LMS algorithm computes a filter output and an error value that is equal to the difference between the current filter output and the desired response. The estimated error is then used to update the filter weights in every training cycle. The weights of LMS adaptive filter during the nth iteration are updated according to the following equations: w(n+1) = w(n) + μ·e(n)·x(n) (1) where e(n) = d(n) − y(n) (2) y(n) = w qT(n)·x(n). (3) The input vector x(n)and the weight vector w(n)at the nth training iteration are respectively given by x(n = [x(n),x(n−1),...,x(n−N+1)]T (4) w(n)= [w0(n),w1(n),...,wN−1(n)]T. (5) d(n) is the desired response, and y(n) is the filter output of the nth iteration. e(n)denotes the error computed during the nth iteration, which is used to update the weights, μ is the convergence factor, and N is the filter length.In the case of pipelined designs, the feedback error e(n) becomes available after certain number of cycles, called the “adaptation delay.” The pipelined architectures therefore use the delayed error e(n−m) for updating the current weight instead of the most recent error, where m is the adaptation de-lay. The weight- update equation of such delayed LMS adaptive filter is given by w(n+1)= w(n)+μ·e(n−m)·x(n−m). (6) Fig. 1. Existing System Block Diagram III. PROPOSED SYSTEM A. Proposed DA based Adaptive Filter The proposed structure of DA-based adaptive filter of length N=4 is shown in Fig. 2. It consists of a four-point inner-product block and a weight-increment block along with additional circuits for the computation of error value e(n) and control word t for the barrel shifters. The four-point inner-product block Fig. 3 includes a DA table consisting of an array of 15 registers which stores the partial inner products yl for 0 < l ≤ 15and a 16 : 1 multiplexor (MUX) to select the content of one of those registers. Bit slices of weights A={w3l w2l w1l w0l} for 0 ≤ l ≤ L−1are fed to the MUX as control in LSB-to-MSB order, and the output of the MUX is fed to the carry-save accumulator (shown in Fig. 2). After L bit cycles, the carry-save accumulator shift accumulates all the partial inner products and generates a sum word and a carry word of size (L+2) bit each. The carry and sum words are shifted added with an input carry “1” to generate filter output which is subsequently subtracted from the desired output d(n)to obtain the error e(n). The magnitude of the computed error is decoded to generate the control word t for the barrel shifter. The logic used for the generation of control word t to be used for the barrel shifter. The convergence factor μ is usually taken to be O(1/N). We have taken μ= 1/N. However, one can take μ as 2 −i/N, where i is a small integer. The number of shifts t in that case is increased by i, and the input to the barrel shifters is pre shifted by i locations accordingly to reduce the hardware complexity.
  • 3. An Efficient Adaptive Fir Filter Based… www.ijesi.org 17 | Page Fig. 2. Proposed System Structure The weight-increment unit consists of four barrel shifters and four adder/subtractor cells. The barrel shifter shifts the different input values xk for k= 0,1,...,N−1 by appropriate number of locations (deter-mined by the location of the most significant one in the estimated error). The barrel shifter yields the desired increments to be added with or subtracted from the current weights. The sign bit of the error is used as the control for adder/subtractor cells such that, when sign bit is zero or one, the barrel-shifter output is respectively added with or subtracted from the content of the corresponding current value in the weight register. B. Proposed DA based approach for inner product computation The LMS adaptive filter, in each cycle, needs to perform an inner-product computation which contributes to the most of the critical path. For simplicity of presentation, let the inner product of (3) be given by Where Wk and XK for 0≤ k ≤ N−1 form the N-point vectors corresponding the current weights and most recent N−1 input, respectively. Assuming L to be the bit width of the weight, each component of the weight vector may be expressed in two’s complement representation Where Wkl denotes the lth bit of Wk. Substituting (8), we can write (7) in an expanded form To convert the sum-of-products form of (7) into a distributed form, the order of summations over the indices k and l in (9) can be interchanged to have (10) and the inner product given by (10) can be computed as where
  • 4. An Efficient Adaptive Fir Filter Based… www.ijesi.org 18 | Page Since any element of the N-point bit sequence {wkl for 0≤ k≤N−1}can either be zero or one, the partial sum yl for l=0,1,...,L−1can have possible values. If all the possible values of yl are precomputed and stored in a LUT, the partial sums yl can be read out from the LUT using the bit sequence{wkl}as address bits for computing the inner product.The inner product can be calculated in L cycles of shift accumulation, followed by LUT-read operations corresponding to L number of bit slices{wkl}for 0≤ l ≤ L−1.The bit slices of vector ware fed one after the next in the least significant bit (LSB) to the most significant bit (MSB) order to the carry-save accumulator. However, the negative (two’s complement) of the LUT output needs to be accumulated in case of MSB slices. Therefore, all the bits of LUT output are passed through XOR gates with a sign-control input which is set to one only when the MSB slice appears complement of the LUT output corresponding to the MSB slice but do not affect the output for other bit slices. Finally, the sum and carry words obtained after L clock cycles are required to be added by a final adder (not shown in the figure), and the input carry of the final adder is required to be set to one to account for the two’s complement operation of the LUT output corresponding to the MSB slice. The content of the kth LUT location can be expressed as (12) where kj is the (j+1)th bit of kth LUT location can be expressed as N-bit binary representation of integer k for 0≤k≤ −1. Note that ck for 0≤k≤ −1 can be pre computed and stored in RAM-based LUT of words. Fig. 3. . DA table for generation of possible sums of input samples. However, instead of storing words in LUT, an example of such a DA table for N=4 is shown in Fig. 4. It contains only 15 registers to store the as address. The XOR gates thus produce the one’s pre computed sums of input words. Seven new values of ck are computed by seven adders in parallel. Fig. 4 Internal blocks of inner product computation block.
  • 5. An Efficient Adaptive Fir Filter Based… www.ijesi.org 19 | Page IV. RESULTS AND DISCUSSION The proposed System was executed on Windows XP operating system at an operating frequency of 2.0GHz using Xillinx simu lator. Fig. 5. Proposed method pin diagram without slow clock Fig. 6. Filter output wave form Its seen from the synthesis result that the frequency of the proposed system is increased by 12 MHZ. V. CONCLUSION We have suggested an efficient pipelined architecture for low-power, high-throughput, and low-area implementation of DA-based adaptive filter. Throughput rate is significantly enhanced by parallel LUT update and concurrent processing of filtering operation and weight-update operation. We have also proposed a carry- save accumulation scheme of signed partial inner products for the computation of filter output. From the synthesis results, we find that the proposed design consumes 13% less power and 29% less ADP over our previous DA-based FIR adaptive filter in average for filter lengths N=16 Compared to the best of other existing designs, our proposed architecture provides 9.5 times less power and 4.6 times less ADP. Offset binary coding is popularly used to reduce the LUT size to half for area-efficient implementation of DA which can be applied to our design as well.
  • 6. An Efficient Adaptive Fir Filter Based… www.ijesi.org 20 | Page REFERENCES [1] B. Widrow and S. D. Stearns, Adaptive signal processing. Prentice Hall, Englewood Cliffs, NJ, 1985. [2] S. Haykin and B. Widrow, Least-mean-square adaptive filters. Wiley-Interscience, Hoboken, NJ, 2003. [3] M. Meyer and P. Agrawal, “A modular pipelined implementation of a delayed LMS transversal adaptive filter,” in Proc. IEEE International Symposium on Circuits and Systems, ISCAS ’09, May 1990, pp. 1943–1946. [4] D. Allred, V. Krishnan, W. Huang, and D. Anderson, “Implementation of an LMS adaptive filter on an FPGA employing multiplexed multiplier architecture,” in Proc. Asilomar Conf. Signals Systems Computers, Nov. 2003, pp.918-921. [5] S. K. Mitra, Digital Signal Processing: A Computer-Based Approach, 2nd . New York: McGraw-Hill, 2001. [6] K.K.Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. Hoboken, NJ: Wiley, 1999. [7] S. A. White, “Applications of distributed arithmetic to digital signal processing: A tutorial review,”IEEE ASSP Mag., vol. 6, no. 3, pp. 4–19,Jul. 1989. [8] D. J. Allred, H. Yoo, V. Krishnan, W. Huang, and D. V. Anderson, “An FPGA implementation for a high throughput adaptive filter using distributed arithmetic,” in Proc. 12th Annu. IEEE Symp. Field-Programmable Custom Comput. Mach., 2004, pp. 324–325 [9] D. J. Allred, H. Yoo, V. Krishnan, W. Huang, and D. V. Anderson, “A novel high performance distributed arithmetic adaptive filter implementation on an FPGA,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2004, vol. 5, pp. V-161–V-164