A new binary halved clustering method and ert processor for assr system

ONLINE IEEE PROJECTS IeeeXpert.com
BUY THIS PROJECT FOR 2000 RS –IeeeXpert.com
Mobile: +91-9566492473/+91-9042092473 | WhatsApp: 09566492473
Email: contact@ieeexpert.com | Locations: Pondicherry/Chennai
Delivering Projects all over India | 100% Output Satisfaction (or) Get Money back
A New Binary-Halved Clustering Method and ERT
Processor for ASSR System
Abstract:
This paper presents an automatic speech–speaker recognition (ASSR) system implemented in a
chip which includes a built-in extraction, recognition, and training (ERT) core. For VLSI design
(here, ASSR system), the hardware cost and time complexity are always the important issues
which are improved in this proposed design in two levels: 1) algorithmic and 2) architecture. At
the algorithm level, a newly binary-halved clustering (BHC) is proposed to achieve low time
complexity and low memory requirement. In addition, at the architecture level, a new ERT core
is proposed and implemented based on data dependence and reuse mechanism to reduce the time
and hardware cost as well. Finally, the chip implementation is synthesized, placed, and routed
using TSMC 90-nm technology library. To verify the performance of the proposed BHC method,
a case study is performed based on nine speakers. Moreover, the validation of the ASSR system
is examined in two parts: 1) speech recognition and 2) speaker recognition. The proposed
architecture of this paper analysis the logic size, area and power consumption using Xilinx 14.2.
Enhancement of the project:
Existing System:
Bapat et al. developed an ASIC-based speech recognition system considering HMM models. In
literature, the HMM-based systems were applied to the large-vocabulary speech recognition
tasks, which consider the higher level acoustic models (e.g., sentence or word level). Certainly,
the higher level acoustic models are formed by concatenating the very basic lower level models,
such as syllable or phoneme (biphone or triphone). In contrast, DTW-based system considers
frame instead of syllable or phone as basis. It performs the recognition task by sequencing frame
by frame, which certainly reduces the time complexity compared with the HMM-based system.
Such an advantage fascinates to apply DTW method for lower or middle size vocabularies

Mobile: +91-9566492473/+91-9042092473 | WhatsApp: 09566492473
maintaining the higher recognition rate. In this concern, Wu and Kuo proposed an ASIC-based
design adopting the traditional DTW-based method which has better time efficiency than the
HMM-based method. Nonetheless, the traditional DTW-based method faces a vital problem
called speaker variations.
To solve this problem, an enhanced crosswords reference template (ECWTR) method which was
developed in our previous research is incorporated in this proposed ASSR system. On the other
hand, techniques of speaker recognition are fundamentally different from that of the speech
recognitions. For instance, time information and their orders are critical to speech recognition,
whereas identity features of personal acoustic characteristics in each speaker are more important
than the timing information of a testing voice for speaker recognition. Therefore, feature
clustering after extraction during training phase for a specific speaker signal (elaborated in next
paragraph) is very crucial for voice-print recognition. In addition, the performance of speaker
recognition has direct relations with the training phase. In this viewpoint, very recently,
Chakrabartty and Cauwenberghs and Peng et al. adopted SVM-based method including
sequential minimal optimization (SMO) technique for hyperplane generation and developed the
ASCI-based speaker recognition system.
Disadvantages:
 High cost
 low flexibility performances
Proposed System:
To realize the proposed ASSR system including BHC in a chip, a system level architecture is
shown in Fig. 1. As shown, the architecture consists of two major blocks—processing and
control blocks. Inside the processing block, there are five subblocks including an ERT core, three
memory blocks (speech models, speaker models, and feature memory), and a debug unit. The
memories concern with speaker models and features are based on SRAM. However, the memory
of speech models is based on ROM. There are two sets of addresses (suffixed by 1 and 2) and
data buses (suffixed by 1 and 2) accompanied with one 4 × 1 MUX (M1) for data output and one
1:8 de-MUX (M2) for data input. The ERT core performs the main computational operations,

Mobile: +91-9566492473/+91-9042092473 | WhatsApp: 09566492473
whereas three memory blocks store the data of speaker models, voice features, and speech
models as shown in Fig. 1. The debug unit performs as interface with the rest of the blocks and
input/output lines through control block. The control block consists of a control unit, a control
register, a 2:1 de-MUX, and a synchronous unit, which includes a combinational block with a
delay circuit. The control unit is connected to M2 through the control register, interrupt, and data
fetch line by M3. The M3 regulates the chip modes (normal/debug). The control unit handles the
scheduling and communication to the major six subblocks of the processing blocks.
Fig. 1. System level architecture of the proposed ASSR system.
ERT CORE:
The ERT core is the key in this proposed ASSR system, which handles the vital 10 operations
(T1–T10). The functional overview of an ERT core is shown in Fig. 2, which consists of four
blocks: 1) a multifeedback shift register (MFSR) array; 2) two multichannel routers (MCRs); and
3) a configurable processing element (CPE).

Mobile: +91-9566492473/+91-9042092473 | WhatsApp: 09566492473
Fig. 2. Block diagram of the proposed ERT core.
Multifeedback Shift Register Array:
The proposed MFSR array that is designed to reduce the global memory accessing, consists of
two MFSRs. The structure of an MFSR is shown in Fig. 3 which includes two eight-word shift
registers (SRs) (SR1 and SR2), and two 3 × 1 MUXs (M1 and M2).
Fig. 3. Block diagram of multifeedback shift register.
Configurable Processing Element

Mobile: +91-9566492473/+91-9042092473 | WhatsApp: 09566492473
The CPE consists of three operation units—unit 1, unit 2, and unit 3, which provides important
computational resources to ERT core in certain configurations. These three units can operate
either independently or cooperatively based on reuse mechanism, which definitely improves the
time cost as well as hardware resources.
Advantages:
 low-cost
 high-flexibility performances
Software implementation:
 Modelsim
 Xilinx ISE

A new binary halved clustering method and ert processor for assr system

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (20)

En vedette

En vedette (20)

A new binary halved clustering method and ert processor for assr system