Financial services companies around the world are getting into the recording business in a big way. No, they’re not going into the studio with the latest pop music sensations. Instead, they’re recording just about everything being said in and around their institution. Client trades made over the phone. Voice mails left for financial advisers. Conference calls. Audio from videoconferences. Employee calls from mobile devices, whether the company’s or their own.
Phonetic search: A powerful new regulatory compliance tool for financial institutions
1. Financial services companies around the world are
getting into the recording business in a big way. No,
they’re not going into the studio with the latest pop
music sensations. Instead, they’re recording just about
everything being said in and around their institution.
Client trades made over the phone. Voice mails left
for financial advisers. Conference calls. Audio from
videoconferences. Employee calls from mobile
devices, whether the company’s or their own.
A variety of federal, state and international regulations are driving institutions
to blanket their operations with voice recordings. Regulatory requirements
specify how long recordings need to be retained by financial firms and how
quickly records must be delivered to government agencies or other requestors.
Furthermore, compliance can involve granular searches. For example,
complying with e-discovery requests may include isolating a specific
recording of a certain employee or client conversation.
For some time, institutions have stored and analyzed the content of audio
communications using speech-to-text technology that translates audio into a
minable data format. But the recent, rapid growth in regulatory requirements
has thrown the limitations of speech-to-text into sharp relief. For instance,
a bank that previously only had to record 1,000 turrets of financial trader
information could now find itself having to record and retrieve audio from
half-a-million telecommunications end points throughout the enterprise.
Speech-to-text technology simply isn’t up to this kind of challenge, for several
reasons. As a result, institutions are increasingly turning to phonetic search to
find and analyze conversations and content. Phonetic search is a process built
on phonemes — basic language elements that provide the building blocks for
how human speech sounds.
avaya.com | 1
Phonetic search:
A powerful new regulatory compliance
tool for financial institutions
Complying with
e-discovery requests
may include isolating
a specific recording
of a certain employee
or client conversation.
2. Phonetic search can help financial services companies meet increasing
regulatory requirements by providing the capability to capture calls in real
time regardless of the source. Institutions can then use advanced analytical
tools to mine the phonetic records from those calls to identify specific topics,
specific people and specific calls.
Growing mandates and scrutiny
Two very recent developments highlight the increasing legal and regulatory
imperative that financial institutions be able to record and retrieve specific
calls and other audio exchanges.
In April 2012, the U.S. Commodity Futures Trading Commission (CFTC)
finalized regulations recommended in the Dodd-Frank Wall Street Reform
and Consumer Protection Act regarding reporting, record keeping and
daily trading records for swap dealers and major swap participants.
The regulations state that effective July 3, 2012:
Each swap dealer and major swap participant shall make and keep
pre-execution trade information, including, at a minimum, records of
all oral and written communications provided or received concerning
quotes, solicitations, bids, offers, instructions, trading, and prices,
that lead to the execution of a swap, whether communicated by
telephone, voicemail, facsimile, instant messaging, chat rooms,
electronic mail, mobile device, or other digital or electronic media.1
In March 2012, the U.S. District Court in New York granted a Federal Trade
Commission (FTC) motion for summary judgment against businessman
Paul Navestad for violating the FTC Act. Navestad was found to have made
1
Commodity Futures Trading Commission, 17 CFR Parts 1, 3 and 23 RIN 3038–AC96, § 23.202,
Daily trading records.
Phonetic search can help financial services companies
meet increasing regulatory requirements by providing
the capability to capture calls in real time regardless
of the source.
avaya.com | 2
3. While the Telemarketing
Sales Rule does not apply
directly to financial
institutions, individuals
or companies, it does
apply to them indirectly
when they contract with
an institution that must
comply with the TSR.
material, false and deceptive claims to deceive consumers and to have
violated the Telemarketing Sales Rule (TSR). In addition to calling consumers
on the national Do Not Call Registry, Navestad’s violations included:
• Not providing an opt-out mechanism for consumers not wishing to
receive calls;
• Not providing consumers with the ability to speak to a live operator; and,
• Making false and deceptive statements intending to induce consumers
to pay for services that would allegedly enable them to easily and quickly
receive public or private grants.2
While the TSR does not apply directly to financial institutions, individuals
or companies, it does apply to them indirectly when they contract with an
institution that must comply with the TSR.
However, two other rules do apply directly to financial services companies: the
Telephone Consumer Protection Act (TCPA) and the Gramm-Leach-Bliley Act.
The Federal Communications Commission recently amended the TCPA to
include telemarketing done by banks and insurance companies in Do Not Call
Registry rules and regulations and in the Gramm-Leach-Bliley Act, including
provisions that protect personal consumer financial information held by
financial institutions.
Gramm-Leach-Bliley’s privacy requirement has three principal parts: the
Financial Privacy Rule, the Safeguards Rule and pretexting provisions. Civil
penalties are steep, costing up to $10,000 per violation levied against officers
and directors found to be personally liable and up to $100,000 per violation
for financial institutions held liable.
These are just a few of the regulatory requirements that either directly or
indirectly impact U.S. financial institutions. Multinational firms need to add
foreign regulations to their list of concerns, such as those imposed by the
U.K.’s Financial Services Authority (FSA) requiring all participants in the
country’s capital markets to begin recording mobile communications,
including voice, short message service (SMS) and instant messaging (IM),
of all their employees involved in trading by November 2011.3
2
http://scholar.google.com/scholar_case?case=17476208810662567879.
3
Policy Statement 10/17, “Taping of mobile phones,” Financial Services Authority, http://www.fsa.gov.uk/
pubs/policy/ps10_17.pdf.
avaya.com | 3
4. Speech analytics solutions, in general, convert
recorded audio to text and then perform a text search.
This approach has several limitations associated with
efficiency, cost, propensity for errors and lack of flexibility.
In response, should financial institutions consider taking the approach of
recording all employee voice communications? Such a decision would have
monumental technical implications, especially if those firms are using
speech-to-text technologies for call retrieval and data-mining purposes.
The limitations of speech-to-text
Speech analytics solutions, in general, convert recorded audio to text and
then perform a text search. This approach has several limitations.
First, speech-to-text conversion is inefficient, consuming considerable CPU
and memory resources. The process effectively duplicates the content,
which once converted, must still be searched in its entirety. This both drains
resources and creates a content management challenge.
Also, the process of converting the spoken word to a text file requires that
a series of dictionaries be loaded into the conversion system. In addition,
the hardware- and software-intensive nature of speech-to-text solutions
makes their widespread deployment, in perhaps hundreds or thousands of
financial institution branches for example, an expensive proposition.
Speech-to-text is error prone. The further content is removed from its
original source, the more likely it is that errors have been introduced during
the conversion process.
Finally, speech-to-text offers limited flexibility. Words and phrases to be
searched in the converted text must be predefined in a dictionary of terms
for the text search engine to perform. For ad hoc searches, this can become
unwieldy and a challenge to manage.
avaya.com | 4
5. For any user wishing to
access the information
from an audio stream in
real time or cost-effectively
deploy the solution across
an enterprise, the phonetic
search approach is the
only practical option.
The phenomenal power of phoneme analysis
Until fairly recently, the science of phonetics was confined to university
research laboratories. However, the breadth of potential applications in
the commercial world is accelerating its development and use. For any
user wishing to access information from an audio stream in real time, the
phonetic search approach is the only practical option. Its lightweight
requirements in terms of the processing power required to perform a
search mean that it is able to scale easily to whatever levels are required
to cover an entire organization.
The benefits of this approach are wide-ranging, perhaps the most valuable
being its ability to reduce decision-making latency based on accurate and
up-to-date information. Real-time phonetic search enables insights discovered
in speech to be populated into business intelligence (BI) platforms, allowing
financial institutions to consume aggregated data, measure the scale of a
problem and compare its criticality to other issues — all within a very short
time from occurrence to discovery. This low latency then enables companies
to deploy proactive notification systems to make the technology work in an
observer-less way, saving time and resources.
Another benefit of phoneme-based searches is that they do not require a large
vocabulary of predefined phonemes. For example, there are 40 phonemes in
U.S. English and 44 in U.K. English. Bottom line, phoneme-based searching is
faster and more efficient than speech-to-text conversion and search.
The first step in phonetic search is to build a language- and dialect-dependent
index of the audio content represented as phoneme strings (Figure 1). Future
searches then leverage this index to yield hits or results. Words and phrases
a user searches on are converted into phoneme strings, and searches or
matches are then obtained by walking through the index. Each hit enriches
the context for future searches.
Results are presented in such a way that a user can see which portion or
region of the selected audio content contained phrases or utterances
deemed to be similar.
avaya.com | 5
6. Figure 1. Building a phonetic search
Audio data is
transformed into
phoneme strings
h e l @ U . . .
Hello
1
3
Words and phrases
converted to phoneme
strings and searched
Relevant search results
are displayed
2
Taking phonetic search to the next level
Two recent technology advancements are expanding the capabilities of
phonetic search. One is the development of high-performance desktop
clients for searching and indexing searches in real time. Searches can be
issued through such a client, and, as part of the process, relevance thresholds
can be set that define results the user can ignore. There is no right or wrong
way to set the relevance threshold level. By varying it, trade-offs can be made
between false positives and false negatives.
The second noteworthy development in phonetic search is the emergence
of cloud-based BI solutions. Scalable, secure cloud-based BI platforms can
provide advanced analytics and reporting with low organizational risk, impact
and cost. For example, phone calls can be tagged by criteria, such as the
work shifts during which they occurred or the top reasons clients are calling
the institution.
Cloud solutions also offer automated upload of search and discovery results
to an analytics and reporting engine. They can also include out-of-the-box
coverage for industry-standard key performance indicators such as first-call-
resolution and average-hold-time analysis.
avaya.com | 6