Automatic Language Identification

•

2 j'aime•1,231 vues

bigshum

My presentation about Automatic Language Identification at Barcamp London 9.

Technologie

mjav
meong
h

miaou mja
m

u
ea
m

m
ia
m

uw
na
meow

u
mi

m
v

ao

já
ya

?
mi

Automatic
Language

Identiﬁcation

(LiD)

Alexander
Hosford

@awhosford

The
LiD
Problem

•  The
identiﬁcation
of
a
given
spoken
language

from
a
speech
signal

•  94%
of
global
population
speak
only
6%
of

world
languages

Real
World
Situations

•  Automation
of
LiD
is
desirable

•  Oﬀers
many
beneﬁts
to
international
service

industries

•  Hotels,
Airports,
Global
Call
Centres

Language
Diﬀerences

•  Languages
contain
information
that
makes

one
discernable
from
the
other;

–  Phonemes

–  Prosody

–  Phonotactics

–  Syntax

Human
Abilities

•  Bias
towards
native
language
arises
in
infancy

•  Prosodic
features
some
of
the
ﬁrst
cues
to
be

recognised

•  Humans
can
make
a
reasonable
estimate
on
language

heard
within
3-‐4
seconds
of
audition

•  Even
unfamiliar
languages
may
be
plausibly
judged

Previous
Attempts

•  Attempts
as
early
as
1974
–
USAF
work,
therefore

classiﬁed.

•  Methodological
Investigations
as
early
as
1977
(House
&

Neuburg)

•  Studies
for
the
most
part
center
on
phonotactic
constrains

and
phoneme
modeling

•  Raw
acoustic
waveforms
have
been
visited
(Kwasny
1993)

A
Simpler
Approach?

•  Phonotactic
approaches
require
expert
linguistic

knowledge

•  Phoneme
and
phonotactic
modeling
time

consuming

•  Given
the
speed
of
human
LiD
abilities,

discrimination
most
likely
based
on
acoustic

features

System
Overview

SuperCollider

MFCC Vectors
SuperCollider
SystemCmd
Pre-processed Speech Onset ARFF File
WEKA Tools
nPVI Generation
Utterance Detection

Pitch Contour

Feature
Extraction

•  Spectral
Information
–
MFCC
Vectors

•  Pitch
contour
information

•  Handled
by
SCMIR
Library
(Collins,
2010)

•  Speech
Rhythm
–
Normalised
Pairwise
Variability

Index
(nPVI)

Feature
Type
Implemented
Measure

Spectral
Content
MFCC
Vector
uGen

Pitch
Contour
Tartini
uGen

Speech
Rhythm
nPVI
function

Classiﬁcation

•  Handled
by
the
WEKA
toolkit

•  Built
in
Multilayer
Perceptron

•  Called
from
the
command
line
through
a

SuperCollider
system
command

Comparisons
Made

•  66
language
pairs
from
12
languages

•  A
comparison
within
language
families

•  A
comparison
of
all
12
languages

•  Averaged
&
segmented
data

Results
Within
Families

100.00

90.00

80.00

70.00

Germanic

Romantic

60.00

Slavic

SinoAltaic

50.00

Mean

40.00

30.00

20.00

4
MFCC
13
MFCC
Tartini
41
MFCC
Tartini
All
Features

*Data
averaged
across
ﬁles

Results
From
All
Languages

100.00

90.00

80.00

70.00

60.00
Averaged

Mean

50.00
5s
Segments

Mean

40.00

30.00

20.00

10.00

4
MFCC
4
MFCC
Tartini
4
MFCC
Tartini
13
MFCC
13
MFCC
Tartini
13
MFCC
Tartini
41
MFCC
41
MFCC
Tartini
41
MFCC
Tartini

nPVI
nPVI
nPVI

Extensions

•  nPVI
function
to
use
vowel
onsets

•  Phonemic
Segmentation

•  A
Larger
dataset

•  Better
Eﬃciency

•  Real-‐time
operation

Robustness

•  Real
world
signals
are
very
diﬀerent
from

processed
‘clean’
data.

•  ‘Ideal’
LiD
systems
–
independent
&
robust

•  A
need
to
analyse
only
the
part
of
the
signal

that
matters.

CASA

•  A
computational
modeling
of
human

‘Auditory
Scene
Analysis’
(Bregman
1990)

•  Separation
of
signal
into
component
parts

and
reconstitution
into
meaningful
‘streams’

Using
Noisy
Signals

I’m
open
to
suggestions!

alexanderhosford@googlemail.com

07814
692
549

@awhosford

Contenu connexe

En vedette

subrat

ABA,BALASORE

Natural language processing

Verb to be

แรงดันในของเหลว

Aids to creativity

Conspiracy Profile

Conspiracy Profile

Building solutions in SharePoint isn’t simply about getting the functionality right based on the business requirements. Developers must think about the entire user experience (UX), which goes far beyond the technical aspects of the solution. It’s no longer good enough to meet the specifications. We must exceed them in terms of usability. This takes many developers out of their comfort zones and into the messy world of end users.In this interactive session, we’ll discuss questions like:* How should the user feel when they use this piece of functionality?* Will they perceive that this functionality saves them work or creates new work?* How will the functionality compare to what they see on the consumer Web?* How can we use technologies which haven’t historically been considered mainstream SharePoint developer tools (like jQuery and CSS) to make SharePoint feel more like the sites people love?We’ll look at good and bad examples from SharePoint itself as well as specific customizations.

SharePointFest Konferenz 2016 - Creating a Great User Experience in SharePoint

Marc D Anderson

TRABAJO GRUPAL

Humberto Dominguez

Examples of BT using SharePoint 2010

Mark Morrell

поездка на родину

zizari

Edge Amsterdam Profile

charlyheus

Conversational Internet - Creating a natural language interface for web pages

Dale Lane

Automatic Language Identification

bigshum

01 introduction image processing analysis

Rumah Belajar

3D visualisation of medical images

Shashank

Media text analysis

mariaahmad82

Lightweight Natural Language Processing (NLP)

Lithium

Mining the CPLEX Node Log for Faster MIP Performance

IBM Decision Optimization

3 Things Every Sales Team Needs to Be Thinking About in 2017

Drift

En vedette (20)

subrat

Natural language processing

Verb to be

แรงดันในของเหลว

Aids to creativity

Conspiracy Profile

SharePointFest Konferenz 2016 - Creating a Great User Experience in SharePoint

TRABAJO GRUPAL

Examples of BT using SharePoint 2010

поездка на родину

Edge Amsterdam Profile

Conversational Internet - Creating a natural language interface for web pages

Automatic Language Identification

01 introduction image processing analysis

3D visualisation of medical images

Media text analysis

Lightweight Natural Language Processing (NLP)

Mining the CPLEX Node Log for Faster MIP Performance

3 Things Every Sales Team Needs to Be Thinking About in 2017

Dernier

Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?

Driving Behavioral Change for Information Management through Data-Driven Gree...

Enterprise Knowledge

Real Time Object Detection Using Open CV

Khem

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

Developing An App To Navigate The Roads of Brazil

V3cube

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc

Imagine a world where information flows as swiftly as thought itself, making decision-making as fluid as the data driving it. Every moment is critical, and the right tools can significantly boost your organization’s performance. The power of real-time data automation through FME can turn this vision into reality. Aimed at professionals eager to leverage real-time data for enhanced decision-making and efficiency, this webinar will cover the essentials of real-time data and its significance. We’ll explore: FME’s role in real-time event processing, from data intake and analysis to transformation and reporting An overview of leveraging streams vs. automations FME’s impact across various industries highlighted by real-life case studies Live demonstrations on setting up FME workflows for real-time data Practical advice on getting started, best practices, and tips for effective implementation Join us to enhance your skills in real-time data automation with FME, and take your operational capabilities to the next level.

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Safe Software

[2024]Digital Global Overview Report 2024 Meltwater.pdf

hans926745

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

GenCyber Cyber Security Day Presentation

Michael W. Hawkins

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Miguel Araújo

Finology Group – Insurtech Innovation Award 2024

The Digital Insurer

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

Igalia

Tech Trends Report 2024 Future Today Institute.pdf

hans926745

Handwritten Text Recognition for manuscripts and early printed texts

Maria Levchenko

Scaling API-first – The story of a global engineering organization

Radu Cotescu

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Partners Life - Insurer Innovation Award 2024

The Digital Insurer

Dernier (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...

Real Time Object Detection Using Open CV

A Domino Admins Adventures (Engage 2024)

GenAI Risks & Security Meetup 01052024.pdf

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Developing An App To Navigate The Roads of Brazil

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Axa Assurance Maroc - Insurer Innovation Award 2024

GenCyber Cyber Security Day Presentation

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Finology Group – Insurtech Innovation Award 2024

A Year of the Servo Reboot: Where Are We Now?

Tech Trends Report 2024 Future Today Institute.pdf

Handwritten Text Recognition for manuscripts and early printed texts

Scaling API-first – The story of a global engineering organization

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Strategies for Landing an Oracle DBA Job as a Fresher

Partners Life - Insurer Innovation Award 2024

Automatic Language Identification

1. mjav meong h miaou mja m u ea m m ia m uw na meow u mi m v ao já ya ? mi

2. Automatic Language Identiﬁcation (LiD) Alexander Hosford @awhosford

3. The LiD Problem •  The identiﬁcation of a given spoken language from a speech signal •  94% of global population speak only 6% of world languages

4. Real World Situations •  Automation of LiD is desirable •  Oﬀers many beneﬁts to international service industries •  Hotels, Airports, Global Call Centres

5. Language Diﬀerences •  Languages contain information that makes one discernable from the other; –  Phonemes –  Prosody –  Phonotactics –  Syntax

6. Human Abilities •  Bias towards native language arises in infancy •  Prosodic features some of the ﬁrst cues to be recognised •  Humans can make a reasonable estimate on language heard within 3-‐4 seconds of audition •  Even unfamiliar languages may be plausibly judged

7. Previous Attempts •  Attempts as early as 1974 – USAF work, therefore classiﬁed. •  Methodological Investigations as early as 1977 (House & Neuburg) •  Studies for the most part center on phonotactic constrains and phoneme modeling •  Raw acoustic waveforms have been visited (Kwasny 1993)

8. A Simpler Approach? •  Phonotactic approaches require expert linguistic knowledge •  Phoneme and phonotactic modeling time consuming •  Given the speed of human LiD abilities, discrimination most likely based on acoustic features

9. System Overview SuperCollider MFCC Vectors SuperCollider SystemCmd Pre-processed Speech Onset ARFF File WEKA Tools nPVI Generation Utterance Detection Pitch Contour

10. Feature Extraction •  Spectral Information – MFCC Vectors •  Pitch contour information •  Handled by SCMIR Library (Collins, 2010) •  Speech Rhythm – Normalised Pairwise Variability Index (nPVI) Feature Type Implemented Measure Spectral Content MFCC Vector uGen Pitch Contour Tartini uGen Speech Rhythm nPVI function

11. Classiﬁcation •  Handled by the WEKA toolkit •  Built in Multilayer Perceptron •  Called from the command line through a SuperCollider system command

12. Comparisons Made •  66 language pairs from 12 languages •  A comparison within language families •  A comparison of all 12 languages •  Averaged & segmented data

13. Results Within Families 100.00 90.00 80.00 70.00 Germanic Romantic 60.00 Slavic SinoAltaic 50.00 Mean 40.00 30.00 20.00 4 MFCC 13 MFCC Tartini 41 MFCC Tartini All Features *Data averaged across ﬁles

14. Results Within Families 100.00 90.00 80.00 70.00 Germanic Romantic 60.00 Slavic SinoAltaic 50.00 Mean 40.00 30.00 20.00 4 MFCC 13 MFCC Tartini 41 MFCC Tartini *Data in 5 second segments

15. Results From All Languages 100.00 90.00 80.00 70.00 60.00 Averaged Mean 50.00 5s Segments Mean 40.00 30.00 20.00 10.00 4 MFCC 4 MFCC Tartini 4 MFCC Tartini 13 MFCC 13 MFCC Tartini 13 MFCC Tartini 41 MFCC 41 MFCC Tartini 41 MFCC Tartini nPVI nPVI nPVI

16. Extensions •  nPVI function to use vowel onsets •  Phonemic Segmentation •  A Larger dataset •  Better Eﬃciency •  Real-‐time operation

17. Robustness •  Real world signals are very diﬀerent from processed ‘clean’ data. •  ‘Ideal’ LiD systems – independent & robust •  A need to analyse only the part of the signal that matters.

18. CASA •  A computational modeling of human ‘Auditory Scene Analysis’ (Bregman 1990) •  Separation of signal into component parts and reconstitution into meaningful ‘streams’

19. Using Noisy Signals I’m open to suggestions!

20. alexanderhosford@googlemail.com 07814 692 549 @awhosford

Automatic Language Identification

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (20)

Dernier

Dernier (20)

Automatic Language Identification