SlideShare une entreprise Scribd logo
1  sur  14
Plangsarn: Thai
Romanization Library
                       Application

  Srichan Chancheewa, THAMMASAT UNIVERSITYDr.
Choochart Haruechaiyasak, NECTEC, THAILAND
Why do we need this application?
before                   after
Background and Motivation
• Romanization is the representation of a written word or
  spoken speech with the Roman (Latin) script.
• Two main methods of romanization are
   (1) Transliteration: for representing written text,
   (2) Transcription: for representing the spoken word,
       (2.1) phonemic transcription: records the
       phonemes or units of semantic meaning in speech
       (2.2) phonetic transcription: records speech sounds
       with precision.
Background and Motivation
• Phonetic transcription attempts to depict all phones in the
  source language.
• We adopt the International Phonetic Alphabet (IPA).
• IPA is an alphabetic system of phonetic notation based
  primarily on the Latin alphabet.
• IPA was devised by the International Phonetic Association as
  a standardized representation of the sounds of oral
  language.
• IPA symbols are composed of one or more elements of two
  basic types, letters and diacritics.
Background and Motivation
• For library cataloging, there is a guideline for preparing
  romanization:
   –ALA-LC Romanization Tables: Transliteration
    Schemes for Non-Roman Scripts, which is approved
    by the Library of Congress and the American Library
    Association.
   –ALA-LC is a set of standards for romanization, or
    the representation of text in other writing systems
    using the Latin alphabet.
Objective
• Design and implement a system for automatically
  romanizing Thai texts with the focus on bibliographic
  data.
• The system is intended to support librarians who
  perform the cataloguing task.
• The system is called Plangsarn (          ).
   –plang (      ) = transform
   –sarn (     ) = text, message
Plangsarn’s System Architecture




• Lexicon contains Thai word list with romanization.
• G2P (Grapheme-to-Phoneme) converts grapheme unit into phoneme unit.
• Tokenizer performs the word segmentation on Thai written texts.
Plangsarn’s process flow
1. Get Thai text input from the user via browser
   interface.
2. Perform word segmentation on the input text.
3. For each word token, look up in the lexicon for
   existing romanized representation.
4. If the romanized representation already exists, then
   use it. Otherwise send the word to G2P module.
5. Concatenate all romanized representations and
   send the final result to the user via the browser
   interface.
Example of Plangsarn process
Input Thai text:



Tokenized text:


Romanized text:     kāndoēnthāng pai mahāwitthayālai
Plangsarn’s Web interface:
http://164.115.23.167/plangsarn/
User Evaluation
• Plangsarn is under the process of user testing and evaluation.
• Two main reported problems so far are
   – Incorrect word segmented unit.
       •         (Kitchen) => hǭng khrūa (should be hǭngkhrūa)

   – Unable to recognize and capitalize named entities such as the names
     of person, organization, place
      •         (Bangkok) => krung thēp (should be Krungthēp)

• To solve the problem, the system allows authorized users to manually
  include words with its correct romanization into the lexicon.
Plangsarn’s Summarized Features
• Automatic romanization of Thai text using the G2P
  (Grapheme-to-Phoneme) technique
• Use the lexicon-based Thai word segmentation to tokenize
  the text
• Allow authorized users to manually add and edit the list of
  romanized lexicon
• Support both IPA and OCLC Connexion
• Automatic capitalization for named entities (e.g., names of
  person, organization, place)
• Provide the machine translation of Thai to English
PLANGSARN IS NOW AVAILABLE WITHIN
 THAI GOVERNMENT CLOUD SERVICE
 HTTP://164.115.23.167/PLANGSARN/
SPECIAL THANKS TO
                 Dr. Choochart Haruechaiyasak
                       Senior Researcher
        Speech and Audio Technology Laboratory (SPT)
National Electronics and Computer Technology Center (NECTEC)
          112 Phahonyothin Rd., Klong 1, Klong Luang,
                       Pathumthani 12120
                           THAILAND
email: choochart.haruechaiyasak@nectec.or.th
               Tel. (+66) 2 564-6900 Ext. 2240
                     Fax. (+66) 2 564-6873

Contenu connexe

Tendances

Scaling agile Principles and Practices
Scaling agile Principles and PracticesScaling agile Principles and Practices
Scaling agile Principles and PracticesJosef Scherer
 
Introduction to LeSS - Large Scale Scrum
Introduction to LeSS - Large Scale ScrumIntroduction to LeSS - Large Scale Scrum
Introduction to LeSS - Large Scale ScrumSrikanth Ramanujam
 
MAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMarkus Michalewicz
 
DevOps : Consulting with Foresight
DevOps : Consulting with ForesightDevOps : Consulting with Foresight
DevOps : Consulting with ForesightInfoSeption
 
Deploying OpenShift Container Platform on AWS by Red Hat
Deploying OpenShift Container Platform on AWS by Red HatDeploying OpenShift Container Platform on AWS by Red Hat
Deploying OpenShift Container Platform on AWS by Red HatAmazon Web Services
 
Microservices Design Patterns | Edureka
Microservices Design Patterns | EdurekaMicroservices Design Patterns | Edureka
Microservices Design Patterns | EdurekaEdureka!
 
DevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesDevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesShiva Narayanaswamy
 
Simplify DevOps with Microservices and Mobile Backends.pptx
Simplify DevOps with Microservices and Mobile Backends.pptxSimplify DevOps with Microservices and Mobile Backends.pptx
Simplify DevOps with Microservices and Mobile Backends.pptxssuser5faa791
 
Introduction to scaled agile framework
Introduction to scaled agile frameworkIntroduction to scaled agile framework
Introduction to scaled agile frameworkITEM
 
Evolution of netflix conductor
Evolution of netflix conductorEvolution of netflix conductor
Evolution of netflix conductorvedu12
 
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Janusz Nowak
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Edureka!
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022Sandesh Rao
 
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...Mihai Criveti
 
The History of DevOps (and what you need to do about it)
The History of DevOps (and what you need to do about it)The History of DevOps (and what you need to do about it)
The History of DevOps (and what you need to do about it)dev2ops
 

Tendances (20)

Scaling agile Principles and Practices
Scaling agile Principles and PracticesScaling agile Principles and Practices
Scaling agile Principles and Practices
 
Introduction to LeSS - Large Scale Scrum
Introduction to LeSS - Large Scale ScrumIntroduction to LeSS - Large Scale Scrum
Introduction to LeSS - Large Scale Scrum
 
Devops ppt
Devops pptDevops ppt
Devops ppt
 
MAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19c
 
DevOps : Consulting with Foresight
DevOps : Consulting with ForesightDevOps : Consulting with Foresight
DevOps : Consulting with Foresight
 
Deploying OpenShift Container Platform on AWS by Red Hat
Deploying OpenShift Container Platform on AWS by Red HatDeploying OpenShift Container Platform on AWS by Red Hat
Deploying OpenShift Container Platform on AWS by Red Hat
 
Scaling Agile
Scaling Agile Scaling Agile
Scaling Agile
 
Scaling Agile - LeSS Framework
Scaling Agile - LeSS FrameworkScaling Agile - LeSS Framework
Scaling Agile - LeSS Framework
 
Microservices Design Patterns | Edureka
Microservices Design Patterns | EdurekaMicroservices Design Patterns | Edureka
Microservices Design Patterns | Edureka
 
Devops
DevopsDevops
Devops
 
DevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesDevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best Practices
 
Simplify DevOps with Microservices and Mobile Backends.pptx
Simplify DevOps with Microservices and Mobile Backends.pptxSimplify DevOps with Microservices and Mobile Backends.pptx
Simplify DevOps with Microservices and Mobile Backends.pptx
 
Introduction to scaled agile framework
Introduction to scaled agile frameworkIntroduction to scaled agile framework
Introduction to scaled agile framework
 
Evolution of netflix conductor
Evolution of netflix conductorEvolution of netflix conductor
Evolution of netflix conductor
 
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
 
Continuous testing
Continuous testing Continuous testing
Continuous testing
 
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
AnsibleFest 2021 - DevSecOps with Ansible, OpenShift Virtualization, Packer a...
 
The History of DevOps (and what you need to do about it)
The History of DevOps (and what you need to do about it)The History of DevOps (and what you need to do about it)
The History of DevOps (and what you need to do about it)
 

Similaire à Plangsarn: Thai Romanization Library Application - Srichan Chincheewa

Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShashank Shisodia
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools Lifeng (Aaron) Han
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionSarvnaz Karimi
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Acceptance Testing Of A Spoken Language Translation System
Acceptance Testing Of A Spoken Language Translation SystemAcceptance Testing Of A Spoken Language Translation System
Acceptance Testing Of A Spoken Language Translation SystemMichele Thomas
 
Ch 2 Names scopes and bindings.pptx
Ch 2 Names scopes and bindings.pptxCh 2 Names scopes and bindings.pptx
Ch 2 Names scopes and bindings.pptxRanjanaShevkar
 
Multi lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationMulti lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationAashna Phanda
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
International journal on natural language computing(ijnlc)
International journal on natural language computing(ijnlc)International journal on natural language computing(ijnlc)
International journal on natural language computing(ijnlc)kevig
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
 
Machine transliteration survey
Machine transliteration surveyMachine transliteration survey
Machine transliteration surveyunyil96
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicijnlc
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionaryiosrjce
 
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENTAMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENTNathan Mathis
 

Similaire à Plangsarn: Thai Romanization Library Application - Srichan Chincheewa (20)

Php packages
Php packagesPhp packages
Php packages
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliterator
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
ReseachPaper
ReseachPaperReseachPaper
ReseachPaper
 
Acceptance Testing Of A Spoken Language Translation System
Acceptance Testing Of A Spoken Language Translation SystemAcceptance Testing Of A Spoken Language Translation System
Acceptance Testing Of A Spoken Language Translation System
 
Ch 2 Names scopes and bindings.pptx
Ch 2 Names scopes and bindings.pptxCh 2 Names scopes and bindings.pptx
Ch 2 Names scopes and bindings.pptx
 
Multi lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationMulti lingual corpus for machine aided translation
Multi lingual corpus for machine aided translation
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
 
International journal on natural language computing(ijnlc)
International journal on natural language computing(ijnlc)International journal on natural language computing(ijnlc)
International journal on natural language computing(ijnlc)
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Machine transliteration survey
Machine transliteration surveyMachine transliteration survey
Machine transliteration survey
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabic
 
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionary
 
Ijetcas14 444
Ijetcas14 444Ijetcas14 444
Ijetcas14 444
 
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENTAMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
 

Plus de tulipbiru64

Towards a Structured Information Security Awareness Programme
Towards a Structured Information Security Awareness ProgrammeTowards a Structured Information Security Awareness Programme
Towards a Structured Information Security Awareness Programmetulipbiru64
 
Towards The Curated web
Towards The Curated webTowards The Curated web
Towards The Curated webtulipbiru64
 
Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...
Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...
Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...tulipbiru64
 
Multi-factor Information Security Risk in Information System
Multi-factor Information Security Risk in Information SystemMulti-factor Information Security Risk in Information System
Multi-factor Information Security Risk in Information Systemtulipbiru64
 
Informative Centers' Intelligent Agent Based Model - a preliminary study
Informative Centers' Intelligent Agent Based Model - a preliminary studyInformative Centers' Intelligent Agent Based Model - a preliminary study
Informative Centers' Intelligent Agent Based Model - a preliminary studytulipbiru64
 
Research Data Management: Our Role
Research Data Management: Our RoleResearch Data Management: Our Role
Research Data Management: Our Roletulipbiru64
 
Transforming The Academic Library Services For Generation Y Using Knowledge M...
Transforming The Academic Library Services For Generation Y Using Knowledge M...Transforming The Academic Library Services For Generation Y Using Knowledge M...
Transforming The Academic Library Services For Generation Y Using Knowledge M...tulipbiru64
 
Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...
Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...
Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...tulipbiru64
 
Social Tagging/Bookmarking Application: The Usage In Academic Libraries
Social Tagging/Bookmarking Application: The Usage In Academic LibrariesSocial Tagging/Bookmarking Application: The Usage In Academic Libraries
Social Tagging/Bookmarking Application: The Usage In Academic Librariestulipbiru64
 
e-Books: Putting Librarians And Researchers 'In The Know'
e-Books: Putting Librarians And Researchers 'In The Know'e-Books: Putting Librarians And Researchers 'In The Know'
e-Books: Putting Librarians And Researchers 'In The Know'tulipbiru64
 
Measurement Of Values And Performance For The Institutions Of Higher Educatio...
Measurement Of Values And Performance For The Institutions Of Higher Educatio...Measurement Of Values And Performance For The Institutions Of Higher Educatio...
Measurement Of Values And Performance For The Institutions Of Higher Educatio...tulipbiru64
 
Keynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeM
Keynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeMKeynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeM
Keynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeMtulipbiru64
 
Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...tulipbiru64
 
Mobile OPAC Prototype Based On Koha Open Source Integrated Library System
Mobile OPAC Prototype Based On Koha Open Source Integrated Library SystemMobile OPAC Prototype Based On Koha Open Source Integrated Library System
Mobile OPAC Prototype Based On Koha Open Source Integrated Library Systemtulipbiru64
 
Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...
Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...
Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...tulipbiru64
 
Corporate Social Responsibility (CSR) And Library Collaborative Partnership
Corporate Social Responsibility (CSR) And Library Collaborative PartnershipCorporate Social Responsibility (CSR) And Library Collaborative Partnership
Corporate Social Responsibility (CSR) And Library Collaborative Partnershiptulipbiru64
 
The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...
The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...
The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...tulipbiru64
 
Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...
Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...
Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...tulipbiru64
 
Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...
Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...
Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...tulipbiru64
 

Plus de tulipbiru64 (20)

Towards a Structured Information Security Awareness Programme
Towards a Structured Information Security Awareness ProgrammeTowards a Structured Information Security Awareness Programme
Towards a Structured Information Security Awareness Programme
 
Towards The Curated web
Towards The Curated webTowards The Curated web
Towards The Curated web
 
Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...
Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...
Kajian Kepuasan Pengguna Terhadap Kualiti Perkhidmatan Ruang Pembelajaran di ...
 
Multi-factor Information Security Risk in Information System
Multi-factor Information Security Risk in Information SystemMulti-factor Information Security Risk in Information System
Multi-factor Information Security Risk in Information System
 
Informative Centers' Intelligent Agent Based Model - a preliminary study
Informative Centers' Intelligent Agent Based Model - a preliminary studyInformative Centers' Intelligent Agent Based Model - a preliminary study
Informative Centers' Intelligent Agent Based Model - a preliminary study
 
Research Data Management: Our Role
Research Data Management: Our RoleResearch Data Management: Our Role
Research Data Management: Our Role
 
Transforming The Academic Library Services For Generation Y Using Knowledge M...
Transforming The Academic Library Services For Generation Y Using Knowledge M...Transforming The Academic Library Services For Generation Y Using Knowledge M...
Transforming The Academic Library Services For Generation Y Using Knowledge M...
 
Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...
Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...
Repositori Institusi Isu Dan Cabaran: Kajian Kes Perpustakaan Universiti Tekn...
 
Social Tagging/Bookmarking Application: The Usage In Academic Libraries
Social Tagging/Bookmarking Application: The Usage In Academic LibrariesSocial Tagging/Bookmarking Application: The Usage In Academic Libraries
Social Tagging/Bookmarking Application: The Usage In Academic Libraries
 
e-Books: Putting Librarians And Researchers 'In The Know'
e-Books: Putting Librarians And Researchers 'In The Know'e-Books: Putting Librarians And Researchers 'In The Know'
e-Books: Putting Librarians And Researchers 'In The Know'
 
Buku Masa Depan
Buku Masa DepanBuku Masa Depan
Buku Masa Depan
 
Measurement Of Values And Performance For The Institutions Of Higher Educatio...
Measurement Of Values And Performance For The Institutions Of Higher Educatio...Measurement Of Values And Performance For The Institutions Of Higher Educatio...
Measurement Of Values And Performance For The Institutions Of Higher Educatio...
 
Keynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeM
Keynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeMKeynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeM
Keynote Speech by YBhg. Profesor Datuk Dr. Shahrin Sahib, Vice Chancellor UTeM
 
Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...Repository : A Brief Comparative Study Between The National University Of Mal...
Repository : A Brief Comparative Study Between The National University Of Mal...
 
Mobile OPAC Prototype Based On Koha Open Source Integrated Library System
Mobile OPAC Prototype Based On Koha Open Source Integrated Library SystemMobile OPAC Prototype Based On Koha Open Source Integrated Library System
Mobile OPAC Prototype Based On Koha Open Source Integrated Library System
 
Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...
Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...
Kajian Tinjauan Tanggapan Pengguna Terhadap Profesion Pustakawan Dalam Kalang...
 
Corporate Social Responsibility (CSR) And Library Collaborative Partnership
Corporate Social Responsibility (CSR) And Library Collaborative PartnershipCorporate Social Responsibility (CSR) And Library Collaborative Partnership
Corporate Social Responsibility (CSR) And Library Collaborative Partnership
 
The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...
The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...
The Effectiveness Of Searching Arabic Resources Through OPAC : A Case Study I...
 
Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...
Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...
Library Usage Among Medical Students In The Faculty Of Medicine And Health Sc...
 
Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...
Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...
Kajian Kepuasan Pelanggan Di Perpustakaan UTHM Dalam Meningkatkan Kualiti Per...
 

Dernier

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Dernier (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Plangsarn: Thai Romanization Library Application - Srichan Chincheewa

  • 1. Plangsarn: Thai Romanization Library Application Srichan Chancheewa, THAMMASAT UNIVERSITYDr. Choochart Haruechaiyasak, NECTEC, THAILAND
  • 2. Why do we need this application? before after
  • 3. Background and Motivation • Romanization is the representation of a written word or spoken speech with the Roman (Latin) script. • Two main methods of romanization are (1) Transliteration: for representing written text, (2) Transcription: for representing the spoken word, (2.1) phonemic transcription: records the phonemes or units of semantic meaning in speech (2.2) phonetic transcription: records speech sounds with precision.
  • 4. Background and Motivation • Phonetic transcription attempts to depict all phones in the source language. • We adopt the International Phonetic Alphabet (IPA). • IPA is an alphabetic system of phonetic notation based primarily on the Latin alphabet. • IPA was devised by the International Phonetic Association as a standardized representation of the sounds of oral language. • IPA symbols are composed of one or more elements of two basic types, letters and diacritics.
  • 5. Background and Motivation • For library cataloging, there is a guideline for preparing romanization: –ALA-LC Romanization Tables: Transliteration Schemes for Non-Roman Scripts, which is approved by the Library of Congress and the American Library Association. –ALA-LC is a set of standards for romanization, or the representation of text in other writing systems using the Latin alphabet.
  • 6. Objective • Design and implement a system for automatically romanizing Thai texts with the focus on bibliographic data. • The system is intended to support librarians who perform the cataloguing task. • The system is called Plangsarn ( ). –plang ( ) = transform –sarn ( ) = text, message
  • 7. Plangsarn’s System Architecture • Lexicon contains Thai word list with romanization. • G2P (Grapheme-to-Phoneme) converts grapheme unit into phoneme unit. • Tokenizer performs the word segmentation on Thai written texts.
  • 8. Plangsarn’s process flow 1. Get Thai text input from the user via browser interface. 2. Perform word segmentation on the input text. 3. For each word token, look up in the lexicon for existing romanized representation. 4. If the romanized representation already exists, then use it. Otherwise send the word to G2P module. 5. Concatenate all romanized representations and send the final result to the user via the browser interface.
  • 9. Example of Plangsarn process Input Thai text: Tokenized text: Romanized text: kāndoēnthāng pai mahāwitthayālai
  • 11. User Evaluation • Plangsarn is under the process of user testing and evaluation. • Two main reported problems so far are – Incorrect word segmented unit. • (Kitchen) => hǭng khrūa (should be hǭngkhrūa) – Unable to recognize and capitalize named entities such as the names of person, organization, place • (Bangkok) => krung thēp (should be Krungthēp) • To solve the problem, the system allows authorized users to manually include words with its correct romanization into the lexicon.
  • 12. Plangsarn’s Summarized Features • Automatic romanization of Thai text using the G2P (Grapheme-to-Phoneme) technique • Use the lexicon-based Thai word segmentation to tokenize the text • Allow authorized users to manually add and edit the list of romanized lexicon • Support both IPA and OCLC Connexion • Automatic capitalization for named entities (e.g., names of person, organization, place) • Provide the machine translation of Thai to English
  • 13. PLANGSARN IS NOW AVAILABLE WITHIN THAI GOVERNMENT CLOUD SERVICE HTTP://164.115.23.167/PLANGSARN/
  • 14. SPECIAL THANKS TO Dr. Choochart Haruechaiyasak Senior Researcher Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) 112 Phahonyothin Rd., Klong 1, Klong Luang, Pathumthani 12120 THAILAND email: choochart.haruechaiyasak@nectec.or.th Tel. (+66) 2 564-6900 Ext. 2240 Fax. (+66) 2 564-6873

Notes de l'éditeur

  1. In this paper, we adopt the approach of phonetic transcription.
  2. Plangsarn is designed based on the phonetic transcription approach by using the IPA and ALA-LC guideline.
  3. การเดินทางไปมหาวิทยาลัย = travelling to the university
  4. การเดินทางไปมหาวิทยาลัย = a trip to a university