SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Building Capacities in Human Language
  Technology for African Languages

          'Tunde ADEGBOLA
      African Languages Technology Initiative (Alt-i)
                    Ibadan, Nigeria.




Supported by:
                Tiwa Systems Ltd.,
                Bait-al-Hikma,
                Open Society Initiatives for West Africa,
                International Research Centre (IDRC).
Aim of this presentation
    Describe efforts on African language
➢

    technology
         Focus on work at African Language
     ➢

         Technology Initiative(5-years:2003 to 2008)
  State challenges and opportunities for
➢

  African language technology
➢ Present proposal for accelerating the

  development of African language
  technology
State of African Language Technology

  Relatively recent; expanding
➢


➢ Efforts in South Africa


      motivated and guided national policy
    ➢

    ➢ private sector and public organisations

    ➢ semi-government institutions


  Efforts in other parts of Africa are based
➢

  on private initiatives
➢ Encouragning International assistance


      ➢ Mainly from Europe
South African Effort
    Based mainly in 7- universities:
➢


        University of Cape Town
    ➢


        University of Limpopo
    ➢


        University of the North West (Potchefstroom)
    ➢


        University of Pretoria
    ➢


        University of South Africa
    ➢


        University of Stellenbosch
    ➢


        University of the Witwaterstrand (Johannessburg)
    ➢


        Semi-Government institute
    ➢



            Meraka Institute
        ➢


            Human Language Technology Unit (Under
        ➢

            department of Art and Culture)
Other efforts in Africa

     West Africa
 ➢


         Only one private organisation: The African Language
     ➢

         Initiative (Alt-i)
         Individual (O.A. Odejobi)
     ➢



     East Africa
 ➢


         The Djibouti Centre for Speech Research
     ➢


         Technobyte Speech Technologies (Kenya)
     ➢


         Individual(Wanjiku Ag'ang'a, Peter Wagacha)
     ➢
Efforts in other parts of the world
    AflaT
➢


    Outside Echo Project (UK):
➢


        Local language speech technology Initiative
    ➢



    West African Language Documentation
➢

    Project(Germany):
        University of Bielefeld and University of Uyo (Nigeria)
    ➢



    Other small activities:
➢


        E.g. In USA, Yoruba-English machine Translation at St
    ➢

        Mary's College of Maryland
Alt-i
     History
 ➢


          Started in 1975 but became more focused in 1985
      ➢


          By Electrical engineers and physicists
      ➢



     Realises the importance of linguist in 2001 and
 ➢

     incorporate linguistic experts
     Based at Ibadan, Nigeria
 ➢


     Efforts primarily focused on Yoruba
 ➢


     Initial connection with the academia was
 ➢

     hampered by bad economy
          This has improved, but interdisciplinary efforts still low
      ➢
Activities

     Includes research and development in the
 ➢

     following areas:
        Automatic speech recognition
      ➢


      ➢ Text to speech synthesis


      ➢ Machine translation


      ➢ Yoruba spelling checker


      ➢ Automatic diacritic application


      ➢ Localisation of Microsoft Vista and Office


      ➢ Assistance to Universities


      ➢ Education
Automatic Speech Recognition(ASR)

    Started in 2001
➢


    Approaches ASR through the use of tone
➢

    information(similar to talking drum)
    Findings
➢


        Tone-guided search of the recognition space produce
    ➢

        improved accuracy and speed
    Results include:
➢


        A PhD Thesis
    ➢


        Yoruba speech recognition resources
    ➢



    Efforts continuing (funded by OSIWA)
➢
Text-to-speech (TTS) Synthesis

    Started in 2002
➢


    Results
➢


        Our associated (OA Odejobi) researched into
    ➢

        prosody modelling for Yoruba TTS
        Used an innovative modular holistic approach which
    ➢

        integrates: Relational tree and fuzzy logic
        Book on the technique and how it can be extended
    ➢

        for other African languages published (available at
        Amazon)
    Funding yet to be obtained for sustaining this
➢

    work
Machine Translation

    Focus on translation of language spoken in
➢

    Nigeria to English
        Igbo-English
    ➢


        Yoruba-English
    ➢



    Efforts of student volunteers from Department
➢

    of Linguistics and African Languages and Africa
    Regional Centre for Information Science
    Funding yet to be obtained for sustaining this
➢

    effort
Yoruba spelling checker
    Work as part of African Network of Localization
➢



    Developing spelling checker for Open Office
➢



        Based on Hunspell software (Nemeth Laszlo)
    ➢



             Hunspel cannot accommodate all Yoruba morphology rules;
         ➢

             separate codes were developed to handle this.
    Computational study of Yoruba morphology
➢


    Involves staff and Students of Department of Linguistics and
➢

    African Languages at the University of Ibadan
    Results
➢



        ~ 5000 Yoruba root words
    ➢


        100 highly productive affix rule
    ➢


        Working (but limited) spelling checker
    ➢



        Funded by International Research Center, Canada
    ➢
Automatic diacritic application

 Aim is to generate automatic text tone maker for
➢

accurate Yoruba orthography
    By product of Yoruba spelling checker project
➢


    Uses the Bayesian learning approach
➢


    Uses corpus produced in the IDRC
➢


    Funding yet to obtained for this project
➢
Localization of Microsoft

 Microsoft appointed Alt-i as moderator
➢

for localising its Vista and Office Suite
Working on Hausa, Igbo and Yoruba
➢



Project progressing
➢
Assistance to Universities

    Teaching of PG students at University of Ibadan
➢


Supervision of postgraduate projects at African
➢

Regional Centre for Information
 Provide facilities for many PhD and research
➢

students
 Provide facilities and support staff and students
➢

from a number of universities in Western Nigeria
Collaborate with a number of organisations (e.g.
➢

WALS, LAN & YSAN)
Education and outreach

    Seminar
➢


        In 8 Nigerian universities
    ➢



    Workshop and conferences
➢


        For scholars in Linguistics, physics, computer
    ➢

        science, etc.
    Cross-disciplinary studies
➢


        Encourages and support knowledge and skill sharing
    ➢
Observations
    Intellectual resources are available in the universities
➢


 Lack of awareness hampers focussed and organised
➢

effort and hence progress
 Sentimental attachments to departmental traditions
➢

prevent positive engagement
 Importance and role of linguistics in language
➢

technology development not given adequate
recognition
    Inappropriate admission criteria and limited curricular
➢
Recommendations

 Intensive and sustained awareness building
➢

programmes on language technology
 Review of admission criteria and curricular to
➢

encourage and sustain students interest
 Employ modern technique for management of
➢

learning resources
Proposal

    Advocacy
➢


        Identify and develop policy thrust- encourage
    ➢

        development of African language
            Accelerate the development of African language technology
        ➢


            Produce lecturer, researchers and other experts
        ➢


            Raising awareness in secondary and tertiary institution
        ➢



    Service
➢


        Develop man power through graduate training
    ➢


        Support from international scholars will be sought
    ➢


        Develop product that will draw attention to language
    ➢

        technology
Conclusion

 Development of African language technology is
➢

in embryonic state
 Apart from South African Efforts, no coherent
➢

efforts in Africa
 National language policies do not address
➢

language technology appropriately
 Low level of the awareness of the benefits of
➢

language technologies
 Interdisciplinary and multidisciplinary efforts are
➢

required
Thank you



    Suggestions
       and
     Question?

Contenu connexe

Similaire à Building Capacities in Human Language Technology for African Languages

Human Language Technologies for Ethiopian Languages: Challenges and Future Di...
Human Language Technologies for Ethiopian Languages: Challenges and Future Di...Human Language Technologies for Ethiopian Languages: Challenges and Future Di...
Human Language Technologies for Ethiopian Languages: Challenges and Future Di...Guy De Pauw
 
Huy & robert a call to call
Huy & robert a call to callHuy & robert a call to call
Huy & robert a call to callPhung Huy
 
Language Learners And Tech
Language Learners And TechLanguage Learners And Tech
Language Learners And Technenapan
 
Using OER to Address Income Inequality in Poorer South African Schools
Using OER to Address Income Inequality in Poorer South African SchoolsUsing OER to Address Income Inequality in Poorer South African Schools
Using OER to Address Income Inequality in Poorer South African SchoolsGrant Penny
 
WOCMES18 OpenMed
WOCMES18 OpenMedWOCMES18 OpenMed
WOCMES18 OpenMedisidromj
 
Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...
Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...
Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...Idibon1
 
FOSSFA Presentation - Namibtec
FOSSFA Presentation - NamibtecFOSSFA Presentation - Namibtec
FOSSFA Presentation - NamibtecGerard Jensen
 
Policies for OER in regional and minority languages: are regional and minorit...
Policies for OER in regional and minority languages: are regional and minorit...Policies for OER in regional and minority languages: are regional and minorit...
Policies for OER in regional and minority languages: are regional and minorit...LangOER
 
IEEE GHTC 2017: Exploring Mobile Device Literacy in Senegal
IEEE GHTC 2017: Exploring Mobile Device Literacy in SenegalIEEE GHTC 2017: Exploring Mobile Device Literacy in Senegal
IEEE GHTC 2017: Exploring Mobile Device Literacy in SenegalVanessa Rene
 
Exploring device literacy in Senegal
Exploring device literacy in SenegalExploring device literacy in Senegal
Exploring device literacy in SenegalChristelle Scharff
 
English for work and the workplace in sub saharan africa
English for work and the workplace in sub saharan africaEnglish for work and the workplace in sub saharan africa
English for work and the workplace in sub saharan africaPaul Woods
 
How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?LangOER
 
How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?The Open Education Consortium
 
Workshop on OER and less used languages
Workshop on OER and less used languagesWorkshop on OER and less used languages
Workshop on OER and less used languagesicdeslides
 
SLA Asia A Success Story of its Virtual Events and the Road Ahead.pdf
SLA Asia A Success Story of its Virtual Events and the Road Ahead.pdfSLA Asia A Success Story of its Virtual Events and the Road Ahead.pdf
SLA Asia A Success Story of its Virtual Events and the Road Ahead.pdfNabi Hasan
 
Adoption of Open Educational Resources in Higher Learning Institutions
Adoption of Open Educational Resources in Higher Learning InstitutionsAdoption of Open Educational Resources in Higher Learning Institutions
Adoption of Open Educational Resources in Higher Learning InstitutionsLilian Juma
 
Presenting the University of Alicante realizations for SWING Project
Presenting the University of Alicante realizations for SWING ProjectPresenting the University of Alicante realizations for SWING Project
Presenting the University of Alicante realizations for SWING ProjectJose Maria Fernandez Gil
 
Koha uptake in Nigeria: Prospects & Challenges
Koha uptake in Nigeria: Prospects & ChallengesKoha uptake in Nigeria: Prospects & Challenges
Koha uptake in Nigeria: Prospects & ChallengesOlugbenga Adara
 

Similaire à Building Capacities in Human Language Technology for African Languages (20)

Human Language Technologies for Ethiopian Languages: Challenges and Future Di...
Human Language Technologies for Ethiopian Languages: Challenges and Future Di...Human Language Technologies for Ethiopian Languages: Challenges and Future Di...
Human Language Technologies for Ethiopian Languages: Challenges and Future Di...
 
Huy & robert a call to call
Huy & robert a call to callHuy & robert a call to call
Huy & robert a call to call
 
Language Learners And Tech
Language Learners And TechLanguage Learners And Tech
Language Learners And Tech
 
Using OER to Address Income Inequality in Poorer South African Schools
Using OER to Address Income Inequality in Poorer South African SchoolsUsing OER to Address Income Inequality in Poorer South African Schools
Using OER to Address Income Inequality in Poorer South African Schools
 
WOCMES18 OpenMed
WOCMES18 OpenMedWOCMES18 OpenMed
WOCMES18 OpenMed
 
Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...
Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...
Understanding Community Needs: Scalable SMS Processing for UNICEF Nigeria and...
 
FOSSFA Presentation - Namibtec
FOSSFA Presentation - NamibtecFOSSFA Presentation - Namibtec
FOSSFA Presentation - Namibtec
 
OER Francophone Africa working group and project
OER Francophone Africa working group and projectOER Francophone Africa working group and project
OER Francophone Africa working group and project
 
Policies for OER in regional and minority languages: are regional and minorit...
Policies for OER in regional and minority languages: are regional and minorit...Policies for OER in regional and minority languages: are regional and minorit...
Policies for OER in regional and minority languages: are regional and minorit...
 
IEEE GHTC 2017: Exploring Mobile Device Literacy in Senegal
IEEE GHTC 2017: Exploring Mobile Device Literacy in SenegalIEEE GHTC 2017: Exploring Mobile Device Literacy in Senegal
IEEE GHTC 2017: Exploring Mobile Device Literacy in Senegal
 
Exploring device literacy in Senegal
Exploring device literacy in SenegalExploring device literacy in Senegal
Exploring device literacy in Senegal
 
English for work and the workplace in sub saharan africa
English for work and the workplace in sub saharan africaEnglish for work and the workplace in sub saharan africa
English for work and the workplace in sub saharan africa
 
The DOAJ Ambassadors Project- Results and Plans for the Future
The DOAJ Ambassadors Project- Results and Plans for the FutureThe DOAJ Ambassadors Project- Results and Plans for the Future
The DOAJ Ambassadors Project- Results and Plans for the Future
 
How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?
 
How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?How can OER enhance the position of less used languages on a global scale?
How can OER enhance the position of less used languages on a global scale?
 
Workshop on OER and less used languages
Workshop on OER and less used languagesWorkshop on OER and less used languages
Workshop on OER and less used languages
 
SLA Asia A Success Story of its Virtual Events and the Road Ahead.pdf
SLA Asia A Success Story of its Virtual Events and the Road Ahead.pdfSLA Asia A Success Story of its Virtual Events and the Road Ahead.pdf
SLA Asia A Success Story of its Virtual Events and the Road Ahead.pdf
 
Adoption of Open Educational Resources in Higher Learning Institutions
Adoption of Open Educational Resources in Higher Learning InstitutionsAdoption of Open Educational Resources in Higher Learning Institutions
Adoption of Open Educational Resources in Higher Learning Institutions
 
Presenting the University of Alicante realizations for SWING Project
Presenting the University of Alicante realizations for SWING ProjectPresenting the University of Alicante realizations for SWING Project
Presenting the University of Alicante realizations for SWING Project
 
Koha uptake in Nigeria: Prospects & Challenges
Koha uptake in Nigeria: Prospects & ChallengesKoha uptake in Nigeria: Prospects & Challenges
Koha uptake in Nigeria: Prospects & Challenges
 

Plus de Guy De Pauw

Natural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageNatural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageGuy De Pauw
 
POS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik LanguagePOS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik LanguageGuy De Pauw
 
The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)Guy De Pauw
 
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Guy De Pauw
 
Tagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusTagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusGuy De Pauw
 
A Corpus of Santome
A Corpus of SantomeA Corpus of Santome
A Corpus of SantomeGuy De Pauw
 
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Guy De Pauw
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTGuy De Pauw
 
The Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionThe Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionGuy De Pauw
 
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingLearning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingGuy De Pauw
 
Issues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishIssues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishGuy De Pauw
 
How to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsGuy De Pauw
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersGuy De Pauw
 
The PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentThe PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentGuy De Pauw
 
A System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersA System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersGuy De Pauw
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemGuy De Pauw
 
A Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemA Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemGuy De Pauw
 
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...Guy De Pauw
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersGuy De Pauw
 
Amharic document clustering
Amharic document clusteringAmharic document clustering
Amharic document clusteringGuy De Pauw
 

Plus de Guy De Pauw (20)

Natural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageNatural Language Processing for Amazigh Language
Natural Language Processing for Amazigh Language
 
POS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik LanguagePOS Annotated 50m Corpus of Tajik Language
POS Annotated 50m Corpus of Tajik Language
 
The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)
 
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
 
Tagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusTagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News Corpus
 
A Corpus of Santome
A Corpus of SantomeA Corpus of Santome
A Corpus of Santome
 
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFST
 
The Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionThe Database of Modern Icelandic Inflection
The Database of Modern Icelandic Inflection
 
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingLearning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
 
Issues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishIssues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken Irish
 
How to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 years
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound Analysers
 
The PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentThe PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource Development
 
A System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersA System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá Characters
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation System
 
A Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemA Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription System
 
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound Analysers
 
Amharic document clustering
Amharic document clusteringAmharic document clustering
Amharic document clustering
 

Dernier

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 

Dernier (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 

Building Capacities in Human Language Technology for African Languages

  • 1. Building Capacities in Human Language Technology for African Languages 'Tunde ADEGBOLA African Languages Technology Initiative (Alt-i) Ibadan, Nigeria. Supported by: Tiwa Systems Ltd., Bait-al-Hikma, Open Society Initiatives for West Africa, International Research Centre (IDRC).
  • 2. Aim of this presentation Describe efforts on African language ➢ technology Focus on work at African Language ➢ Technology Initiative(5-years:2003 to 2008) State challenges and opportunities for ➢ African language technology ➢ Present proposal for accelerating the development of African language technology
  • 3. State of African Language Technology Relatively recent; expanding ➢ ➢ Efforts in South Africa motivated and guided national policy ➢ ➢ private sector and public organisations ➢ semi-government institutions Efforts in other parts of Africa are based ➢ on private initiatives ➢ Encouragning International assistance ➢ Mainly from Europe
  • 4. South African Effort Based mainly in 7- universities: ➢ University of Cape Town ➢ University of Limpopo ➢ University of the North West (Potchefstroom) ➢ University of Pretoria ➢ University of South Africa ➢ University of Stellenbosch ➢ University of the Witwaterstrand (Johannessburg) ➢ Semi-Government institute ➢ Meraka Institute ➢ Human Language Technology Unit (Under ➢ department of Art and Culture)
  • 5. Other efforts in Africa West Africa ➢ Only one private organisation: The African Language ➢ Initiative (Alt-i) Individual (O.A. Odejobi) ➢ East Africa ➢ The Djibouti Centre for Speech Research ➢ Technobyte Speech Technologies (Kenya) ➢ Individual(Wanjiku Ag'ang'a, Peter Wagacha) ➢
  • 6. Efforts in other parts of the world AflaT ➢ Outside Echo Project (UK): ➢ Local language speech technology Initiative ➢ West African Language Documentation ➢ Project(Germany): University of Bielefeld and University of Uyo (Nigeria) ➢ Other small activities: ➢ E.g. In USA, Yoruba-English machine Translation at St ➢ Mary's College of Maryland
  • 7. Alt-i History ➢ Started in 1975 but became more focused in 1985 ➢ By Electrical engineers and physicists ➢ Realises the importance of linguist in 2001 and ➢ incorporate linguistic experts Based at Ibadan, Nigeria ➢ Efforts primarily focused on Yoruba ➢ Initial connection with the academia was ➢ hampered by bad economy This has improved, but interdisciplinary efforts still low ➢
  • 8. Activities Includes research and development in the ➢ following areas: Automatic speech recognition ➢ ➢ Text to speech synthesis ➢ Machine translation ➢ Yoruba spelling checker ➢ Automatic diacritic application ➢ Localisation of Microsoft Vista and Office ➢ Assistance to Universities ➢ Education
  • 9. Automatic Speech Recognition(ASR) Started in 2001 ➢ Approaches ASR through the use of tone ➢ information(similar to talking drum) Findings ➢ Tone-guided search of the recognition space produce ➢ improved accuracy and speed Results include: ➢ A PhD Thesis ➢ Yoruba speech recognition resources ➢ Efforts continuing (funded by OSIWA) ➢
  • 10. Text-to-speech (TTS) Synthesis Started in 2002 ➢ Results ➢ Our associated (OA Odejobi) researched into ➢ prosody modelling for Yoruba TTS Used an innovative modular holistic approach which ➢ integrates: Relational tree and fuzzy logic Book on the technique and how it can be extended ➢ for other African languages published (available at Amazon) Funding yet to be obtained for sustaining this ➢ work
  • 11. Machine Translation Focus on translation of language spoken in ➢ Nigeria to English Igbo-English ➢ Yoruba-English ➢ Efforts of student volunteers from Department ➢ of Linguistics and African Languages and Africa Regional Centre for Information Science Funding yet to be obtained for sustaining this ➢ effort
  • 12. Yoruba spelling checker Work as part of African Network of Localization ➢ Developing spelling checker for Open Office ➢ Based on Hunspell software (Nemeth Laszlo) ➢ Hunspel cannot accommodate all Yoruba morphology rules; ➢ separate codes were developed to handle this. Computational study of Yoruba morphology ➢ Involves staff and Students of Department of Linguistics and ➢ African Languages at the University of Ibadan Results ➢ ~ 5000 Yoruba root words ➢ 100 highly productive affix rule ➢ Working (but limited) spelling checker ➢ Funded by International Research Center, Canada ➢
  • 13. Automatic diacritic application Aim is to generate automatic text tone maker for ➢ accurate Yoruba orthography By product of Yoruba spelling checker project ➢ Uses the Bayesian learning approach ➢ Uses corpus produced in the IDRC ➢ Funding yet to obtained for this project ➢
  • 14. Localization of Microsoft Microsoft appointed Alt-i as moderator ➢ for localising its Vista and Office Suite Working on Hausa, Igbo and Yoruba ➢ Project progressing ➢
  • 15. Assistance to Universities Teaching of PG students at University of Ibadan ➢ Supervision of postgraduate projects at African ➢ Regional Centre for Information Provide facilities for many PhD and research ➢ students Provide facilities and support staff and students ➢ from a number of universities in Western Nigeria Collaborate with a number of organisations (e.g. ➢ WALS, LAN & YSAN)
  • 16. Education and outreach Seminar ➢ In 8 Nigerian universities ➢ Workshop and conferences ➢ For scholars in Linguistics, physics, computer ➢ science, etc. Cross-disciplinary studies ➢ Encourages and support knowledge and skill sharing ➢
  • 17. Observations Intellectual resources are available in the universities ➢ Lack of awareness hampers focussed and organised ➢ effort and hence progress Sentimental attachments to departmental traditions ➢ prevent positive engagement Importance and role of linguistics in language ➢ technology development not given adequate recognition Inappropriate admission criteria and limited curricular ➢
  • 18. Recommendations Intensive and sustained awareness building ➢ programmes on language technology Review of admission criteria and curricular to ➢ encourage and sustain students interest Employ modern technique for management of ➢ learning resources
  • 19. Proposal Advocacy ➢ Identify and develop policy thrust- encourage ➢ development of African language Accelerate the development of African language technology ➢ Produce lecturer, researchers and other experts ➢ Raising awareness in secondary and tertiary institution ➢ Service ➢ Develop man power through graduate training ➢ Support from international scholars will be sought ➢ Develop product that will draw attention to language ➢ technology
  • 20. Conclusion Development of African language technology is ➢ in embryonic state Apart from South African Efforts, no coherent ➢ efforts in Africa National language policies do not address ➢ language technology appropriately Low level of the awareness of the benefits of ➢ language technologies Interdisciplinary and multidisciplinary efforts are ➢ required
  • 21. Thank you Suggestions and Question?