ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
Wikibhasha by Dr A Kumaran
1. WikiBhasha From Digital Inclusion to Digital Democracy… Dr A Kumaran Microsoft Research Feb 2011
2. “Egypt will not become liberal democracy overnight; it has a long and difficult journey…Becoming an open and empowered democracy needs a slow process of popular self-education, and it will not come easily or naturally.” Hindustan Times Editorial Feb 14, 2011
3. Agenda Language Technology Research (15Min) “Digital Inclusion vs. Digital Democracy” (5 Min) WikiBhasha (5 Min)
26. Ex #1: Which Language? (In general, Language Identification) Is a document in English or Finnish or Tamil? “Length of words” Near-perfect identification!
27.
28.
29. Ex #5: Statistical MT President visits Chennai ஜனாதிபதிசென்னைசெல்கிறார் Statistical Models Parallel corpora visits President Chennai செல்கிறார் ஜனாதிபதி சென்னை President –ஜனாதிபதி visits –செல்கிறார் Chennai –சென்னை President inaugurates Tamil Conference ஜனாதிபதி தமிழ்மாநாட்டைதுவக்குகிறார்
30. For most Language Technologies… Statistical approaches EXIST, and are proven to be very successful! Data are Critical! Theorem: Data drivesResearch & Technology!
31.
32.
33. WikiBhashaResearch Project on Crowd-sourcing to explore collaborative data creation for Computational Linguistic research (first focus: parallel data)
34. Content Creation by Infusion… Rough content using Machine Translation Appropriate community correction to create value… Article to Target Wikipedia WikiBABEL on Wikipedia CollaborativeTranslation Cache MachineTranslationSystem Linguistic Resources
38. Little traction with Wikipedians Published in WikiSYM 2008 Conference; Adopted for some products in Microsoft
39. WikiBhasha V2.0: Design Objectives #1: Focus users on their purpose (say, Wikipedia) Content Creation, and not Translation #2: In-site Solution WikiBABEL to stay on Wikipedia for the session Submit any/all contributions #3: Generic components, but specifically purposed Vendor Neutrality Componentized Architecture …
41. WikiBhasha Beta WikiBABEL released as WikiBhasha… Content creator, and not Translator ‘On-Wikipedia’ Open sourced ‘Bhasha’ in Sanskrit means ‘Language’ 2010 Version; Interested in open-sourcing and contributing to Wikipedia
42. WikiBhasha: User View CTF Dictionary Designed WikiBhasha as a thin edit layer Stays on Wikipedia User contribution submitted to Wikipedia Cloud Services WikiBABEL UX Wikipedia User Community WikiBhasha 2.0 API’s
43. WikiBhasha: Developer View WikiBhasha designed to be modular & extendible Open-sourced, so community can contribute/enhance WikiBhasha CORE Components GUI Components(Wikipedia-specific UI and Workflow) MediaWiki Software WikiBABEL [Edit] WikiBABEL-CORE User-Interface(Generic UI Components, Scratch Pad, …) User-Experience(Linguistically Aware Wiki-site Aware Workflow Engine) User Management(Authentication, User Credentials Management, User Preferences/Skills, Contributions Tracking, …) Contextual Help(Domain-specific, Context-specific, User-Contribution Aware Help…) CTF Wikipedia Communication (Message Boards, Email/Alert Mechanisms, Wikis, …) Linguistic Resources(Mono-/Bi-lingual Dictionaries, Thesauri, …) Content Management(Content Discovery, Versions, Tagging, Notification Lists, …) 3rd Party Linguistic Services Source/Target Wiki System Interface MediawikiExtensions Lang. Technology Components(Machine Translation, Transliteration, Summarization …) Source/Target Wiki System Interface(Wiki API’s for Content Pull/Push, Content & User Management, …) MediaWikiLayer Cloud Services Layer WikiBhasha UI/UX/IntegrationComponents Layer
44. WikiBhasha: A Community Project WikiBhasha is available as a Bookmarklet/ Wikipedia user-script Please contribute to your Wikipedia! WikiBhasha source code available as a MediaWiki Extension http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/WikiBhasha Please enhance it!
46. Covered in 20+ languages/countries across the world WikiBhasha Release
47. WikiBhasha Release ~500K Visits & ~100K Unique Visitors Visits from 50+ countries Primarily from Europe (and Eastern Europe) Many “casual visitors” who may become “contributors”!
48. Community Program Being conducted in 5 demographics Allahabad &Banaras Cairo Delhi, … Objectives Interaction with Wikipedians & Language Enthusiasts To study community adoption, user experience, data creation, and ultimately, technology development…
50. Languages: Communities & Technology Research requires Data Participatory Internet provides the data needed! Digital “haves and have-nots” For many languages of the world Digital Inclusion is a necessary first step Digital Democracy is a process in which the communities may have to take active part in…