Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Localizing Ruby on Rails in 64% of the World
1. Lokalisierung von
Ruby on Rails
Till Vollmer
Geschäftsführer Codemart GmbH
2. “The Confusion of Tongues”
by Gustave Doré
Locali[s|z]ing Ruby on Rails
Till Vollmer
Managing Director Codemart GmbH
3. Why?
64% of people online
do not speak English!
(680 Mio people)
Source: http://global-reach.biz
3
4. Definitions
Internationalisation i18N
to handle international text (input, process
and output)
Localising L10N
Process of making an application to fit a
locale
Globalisation G11N
Refers to both
4
7. History of Text
ASCII – American Standard
Code for Information
Interchange (1967)
7 Bit with control chars
developed from
telegraphic codes
7
8. History of Text
Codepages
Use of 8th Bit to
introduce all sort of
characters
Charsets
ISO-8859-X
Problems: No texts
with mixed
encodings
8
9. Unicode
Have one! system that
can handle all possible
characters (1991)
encodes the underlying
characters — graphemes
— rather than the
variant glyphs
(renderings) for such
characters
http://ian-albert.com/misc/unichart.php
9
10. Unicode
Codepoints
Notation: U+xxxxxxxx (hex)
Only theoretical not an encoding!
First 256 chars -> ISO 8859-1
Example: U+0052, U+0075, U+0062, U+0079
Don‘t get mislead by leading 00 (not 2-
Bytes)
10
11. Klingon language - Klingonese
Was rejected in 2001 by Unicode
Consortium, but exists in private Area
U+F8D0 to U+F8FF
Copyright: Paramount Pictures
11
12. Unicode encodings
Mapping of codepoint into Bytes
UTF-7 7-bit encoding, suited for transmission and storage only; obsolete
UTF-8 an 8-bit, variable-width encoding, compatible with ASCII
UTF-16 a 16-bit, variable-width encoding
UTF-32/UCS4 functionally identical 32-bit fixed-width encodings
UCS2 a 16-bit, fixed-width encoding that only supports the BMP
Most common myth: UTF-8 only needs 2 Bytes
maximum! It can use up to 4 (or even 6 in ISO-
10646)
12
13. UTF-8
Variable length and ASCII fits in
Up to 4 Bytes (or even 6 in ISO)
ISO 10646 range covered UTF-8 representation
----------------------- --------------------
Bits Hex Min Hex Max Byte Sequence in Binary
7 00000000 0000007f 0vvvvvvv
11 00000080 000007FF 110vvvvv 10vvvvvv
16 00000800 0000FFFF 1110vvvv 10vvvvvv 10vvvvvv
21 00010000 001FFFFF 11110vvv 10vvvvvv 10vvvvvv 10vvvvvv
26 00200000 03FFFFFF 111110vv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
31 04000000 7FFFFFFF 1111110v 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
13
14. Unicode and Ruby
The ugly truth is:
Ruby does not support it.
But: only Java and .net are truly into
Unicode
Ruby supports different encoding for the
code itself
$KCODE = ' UTF8‚
require ' jcode'
14
15. Bad examples
quot;Über-Rubyquot;.length # -> 10
quot;Caféquot;.length # -> 5
quot;Caféquot;.reverse # -> faC
quot;Caféquot;[0..3] #-> Caf303
quot;efficientquot;[0..1] #=> “ef” ? -> effi
Upcase, reverse, downcase, etc will not
work
15
16. Implications
Be aware that for Ruby UTF-8 is just a
byte sequence
e.g. String#length: Watch your validators
=> There are tools to handle Unicode
correctly!
Hear more about Unicode and how to
handle Strings properly tomorrow on the
Unicode track from Dominic Mitchell
16
17. Localising Rails apps
Views
Models
Date/Time
Currency
Number formating
Other: e.g. Name display
Built in stuff
=> Depends on Region and language
17
18. Built in stuff
Active Record error messages
Active Record model names
strftime (month names, weekdays)
distance_of_time_in_words
number_to_currency
date_select, datetime_select
Date.*, Time.*
…
=> be prepared
18
19. Basic things in Rails
Database set to utf-8
database.yml
encoding: utf8
=> set names = utf8
set header in filter application.rb:
@response.headers[quot;Content-Typequot;] = quot;text/html;
charset=utf-8quot;
META Tag
<meta http-equiv=quot;Content-Typequot;
content=quot;text/html; charset=utf-8quot;/>
Convince your editor to use UTF-8
Radrails -> global setting
Nasty: YAML has some bugs: fixtures ?!
19
20. Rails Plugins/Gems
Globalize
Localize
Gloc
Localization Simplified
Ruby Gettext
Ri18N
=> All have different features
http://wiki.rubyonrails.com/rails/pages/InternationalizationComparison
20
21. Globalize
Model Translations
View Translations
Active Record errors
Caching
Currency, Date/Time
Pluralisation
All translations are in database
Locale handling
View files, ActionMailer
Comes with prefilled database for ~240 countries
=> Complex, but very powerful
21
22. Globalize Examples
Model:
class Model < ActiveRecord::Base
translates :title, :description, :text
end
Locale.set quot;en-USquot;
model.title= quot;Englishquot;
View:
<%=:mytext.t%>
<%=quot;My text goes herequot;.t%>
<%=_quot;My textquot;%> # gettext syntax
22
23. Localize
Only View
Only strings
Simple pluralisation, but Proc support
Translations are in .rb files
=> Simple but efficient if you only need string
translation
Localization.define('de') do |l|
l.store 'Hello World', 'Hallo Welt'
l.store '(time)', lambda { |t| t.strftime('%H:%M') }
end
_quot;Hello Worldquot;
_quot;(time)quot;
23
24. Gloc
Only View
Date/Time
Active Record errors
Pluralisation rules
Translation is stored in .yml file
_gloc_rule_default: ' |n| n==1 ? quot;_singlequot; : quot;_pluralquot; '
man_count_plural: There are %d men.
man_count_single: There is 1 man.
lwr(:man_count, 1) # => There is 1 man.
lwr(:man_count, 8) # => There are 8 men.
24
25. Localization Simplified
Localisation for a site that uses only ONE
language other than English
Date/Time
Active Record Error Messages
Sets connection and HTTP header
View is normal
25
26. Ruby Gettext
Based on GNU gettext
Translation stored in po/mo files
A lot of tools exist to translate
Rails support (Active Records Errors,
ActionMailer)
Pluralisation
_quot;My Textquot;
26
27. Ri18N
Again based on gettext, but not so
complex
Po files
NO Rails support
Purly based on Ruby for Ruby
=> all built in things must be taken care of
27
28. Recommendations
Choose wisely
Patching may occur
If you need model translation, the only
choice is Globalize
Using _() in general is a good idea if you
plan to go global ant take over the world
28
29. Summary
Unicode is the best choice (UTF-8) for
storing texts
Ruby is not good at UTF-8 (but not alone)
Localisation: A lot more than store text
A lot of plugins exist to help
29