Note: download the slide will give your better experience.
This talk was giving at PyConTW 2019.
https://tw.pycon.org/2019/en-us/events/talk/839036452602904785/
Youtube video will be released later.
6. 🚨 Confusing Terms 🚨
• Str: str() of python
• Bytes: bytes() of python
• Text: unicode() in Python2 or str() in Python3
• String: Text ∪ Bytes
6
You just need to know that those
terms are different in this talk.
30. Python2 is well-designed for ascii code
BUT…
• Most of encoding only support one way: utf-8, latin-1
>>> 'ℙƴ ℌøἤ'.encode()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte
0xe2 in position 0: ordinal not in range(128)
30
37. The Fact of String
By Ned Batchelder
• Python3: string = unicode
• Encoding needs to be handled manually
• One-way encode/decode behind str/bytes (Good)
• Python2: string = auto encoded bytes
• Ascii codes are perfectly handled
• Two-way encode/decode behind str/bytes (Broken after
python is broadly used in many different human languages.)
37
39. Consistent IO
• Standard I/O (bytes)
• Python2: sys.stdin / sys.stdout
• Python3: sys.buffer.stdin / sys.buffer.stdout
• File IO
import io # consistent api in both versions
io.open('path/to/file', 'wt') # text, bytes
39
40. Bytes or Text (with Encoding)
• u'' or b''?
• No more raw string ''
• Which encoding is used for the text?
• No more guess, always provide encoding: latin-1, utf-8…
40
def my_encrypt(text, encoding='utf-8'):
… 32
41. 41
>>> # -*- coding: utf-8 -*-
sys.getdefaultencoding()
Encoding of your Operating System
32
52. Take-Home Messages
Write text/bytes explicitly with typing
Always provide encoding for bytes
Apply Unicode Sandwich if possible
Copy encode/decode from StackOverflow
52
Python String is no longer a nightmare! 🎉
(Even you only write Python3)
56. Reference: Talks
• Ned Batchelder: Pragmatic Unicode, or, How do I
stop the pain?
• Guido van Rossum: BDFL Python 3 retrospective
• Brett Cannon - How to make your code Python
2/3 compatible - PyCon 2015
• Edward Schofield - Writing Python 2/3 compatible
code
• Victor Stinner - Python 3: ten years later - PyCon
2018
56