This talk describes the design decisions, implementation steps and collected experiences related to migrating data oriented financial applications from Python 2.6 to Python 3.5 in the Raiffeisen Rechenzentrum.
3. 3
Seite 3
• Software developer
• Business intelligence in finance
• Medical workflow
• E-commerce in retail
• Open source: https://github.com/roskakori
• Master‘s degree in information processing science
• Co-organizer PyGRAZ user group: https://pygraz.org
• http://www.roskakori.at, @TAglassinger
RRZ_
About me
4. 4
Seite 4
• Python usage in RRZ
• Recommendations
• Impact and effort
• Appendices: software migration, Python 3 migration, references
RRZ_
Agenda
6. 6
Seite 6
• IT service provider
• Offers high-availability and secure IT infrastructure
• Roots in banking, active since 1975
• Located in Graz-Raaba, Styria, Austria
• https://www.rrz.co.at/
RRZ_
About the RRZ
8. 8
Seite 8
• about 37K SLOC
• data driven tests
• about 60% code coverage
20K20K
RRZ_
Code base to migrate
common
5K
common
5K
app2app2
tools
9K
tools
9K
test
3K
test
3K
app1app1 app3app3
9. 9
Seite 9
• Store premade input and expected output in version control repository
• Test code: execute app and compare outputs
• Majority of effort: creating and maintaining test data
RRZ_
Data driving testing
App codeApp code
Premade
input
Premade
input
Actual
output
Actual
output
Expected
output
Expected
output
assertEqual()assertEqual()
Version
control
repository
Continuous
integration
server
10. 10
Seite 10
• Python 2.6
• Common packages, mostly to process certain kinds of data: lxml, xlrd,
reportlab, …
• Open sourced own packages:
• Cutplace – read and validate CSV, PRN etc
• Loxun – scalable streamed XML writing
• Public domain codec for EBCDIC cp1141 (vanished from the web)
RRZ_
Python and external packages
11. 11
Seite 11
• No customer demand → Software runs fine with Python 2.6; at best, Python
3 does not change that
• Development of new features has to continue during migration
• Mainframe centric environment → Python still considered “suspicious”;
obscure data formats (e.g. VSAM) and codec (EBCDIC)
• Managed binaries due to high security demands → developers can’t just
install and run new tools (though virtualenv possible)
• Parallel migration of the core banking system → less COBOL, more PL/1; still
Easytrieve and WebFOCUS -_-
RRZ_
Challenges
13. 13
Seite 13
• Meet competent people
• Share experiences
• Get the current mood
RRZ_
Visit Python conferences
14. 14
Seite 14
• EuroPython 2012:
• “Python 3 doesn’t really work yet”
• EuroPython 2013:
• “People don’t use Python 3 but want to… kinda”
• Coffee break talk with one of the SQLAlchemy guys about his experience with using the
same code for Python 2 and 3
• Assessment: many of the packages we need do not work with Python 3 yet
RRZ_
Visit Python conferences
15. 15
Seite 15
• EuroPython 2014:
• Attended talk on “Support Python 2 and Python 3 with the same code” by Stefan Schwarzer
https://www.youtube.com/watch?v=9vNr_ZzZZAk
• Attended Sprint with Stefan and migrated own open source package loxun
• Take away: Single code strategy seems nice
RRZ_
Visit Python conferences
16. 16
Seite 16
• Decide on a cut over strategy: big bang or soft?
• In practice mostly: 2to3 or same code for Python 2 and 3?
• Cover all areas
• Also consider data, user interface, developer tools, deployment, scheduling
• See appendix “software migration and strategies”
RRZ_
Be prepared
17. 17
Seite 17
• Same code for Python 2 and 3
• Start writing code that works with Python 2 and should work in Python 3
(even if you can’t fully test that yet)
• Use __future__, six, Pymodernize etc.
• Track Python 3 compatibility of required external packages
• See appendix “Python 3 migration in general”
RRZ_
Use a soft migration strategy
18. 18
Seite 18RRZ_
Test by comparing outputs
Python 2Python 2 OutputOutput
Python 3Python 3 OutputOutput
Δ?Δ?InputInput
19. 19
Seite 19
• Even works during parallel migration of core banking system
• Even works when your code coverage drops to %10
RRZ_
Test by comparing outputs
Python 2Python 2 OutputOutput
InputInput
Python 3Python 3 OutputOutput
Δ?Δ?
20. 20
Seite 20
• Install in different folders (or servers)
• Use separate configuration files pointing to the same input and output
RRZ_
Deploy both in production
Python 2Python 2
InputInput
Python 3Python 3
OutputOutput
21. 21
Seite 21
• Can switch to Python 3 at any time (per application)
• Trivial fallback scenario: revert to Python 2
• Risk: migration drags on due to lack of actual need → eventually just do it
RRZ_
Deploy both in production
Python 2Python 2
InputInput
Python 3Python 3
OutputOutput
22. 22
Seite 22
• “Quality of life improvements”
• Python + several external packages
→ Anaconda + few external packages
• easy_install → pip, conda
• Windows scheduled Taks + Log-Monitoring
→ professional Scheduler
• Eclipse + PyDev → Pycharm
RRZ_
Improve your infrastructure (conservatively)
23. 23
Seite 23
• During migration, you revisit your whole code base
• You‘ll probably notice things that work but might be done in a better way
• Avoid fighting too many battles at the same time
• Remove obsolete code
• Perform minor cleanup
• No architectural refactoring → create issue or TODO comment and move on
Example: getopt usage code from the days of Python 2.2
RRZ_
Refactor code conservatively
24. 24
Seite 24
• Python 2 only EBCDIC codec was not supported any more
→ released new package: https://pypi.python.org/pypi/ebcdic/
• No middle ware for csv module
→ released new package: https://pypi.python.org/pypi/csv342/
• Tested by public
RRZ_
Contribute to open source
25. 25
Seite 25
• Mundane tasks that are hard to automate:
• optparse → argparse
• urllib → requests
• csv → csv
• Advantages
• interns do actually meaningful things they can add to their CV
• permanent developers can focus on customer requirements
• fun!
• School project with HTL Wiener Neustadt to migrate our existing open
source package to read and validate tabular data: https://
pypi.python.org/pypi/cutplace/
RRZ_
Utilize interns
32. 32
Seite 32
• Migrating to Python 3 went smooth
• Deliberately long duration but few
resources
• Key factors: single code, open source,
interns
RRZ_
Summary
35. • Goals
• Keep existing software “alive” without replacing it by new one
• Ensure low costs for continued maintenance
• Ensure it can be extended in future with reasonable effort
RRZ_
Software migration
36. 1. Requirements analysis
2. Legacy analysis
3. Target/bridge design
4. Choice of strategy
5. Implementation (transformation)
6. Quality assurance (testing)
7. Cut-over
RRZ_
Unified process for software migration
37. • Reimplementation: rewrite from scratch, keep functionality the same
• Wrapping: preserve internal functions, update only the interfaces so
functionality is accessible for more modern systems
• Conversion: modify software so it runs on the new system
RRZ_
Software migration strategies
38. • Big bang: replace old system in one fell swoop
• Soft: gradually replace old system
RRZ_
Cut-over strategies
39. • Programs
• Data
• User interface
• Scheduling
RRZ_
Areas of software migration
40. • E. Ackermann, A. Winter, R. Gimnich (2005). Ein Referenz-Prozess der
Software Migration. Softwaretechnik-Trends 25(4), p. 20-22.
• Broadie M. & Stonebreaker L. (1995). Migrating Legacy Systems. San
Francisco, Kalifornien: Morgan Kaufmann.
• Sneed H., Wolf, E. & Heilman, H. (2010). Software-Migration in der Praxis:
Übertragung alter Softwaresysteme in eine moderne Umgebung. dpunkt
Verlag.
RRZ_
References
42. • Support only Python 3
• Fast, easy, clean; use 2to3
• Limits cut-over strategy to “big bang”
• Version control branches for Python 2 and 3
• Similar to above but both branches can be maintained in parallel → soft cut-over possible
• Both branches have to be maintained → additional costs
RRZ_
Basic strategies (1/2)
43. • Integrate 2to3 in build process
• One code base, soft cut-over
• Automatic conversion error prone, might require to implement own hairy transformation
rules
• Integrate 3to2 in build process: only on theory
• Single code for both Python 2 and 3
• One code base, soft cut-over
• Requires middleware, sometimes “ugly” code
RRZ_
Basic strategies (2/2)
44. • Regebro, L. (2013). Porting to Python 3: An in-depth guide. CreateSpace
Independent Publishing Platform.
http://python3porting.com/
• Ronacher, A. (2013). Porting to Python 3 Redux.
http://lucumr.pocoo.org/2013/5/21/porting-to-python-3-redux/
• Schofield, E. (2015). Cheat Sheet: Writing Python 2-3 compatible code.
Python Charmers Pty Ltd, Australia.
http://python-future.org/compatible_idioms.html
RRZ_
References