Illustrated Code: Building Software in a Literate Way
Andreas Zeller, CISPA Helmholtz Center for Information Security
Notebooks – rich, interactive documents that join together code, documentation, and outputs – are all the rage with data scientists. But can they be used for actual software development? In this talk, I share experiences from authoring two interactive textbooks – fuzzingbook.org and debuggingbook.org – and show how notebooks not only serve for exploring and explaining code and data, but also how they can be used as software modules, integrating self-checking documentation, tests, and tutorials all in one place. The resulting software focuses on the essential, is well-documented, highly maintainable, easily extensible, and has a much higher shelf life than the "duct tape and wire” prototypes frequently found in research and beyond.
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
Illustrated Code (ASE 2021)
1. Andreas Zeller • 36th IEEE/ACM International Conference on Automated Software Engineering (ASE 2021) • November 17, 2021
Illustrated Code
Building Software in a Literate Way
14. Illustrated Code
Everything in One Place
in documents that are executable •
self-updating • self-checking
One
Document
Rationales
Architecture
Tests
Interface
Q&A
Tutorials
Specification
Implementation
15. A Case Study
De
fi
ne a function
middle(x, y, z) that
returns the "middle" of
three integers x, y, and z –
i.e. the one that is neither
the maximum nor the
minimum of the three.
One
Document
Architecture
Tests
Interface
Tutorials
Specification
Rationales Q&A
Implementation
16. Interface
How do I use the code?
• Standard way of documenting things
• No formal spec (what is "the middle" here?); no context; no rationale
• No usage example
• No implementation (yet)
Let us de
fi
ne an interface for middle():
17. Specification
What should the code do?
Let us formally specify what middle() should do:
This speci
fi
cation is executable, so we can easily include examples:
Or just write the examples as assertions, so we can use them as tests later:
18. Specification
What should the code do?
Our speci
fi
cation document can contain all sorts of diagrams and more
```mermaid
sequenceDiagram
Client->>+Server: middle(5, 4, 7)
Server-->>-Client: 5
```
(I get this through three Markdown lines)
19. Implementation
How does it work?
Let us now provide an e
ffi
cient implementation for middle():
Once written, this is executable:
Tests and results become part of the doc!
20. Rationales
Why is the code as it is?
Why do we implement middle() again, rather than using middle_spec()?
Because middle() is twice as fast.
The document can also discuss and evaluate more alternatives,
reproducing the thoughts and experiments of the original programmer
21. Rationales
Why is the code as it is?
The document can include all these experiments and their results as a rationale:
(experiment details go here)
We can have the document check automatically whether the rationale holds:
This ensures consistency between text and code.
22. Tests
What does the code guarantee?
Tests can be written as additional examples on how the code should work:
If a test fails, that's the same as an example failing. (And examples act as tests.)
23. Tests
What does the code guarantee?
One can analyze (and report) test performance, again in the document:
Assumptions about coverage can be made in
the document, too
24. Tests
What does the code guarantee?
One can check against the spec, again in the document:
All these tests can be run (and debugged) right from the document.
25. Tests
What does the code guarantee?
One can also include static checks or symbolic veri
fi
cation:
← and actually prove correctness
26. Architecture
How is the code organized?
We can extract architecture diagrams from code, keeping it up to date
27. Architecture
How is the code organized?
We can extract dynamic
diagrams from executions:
We can even compare these
diagrams against the spec
and
fi
nd mismatches!
28. Tutorials
How can I use or extend this?
The document can contain instructions on how to run things:
(Of course, these would be executable too, testing the tutorial)
29. Q&A
Where can I ask a question?
The document can contain sections with questions and answers:
These would be managed by the public, and continuously ensure consistency
(Thanks to Greg Wilson for this idea!)
30. Illustrated Code
• Have all in one document
• Use rich documentation
• Make things computable
• Make things checkable
‣ Fun to read; fun to write
‣ Always up to date
‣ Exploit synergies
Everything in One Place
One
Document
Rationales
Comments
History
Architecture
Documentation
Tests
Test code
Test document Interface
Reference document
Code
Q&A
Stack Over
fl
ow
FAQ
Tutorials
Tutorials
Textbook
Specification
Documentation
Implementation
Code
fi
le
33. Literate Programming
Donald J. Knuth, 1984
+ Rich documentation (TEX)
+ Web structure
- No examples
- No output
- No tests
- Not interactive
34. Rich Comments
Python Source Code, ~2010
+ Rich documentation
(Markdown)
+ Edit as code
+ Tests
- All text
- Examples written from hand
- Output written from hand
- Not interactive
- No tutorial
35. Nbdev
Library for Python Projects using Jupyter Notebooks, 2020
+ Rich documentation
(Markdown + HTML)
+ Edit as notebook
+ Edit as code
+ Integrated tests
- No writing philosophy
- Python-speci
fi
c
37. Notebooks
• As of now, Notebooks (Jupyter) are still inadequate as an IDE
• Want refactorings, search + replace, code completion, links
• Want the document organized as a tree rather than linearly
• Want local executions that do not a
ff
ect global state
• Need modularity and reuse across notebooks
• Python is not perfect for programming in the very large
are not the Answer
38. Integrated Development Environments
• Today's development environments are all typewriter-centric
• Want rich documentation, interactivity, examples, tutorials…
• Want documents that capture all code aspects consistently,
rather than a dozen fragments
are not the Answer
39. Illustrated Code
An Agenda
➡ Give IDEs the power of rich documentation
➡ Give Notebooks the powers of IDEs – refactoring, search, etc.
➡ Have Notebook support C, C++, Java, Rust, Go, Scala…
➡ Make Notebooks first-class citizens (e.g., import/include them)
➡ Make Notebooks collaborative with Q&As (a la StackOverflow)
➡ Make tutorials, illustrations, examples a part of standard coding
➡ Do more research on what makes notebooks special/successful
40. Illustrated Code
An Agenda
➡ Give IDEs the power of rich documentation
➡ Give Notebooks the powers of IDEs – refactoring, search, etc.
➡ Have Notebook support C, C++, Java, Rust, Go, Scala…
➡ Make Notebooks first-class citizens (e.g., import/include them)
➡ Make Notebooks collaborative with Q&As (a la StackOverflow)
➡ Make tutorials, illustrations, examples a part of standard coding
➡ Do more research on what makes notebooks special/successful
Andreas Zeller
Illustrated Code
Building Software that Lasts
Rich Documents
Code Knowledge
Implementation
Code file
Rationales
Comments
History
Architecture
Documentation
Tests
Test code
Test document Interface
Reference document
Code
Tutorials
Tutorials
Textbook
Specification
Documentation
How do I keep these consistent?
How do I obtain these in the first place?
Q&A
Stack Overflow
FAQ
Illustrated Code
Everything in One Place
in documents that are executable •
self-updating • self-checking
One
Document
Rationales
Architecture
Tests
Interface
Q&A
Tutorials
Specification
Implementation
@AndreasZeller