PySmbC: Contribute to Python C Extensions in Minutes
1. PySmbC:
C Modules are Easy
EuroPython 2012, 6th July - Firenze
Babel Srl P.zza S. Benedetto da Norcia, 33 0040, Pomezia (RM) – www.babel.it
Roberto Polli - roberto.polli@babel.it
2. What? Who? Why?
A story about how easy is to contribute to a Python project using
GitHub and Nose testing framework.
Roberto Polli - Community Manager @ Babel.it. Loves writing in C,
Java and Python. Red Hat Certified Engineer.
Fabio Isgrò – System Engineer @ Babel.it. Linux and Samba expert.
Babel – Proud sponsor of this talk ;) Delivers large mail infrastructure
based on Open Source software for Italian ISP and PA. Contributes
to various FLOSS.
Roberto Polli - roberto.polli@babel.it
3. Agenda - 1
Contributing to a Python C Extension is easier than thought
+ GitHub allows a fast way to fork and merge patches to open
source projects
+ Write and run Unit and Functional tests is easy with NoseTest
= Don't have to be a Guru to support FLOSS!
Roberto Polli - roberto.polli@babel.it
4. Agenda - 2
SMB is the protocol used by Windows for sharing folders. Samba is its
FLOSS implementation.
smbclient library supports almost all SMB features. PySmbC wraps
some functions provided by this library.
To support extended permissions (ACL) we wrapped two more
functions: get_xattr and set_xattr.
You don't need to know SMB to extend PySmbC
Roberto Polli - roberto.polli@babel.it
5. GitHub – social coding
GitHub is a social coding platform.
●
Clone my repository and fork the project
https://github.com/ioggstream/pysmbc
●
Patch and commit to your repository
●
Push your changes to my repo
●
Discuss for approval
●
Merge my changes
Did you say...fork?
Roberto Polli - roberto.polli@babel.it
6. GitHub – social coding
Forking is the nightmare of every
floss maintainer.
Sparse writes, increasing merge
efforts, lost changes.
GitHub, like the I/O Scheduler,
helps to merge writes!
– Tracks forks;
– Push changes to master.
Roberto Polli - roberto.polli@babel.it
7. GitHub – social coding
Patch and commit to your repository
●
don't lose your changes;
●
track your own history.
Push your changes to my repo
●
no more for .patch;
●
I get the change history.
Discuss for approval
●
Github portal supports code annotations.
Merge my changes
●
use git to merge from a remote repository!
Roberto Polli - roberto.polli@babel.it
8. Enter PySmbC - Extensions
Python wrapper around libsmbclient: run C code from python
– enjoy a stable C library;
– just manage input, output and errors:
– usually faster than a pure python implementation.
rpolli$ ipython
ln [1]: import smbc smbc.so
ln [2]: print smbc.XATTR_ACL
system.nt_sec_desc.acl
libsmbclient.so.0
ln [3]:
Roberto Polli - roberto.polli@babel.it
9. Example - Wrapping factorial() - 1
The wrapping function my_factorial():
// Python C Extension
// uses factorial from fact.c
●
Parses and validates the input; #include <fact.h>
wrapperfy.c
●
Calls the wrapped function(); // returns a python object!
PyObject *my_factorial(...) {
...
●
Returns a python object. ret = factorial(n);
...
return PyLong_asLong(ret);
}
A given structure maps python methods to
C functions.
// Maps _wrapper.factorial
# python script // to my_factorial
from _wrapper import factorial
Now we can invoke a PyMethodDef BabelMethods[] = {
{"factorial", my_factorial, ... },
print _wrapper.factorial(4)
wrapped function! };
{NULL, NULL, 0, NULL} /*Sentinel*/
Roberto Polli - roberto.polli@babel.it
10. Example - Wrapping factorial() - 2
Parsing and validating Input and Output // returns a python object!
PyObject *my_factorial(..., *args) {
is fundamental. We don't want python // NULL indicates an error
if (!PyArg_ParseTuple(args, "i", &n))
to SEGFAULT! return NULL;
// n! needs more than 8byte
if (n>21) {
...
PyErr_SetString(FactError,
Create new exceptions in the “Bad value”);
}
initialization function. ...
return PyLong_asLong(ret);
}
PyObject *FactError;
Throw exceptions in the function: // in extension initialization...
...
●
setting PyErr; init_wrapper(void) {
...
// define an exception
●
returning NULL. FactError =
PyErr_NewException("_wrapper.error",
NULL, NULL);
...
}
Roberto Polli - roberto.polli@babel.it
11. Example - Wrapping factorial() - 3
// Python C Extension
C Extension components: #include <Python.h>
●
wrapping functions; // exceptions
PyObject *FactError;
PyObject *FiboError;
●
method/function map; // functions
●
exceptions; PyObject *my_factorial(...);
PyObject *my_fibonacci(...);
●
initialization function. // Function/Method Maps
PyMethodDef BabelMethods[] = {
{"factorial",my_factorial,... },
Functions and Exception should be static {"fibonacci",my_fibonacci,... },
{NULL, NULL, 0, NULL} /*Sentinel*/
};
PyMODINIT_FUNC
You have to track memory usage! init_wrapper(void)
{
PyObject *m;
m = Py_InitModule("_wrapper",
BabelMethods);
// … Allocate Exceptions
FactError = PyErr_NewException(...)
FiboError = PyErr_NewException(...)
}
Roberto Polli - roberto.polli@babel.it
12. Enters PySmbC - Modules
Python C extensions
may enjoy both C and
Python code.
Wrap the C extension in
a Python module.
Extend the module with
python classes.
$PYTHONPATH/
wrapper/
__init__.py In [1]: import wrapper
In [2]: assert wrapper.helpers
_wrapper.so In [3]: wrapper.helpers.is_integer(10)
helpers.py
Roberto Polli - roberto.polli@babel.it
13. Nose – let's contribute - 1
Before adding features to PySmbC we checked the project status
# git clone https://github.com/ioggstream/pysmbc .
# vim tests/settings.py # set samba
credential
# nosetests test/
NoseTest - a python script that auto-discovers and run test cases.
Wraps python-unittest.
Add new features only after successful tests. Verify your environment
(eg. Samba credentials, directory acls )
Roberto Polli - roberto.polli@babel.it
14. Nose – let's contribute - 2
On successful tests, we can start developing
Follow the Git way: create a separate branch. We'll merge it on
success
# git checkout -b ioggstream_setxattr
Write the tests before writing the code. You'll be more focused on
your targets
With nosetest it's simpler than ever!
Roberto Polli - roberto.polli@babel.it
15. Nose – is like UnitTest
UnitTest Nose
from unittest import TestCase, main import nose
class MyTest(UnitTest): class MyTest:
def setUp(self): def setup(self):
print”setup every” print ”setup”
def tearDown(self): def teardown(self):
print “teardown every” print “teardown”
def test_do(self): def test_do(self):
print “do 1” print “do 1”
if __name__== “__main__”: # nose script will auto-discover
main() # this script named test_script.py
Roberto Polli - roberto.polli@babel.it
16. Nose – is simpler than UnitTest
Nose: simple test Nose: annotations
# don't need to import nose from nose import SkipTest,with_setup
# or define a class
def pre(): print “setup”
def setup(): def post(): print “teardown”
print”setup once for all tests”
def teardown(): @with_setup(pre,post)
print “teardown once for all test” def test_do():
print “do 1”
def test_do():
print “do 1” @SkipTest
def test_fail(): def test_dont():
assert False Print “not done yet”
Roberto Polli - roberto.polli@babel.it
17. Nose – Invocation
You can run your all tests in a given directory
# nosetests ./path/
Or just one file
# nosetests ./path/test_sample.py
Or even a single test method
# nosetests ./path/test_sample.py:test_do1
Or suite, eventually setting the working directory
ex1# nosetests ./path/test_class.py:TestOne
ex2# nosetests -w ./path test_class:TestOne
For a verbose output just use:
#nosetests -sv [args]
Roberto Polli - roberto.polli@babel.it
18. PySmbC – add getxattr
Nose ensures that we're not going to break # from test_context.py
def test_xattr_constants():
anything. '''reuse variable defined
in smbclient.h'''
assert smbc.XATTR_ACL
assert smbc.XATTR_OWNER
Start writing tests, not coding functionalities. assert smbc.XATTR_GROUP
You can @SkipTest until new functions are def test_xattr_get():
ready. '''test xattr with all
possible values'''
. . .
for xa in valid_xatts:
Play with the wrapped functions. assert ctx.getxattr(url, xa)
def test_xattr_get_error():
Start with the simpler one: getxattr() '''xattr_get should
recognize bad values'''
●
embed C constants into python; . . .
for xa in invalid_xatts:
●
test good values; try:
ctx.getxattr(url, xa)
●
check behavior with bad values. assert False
except RuntimeError as e:
. . . #get errno
Code until tests are successful. assert errno == EINVAL
Roberto Polli - roberto.polli@babel.it
19. PySmbC – add setxattr and futures
Helper methods for parsing and creating # from test_context.py
ACL def test_xattr_set():
attrs_new = u'REVISION:1' . . .
ctx.setxattr(url, a_name,
+ ',OWNER:RPOLLIbabel' attrs_new, REPLACE)
+ ',GROUP:Unix Groupbabel' attrs_1 = ctx.getxattr(url,
a_name)
+ ',ACL:RPOLLIbabel:0/0/0x001e01ff' assert attrs_1 == attrs_new
+ ',ACL:Unix Groupbabel:0/0/0x00120089'
+ ',ACL:Unix Groupgames:0/0/0x001e01ff' def test_xattr_set_error():
'''setxattr should
+ ',ACL:Everyone:0/0/0x00120089' recognize bad values'''
. . .
for xa in invalid_xatts:
Shift from smbc.so to smbc module: try:
ctx.setxattr(url, a_name,
●
smbc/_smbc.so xa, REPLACE)
assert False
●
smbc/__init__.py except RuntimeError as e:
. . . #get errno
●
smbc/helper.py assert errno == EINVAL
except TypeError
pass
Roberto Polli - roberto.polli@babel.it
Hi everybody, I'm Roberto Polli from Babel and I'm going to tell you a python story. A story about how easy is to contribute to a python project even if you're a py-noob like me, if you just use all the tools that the FLOSS world gives you. Before starting I'd like to thank Babel – the proud sponsor of this talk. - for letting me to play with this project even after I've finished the job. Babel delivers large mail infrastructures for ISP, PA and GOV using and contributing to open source software. This story is just a drop in the sea of the various contribution of Babel to FLOSS.
People loves FLOSS – mostly because it's free as in beer. Many companies think that contributing is costly and useless. Or simply because they don't have the required knowledge to mess with the code. But today there are many technologies that should make them re-think about that, because the hardest part is to start. With social coding platform like GitHub, and a clean testing process it's easy to write working code and contribute it back. You don't have to be a guru to contribute!
People loves FLOSS – mostly because it's free as in beer. Many companies think that contributing is costly and useless. Or simply because they don't have the required knowledge to mess with the code. But today there are many technologies that should make them re-think about that, because the hardest part is to start. With social coding platform like GitHub, and a clean testing process it's easy to write working code and contribute it back. You don't have to be a guru to contribute!
So, what's GitHub? Is there somebody here that doesn't know GitHub? Raise your hand! The standard FLOSS development patter was based on check-out from main repository and send patches to mailing list. GitHub one is based on fork. [read the slide] Did you say fork???
Fork is – historically – the nightmare of floss maintainer. It fragments the user base, and parcellize development efforts. The 1 st image shows various forks. Merging back the patches is hard, and even applying fixes from the master branch to the forks. So fork childs are sparse writes, merge difficulties, lost changes. GitHub acts like the Linux IO Scheduler, helping us to merge writes! It's done tracking forks, and allowing to review, push and merge changes from forks to master. When I commit a possibily small change on my repo – eg a fix – I can push it to the master branch. The maintainer can validate it – eventually adding some more test – and the apply. At this point I just have to merge my branch with the master one – which contains now my patch!
Here are some GitHub features: [read slide]
GitHub was essential for our contribution. But first we had to understand how pysmbc work. Linux smbclient library implements almost all functionalities. Is continuously maintained by the Samba team. Writing a pure python implementation of SMB client protocol was redundant! Python has a nice feature: C Extension. It's a way to wrap existing C functions in a python module. Main advantages are mentioned in the slide, while a drawback is that the given module is platform-dependant. ipython is a python console supporting gnu-readline and many nice features. To use C Extension smbc.so that wraps the libsmbclient.so just set the PYTHONPATH and # import smbc And enjoy all the features of smbc
Here is a code example of wrapping a simple factorial() C function. The C Extension is a C file named eg. wrapperfy.c The wrapper function is my_factorial(), that parses the arguments received from python, calls the wrapped factorial() function, and return a Py_Object – in this case is a Python_Long one. To associate the my_factorial() function to a python method we need a map – the MethodDef variable contained in wrapperfy.c. Once built wrapperfy.c into _wrapper.so with gcc, we can play again with ipython: import _wrapper _wrapper.factorial(4)
Writing C Extension we should be careful to avod memory leaks and segmentation faults. All the memory management is up to you! A good starting point is to validate memory areas and variable content, adding new exceptions to our module and throwing them when issues could raise. This is done: properly setting the python error and stacktrace variables with PyErr_ functions – that are similar to the errno C variable; Returning NULL in the function. In our example, we created an Exception in the initialization function of the extention. Then we raise it using PyErr_SetString() in case of an invalid parameter in factorial()
Main sections of our wrapper modules: - wrapper functions - exceptions - function/method map - initialization function All functions should be static – statically mapped into the .so files and not usable outside that file. And – as I told above - you have to track memory usage!
We can even enjoy python code wrapping the extension in a module. The convention is to prepend an underscore “_” to the C extension library. wrapper.so becomes _wrapper.so In this way we can create a module named “wrapper”. The slide shows the module directory structure. It's initialized by __init__.py which includes: all the _wrapper methods; methods contained in further .py files. This could be a future asset of pysmbc, letting us to add further python methods to manage eg. xattr strings.
Now that we understood how C extensions and modules works, let's see how to get our job to be accepted by maintainers. Once you clone (checkout) your project, you have to setup your environment – in our case a Samba Server – in a way that all tests are successful. So check the credentials, the workgroup and all the environment! If you care about tests, the maintainer will care about your work! Running tests on pysmbc is easy. Just run #nosetests against the test directory. Nose will browse the directory and discover all your test cases. When all tests are ok, you can start developing!
Before coding, branch your code tree and name it with a meaningful target of your changes. I used ioggstream_setxattr, so that the maintainer could easily find who and what changes were involved. Branching let us preserve the original tree and merge the “master” with the new changes made on the maintainer repository. Writing tests before code is even simpler with Nose Test!
Does everybody knows something about unit testing? The standard python library for unit testing is python-unittest. It lets you write Test classes, with setup/teardown methods. Nose works in a similar way. With the nosetests script you can save some bytes and avoid writing main(). And you can even run unittest programs.
Nose simplifies test writing. In many cases you don't even need to include nose. Just write your test methods, beginning with test_. Moreover it supports basic fixtures like @SkipTest and @with_setup, to customize setup and teardown tests.
Nosetests script is quite flexible. You can run all tests, select a single file, class or even a single test. Everything works via command-line and is scriptable. By default nosetests silences tests writings to stdout. You can de-silence it with -s. And enable verbose output with -v.
Nose ensures that we're not going to break anything. The next step is to to write some test cases for the new functionalities. Writing tests let us focus on the expected input and output, and require us to read the wrapped library documentation. We started with the simpler one: get_xattr. T1 - check the existence of all the new constants T2 – test get_xattr with invalid parameters T3 – test get_xattr with valid parameters
setxattr is not user friendly. You have to create ACL strings by hand. The slide shows a simple ACL with a file accessible from two different groups. Python is really good at handling strings, so I started to work on a python class to manage ACL. To include it in pysmbc I had to reorganize pysmbc structure on my branch. Test classes are still the same, living in the same path, so I'm sure I'm not going to break anything. I'm currently discussing with the maintainer the new asset of the module.
That's all. Your questions are welcome! I leave just few links, if you're interested in contributing to PySmbC or just want to find some useful posts on Linux and Coding on Babel company blog vaunaspada.