Contenu connexe


Open Source Software Licence Compliance: Art or science?

  1. Open Source Software Licence Compliance: Art or science? Andrew Katz, CEO, Orcro Limited and Partner, Moorcrofts LLP
  2. The goal for OSS licence compliance… • Tooling which • Knows all the components you are using; • Knows exactly which licence is applicable to each component (including exceptions); • Can automatically generate a complete set of compliance artefacts; • Understands the terms you are trying to out-license on; • Understands the use case (distributable, app store, SaaS, containerized, embedded); • Generates build scripts and installation instructions where necessary; • Suggests remediation for potential problems; • Provides a repository which contains a set of pre-approved components; • Won’t allow/generates an error on build for non-compliance components.
  3. The goal for OSS licence compliance… • Component sources which • Correctly identify the applicable licences • Correctly identify relevant dependencies • Correctly identify relevant dependencies for that particular build configuration • Have a reference implementation with correct compliance artefacts • We are confident do not have included code from elsewhere • Contain easily machine readable metadata covering the above
  4. Challenges • The scale of the problem. An embedded system may contain over 100k components. • Legacy code does not use SPDX/REUSE etc. Much is being retrofitted, but sometimes it’s impossible. • Technology develops faster than compliance. Law develops slower than both • Containerization: where are all the components coming from? • Orchestration (e.g. Kubernetes) compounds this • What is “distribution”? • If you are not distributing to your client, but Docker is (for example), who must comply? • What about secondary liability? • An outsourced developer will frequently provide the compliance artefacts necessary to distribute the software to you….but insufficient information for you to distribute to your end users.
  5. Real issues: examples from Linux Kernel File: arch/m68k/mac/config.c * Much of this was defined by Alan, based on who knows what docs. File: arch/arm/kernel/sys_arm.c * Copyright (C) People who wrote linux/arch/i386/kernel/sys_i386.c (The file sys_i386.c no longer exists in the source tree.) File: drivers/staging/rtl8192e/rtllib_softmac.c * WPA code stolen from the ipw2200 driver. * Copyright who own it's copyright. File: drivers/staging/rtl8192e/rtllib_softmac_wx.c * Some pieces of code might be stolen from ipw2100 driver * copyright of who own it's copyright ;-)
  6. Real issues: examples from Linux Kernel File: arch/alpha/kernel/smc37c669.c * This software is furnished under a license and may be used and copied... (No licence is specified - hopefully it is compatible with the GPL) Some typically unclear attributions/copyright notices: File: arch/powerpc/platforms/chrp/pegasos_eth.c * And anyone else who helped me on this (Following a set of attributions) File: arch/um/drivers/daemon_kern.c * Copyright (C) 2001 by various other people who didn't put their name here.
  7. Real examples: GPL Overreach SQLMap: GPL, but with an extended definition of “derivative work” which includes any software which * Executes sqlmap and parses the results (as opposed to typical shell or execution-menu apps, which simply display raw sqlmap output and so are not derivative works).
  8. Code interaction – 3 axes of compliance To what extent must GPL code interact with other code to trigger the requirement for the other to be released under GPL? 1. How closely are the components combined? 2. How is the code delivered (distributed) to the user? 3. What sort of interface does the interaction use?
  9. How closely are the components combined? a. running the two components on separate computers b. running the two components separate virtual machines c. running the two components in separate threads or processes d. running the two components sequentially in the same thread or process (e.g. as a plug in) e. running the two components dynamically linked f. running the two components statically linked g. combining the two components by inserting code from one into the other (e.g. copypaste)
  10. How is the code delivered to the user? a. two components delivered separately at separate times, downloaded initiated by end-user b. two components delivered at the same time, with end-user explicitly accepting download of the copyleft component c. two components delivered simultaneously without the end-user’s explicit involvement, but are clearly separable within the package downloaded d. two components delivered simultaneously and pre-linked e. two components merged into one, inseparably.
  11. How do the components communicate? a. two components communicate through command-line interface or some form of pre-existing inter process communication (e.g. SMTP, or pipes) b. two components communicate through an API which is publicly published and which pre-exists the copyleft component c. two components communicate through an API which is private but which can be demonstrated to pre-exist the copyleft component d. two components communicate through an API defined by the copyleft component e. two components are dynamically linked f. two components are statically linked g. two components are combined into a single executable, inseparably
  12. So, do we need to apply GPL or not? 1. Establish how the interaction operates on all 3 axes. 2. The higher up the list the interaction is, in each case, the less is the likelihood of the problem. 3. Architect your application with this in mind. 4. This is not only a question of minimizing the possibility that you are infringing GPL (which is something only a judge can decide). It’s also about making it more difficult for a copyright holder to claim that you are.
  13. General issues here: 1. The scope of secondary copyright liability (if I encourage, or provide instructions to, or automate a process which means that I am not distributing the code, but the end-user is, am I liable for secondary copyright liability?) 2. Is this true, even if what the end-user is doing is perfectly lawful in itself? 3. (Note: there are several, mainly US cases around Napster and Grokster etc. 4. Thought experiment: it was revealed some years ago that certain Intel CPUs contain a Management Engine running a variant of Minix. 1. If that software contained infringing components, could you, as the person owning and switching on the computer, be held liable? 2. Liability for copyright infringement is not dependent on knowledge or intent. Knowing about the IME, is that fair? 5. Conclusion: the main area where compliance is not being helped, is the law
  14. Where to go from here? 1. Make it easier to find components with known licences, and better- defined dependencies - SPDX, REUSE, GitHub Licence Chooser 2. Clarify existing licence terms for older components - Please relicense to better known licences (and not to fauxpen licences) - Make the licensing information more consistent (e.g .Maven Central Repository) 3. Provide reference implementations, and example compliance materials - OpenChain, Oniro 4. But what about the law?
  15. Blue Sky? How licensing works (in theory): • Developer A decides what rights they want to grant users, for all possible use cases. • Developer A works with legal advisers to either select a licence or draft a new one (!) • B decides they want to use A’s code, reads the licence (possibly with the help of the legal department). • B uses the code. • A thinks that B is infringing. An argument involving lawyers ensues. • A judge decides who’s right.
  16. Blue Sky? How licensing works (in practice): • Developer A decides to write some software, and wants to open source it • A (hopefully) picks an existing licence which seems to do what they want, for most use cases. Possibly with the help of the legal dept. • B decides they want to use A’s code, and uses it in a way which is generally understood as acceptable (possibly with the help of the legal department). • B uses the code. • A thinks that B is infringing. An argument involving lawyers ensues. • A judge, who probably doesn’t know about FOSS decides who’s right.
  17. Ultimately… A developer may have a pretty good idea of what permissions they want attached to their code, and what outcome they want for certain code combinations…. … but that clarity will be destroyed because the message is filtered through lawyers, the uncertain legal system (which differs between jurisdictions) and the ultimate arbiter is a judge, who may well know nothing about FOSS.
  18. Is there a solution? • The law already acknowledges that the legal rights and obligations can be created without using natural language (e.g. derivatives). • Is it possible to remove the requirement that natural language is the only medium by which the rights and obligations of FOSS licences can be determined? • Can developers select a set of rights they want to apply to their code, and then an agreed algorithm determines how software interactions and compliance materials must be generated? • Can compliance (or at least a subset of it) become deterministic? • This won’t be perfect, but a developer may well settle with it being right 95% if the time.
  19. Such an algorithm… • Must provide consistent and reproducible results. • Must have a mechanism for being updated as practice evolves. Challenges • How to deal with fair use/fair dealing? • As law becomes code (in the Lessig sense) how do we guard against bad actors manipulating governance or the algorithm. • How do we transition existing code into the algorithmic licence?
  20. Conclusion • Compliance is getting more complex as the sheer number of components increases. • New technologies (containerization etc.) compound this • Compliance MUST be automated. There is great progress in this area. OpenChain, SPDX, REUSE frameworks, and technology from FOSSology, Scancode, SW360 and many more, including proprietary vendors, are making huge strides. • However, compliance is still, in part, an art, not a science. • The essence of open source is reducing friction. Can we reduce friction further by looking at the licensing process itself?
  21. Open Source Software Licence Compliance: Art or science? Andrew Katz, CEO, Orcro Limited and Partner, Moorcrofts LLP