Publicité

“State of the Tooling” in Open Source Automation

OpenChain Program Manager à The Linux Foundation
31 Mar 2023
Publicité

Contenu connexe

Plus de Shane Coughlan(20)

Publicité

“State of the Tooling” in Open Source Automation

  1. Copyright © nexB Inc. License: CC-BY-SA-4.0 “State of the Tooling” in Open Source Automation OpenChain German work group Philippe Ombredanne, AboutCode.org nexB Inc.
  2. Copyright © nexB Inc. License: CC-BY-SA-4.0 Philippe Ombredanne ► Project lead and maintainer for VulnerableCode, ScanCode and AboutCode ► Creator of Package URL, co-founder of SPDX & ClearlyDefined ► FOSS veteran, long time Google Summer of Code mentor ► Co-founder and CTO of nexB Inc., makers of DejaCode ► Weird facts and claims to fame ● Signed off on the largest deletion of lines of code in the Linux kernel (but these were only comments) ● Unrepentant code hoarder. Had 60,000+ GH forks now down only to 20K forks ► pombredanne@nexb.com irc:pombreda
  3. Copyright © nexB Inc. License: CC-BY-SA-4.0 Why open source compliance tooling? ▷ Because open source for open source: This is the way! ● Dogfooding ▷ Free as in beer and freedom of course ● Code of course, but do not forget the data! ▷ Key to enable right-sized automation for your open chain ▷ Best-in-class tools in several areas
  4. Copyright © nexB Inc. License: CC-BY-SA-4.0 Key trends (1) Time to retool? ▷ 3rd wave of Compliance tooling creation and adoption underway ● 1st wave was commercial ● 2nd wave was centered on license compliance and legal ● 3rd wave will be centered on developers and appsec ■ Eventually balanced and holistic FOSS solutions ▷ TODO: Review your existing approach and retool
  5. Copyright © nexB Inc. License: CC-BY-SA-4.0 Key trends (2) ▷ Security is top of mind ● SBOMs are everywhere, but for what? Few can process them ▷ And license compliance is not yet solved ● Still a lot of work left for automation ● Emerging scripting platforms to capture your pipelines ■ Orchestrate many tools ▷ Open data and data sharing will happen ● Everybody wants it, but also everyone wants to control it ● Centralized or decentralized?
  6. Copyright © nexB Inc. License: CC-BY-SA-4.0 Key trends (3) ▷ Software health, quality, sustainability are not yet on the radar ▷ FOSS GUI/Web apps are still badly missing ▷ Slowly the analysis of builds and binaries will displace source-only scans ▷ Dependency tracking is not yet solved at scale
  7. Copyright © nexB Inc. License: CC-BY-SA-4.0 Key trends (4) Best tools are FOSS ▷ The leading tools are mostly FOSS first ● License detection ● Container analysis ● Package detection ● Dependency tracking and resolution ▷ But BEWARE ● Lots of tools are shallow and look only skin deep ■ Barely suitable for serious license or security work ● Do your homework and try the tools: they are open after all
  8. Copyright © nexB Inc. License: CC-BY-SA-4.0 ▷ Vulnerability and package databases are the new rush ● Open or commercial vulnerability databases with supposedly "premium" content ● But BEWARE of the data quality. Size DOES NOT matter. ■ Made up packages, made up versions ■ Not worth their price: Compare and include open solutions! ▷ Every commercial tool now includes license data ● License data derived from package manifest is NOT ENOUGH ● Built-in policies are impractical: is GPL always bad?? Key trends (5) Poor data quality
  9. Copyright © nexB Inc. License: CC-BY-SA-4.0 PURL is emerging as the glue to avoid lock-in! ● Started to support package ids in ScanCode and VulnerableCode, now everywhere ○ CycloneDX ○ SPDX including just released GitHub SPDX SBOMs features ○ Google OSV ○ Sonatype OSSIndex ○ New PurlDB, MatchCode ○ Most FOSS tools such as ORT, Fosslight, DependencyTrack, Anchore, Tern and most of the open (and prioprietary) SCA and Infosec/Appsec tools ● Coming to the NVD in version 5.1!! ● Key vector for interop: if two tools speak PURL, integration is made easier ● Demand its adoption by your vendors and projects Key trends (6) PURL is the essential glue
  10. Copyright © nexB Inc. License: CC-BY-SA-4.0 Key insights (1): Share the data! "I would like to have automation to avoid repeat work when re-running tools" "Let's avoid re-running scans, share them and reuse them instead" ● Everyone wants to share and reuse data from scans, and origin and license data ○ Speed up origin and license review ○ Avoid redoing the scans and the same review either inside my org or across orgs ● But "It is hard to overcome lawyers’ objections to sharing data such as license conclusions and curations" ● And how to trust the scans and curations? And deal with different policies and standards for conclusions and curations? (specifically about licensing) ● What is the motivation and ease for public data sharing?
  11. Copyright © nexB Inc. License: CC-BY-SA-4.0 Key insights (2): Open the data! ● Open data (e.g., as in free and open licensed data on FOSS) are emerging ○ The too big to share argument will not hold ● Eventually open, community curated FOSS package "knowledge bases" will become the norm and supplant proprietary, closed source alternatives ● We should share raw scanners/tools outputs first ● We should fix upstream licensing issues, upstream ● The centralized approach does not work well ○ Too big to share ○ Out of date ○ Lack of trust in centralized control
  12. Copyright © nexB Inc. License: CC-BY-SA-4.0 License and Vulnerability are like oil and vinegar ● Even if core process is code origin determination, constituents are not the same (yet) ○ License folks care less about Vulnerabilities ○ Security folks care less about Licenses ● FOSS projects that cater to both should provide differentiated documentation for each audience ● Some core tools are the same, but users are different ● Expect a convergence of the two aspects in the future ● Until then, advice to OSPOs: ○ Handle both domains ○ But adapt your language to each constituent/persona Key insights (3) Licensing != Security?
  13. Copyright © nexB Inc. License: CC-BY-SA-4.0 Multiple FOSS projects try to solve license compatibility ● FLICT, OSADL, Hermine Oniro ● Automating license conflicts/compatibility checks is a real problem at scale ● Projects may work together and eventually some conventions will emerge ● Key domains ○ Help legal understand/zoom in on key license concerns ○ What is the effect of multiple licenses? ○ How to surface license compatibility issues ● Effective/resulting license inference and compatibility is a policy issue ○ But tooling can automate the grunt work Key insights (4) License Compatibility
  14. Copyright © nexB Inc. License: CC-BY-SA-4.0 ● Does copying a snippet of code really matter? ○ Have you looked at the big rocks first? e.g., whole libraries ○ Are you ready to pay the price in time and/or cash? Image credits: https://www.integrativenutrition.com/ Key insights (5) Snippets and matching?
  15. Copyright © nexB Inc. License: CC-BY-SA-4.0 ● Domain has been abandoned by commercial vendors ○ Snyk has spun off FOSSID ○ Synopsys mostly abandoned Protex ● One new entrant with open source code but proprietary data: SCANOSS ● Snippets may not matter (too much) ● But AI/ML-generated code snippets anyone? ○ Will Artificial general intelligence (AGI) make snippets both more relevant and useless at the same time when everyone can generate the same boilerplate derived from everyone's code ● Yet code matching can speed up the analysis when done right (find big rocks first) ○ Reuse previous analysis based on matching code: WIP with MatchCode Key insights (5) Snippets and matching?
  16. Copyright © nexB Inc. License: CC-BY-SA-4.0 ● SBOMs are everywhere ○ GitHub can even create these directly from a repo ○ But what about data quality (depth and breadth)? ○ But what about using proper machine readable identifiers (license, PURL)? ● Hi-Fi or Lo-Fi SBOMs? ● Every tool creates SBOMs but then what? ○ 2 out of 50+ folks were effectively consuming SBOMs ● Big gaps in tool-to-tool integration ● Too much over engineering, and under-specification ● Advice: Ignore the SPDX vs. CycloneDX feud and embrace both, with PURL ○ Feel free to ignore SWID ○ SBOM is just a reporting format Key insights (6) SBOM, mehBOM?
  17. Copyright © nexB Inc. License: CC-BY-SA-4.0 ● Collaborate: License conflict/compatibility checking FOSS projects on data and standards (FLIct/OSADl/Hermie) ● Create: A live inventory of all FOSS tools and their capabilities ● Share: Approaches to dependency detection/resolution/processing ● Define: Evolve a standard/schema for tool-to-tool technical scan data sharing ● DATA: Exchange data! Follow up on collaboration opportunities?
  18. Copyright © nexB Inc. License: CC-BY-SA-4.0 Credits ▷ Presentation template by SlidesCarnival licensed under CC-BY-4.0 ▷ Photograph by Unsplash licensed under Unsplash License ▷ Other content licensed under CC-BY-SA-4.0 18
Publicité