Hands-on Intro to SBOM

The concept of a Bill Of Materials (BOM) is well-established in traditional manufacturing as part of supply chain management. A manufacturer uses a BOM to track the parts it uses to create a product. If defects are later found in a specific part, the BOM makes it easy to locate affected products. In software industry, this concept is fairly new and is used to keep track of all the ingredients of the software.

What is SBOM ??

A software bill of materials (SBOM) is a formal record of the components used to develop software and its software supply chain relationships, according to the National Telecommunications and Information Administration (NTIA). An SBOM covers both open source (OSS) and proprietary software, creating transparency into potential vulnerabilities and elements within the software. SBOMs can be used for vulnerability management and product integrity.

An SBOM is useful both to the builder (manufacturer) and the buyer (customer) of a software product. Builders often leverage available open source and third-party software components to create a product; an SBOM allows the builder to make sure those components are up to date and to respond quickly to new vulnerabilities. Buyers can use an SBOM to perform vulnerability or license analysis, both of which can be used to evaluate risk in a product.

Why SBOM ??

There could be multiple usages of SBOM, like

  • easy End-Of-Life management for dependencies and product itself.
  • License obligations and policy compliance.
  • For developers, it can help to unbloat the software by identifying the BOM and clean up unused things or can use it for quality assurance.
  • Identify and eliminate vulnerabilities from early stages (more shift left)

There are many artifacts that can provide SBOM information and this information can be correlated and used together to provide better security insights. These artifacts could be the source code, executables, published softwares, or in devops world, containers!!

Containers are easy way to package and deliver software; Container is like an encapsulated artifact. Here we can get SBOM for Application dependencies, Secret code, OS packages, Licenses, File data, Configuration files, Container meta-data, etc. When it comes to security, it’s important to know every part of the system. SBOM gives you a clear list of components that help in monitoring every part for vulnerabilities.

Existing SBOM formats

A new SBOM can be created and published in various formats including HTML, CSV, PDF, Markdown, and plain text. SBOM formats are still in development and new formats might arise in future that can address specific problems in a better way. Currently used formats are — Software Package Data Exchange (SPDX), Software Identification (SWID) Tags, and Cyclone DX.

  1. SPDX

Also known as ISO/IEC 5962:2021, SPDX is spearheaded by The Linux Foundation. It is an open standard for describing SBOM information related to provenance, licensing, and security.

  1. SWID Tags

This format identifies and reports software components under four categories across the development lifecycle:

  • Corpus Tags: Identifies and describes components in a pre-installation stage.
  • Primary Tags: Identifies and describes components in a post-installation stage.
  • Patch Tags: Identifies and describes the patch.
  • Supplement Tags: Allows only the tag creator to modify corpus, primary, and patch tags.
  1. Cyclone DX

Managed by Cyclone DX’s core working group, it is designed for application security contexts. Cyclone DX is considered a lightweight standard with features of both SPDX and SWID. It includes four data fields:

  • BOM Metadata: Description of the supplier, manufacturer, component, and compilation tools.
  • Components: Complete information of a proprietary and open-source components along with licensing requirements.
  • Services: A list of external APIs that the software may invoke.
  • Dependencies: All forms of relationship within the supply chain.

Don’t talk, show!!

For the demo, I’ve created a basic flask application that says hello and have containerized it into 3 different base images — ubuntu, alpine and distroless.

We can check the size of the container image using docker images command.

If you want to get more details about the size of each layer then you can use docker history <image> command. More information about the running container (process) can be obtained using docker inspect <container>.

All these commands are good, but they do not provide any information about the application and its dependencies. Docker has recently announced its experimental feature — docker sbom, that allows us to generate the SBOM of a container image. Today, it does this by scanning the layers of the image using the Syft project but in future it may read the SBOM from the image itself or elsewhere.

Let’s generate a SBOM for our containers by directly using the syft project.

By default, syft parses and analyses the final layer of the container and displays the tabular result on the standard output (stdout). This is good if we just want to see the SBOM ourselves and not want to share it with other tools or people. To save the output to a file you can use --file option and you can also specify another formats that are widely used by community with -o or --output flag. Below bash script will create cyclonedx-json , github-json, spdx-jsonand syft-json format SBOMs and also store them in their respective files.

Output of the above script provides us with package count for each image and it is clear that the ubuntu has most of them as it is a full fledged distro with a lot of system files, manpages, etc… and distroless images have the least one. The idea of distroless is somewhat over-hyped in the world of containers and sometimes it can be related with security ideas of minimum attack surface. Here is a RedHat article that try to give a clear understanding of the benefits of distroless containers and myths around it.

And it’ll create a directory with organised json files

Now we have our sbom files and we can share these files to other people who need it. It can be our customers, external auditors, Incident response team, etc etc… Also we can use these files with another tool that can check these images for vulnerabilities. One such tool is grype — A vulnerability scanner for container images and filesystems that works exceptionally with Syft. Below script will generate grype results for all the 3 images using their respective spdx.json files.

Like all static analysers, this tool might generate tons of false positives. Apart from this, grype tool provides tons of configuration features that can come in handy for automations and several other usecases. A lot of other commercial and open-source tools are arising that can leverage SBOMs and can help to solve problems around licencing and policy compliene, security audits, quality assurance, etc.

SBOM misconception

There are few misconeptions or myths about SBOMs like it can :-

  1. be a roadmap to the attacker ?
  2. require source code disclosure ?
  3. expose my intellectual properties ? .. etc

Here is a NTIA publication that covers explaination of some such myths V/S facts.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store