How to create a Software Bill of Materials in Neo4J

4 min readApr 20, 2021

Graph databases are very useful when we try to represent data in a friendly visual way. One of the best use cases i can think of for a graph database is to visualise the libraries used by a project and the vulnerabilities (if any) in those libraries. In short, a software bill of materials.

The code for this project is available in my personal github repo

Pre-requisites

The database we will use here is Neo4J, so the first thing we need is to install it and create a database. There are plenty of tutorials in their official site, so if you are not familiar with it, this is a good starting point.

We also need Dependency-Check installed. This is a tool from OWASP to scan opensource libraries. We will use as an input the report generated by Dependency-Check and ingest it in our database, but this solution can be adapted to use it with any other tool.

Design

The idea behind this is really simple. We have a json report previously generated by Dependency-Check. This report contains all the libraries and vulnerabilities for the project scanned. We parse this report, extract the relevant information and ingest it in our database to visualise it. Lets go a bit more in deep into that:

We will have three different sets of data to ingest in the database. A project, a dependency and a vulnerability. This is the information we will store for each of them:

Project:

project_name: The name of the project we have scanned

Dependency:

project_name: The list of projects where this dependency is included
dependency: The name and version of the dependency
package: The technology used for that dependency (Maven, npm, etc…)
vulnerabilities: The list of CVE or identifiers for the vulnerabilities in that dependency

Vulnerability:

vulnerability_name: The identifier for the vulnerability
severity: The CVSS score for this vulnerability

This is just the data to be ingested. In graph databases, on top of the data, we need to create relations:

Project -Uses->Dependency:

We create this relation when a project name exists in the list of projects for a given dependency

Dependency-Vulnerable_to->Vulnerability:

We create this relation too when a vulnerability exists in the list of vulnerabilities for a given dependency.

So, our structure looks like this:

Ingestion

Once we have explained the model, it is time to run the tool and ingest data in our database. This step couldn’t be simpler.

You just need to configure as enviromental variables the configuration for the database:

NEO4J_USER=MYUSER
NEO4J_PWD=MYPASSWORD
NEO4J_DB=bolt://Neo4J_Location:port

And then, run the python script in the git repository, sending as parameters the name of the project to be ingested and the path to the json report

python ingest_data_neo4j.py testjavi myreport.json

And that’s it! Now, we have our data ingested in Neo4J.

Visualisation

Finally, it’s time to visualise the data we have ingested in the Neo4J Browser . We can visualise different things here, but i will leave the queries for the data that i find more useful:

List of dependencies for a given project

MATCH(a:dependency), (m:project) WHERE m.project_name=’testjavi’ RETURN a,m

List of projects that use a given dependency

MATCH(a:dependency), (m:project) WHERE a.dependency=’org.springframework.boot.spring-boot@1.3.1.RELEASE’ RETURN a,m

List of all the dependencies and vulnerabilities in a project

MATCH(a:dependency), (m:project), (v:vulnerability) WHERE m.project_name=’testjavi’ RETURN a,m,v

And finally, list only the dependencies that contain vulnerabilities, and which projects use it

MATCH(a:dependency), (m:project) WHERE a.vulnerabilities<>[] RETURN a,m

Summary

What we have seen here is just an example about how to ingest a report from Dependency-Check, but it can be adapted to any Software Composition Analysis tool and how to visualise that data

I hope you found this article useful!