Measuring Catalog Correctness and completeness

A comprehensible Catalog in Backstage is the ultimate goal for many teams. To achieve it you need a plan and a way of tracking your progress. This article will not delve into the former, as a suitable plan is something only you and your team can come up with.

But measuring how well your Catalog is doing is something everyone needs and can help you tune your plan along the way.

What is a correct and complete Catalog?

Backstage is used by organizations large and small, and for different purposes. Thus, there’s no universal definition of “correct” or “complete” for the Catalog. But let me explain what I’m referring to in this context.

With “correct,” I mean the data in individual entities. It’s not enough that a component shows up in the Catalog. It must have all the information required for its type. For example, if a component of type service shows up in the Catalog but doesn’t have PagerDuty and API Docs annotations, it’s not rich enough to meet my definition of “correct.” Most importantly, all the meta-data and annotations in the entity must correspond to the entity for it to be correct.

With “complete,” I am referring to coverage of software assets surfaced in the Catalog. The absolute meaning would refer to having every single component, user, and other kind of entities in the organization reflected in Backstage. However, this is rare. More often, teams may define “complete” within a scope, such as having an entity tracked for every business-critical service, or covering a subset of teams in the Catalog.

Tracking Catalog correctness

Once you have more than a few entities registered in your Catalog, tracking how rich and correct the metadata in each of them is becomes impossible, especially if you’re welcoming dev teams to onboard their own components.

Here’s when you need Tech Insights. Tech Insights lets you check data points across your Catalog entries and summarize the findings in Scorecards.

Screenshot: Backstage best practices scorecard

Tech Insights is available as building blocks in the OSS plugin and as a fully-fledged Scorecards solution as a paid addon in Roadie. The OSS version will provide you the fundamentals so you can implement your own Scorecards solution, while Roadie offers a no-code UI with hundreds of pre-built checks. You can apply the ideas in this article using either of them.

What to check for?

With Tech Insights, the possibilities of what you can check for are pretty extensive. Thus, I recommend designing a schema that defines what constitutes best practices in your Catalog.

For example, for all components I expect a minimum to at least have a name, description, group owner, and a valid type to be set. To overview all of these criteria I need to define several checks in the component metadata:

Check that name and description are present: make Tech Insights check the metadata for entities is present for the fields you’re interested in.
Check that the component is associated to a GitHub repository: linking a component to GitHub unlocks several features in Backstage, thus, make sure to check that the github.com/project-slug annotation is present. You can also check if it complies with a form using regex.
Check that the type of entity is valid: the type field in Backstage can be an arbitrary string, which is prone to human errors. You can check that the types defined in all components do belong to a set of strings that you expect.
Check that components are labelled correctly: if you need your components to be tagged with their corresponding region or tier, using labels is the simplest way to go. Thus, you must asses which components are complying with the expected labels.

You can check any other detail from the entity metadata, and decide if the check applies for only a subset of them. Screenshot: Backstage component check

Once you have all the checks you want to apply defined, you can aggregate them in a Scorecard that keeps track of which entities are complying with all the checks.

Screenshot: Scorecards composition

Go beyond Catalog correctness

Happiness in your Catalog becomes a measurable objective with Tech Insights. However, with Tech Insights you can check more than the validity of entity metadata. For example, you can

Overview Docker image migrations
Track on-prem SonarQube metrics
Or, enforce branch protection.

If you’d like to see what Roadie’s Tech Insights can do for your team, feel free to book a demo!

What is a correct and complete Catalog?

Tracking Catalog correctness

What to check for?

Go beyond Catalog correctness

Become a Backstage expert