Roadie’s Blog

How to customize Backstage Kinds and Types without getting in trouble

By Sam BlaustenSeptember 3rd, 2024
Example of API populated select in Scaffolder

Backstage comes with an opinionated set of top level groupings to use in modeling the software and assets in your organization, such as “Component”, “Resource”, “API”, “System” and “Domain”.

These are called “Kinds” and they are meant to be combined with a “Type” that signifies more detailed sub categories within these generic buckets. For instance, you could have a Component of type library or a Component of type website. Types are completely up to you to define, which can lead to its own set of problems, as we will see.

When someone wants to add something to your catalog, they will need to decide first what kind that something belongs in, and then the type. Most of the time these questions arise sitting in front of a blank YAML file in your code editor, without any helpful context to hand. This can lead to several problems, as we shall see, even for those who understand the Backstage YAML approach and schema.

The common problems

Which Type?

How does this decision get made? A conscientious user could first go to the existing catalog and try to find patterns already used for a similar thing, and then decide whether to follow that pattern or not.

However it might also be the case that the user just tries to use their own common sense to choose the kind and type. This often leads to problems particularly with types as you can end up with multiple versions of the same type - i.e. website, Website, site, web-server, webServer, web-app etc.. This can even be intentional when the existing type doesn’t follow the formatting of other types so a user decides to introduce a corrected duplicate.

Duplicate types can cause problems when you want to search for things in the catalog by type as you have to select all variants, both in the UI and via API. The Backstage search might not work as expected or might miss results. And the cognitive load of seeing a messy catalog also has an impact in perceived reliability of the data, which could mean users look at it less and don’t see it as a source of truth anymore.

Which Kind?

Types are not the only potential problem area. Kinds may not neatly map to the top level terminology or groupings used inside your organization. Maybe you are using Value Streams, which are groupings of Domains as an organizational concept and way of grouping teams. Backstage has no ValueStream kind, only the Domain kind. Domains are not the same as a Value Stream - in fact Value Streams usually comprise of multiple domains. If you want to model Value Streams with the default options you would have to use a kind of Domain with a qualifying type of value-stream, which might feel unintuitive or even stop users adding it at all.

The solutions

So how can we address these problems?

Problem 1: Kinds don’t map to top level concepts in your organization causing confusion

Custom Kinds

It is possible to create custom kinds in Backstage that better align to your organization.

Creating a new kind is a relatively simple engineering task involving registering a new processor and validation for that kind that emits any desired relationship mappings.

import { CatalogProcessor, CatalogProcessorEmit, processingResult } from '@backstage/plugin-catalog-node';
import { ProductEntityV1 } from './ProductKind';
import { Entity, entityKindSchemaValidator, getCompoundEntityRef, parseEntityRef, ...} from '@backstage/catalog-model';
import { LocationSpec } from '@backstage/plugin-catalog-common';
import productSchema from './Product.roadie.v1.schema.json';

// Creates a validator using the JSON AJV schema imported above. 
const validateProductEntity = (entity: Entity) => 
	entityKindSchemaValidator(productSchema)(entity) === entity

// Processors will run against every entity in the catalog
export class ProductKindProcessor implements CatalogProcessor {

  getProcessorName(): string {
    return 'ProductKindProcessor';
  }

  postProcessEntity(
    entity: Entity,
    _location: LocationSpec,
    emit: CatalogProcessorEmit,
  ): Promise<Entity> {
    const selfRef = getCompoundEntityRef(entity);

    // Function for triggering relationships to be processed
    function doEmit(
      targets: string | string[] | undefined,
      context: { defaultKind?: string; defaultNamespace: string },
      outgoingRelation: string,
      incomingRelation: string,
    ): void {
      if (!targets) {
        return;
      }
      for (const target of [targets].flat()) {
        const targetRef = parseEntityRef(target, context);
        emit(
          processingResult.relation({
            source: selfRef,
            type: outgoingRelation,
            target: {
              kind: targetRef.kind,
              namespace: targetRef.namespace,
              name: targetRef.name,
            },
          }),
        );
        emit(
          processingResult.relation({
            source: {
              kind: targetRef.kind,
              namespace: targetRef.namespace,
              name: targetRef.name,
            },
            type: incomingRelation,
            target: selfRef,
          }),
        );
      }
    }

		// Adding relationships
    if (entity.kind === 'Product') {
      const product = entity as ProductEntityV1;
      doEmit(
        product.spec.owner,
        { defaultKind: 'Group', defaultNamespace: selfRef.namespace },
        RELATION_OWNED_BY,
        RELATION_OWNER_OF,
      );
      doEmit(
        product.spec.system,
        { defaultKind: 'System', defaultNamespace: selfRef.namespace },
        RELATION_PART_OF,
        RELATION_HAS_PART,
      );
      ...
    }
    return Promise.resolve(entity);
  }

	// This is is required so that the catalog will be able to ingest this new Kind
  validateEntityKind(entity: Entity): Promise<boolean> {
    if (entity.kind === 'Product') {
      return Promise.resolve(validateProductEntity(entity));
    }
    return Promise.resolve(false);
  }
}

In addition you will need to add the kind to an allow list in your app-config.yaml file in the root of your Backstage repo like so:

catalog:
  rules:
    - allow:
      - Component
      - Product
      ...

However, you will want to ensure that any new Kinds introduced are stable enough concepts in you organization that they are unlikely to be changed. Deprecating a top-level Kind when you have thousands of YAML files to manually update across hundreds of teams can be a time consuming process.

Documentation

You will also want to document your new kind somewhere internally which could mean duplicating Backstage documentation on catalog schemas so that your organization has a single place to go for reference to all available entity schemas.

Example of Kind documentation in Roadie

Scaffolder

Lastly you can create a Scaffolder template to help people bootstrap a new catalog-info.yaml file that uses a dropdown of available kinds so that its easy to know what options are available. An example template for GitHub can be found here.

Example of API populated select in Scaffolder

Problem 2: Types are prone to misuse and duplication

Types are in some ways a harder problem to solve as the pain point is largely caused by a disconnect between writing YAML manually in a code editor and your Backstage application.

Its worth noting additionally that types have a constrained set of characters, and for instance cannot have separate words. If the format is invalid they will fail to be ingested to the catalog.

Documentation

You can write documentation on catalog entity schemas to try and keep track of agreed types and hope that your contributors will reference it.

Using the CI Pipeline

However a better approach might be using some kind of step in your CI that either forces types to match existing types, tells you if it does or does not, or just tells you what existing types there are.

You can establish existing types in the catalog easily using the /api/catalog/entity-facets?facet=spec.type endpoint, or you could expose a dedicated type validator endpoint in your Backstage backend that also tells you if a type is already in the catalog.

Alternatively you could use a config based approach or allowlist of types. With this approach, new types must be added to the validator (and ideally documentation) via a review process.

The downside of some of these less permissive CI jobs are that they can add friction, which can lead to entities being labelled with types that are not appropriate because the contributor doesn’t want to go through the process of requesting a new type even if it is a valid addition.

Alternatively a more permissive CI job could just print out existing types for the kind(s) you are adding via the /api/catalog/entity-facets?facet=spec.type endpoint and leave it to the contributor to make sure they align if it is not a new one.

Using the Scaffolder

You can also create a Scaffolder template to help people bootstrap a new catalog-info.yaml file that uses a dropdown of available Types so that its easy to know what options are available and not duplicate them. Specifically you can use the SelectFieldFromApi parameter widget in your template and point it to the Catalog API endpoint to get a list of available types.

How Roadie can help

At Roadie, we help you overcome these kinds of challenges in a few ways. We take care of adding new kinds to Backstage for you as well as a wide range of relationships or new relationships for those kinds.

CI Validator

Our catalog validator can run on your CI to check that the YAML you are adding or updating is valid (i.e. the type field format will not break the ingestion).

Example of the Validator running as a Github Action

Tech Insights Scorecards

Our Tech Insights feature allows you to track the correctness of YAML files being added to your catalog across the organisation as well as at Group levels, and even fix issues with a link to a pre-filled Scaffolder template.

Scorecard in Roadie's Tech Insights

Scaffolder

Lastly we make available a series of Scaffolder actions that allow non-technical users to add things to the catalog with context provided by API calls that can then populate a list of existing types for you so users won’t make a mistake with duplicates.

Scaffolder for creating YAML Files

Become a Backstage expert

To get the latest news, deep dives into Backstage features, and a roundup of recent open-source action, sign up for Roadie's Backstage Weekly. See recent editions.

We will never sell or share your email address.