Roadie Blog

A more customisable Scaffolder

Thu, 11 Apr 2024 11:00:00 GMT

The latest features and updates from Roadie.

🤖 Custom Scaffolder Actions.

Templates and the Scaffolder get heavily used by our customers to democratise common tasks like adjusting cloud account budgets or making changes to some Terraform repos.

One of the historic limitations with the Roadie Scaffolder has been that customers were unable to create and use their own custom actions. This added some friction for more advanced Scaffolder users where they were limited by the Scaffolder Actions that we supported by us when they came to write a Template.

No more though! Roadie customers can now create their own self-hosted Custom Scaffolder Actions.

That means:

You can now write your own Scaffolder Actions to complete a task or use some custom CLI that Roadie isn’t aware of
Then register those Actions in Roadie - as many as your heart desires.
Context can then be passed back and forth between Roadie and these Actions within any template you write.

Detailed docs can be found here to get started.

⽥ Custom Field Extensions

In the same vein, we have also opened up support for Custom Field Extensions for the Scaffolder.

You can write your own React Components and validator functions to handle the use cases not currently covered by the existing Templates. This allows you to customise a lot of the Scaffolder forms to your hearts content.

🔌 Plugin roundup

The Dynatrace plugin is now supported on Roadie. You can use it to surface recent problems, error traces, and synthetics results for your services.
The PagerDuty plugin has had a significant upgrade thanks to the fine work Tiago Barbosa is doing as part of PagerDuty taking maintainership of the plugin earlier this year. We’ve adopted the new plugin, so you’ll see a slick new UI and some additional features as they rollout.

Repositories have come to the Catalog

Mon, 11 Mar 2024 11:00:00 GMT

The latest features and updates from Roadie.

🤖 [Beta] Repositories have come to the Catalog.

We think a lot about catalog completeness at Roadie.

One of the most successful strategies we’ve come across to aid in getting components into the catalog is to surface what is in the catalog against a list of repositories. The gap between the two help to identify what is not in the catalog. Simple.

To help customers do this as easily as possible we’ve brought Repositories in from the cold and added them as a core part of the Catalog.

That means:

There is a new Repository tab in the Catalog
Repositories are auto-discovered for GitHub users. For other SCMs the Roadie API is available to push your repositories in.
Quick editing of Repositories inside the Catalog table itself.

📝 Set owners and other properties in the Catalog table UI (no yaml required)

We’ve expanded Decorators again to allow editing of Title, Owner, Type and Lifecycle for Catalog entities from the Catalog table itself.

You can now edit the title, owner, type and lifecycle of entities right in the Catalog table UI. All without yaml. Just click the pencil icon in the Actions section of whatever row you’d like to edit.

⛏️ [Beta] Auto-discover AWS Resources with our new provider

Our customers often want to represent AWS resources in the Catalog and have been using the Roadie API to do just that. There’s a simpler way though, so we decided to build it. Behold: the Roadie <> AWS Provider.

It can currently be configured to pull in:

Lambda functions
EKS clusters
S3 buckets
DynamoDB tables
EC2 instances
and RDS DBs

Coming soon: AWS Accounts.

⏫ Upgrade to Backstage 1.23

We’ve upgraded Backstage to 1.23 for all customers:

A fix to a vulnerability identified by us as part of our annual third-party penetration test. It related to the Scaffolder and didn’t actually affect us, but we worked on fixing it with the core Spotify team to keep the rest of the community safe. More info here.
A tweak to Nunchucks trimblocks that allows for creating control over templating in the Scaffolder, worked on by our own Miklos, one of the core Roadie team.

Full release notes for Backstage 1.23.

🔌 Plugin roundup

The new End of Life plugin is now supported on Roadie. We’ve bought in both the frontend and the backend for this plugin, so it is able to read from repository files via the “source-location” annotation. It’s nice. We like it.

The Roadie API is now live and scorecards have come to the Catalog

Mon, 12 Feb 2024 11:00:00 GMT

The latest features and updates from Roadie.

🤖 The Roadie API is now live for all users on our Growth Plan.

It includes:

The Backstage catalog API exposed, via token authentication for Roadie customers.
A scaffolder API for testing templates in continuous integration (via dry-run), triggering templates, and listing historical scaffolder runs.
A **Tech Insights API** that allows you to create, read, update and delete scorecards, checks and data sources.

At the moment API token generation is limited to Admins. If you are an Admin and need a token, simply navigate to Administration (bottom of the sidebar) and Account (in the tabs across the top). Give your token a name and click Generate Token.

🎨 More Decorators: you can now set owners and other properties in the Roadie UI (no yaml required)

We’ve expanded Decorators to allow setting many more fields on the Entities in your catalog.

You can now override the owner, lifecycle and tags of Components right in the Roadie UI.
Groups and all other Entity Kinds have also been expanded so that more properties can be set.

All without yaml. Just use the Decorate Entity feature in the top left corner of each Component page. Simple.

💯 Scorecards in the Catalog

How to surface scorecard information to teams is something we think a lot about at Roadie. A few months ago we added Roundups to help. This month we launched Scorecards in the Catalog to increase visibility and discoverability for teams. This is something with a long history (the original discussion in the Backstage community was way back in September 2020) and it’s something we’re hyped about.

🙅‍♂️ Tech Insights gets exclusions (in beta) and a new Facts list

We’re currently beta testing the ability to exclude facts from a Tech Insights scorecard and a check. This will give fine-grain control over the checks and scorecards you can create and pinpoint areas of your catalog.

We’ve also added a Facts list under the Data tab in Tech Insights to improve discovery of what you can and can’t do with Tech Insights data sources.

⏫ Upgrade to Backstage 1.21

We’ve upgraded Backstage to 1.21 for all customers. The biggest change since the last update at the end of 2023 is the new scaffolder UI with horizontal paging.

The upgrade to 1.21 also fixes a small but annoying bug that was pointed out by some customers: the scroll position of TechDocs pages now returns to the top when you navigate between different docs.

Our next version bump will be to 1.23 when that lands (hopefully later this month).

🔌 Plugin roundup

The new PagerDuty plugin is now supported. PagerDuty took over support for the plugin in January, deprecated the old plugin and launched their own version. It’s nice. We like it.
The Pulumi plugin is now supported. With it, you can bring infrastructure data associated with your Pulumi stack into Backstage.
The Cost Insights plugin now has beta support. If you’re interested reach out to one of the Roadie team on Slack, Discord or Teams.

Rollups for Tech Insights

Tue, 14 Nov 2023 00:00:00 GMT

In addition to the recent announcement of Decorators, and a much faster custom plugins pipeline, we’ve also shipped Rollups for Tech Insights users.

Tech Insights

Rollups

Rollups aggregate Scorecard and Check data by team and department, up and down your organisational hierarchy, and let you add scorecard information to teams in the catalog. Learn more below in the Tech Insights section.

Here’s an example for a team called “engineering”.

From this, we can see that the engineering team is doing a great job of using PagerDuty correctly, but could do better at Dependabot configuration. If there are other teams reporting to this one in the org chart then their data will be rolled up into this view also.

Add the ScorecardResultForGroup and ScorecardResultsTableForGroup Cards to Group layouts to see results like this.

You can also see this data presented in report format on a single Scorecard, and dive into the data at different levels in the org.

Bug fixes and improvements

This month has been packed with improvements.

We’ve got a built-in Data Source that scans for errors in CODEOWNERS files.
We’ve got a built-in Data Source that ensures your branch protection is correct.
The built-in Snyk Data Source has been updated to use the github.com/project-slug where possible.
We fixed some rounding errors in our Check calculations.
We fixed a labelling issue where there were two inputs called Type on the New Check form.
We fixed a bug where regex comparison results were exporting incorrectly.
We fixed a bug where selected annotations or labels in the filters of Scorecards couldn’t be deselected.
Improved the performance of Data Sources which iterate over repos containing hundreds of thousands of files.
The “is not blank” operator used to incorrectly ask for a value. Now it doesn’t.
We had mislabeled the “Number” type as “Integer”. This is fixed.
Markdown is now supported in Check descriptions so you can link to supporting documentation.
The Proxy input is now a typeahead so it’s easier to find your favorite proxy.
Scorecard rings now calculate in a more accurate way. A Component used to have to pass all checks on a scorecard to be counted in the ring. Now all checks that are passed will contribute to the score.
We brought consistency and sanity to the positioning of the Re-run, Recalculate and Refresh buttons on Scorecards, Checks and Data Sources.
Fixed some bugs which would prevent scorecards from showing up in the catalog in some cases.
GitHub based Data Sources now filter out archived repos.
Improved a bunch of help text sections on the New Data Source page.

Catalog

Decorators

Decorators allow you to easily add metadata to the stuff you track in your Roadie Backstage catalog. Check out the blog post for full details and to learn how to use them. One simple use case is to use decorators to add a Team Charter and some links to groups in the catalog.

Bug fixes and improvements

API specs are now searchable. Start your endpoint searches with a forward slash.
Renamed “Create…” in the sidebar to Templates. “Create…” was ambiguous.
We removed Tools from the sidebar and moved its pages into Administration.
The card used for displaying Links now hides itself from the interface when there are no links.
Catalog table column visibility is now independently set for each Kind of Entity.
Catalog tables can now display a links column.
Entity Titles are now displayed in the catalog table instead of name when possible.
Admins can now change the sidebar color in the Theme settings.
We improved the Locations Log and renamed it to Administration → Entity Locations. You can also find it in the tabs of the Import page.

Measuring Catalog Correctness and completeness

Mon, 16 Oct 2023 15:00:00 GMT

A comprehensible Catalog in Backstage is the ultimate goal for many teams. To achieve it you need a plan and a way of tracking your progress. This article will not delve into the former, as a suitable plan is something only you and your team can come up with.

But measuring how well your Catalog is doing is something everyone needs and can help you tune your plan along the way.

What is a correct and complete Catalog?

Backstage is used by organizations large and small, and for different purposes. Thus, there’s no universal definition of “correct” or “complete” for the Catalog. But let me explain what I’m referring to in this context.

With “correct,” I mean the data in individual entities. It’s not enough that a component shows up in the Catalog. It must have all the information required for its type. For example, if a component of type service shows up in the Catalog but doesn’t have PagerDuty and API Docs annotations, it’s not rich enough to meet my definition of “correct.” Most importantly, all the meta-data and annotations in the entity must correspond to the entity for it to be correct.

With “complete,” I am referring to coverage of software assets surfaced in the Catalog. The absolute meaning would refer to having every single component, user, and other kind of entities in the organization reflected in Backstage. However, this is rare. More often, teams may define “complete” within a scope, such as having an entity tracked for every business-critical service, or covering a subset of teams in the Catalog.

Tracking Catalog correctness

Once you have more than a few entities registered in your Catalog, tracking how rich and correct the metadata in each of them is becomes impossible, especially if you’re welcoming dev teams to onboard their own components.

Here’s when you need Tech Insights. Tech Insights lets you check data points across your Catalog entries and summarize the findings in Scorecards.

Tech Insights is available as building blocks in the OSS plugin and as a fully-fledged Scorecards solution as a paid addon in Roadie. The OSS version will provide you the fundamentals so you can implement your own Scorecards solution, while Roadie offers a no-code UI with hundreds of pre-built checks. You can apply the ideas in this article using either of them.

What to check for?

With Tech Insights, the possibilities of what you can check for are pretty extensive. Thus, I recommend designing a schema that defines what constitutes best practices in your Catalog.

For example, for all components I expect a minimum to at least have a name, description, group owner, and a valid type to be set. To overview all of these criteria I need to define several checks in the component metadata:

Check that name and description are present: make Tech Insights check the metadata for entities is present for the fields you’re interested in.
Check that the component is associated to a GitHub repository: linking a component to GitHub unlocks several features in Backstage, thus, make sure to check that the github.com/project-slug annotation is present. You can also check if it complies with a form using regex.
Check that the type of entity is valid: the type field in Backstage can be an arbitrary string, which is prone to human errors. You can check that the types defined in all components do belong to a set of strings that you expect.
Check that components are labelled correctly: if you need your components to be tagged with their corresponding region or tier, using labels is the simplest way to go. Thus, you must asses which components are complying with the expected labels.

You can check any other detail from the entity metadata, and decide if the check applies for only a subset of them.

Once you have all the checks you want to apply defined, you can aggregate them in a Scorecard that keeps track of which entities are complying with all the checks.

Go beyond Catalog correctness

Happiness in your Catalog becomes a measurable objective with Tech Insights. However, with Tech Insights you can check more than the validity of entity metadata. For example, you can

Overview Docker image migrations
Track on-prem SonarQube metrics
Or, enforce branch protection.

If you’d like to see what Roadie’s Tech Insights can do for your team, feel free to book a demo!

Live custom Backstage plugins within seconds

Mon, 02 Oct 2023 23:00:00 GMT

Develop custom plugins with a live preview within Roadie Backstage, and deploy them to production in seconds with the Roadie CLI.

An Internal Developer Portal is as good as it tackles your teams’ unique challenges. While Roadie comes with dozens of integrations—such as PagerDuty, ArgoCD, and Sentry— your teams most likely rely on custom workflows, private systems, or in-house tools as part of their software development life cycle. Bringing those specific requirements into your Developer Portal can streamline your developer experience significantly.

For example, Lunar Bank built a dead-letter management plugin, while American Airlines centralized their permissions requests through a custom section of their Backstage instance.

Roadie offers tools to simplify the development and deployment of your custom plugins.

Getting Started scaffolder template

Register Roadie’s New Custom Plugin scaffolder template to jump-start your plugin development. The template will ask you a few details about your plugin and then create a new repository with a basic plugin structure and sample code.

Dev previews within your instance

Once you have your custom plugin running on your machine, you can get a live preview right within your Roadie instance. Your instance will automatically be updated with any code changes which will be applied when you refresh the page.

When writing a custom plugin for Roadie, you can use APIs provided as React hooks so you don’t have to deal with async requests or authentication at the plugin level.

You can preview all your plugin’s views within Roadie: pages, widgets, and cards. Furthermore, you can rely on preview entities to help you develop faster.

Deploying custom plugins to Roadie

Using the Roadie CLI, you can build and deploy your Backstage plugins and see them in your instance within a few seconds after you push them upstream. The simplest option for deployment is to let Roadie host your plugin, but you can also deploy your plugin to other services like Netlify or GitHub Pages.

Our First 12 Month SOC2 Type 2 Report

Tue, 26 Sep 2023 23:00:00 GMT

We’re very excited to say that we just gained our first 12 Month SOC2 Type 2 Report! This report is a big deal for us and shows just how dedicated we are to keeping your data secure and private.

The SOC2 Type 2 certification is a third-party audit that assesses your compliance over a period of time, to ensure your security, availability and confidentiality controls are operating as they should. Back in July 2022 we achieved our first SOC2 Type 2 report, but it was only a 3 month assessment period. This time around we were audited over an entire 12 month period.

A SOC2 Type 2 audit covers all the nitty-gritty details, like how we handle data, control access, respond to incidents, and keep a close eye on things. We’re leaving no stone unturned when it comes to security and compliance.

We believe getting a 12 month SOC2 Type 2 certification shows just how committed we are to keeping your sensitive information safe and is a testament to the hard work and dedication of our entire team. We will continue to demonstrate this commitment year on year as we continue to comply with the SOC2 Type 2 standard. We also aim to expand our compliance to additional standards as we grow.

If you are an existing customer and would like to see a copy of our SOC2 Type 2 Report just reach out via any of the usual channels. We would be happy to share it!

Decorators for rich Team pages

Mon, 25 Sep 2023 23:00:00 GMT

Today we released a feature we call Decorators. Decorators allow you to easily add metadata to the stuff you track in your Roadie Backstage catalog. This metadata is stored within Roadie, and not written to YAML.

How to use Decorators

Using Decorators is simple:

Visit an Entity in your Roadie Backstage catalog (e.g. a team, service or system).
Click the three dots in the top right corner and click “Decorate entity”.
Add links or annotations to the Entity and press Save.

Decorators you create are stored inside Roadie and not written back to the YAML file that backs the Entity.

This means that Entity metadata can come from multiple places for the same Entity. One annotation could come from YAML, and another from Decorators.

You can see where a specific piece of metadata is backed off to by clicking the three dots again and clicking “Inspect entity”. In this case below, the backstage.io/source-location is internal to Backstage and the other items are applied by Roadie Decorators.

Why we’re doing this

The introduction of Decorators is a slight deviation from the Backstage way of doing things, so it’s important that we explain why we’re doing this.

Auto-ingested sources need decoration

Backstage implementations frequently source the hierarchy of Users and Groups from a tool like GitHub Teams, Okta, or a Human Resources application like BambooHR. To support this, Backstage has a number of integrations into these tools.

These integrations will typically stream the hierarchy of Users and Groups into the Backstage catalog.

The problem with automatically ingesting Users and Groups is that Backstage users don’t get a chance to enrich their Group with information like links to Slack channels or a team charter. This leads to dead looking Group pages in Backstage.

Decorators give Backstage users a way to enrich their Group with the information that they want to show-off.

What this is not

We’re not introducing anything brand new in the Backstage ecosystem and we’re not introducing vendor lock-in.

There’s precedent

Backstage itself adds internal annotations to each Entity. These annotations are not written back to the YAML files. They are instead stored inside the Backstage database.

You can see some examples of this internal metadata here:

Roadie is simply piggybacking on this mechanism, to add the ability to store links and annotations.

We’re Backstage API compatible

We’re not introducing any Roadie-specific changes to the Backstage API or the spec for Backstage YAML files. We’re just making it easier to add values to the existing spec.

Entity Decorations are available via the standard Backstage HTTP API that we expose, so you can always write them back to YAML if you wish.

In the future, we will look to support “exporting” Decorators into any YAML file that backs the entity.

New Catalog UI, certified templates, more tutorials

Mon, 04 Sep 2023 23:00:00 GMT

This month we’re rolling out a huge visual update to the catalog with much more space to get your work done. We’ve also got a bunch of new Tech Insights tutorials to help you improve software across your org.

Catalog

New catalog page preview

You will shortly see a new catalog page roll out on Roadie. This update affects the main software catalog table, and the filters around it.

This new catalog table brings a number of enhancements:

More horizontal space for reading the table. We’ve moved the filters to the top so the table is wider.
Per user configurable columns. You can customize the table to show the info that’s important to you. We’ll soon be persisting column choice in your browser so you can pick up where you left off (this is coming imminently).
Kind specific columns. Groups used to have an owner column and Users were missing Display Name. We’ve tidied these up and introduced more sensible defaults.
New table features. Configurable densities and full screen mode make for slicker presentation. Filter highlights make it easier to find what you’re looking for.
Persisted filter choices. If you mostly work with Templates, we’ll keep you on the template page when you navigate away and back.
Sharable filter choices. Filters will be in the URL so you can share a link to a specific subset of data.

All this is building up to the ability to bring Tech Insights data front and centre in the catalog. We want to show scorecards in a column so you can drive more action around important migrations and software quality issues. More on this in the coming weeks and months.

Fixed: Disappearing Azure repos entities

We spent 3 weeks tracking down and fixing a tricky bug that would cause Entities discovered from Azure Repos to periodically and temporarily disappear from the catalog.

This is demonstrated by the wigglyness of the entities count before the fix, compared to how flat it is after.

It turns out the Azure APIs don’t return consistent results unless a sort is specified on the queries. Here’s the upstream fix we made to Backstage.

This is a really good example of the value Roadie adds. Are bugs like this how you want your Developer Experience team spending their time?

Tech Insights

Group check data by owner

Check results are now grouped by owner as well as by Component. This makes it easier to track down the team who own the most Components which are failing the check.

We’re currently working to expand this to scorecards, and to aggregate the data up and down the hierarchy of teams, so you can view it at any level.

New tutorials

We added 4 Tech Insights tutorials this month. Learn how to…

Bug fixes and improvements

We fixed a bug where some Data Sources would fail with an Out of Memory error.
We now support the YAML content type response when sending HTTP requests in Data Sources.
Data Sources can now send POST requests for GraphQL APIs and other use cases.
We rolled out a new version of our broker to patch a security vulnerability.

Scaffolder

Certified label for scaffolder templates

You can now add the Certified label to scaffolder templates to designate them as Platform Team approved and ready for use.

Just add the certified annotation to make this work.

apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: my-template
  annotations:
    roadie.io/certified: "true"

Bug fixes and improvements

The scaffolder now supports a task to open a pull request against Azure repos.
We updated our scaffolder docs page to include more examples and APIs.

Backstage Gets Quality and Compliance Scorecards with Roadie

Mon, 31 Jul 2023 15:00:00 GMT

DUBLIN, July 2023. Roadie, the Backstage-based SaaS Developer Portal, announced the general availability of its first feature on top of Backstage: Tech Insights.

With Roadie Tech Insights, software engineering teams can use Scorecards to monitor their software assets in the Backstage catalog and make sure they meet the quality and compliance standards they have set. Scorecards are based on Data Sources and Checks that the user can customize through a UI.

Roadie Tech Insights comes with more than a hundred different types of checks across dozens of data sources like GitHub, Datadog, and Snyk (plus, you can add custom sources). The feature release was covered by DevOps.com, TFiR, and BENZINGA.

Roadie was founded in 2020 to help organizations boost developer productivity through a Developer Portal. The company raised US$3.7 million in a seed funding round led by Boldstart Ventures and Firstminute Capital.

Back in 2021, Roadie released an open-source Tech Insights plugin with primitive APIs so that any Backstage adopter could build their own Scorecards functionality. A few enterprise Backstage adopters like HP and Lunar Bank have implemented their own solutions based on this version of the plugin. However, the open-source Tech Insights requires each team to design its own UI and implement its Data Sources and Checks, on top of managing integrations, security, and databases.

The new fully-fledged version of Tech Insights—available only to Roadie customers—features more than a hundred facts that you can check against across dozens of data sources like GitHub, Snyk, and PagerDuty. For example, Martin Froehlich, vice president of engineering at SumUp, is using Roadie and Tech Insights to “promote and track adoption of supply chain security and code analysis tools like Dependabot, CodeQL, and others, across all of our production service repositories.”

Roadie Tech Insights helps engineering teams build a culture of quality and accountability Using Tech Insights, teams are nudged towards improved software quality over time. Antony Rinaldi, head of architecture and application platform at Baillie Gifford, is using Roadie Tech Insights to “promote and track adoption of security tools with our 250 developers.”

He also added that Baillie Gifford has “created automated checks and scorecards that help us understand which teams have adopted the tools letting us understand how successful our rollout initiative is. We’re excited to expand it to more use cases over the coming months.”

Roadie Tech Insights is a natural development to make the most out of a Software Catalog and keep quality under control.

Using Backstage’s Scaffolder to Fill up your Catalog

Mon, 24 Jul 2023 15:00:00 GMT

Backstage is a great framework for building Internal Developer Portals. However, having a successful Dev Portal requires more than simply standing it up. Independently of whether you go with self-hosted or managed Backstage, the task of onboarding entities into your Software Catalog will be primarily in your hands.

Most Dev Portals rely on putting metadata files (YAML, Terraform, etc) into the services so they can be updated by the teams that work on them. The more friction you cut from the process of creating these metadata files, the easier it is to convince people to create them.

Backstage’s Scaffolder can make the software onboarding a one-click experience that gives Developers a chance to try out your Dev Portal and easily add their own services to the Catalog.

In this article, I’ll show you how you can write a software template that prompts the user to tell you about their service and opens a PR on their repository. Once that PR is merged, the Catalog will pick up the service automatically (if you have auto-discovery enabled).

Onboard your service with a few clicks

The experience you’re after will let developers onboard their service by filling in a few inputs. The Scaffolder then takes care of generating a catalog-info.yaml file and opening a PR against the service’s repository.

In the first section, you’ll ask for basic information about the service. In the second one, you’ll prompt the user to locate their repo and associate it with an owner. And finally, you’ll ask for integration details such as ArgoCD’s app name or PagerDuty integration key.

Once you’ve collected all the information, your software template will generate a catalog-info.yaml file and open a PR against the service’s repository.

Writing a scaffolder template

Software templates in Backstage have two parts: parameters and steps. The parameters define the inputs you want from the user. The steps are actions—like cloning a repo, editing files, creating an AWS secret, or making an HTTP request —that are run one after the other. In this section, you’ll learn more about how both parts can be implemented for an onboard service template.

Defining parameters

Let’s take care first of the parameters. Parameters can be organized into sections in the UI. In this case, you want to have three sections: one for general information, one for the repository, and one for additional details. Here’s what the code that describes the form presented in the last section could look like:

parameters:
    - title: What is your service about?
      required:
        - name
      properties:
        name:
          title: Service name
          type: string
          description: Human readable name. We'll generate a dasherized version from it.
        description:
          title: Service description
          type: string
        owner:
          title: Service Owner
          type: string
          description: Owner of the component
          ui:field: OwnerPicker
          ui:options:
            catalogFilter:
              kind: Group
    - title: Where is your codebase?
      required:
        - repoSlug
      properties:
        repoHost:
          type: string
          default: github.com
          ui:widget: hidden
        repoOwner:
          title: Repository owner
          type: string
          default: roadiehq
          enum: ['roadiehq', 'jorgelainfiesta']
        repoSlug:
          title: Repository slug
          type: string
    - title: Integrations (optional)
      properties:
        argoAppName:
          title: Argo CD App Name
          type: string
        pagerdutyKey:
          title: PagerDuty integration key
          type: string

The input that the Scaffolder generates is based on the type of property. In all of the cases in this example, we’re dealing with strings but you can also specify numbers, objects, and arrays. You can also specify a component to be rendered as the input with ui:field. For a comprehensive list of these options check out our Scaffolder documentation.

You’ll want to customize the form according to how you want to register services in your Catalog. For example, I’m hard coding the repository’s host and providing two owner options but you may have more than one host option. You could also use a dynamic select box to ease up the selection of repositories.

To help you get the form right, Backstage (and thus, Roadie) come with a form editor that you can find under /create/edit → Template Editor. It’s quite handy, specially as your form becomes more complex.

Defining steps

Now, let’s review the steps side of the template. You’ll need two steps. First, you’ll fetch an existing YAML file and replace placeholders within it with the users’s values. The resulting file will be available in the Scaffolder workspace’s root with the same file name. From there, it’ll be possible to open a Pull Request against the target repo with the content of the Scaffolder workspace, which will be a catalog-info.yaml file that describes the service.

This code shows how the steps look like:

steps:
      - id: fetch-template
        action: fetch:template
        input:
          url: ./skeleton
          values:
            name: ${{ parameters.name }}
            description: ${{ parameters.description }}
            owner: ${{ parameters.owner }}
            repoOrg: ${{ parameters.repoOwner }}
            repoSlug: ${{ parameters.repoSlug }}
            argoAppName: ${{ parameters.argoAppName }}
            pagerdutyKey: ${{ parameters.pagerdutyKey }}
      - id: create-pull-request
        name: create-pull-request
        action: publish:github:pull-request
        input:
          repoUrl: ${{ parameters.repoHost }}?owner=${{ parameters.repoOwner }}&repo=${{ parameters.repoSlug }}
          branchName: onboard-to-catalog
          title: Onboard service to Catalog
          description: This PR adds a meta data file about this service so that it can be registered in our software catalog.

Let’s unpack what’s going on in each step. In the fetch-template, I’m loading a relative path that contains a file with placeholders formatted with nunjucks templating. The file is a catalog-info.yaml that looks like this:

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: ${{ values.name | replace(" ", "-") | lower}}
  title: ${{ values.name }}
  description: ${{ values.description }}
  annotations:
    github.com/project-slug: ${{ values.repoOrg }}/${{ values.repoSlug }}
    {%if values.argoAppName %}argocd/app-name: ${{values.argoAppName}} {% endif %}
    {%if values.pagerdutyKey %}pagerduty.com/integration-key: ${{values.pagerdutyKey}} {% endif %}
spec:
  type: service
  owner: ${{ values.owner }}

You can manipulate strings using filters, use conditional blocks, and pretty much any other templating option available in nunjucks.

Regarding publish:github:pull-request, the only thing worth mentioning is that repoUrl doesn’t look like a familiar URL. That’s because it’s a standard reference used across the Scaffolder actions rather than an actual repository URL.

To help test the steps inputs and outputs, you can dry-run your template with /create/edit → “Load Template Directory.”

Conclusion

Minimizing the friction to using your Catalog will improve the adoption rate. Plus, it can help you provide developers a clear first touch point to start getting familiar with the Developer Portal that you’re building for them. You can find the complete template in our software templates repository.

Backstage Users Unconference: a wrap-up

Mon, 26 Jun 2023 15:00:00 GMT

Last week we had a blast hosting the 4th Backstage Users Unconference! Over 95 people showed up to share how they’re using Backstage and brainstorm solutions to common problems. The vibe on this edition was more intimate, as we were able to go deeper into the discussions around documentation beyond markdown, contributions to the portal by other teams, and creating infrastructure through the Scaffolder.

The people make the Unconference

We had over 95 attendees from around the globe, with people connecting from cities in the US such as San Diego, Chicago, Omaha, and Orlando. We also had the presence of peeps from Buenos Aires, Madrid, Paris, London, and Cambridge, and jumped to India and Singapore after passing by Kazakhstan and Israel.

Most of the people who attended are on a growing internal adoption stage in their Backstage journey (40%), followed by a large group of teams getting started with their PoCs (34%). The Unconference also go input from people who have achieved weekly active usage of their Portal across the board (6%) and folks who have optimized usage for a subset of their users (7%).

Making Documentation easier

Two out of the eight topics chosen by the community revolved around TechDocs. The first discussion was more higher-level and focused on sharing how teams are using Backstage to bring their varied sources of documentation under the same roof. The second topic had a more technical perspective on how to provide tooling to developers writing TechDocs so it’s easier for them to get their docs into Backstage.

Infrastructure and contributions

Using the Scaffolder to spin up infrastructure is a common use case, which got some interesting takes regarding how to manage the templates and the possible race conditions as you manage Terraform files and various PRs. Another topic of discussion was opening up the Portal to receive plugins from other teams and how to define the boundaries for integrations and outline a governance approach.

Recordings

Due to the nature of the unstructured conversations that occur in the Unconference, not all sessions resulted in a recording that could be too interesting to re-watch. We’re working to take pieces from at least two of the sessions to make videos that might be helpful to drive the Backstage discussion further. Stay tuned, we’ll be uploading them to our YouTube channel soon!

Roadie is sponsoring "Unpacked": The Ultimate Virtual DevOps Conference

Thu, 01 Jun 2023 15:00:00 GMT

We are thrilled to announce that Roadie is sponsoring the first-ever “Unpacked.” This virtual conference, hosted by Cloudsmith, promises to bring together DevOps professionals and engineering leaders worldwide. Unpacked aims to enlighten attendees on the intricacies of securing and scaling software delivery, discussing relevant topics such as software distribution at scale, securing open source dependencies, and reducing complexity and cost.

Why Attend Unpacked?

Cutting-Edge Insights: Unpacked brings together a prestigious lineup of industry leaders who will delve into the latest trends, strategies, and technologies shaping the world of software delivery.
Software Distribution at Scale: Scaling software delivery is a complex task, and Unpacked aims to demystify this process. The event will explore strategies and best practices for distributing software.
Securing Open Source Dependencies: With open source software playing a crucial role in modern development, ensuring its security is paramount. Unpacked will address this challenge, offering attendees valuable guidance on securing open-source dependencies and mitigating potential vulnerabilities.
Reducing Complexity and Cost: Complexity and cost often go hand in hand with software delivery. Unpacked will provide insights into simplifying processes, reducing unnecessary complexity, and optimizing costs associated with software distribution, enabling organizations to achieve greater efficiency and ROI.

“Unpacked” is set to be a groundbreaking virtual conference that will equip attendees with the knowledge and tools they need to excel in software delivery. With an exceptional lineup of industry leaders and a focus on scaling, security, and reducing complexity, Unpacked is a must-attend event for DevOps professionals and engineering leaders. Roadie is proud to be a sponsor of this transformative event, demonstrating our ongoing dedication to advancing software delivery practices. We look forward to virtually connecting with fellow attendees and exploring the future of software distribution together at “Unpacked”!

Backstage during KubeCon EU ‘23

Mon, 24 Apr 2023 15:00:00 GMT

Backstage was undoubtedly one of the recurring topics of the conference, with at least five talks dedicated to the framework and several others referencing it. As you walked through the busy venue, it was common to pick up people mentioning “Backstage” as you rushed through the busy venue trying to catch a talk—only to find out the room was complete even if you had arrived 10 minutes earlier.

By far, KubeCon + Cloud Native Conf EU 2023 has been the most impressive since the re-start of the conference series after the p*ndemic. With 10 thousand people in attendance and fascinating talks, being unable to enter the room—even if you arrived 10 mins earlier—became commonplace.

Let’s address the elephant in the room: there was no Backstage booth in the Project Pavilion, as there was in Detroit. A series of miscommunications caused this unfortunate issue, but after talking with other members of the community, including the maintainers, I can assure you: this won’t happen again!

Despite that, I was lucky to have talked with many people in the community, including OSS partners RedHat and VMWare, current Backstage adopters, and people who are just getting started with the framework.

Roadie mentioned in The Pragmatic Engineer

Tue, 21 Mar 2023 15:00:00 GMT

At a readership of 160k+ tech professionals, The Pragmatic Engineer has received a lot of praise from the enterprise software and scale-ups industry. For their column on Backstage, the Roadie team had the opportunity of contributing our experiences with Gergely Orosz. The outcome of Gergely’s research is truly a deep-dive pillar that explains how Backstage came to be, its different features, adoption stories, and many other aspects of the framework.

If you’re a Pragmatic Engineer subscriber, make sure not to miss out on this one. Gergely goes through what a Developer Portal is, how to get started, and even compares closed-sourced alternatives. Check it out!

Martina Iglesias Fernández, CTO of Roadie, saw the inception of Backstage from within Spotify and shared her story with The Pragmatic Engineer. At the time, she was a lead backend engineer at Spotify, where she saw the pain points that derived the need for an Internal Developer Portal. In the column, she explains in detail how Backstage came from an internal tool called System-Z.

The Roadie team shared other experiences with Gergely, including information about Backstage’s main features and the typical adoption process from MVP to org-wide release.

New in Roadie: Automated language tagging for GitHub entities

Mon, 20 Feb 2023 15:00:00 GMT

Part of understanding your software assets is knowing, at a glance, the languages used in each of them. Now, Roadie can automatically bring this information from GitHub and associate the corresponding languages to your entities through a tag or a label, depending on your preference.

Imagine you’re browsing your software catalog looking for a library that you can use for setting up your Go API routes or to find patterns others have used to set up a test harness for their Java apps. In these cases, it would be helpful to be able to filter entities by language.

Since GitHub already compiled this information for you, you only need to bring it into your Developer Portal. Now, Roadie can do this automatically: we’ll label or tag—according to your preference—your entities with their associated language.

If you’re a Roadie customer, you can get your Catalog to start tagging your entities with the corresponding languages by switching the feature on in Administration > Settings > Catalog.

Incident Management in your Backstage Developer Portal

Fri, 20 Jan 2023 15:00:00 GMT

It’s a typical operations day. You’re taking cozy sips of your lapsang tea, and there are only a few meetings on the horizon. Life is good. But, just then, an alarm goes off: the MetaQueriesPostProcessor service is down. It’s a relatively new service, not too critical. Given you’re not entirely familiar with the service, it wouldn’t hurt to get some more context of what’s going on, so you visit your Backstage-based Internal Developer Portal:

From the service’s entity page in Backstage, you can get a glimpse of the recent workflows runs, the service’s dependencies, and other issues like code quality, vulnerabilities, and documentation without jumping around platforms. Along with the ownership information, you’ll be up to speed to address this incident and contact responsible parties more easily. Additionally, other colleagues who work with this entity will get visibility on the ongoing incident and get ready to collaborate to resolve it.

Thankfully, Backstage has a handful of plugins that integrate incident managers into your Internal Developer Portal. Below is a list, in alphabetical order, of the incident management plugins you can integrate into your Backstage internal development portal:

FireHydrant

With the Firehydrant’s Backstage plugin, you can manage your incidents within Backstage. Teams can stay organized and quickly identify information about services like active incidents and healthiness analytics. The plugin includes an entity widget so you embed PageDuty’s information on the entity’s overview page.

To install it on a self-hosted Backstage instance, check out our FireHydrant Plugin guide. If you’re a Roadie customer, you don’t need to install it; just set it up in your Admin panel.

OpsGenie

The OpsGenie Backstage plugin offers two options: an entity widget and an additional standalone page. The entity widget shows the alerts for its corresponding entity on the overview page:

Additionally, the OpsGenie plugin comes with a standalone page where you’ll get to see who’s on call and an aggregated list of alerts happening in the entities registered in Backstage.

To install this plugin on a self-hosted Backstage instance, follow our OpsGenie Plugin Guide. If you’re a Roadie customer, check out the no-code guide to OpsGenie.

PagerDuty

With PagerDuty’s Backstage plugin, you can view the ongoing incidents related to an entity, as well as who’s on call. Additionally, you get handy buttons to create an incident right from the entity’s overview page.

To install it on a self-hosted Backstage instance, check out our Pager Duty Plugin Guide. If you’re a Roadie customer, set up PagerDuty from your Admin panel.

Rootly

The Rootly Backstage plugin is the most generous: it provides you with three options for viewing incident information in your Internal Developer Portal. First, there’s the standard entity widget for the overview page that every plugin offers:

You can see the ongoing incidents right on the entity overview page, and gives you handy links to create an incident or see a more detailed list. Second is a dedicated entity tab to get more details about the ongoing incidents for the associated entity:

And, Rootly also offers you a dedicated page where the incidents from across all entities are aggregated:

To install this plugin in a self-hosted Backstage instance, follow our Rootly Plugin Guide. If you’re a Roadie customer, set up Rootly in your Admin panel.

Splunk On-call

The Splunk On-call (formerly VictorOps) plugin for Backstage provides you with a widget for your entity overview page. The plugin will show a list of ongoing incidents and provide links to open an incident or acknowledge/resolve one right from the entity page. If you want, you can also set the widget to read-only so that no actions can be triggered from Backstage.

To install this plugin in a self-hosted Backstage instance, check out the plugin’s README. If you’re a Roadie customer, support for this plugin is being implemented at this very moment as requested last week by a customer.

And that’s a wrap! Setting up your incident manager with your Backstage Internal Developer Portal can help you manage incidents and prepare your teammates to collaborate when needed. If you know of an incident management plugin that I missed, please let me know through Twitter or LinkedIn! I’ll be pleased to add it to the list.

Roadie customers are not affected by Backstage’s RCE vulnerability

Wed, 23 Nov 2022 06:00:00 GMT

Last week, the Oxeye research team published a report of a vulnerability found in Backstage that could allow a threat actor to execute remote code by exploiting an outdated vm2 third-party library. The Backstage team patched this issue on version 1.5.1 back on August 29th. Roadie customers are unaffected by this vulnerability because their instances are upgraded regularly (currently at v1.8) and due to extra security measures in the Scaffolder implemented in Roadie from the beginning.

The problem

The remote code execution (RCE) vulnerability was possible due to a known issue in the vm2 library used in the Scaffolder, which has been patched since Backstage 1.5.1. By overloading definitions through a software template, the researchers manage to create a function outside the Scaffolder’s sandbox context that allows them the execute arbitrary code in the instance.

Furthermore, the researchers pointed out that Backstage by default doesn’t provide authentication for backend requests. This allowed unauthenticated actors to access the Scaffolder, and therefore, exploit the vulnerability from outside the Developer Portal.

Roadie customers are not affected

Roadie customers were running on Backstage 1.8 at the time of the vulnerability disclosure and were patched for this vulnerability shortly after Backstage 1.5.1 was released because the team keeps a close eye on CVE notifications.

Furthermore, due to Roadie’s architecture, the risk from this vulnerability was greatly mitigated for Roadie customers. Roadie executes templates on a transient ECS task with access to scoped and temporary credentials required for the execution of the template instead of the default execution strategy.

Also, Roadie provides authenticated access to both frontend and backend requests, which means no unauthenticated actor could have accessed the Scaffolder in the first place.

Upgrade your instance ASAP

If you’re running a self-hosted Backstage instance and still use a pre-1.5 version, you’re facing a vulnerability with a 9.8 CVSS score, which is the most severe for exploitability and impact.

If you don’t want to bother to run upgrades again, switch over to Roadie! We’ll keep your instance safe through regular upgrades and extra security layers. Request a demo!

Backstage consolidating its role in the Cloud native ecosystem

Thu, 03 Nov 2022 06:00:00 GMT

Last week, the very first BackstageCon brought along news from influential firms voicing the maturity that the Backstage project has achieved. At the moment, Thoughtworks, Red Hat, Gartner, VMWare, and the Linux Foundation endorse Backstage as a viable solution for improving the developer experience of growing engineering teams through a Developer Portal.

In the most recent Tech Radar, Thoughtworks moved Backstage to Adopt in their Platform quadrant. This means that their advisors “feel strongly that the industry should be adopting” Backstage. And in fact, Thoughtworks offers Backstage as part of their digital transformation offering to their customers.

On the other hand, Red Hat announced that they will start actively participating in the Backstage community. They will be contributing to the framework to support OpenShift and improve the Kubernetes experience in Backstage.

Additionally, VMWare has been a commercial partner of Backstage offering it as a UI for a security product in their Tanzu suite. VMWare has committed an Open Source team dedicated to contribute back to Backstage Open Source.

Previously this year, Gartner reported on Backstage as a solution for Developer Portals. It also mentions Roadie as a solution to tackle the complexity of adopting the framework via de self-hosted route.

Finally, the Linux Foundation is also betting on Backstage after having witnessed the success it has brought to most adopters in the Cloud native space. The Linux Foundation is investing in creating learning resources, starting with an Introduction Course to the framework.

For Roadie customers, this news means you’re about to get all the benefits from the substantial contributions of a robust community without having to do anything. It also shows you’re on the right track by adopting Backstage through Roadie!

Wrap up: BackstageCon and KubeCon NA 2022

Mon, 31 Oct 2022 06:00:00 GMT

Backstage made its way to the center stage last week in Detroit, as maintainers, contributors, and adopters deepened their relationship and shared their excitement about the framework with the wider Cloud Native community.

Monday: BackstageCon

During the first conversations about setting up a Backstage-exclusive conference, the CNCF event organizers said we could aim for 50-100 attendees because it’d be the first time. Well, BackstageCon ended up being the largest co-located event at KubeCon + Cloud Native NA 2022 with 150+ attendees!

The event was co-hosted by Suzanne from Spotify and Martina from Roadie. They did a fantastic job at curating the line-up, all talks are available on Youtube so check them out!

Roadie had a very orange table at BackstageCon, where we greeted everyone.

Of course, we enjoyed hanging out with each other after the conference. Here’s a picture of Roadie having dinner with Spotify folks, including the Backstage maintainers we all love.

Tuesday: Backstage Project Meeting

At first, a 4-hour meeting to talk about Backstage seemed a bit much. But once we got started, it turned out to be enough to only scratch the surface on topics like the sources of truth for the Catalog, frontend performance, maintainership, and adoption challenges. Having adopters, maintainers, and partners in the same room made ideas flow endlessly and lead to actionable tasks to make it easier to adopt and contribute to Backstage in the near future.

In the evening, Frontside and Roadie had what is, without doubt, the best food anybody attending KubeCon enjoyed, at Yemen Café.

Wednesday-Friday: KubeCon

Roadie had a booth at KubeCon, where we explained Backstage and how we offered a managed version to people attending the event. We also were giving away organic cotton Backstage t-shirts, hand-printed in Barcelona. We heard they were the best t-shirts of the conference in terms of comfort!

We had the wonderful chance to meet and talk with our customers attending the event. Here we are with MyFitnessPal:

See you next time!

We all had a blast last week in Detroit and aligned on how to take Backstage further as it gains popularity. Next KubeCon is in Amsterdam from 17th to the 21st of April, 2023. Hope to see you there!

Roadie.io is boosting Backstage Developer Portals for Scale-Ups with Scorecards

Mon, 24 Oct 2022 06:00:00 GMT

Dublin, Ireland, October 24th, 2022. Roadie.io, the company offering a CNCF Backstage SaaS option, is extending the most popular Developer Portal framework by introducing Tech Insights.

The Open Source version of Backstage is used by industry leaders such as HP, VMware, and Expedia Group. But, as highlighted by Gartner, it requires significant effort and dedicated staff to stand up and maintain. Instead of self-hosting Backstage, dozens of scale-ups like Netlify, Snyk, and MyFitnessPal have adopted it through Roadie’s managed option. Now, Roadie is expanding its offering by introducing Tech Insights, which doesn’t have an equivalent in the Open Source Backstage.

Tech Insights lets Roadie users create Scorecards to keep track of quality standards within their organization. This is useful for assessing software maturity and security compliance across all services.

*Roadie users can define the criteria to be measured in their services via a Scorecard. * Given that Roadie Backstage users already have all their software assets registered in their Software Catalog, extracting insights is a valuable next step in any Developer Portal.

*Roadie users can overview the health of their ecosystem across services *

Roadie is currently working with a handful of design partners to develop Tech Insights to ensure it brings out the most value for leadership and developers.

Roadie now keeps the catalog in sync with your GitHub with the webhooks API!

Tue, 04 Oct 2022 22:00:00 GMT

As a Roadie user, editing a Backstage YAML file in your GitHub repo will result in those changes almost immediately appearing in your Catalog. Our team designed and implemented a GitHub integration based on webhooks to replace the default poll-based discovery shipped in Backstage.

Previously, we relied on Backstage’s default behavior for keeping the catalog up to date. This was a pull-based approach where Roadie polls your GitHub and kept the catalog in sync.

By default, the polling interval was set to 2 minutes. This is a long time to wait while you are in the middle of editing your scaffolder templates and still figuring things out.

Polling large catalogs would also result in many requests being sent to the GitHub APIs. This could result in rate limiting and a degraded user experience.

With this release, we are utilizing the GitHub webhooks API to get notifications when you change your Backstage YAML files.

We also added a new feature to the GitHub integrations settings page to be able to manually trigger a sync with your GitHub repos. This is useful if you added a catalog-info.yaml files to a repository where you did not have the Roadie GitHub app installed.

The benefits for Roadie users

We believe this new webhooks based approach brings a number of benefits:

We eliminated the usage of Location entities for discovery. We can spare the additional fetches for the whole organization repositories for every configured github-discovery Location entity.
It results in an almost immediate reaction from the catalog when you push something to your configured branches.
Now you can safely rename your catalog files in your GitHub repo. (This will result in a deleted filename for the old file and an added one for the new file)
It can refresh your API entity when the referenced e.g. openapi/grpc file is changed (if it is hosted in GitHub)

Read on for more technical juicy details about the implementation.

Tech Stuff

Let me walk you through this journey to implement and roll out instant updates for Roadie users!

The Past

Before webhooks, we relied on the default implementation of auto-discovery from Backstage. This used the processing loop, and the provided processors to ingest entities from GitHub organizations. We used the GithubDiscoveryProcessor from OSS Backstage.

It works like this:

This processor is configured and added to your catalog builder.
This processor is evaluated on every entity when it is processed that should this run or not.
This processor will execute its logic when an entity is processed that is a Location entity and its type is github-discovery
It fetches all of your repositories from your organization then creates an optional Location entity for every repo.
These Location entities then will be processed and they are going to fetch the files and emit the entities that they found in the target paths.

This processor has 2 main drawbacks:

It is tied to the processing loop so you cannot set a different interval for it. This is a problem if you’re being rate limited by GitHub. There is no option to lengthen the loop duration.
It makes unnecessary requests towards GitHub API by fetching all of the repositories every time it runs.

The present

We built a Roadie-specific entity provider which can act on the incoming GitHub webhooks.

It uses your configured Roadie Backstage GitHub app to forward the GitHub push events from your organization’s repositories to our servers.

The GitHub webhooks API sends Roadie the modified, added, deleted array of files. This indicates what happened in this event. The provider differentiates the modified and added/deleted events.

When a modification event happens:

we get the event from GitHub
Get all the modified filenames in this push event
trigger a refresh on the Backstage database

We will try to refresh with every filename and let the database decide if there was a matching entity to schedule the refresh. This was implemented this way because it enables us to provide an instant refresh on API entities when a referenced $text placeholder’s value is managed in GitHub and you change that open API descriptor we will refresh the API entity that it belongs to.

When the event contains additions/deletions:

Get the event from GitHub
Construct a set of filenames for added files
Construct a set of filenames for deleted files
Filter these based on the configuration
Create an optional Location entity for these files with proper location annotations

This path is pretty similar to the previous discovery. We are creating Location entities where the location’s spec.target will point to the file that we got in the GitHub event. For every added/deleted file that matches your configuration and we rely on the processing loop fetch and emit the actual content of the file.

We removed the polling for entities, and we disabled the possibility to add github-discovery Location entities to the catalog.

Some things to iron out

With the current implementation, some edge cases can be confusing or not work as expected.

Multiple entities in one file (catalog-info.yaml)

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: valid-same-file-entity-1
spec:
  type: library
  owner: user:kissmikijr
  lifecycle: production
---
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: valid-same-file-entity-2
spec:
  type: library
  owner: user:kissmikijr
  lifecycle: production

This approach you can cause undesired behaviour if you end up with a validation error in one of your Entities.

If this happens the catalog will create a location entity which will point to this file, however, the processing of the entities won’t finish, this means Backstage will not store the correct information to be able to trigger refreshes and even though you fix your validation errors in the next commit you’ll need to wait for the regular processing loop to handle the refresh.

Registering an Entity via the /register-existing-component page

In this case, because this entity was not added to the catalog via the webhooks, when you delete this file from your GitHub repo the webhook won’t be able to remove it.

Updating this entity will be instant.

Using the Location kind

If you used Location entities before in your repository to register this and let the processing loop find the other targets.

apiVersion: backstage.io/v1alpha1
kind: Location
metadata:
  name: roadie-backstage-plugins
spec:
  targets:
    - ./plugins/**/catalog-info.yaml
    - ./utils/**/catalog-info.yaml

The automatic refreshes will not work on the target entities.

This is a shortcoming of the open source implementation of the refresh handling. It is planned to be fixed. Until then, the best advice is to ditch the top-level Locations and configure the targets in the /administration/settings/integrations/github configuration page.

In this case, you’d add two entries to the Targets:

# Entry 1
https://github.com/RoadieHQ/roadie-backstage-plugins/blob/-/plugins/**/catalog-info.yaml 

# Entry 2
https://github.com/RoadieHQ/roadie-backstage-plugins/blob/-/utils/**/catalog-info.yaml

To configure your targets check out the documentation.

Measuring and Improving Software Quality with Roadie Backstage

Thu, 01 Sep 2022 10:00:00 GMT

We are building Tech Insights on top of Roadie Backstage. It will help you ensure that all of your software assets have the support and maintenance they need in order to keep you secure, compliant, productive, agile, and available.

Our customers will be able to use Tech Insights to spot unloved services in production, find teams who need more support in order to produce top quality software, and help engineering organizations understand where the bar is.

Roadie users will be able to create scorecards which define what it means to build quality software within their company. They will be able to apply these scorecards to software in the Backstage catalog. They will see reports to help identify software which needs improvement, and they will be empowered to work with owners to deliver said improvements.

Tech insights will be the first major proprietary Roadie feature. It is in development as I write this post, and we are already working with design partners on rollout.

Throughout our work with dozens of customers, it has become clear that implementing Backstage and cataloging software is only the first step on a longer journey for most organizations.

If you would like to supercharge your Backstage experience with this feature, please request a demo of Roadie.

Measuring the quality of software

The first step towards determining whether or not software is of a high enough quality, is to first define what “quality” means in your organization.

Quality is an aggregate measure which accounts for many dimensions. Quality software is usually easy to operate in production, easy to develop, secure and compliant, amongst many other attributes.

Each of these individual measures is composed of its own factors, each of which is likely tracked in a different tool.

Here are some simple attributes you might check to get a measure of operability:

Software which is easy to operate in production usually has adequate uptime. This might be measured against a service level objective recorded in Datadog.
It is usually connected to a logging tool. This might be Datadog again, or it might be a self-hosted ELK stack.
The software should have an on-call rota associated with it. This would be stored in Pagerduty.
It should have an owner who is responsible for keeping the software up-to-date and deploying it frequently. This would be stored in the Backstage catalog.
it should have runbooks defined. These might be in a Confluence document.

Once we factor in a few of these attributes, our measurement of good software starts to look more complex.

Run the same process for scalable, secure, easy to develop, compliant and any other top level factors and you quickly start to realise that you have to check 40+ factors and integrate with 20+ tools to determine if a piece of software is high quality or not.

Is there enough quality?

The next level of complexity comes from attempting to determine whether or not the current level of quality is “enough”.

The required level of quality will vary depending on the purpose and exposure of the software in question. Take uptime for example. Your payment processing service might need 5 nines of uptime, while some internal reporting tool might be down for 5 months before anyone notices, and that might be ok.

To account for this in the measurement, you will need to attach different expectations to different software components.

This is frequently done by categorizing and dividing software into tiers. Software in tier zero might be mission critical and require the highest levels of quality. Tier 3 might be far less important from an overall business standpoint, and will be subject to less stringent requirements.

We can contextualize the measurements we defined above by defining quality levels, as shown in the diagram below.

Driving improvements in software quality

Invariably, when you define software excellence and compare the current state of the world against it, you will find gaps. This is a good thing. The next step should be to encourage the improvement of this software.

There can be a multitude of legitimate reasons why a piece of software doesn’t meet a given quality bar. Teams will constantly streamline their priorities to try to match those of the wider business, and quality will sometimes suffer.

The important thing is that this quality gap is visible. We may actively choose not to rectify it, but we should at least be able to spot it and have that conversation.

Teams should be able to independently make realizations like “everyone else in the org seems to have passed the security level 1 bar… perhaps we should think about doing some security work soon”.

We believe Roadie Backstage has a role to play by increasing visibility into organizational software quality. We should nudge people towards increasing software quality, in a thoughtful and conscientious way.

We’re working on Tech Insights at Roadie

You will be able to write Checks which continually test the software in your catalog in an automated way. These Checks can be anything from “Does the software have an SLO set in Datadog” to “Is the log4j version semantically greater than 2.16.0”. Checks can leverage custom data or data from the APIs of standard SaaS tools that you already use.
You will be able to group Checks into Scorecards, and target those scorecards to subsets of the software in the catalog. Perhaps you want to target Tier 0 and Tier 1 Java services. Perhaps you want to target Python services in the data science org.
You will be able to slice and dice reports of software quality so that you can find software which needs more support in order to meet the quality bar.
Teams will be carefully and conscientiously nudged towards improving the quality of their software over time.

We’re already rolling out early versions of this software to our design partners. If you would like to learn more, or participate in the betas, please request a demo of Roadie Backstage today.

Roadie Has Achieved SOC2 Type 2 Compliance

Fri, 15 Jul 2022 14:00:00 GMT

We are delighted to announce that we have achieved SOC2 Type 2 compliance across three areas: Confidentiality, Security, and Availability.

From the start, we built Roadie with security, availability, and privacy as fundamental values, and we recognize them as essential to our success. Our team is made up of people who have worked in large enterprise companies and scale-ups such as Workday, Spotify, and Intercom, so we are no strangers to enabling and ensuring good security practices. We understand that if you build these processes early, they will grow with your company and help you scale securely and reliably.

We have a set of mature and robust security and availability practices at Roadie and wanted to validate them against industry standards. We see this achievement of SOC2 Type 2 compliance as a milestone in our ever-improving security journey.

What this means

A SOC2 Type 2 report is one of the most well-known IT security and compliance auditing accreditations. It is highly comprehensive: it doesn’t look at any one business area in isolation.

An accredited external audit firm scrutinized Roadie’s engineering practices—such as our database security controls, monitoring and alarming, and testing methods—as well as the ecosystem within which these practices live. Meaning, that we train our staff, we care about who we hire, we restrict access to data, and we review all vendors that we choose to use.

Why is this important?

Simply put, we want this report to give our customers even more peace of mind when choosing to trust Roadie with their data. The SOC2 Type 2 report shows that we have opened our doors to a third-party and allowed them to test and scrutinize our security and availability practices.

As the old but fitting adage goes, “trust but verify.” Our goal is to provide you with confidence that we have robust, mature, and industry-standard practices that are monitored and updated frequently.

A milestone, not the end goal

The comprehensive nature of this audit affirmed our confidence that we are set up with excellent foundational security and availability practices which we can continue to build on as we scale.

We will continue to keep our compliance with SOC2 up to date, and we will undergo an annual audit to test our SOC2 compliance. We also aim to expand our compliance to additional standards as we grow.

If you would like to see a copy of our SOC2 Type 2 report, reach out to legal@roadie.io

Backstage Home plugin on Roadie

Mon, 04 Apr 2022 15:00:00 GMT

The Home plugin provides a view on what’s important to the currently logged in Backstage user.

Until recently, the catalog has been the primary way to interact with Backstage. You could pick a software component from the list, and quickly get a sense of how that component is doing, who owns it etc.

But a single software engineer typically has to manage and track multiple components at once and it seems inconvenient to need to visit the Overview page of each individual component in order to get a sense of what’s happening. Why can’t the info be co-located in one place?

Software engineers don’t just interface with software either. They also need up-to-date information on the latest goings on in their organization, and they even have to (begrudgingly 😃) attend meetings sometime. Shouldn’t Backstage also be plugged into this part of an engineers job?

The pulse of the team

Tackling these problems is the remit of the Home plugin. It’s a Backstage page which you can visit first thing in the morning to take the pulse of the team. It’s a page for bringing together information from many different systems, in a way which is most relevant to you!

Here’s how the Home plugin looks for me in Roadie Backstage:

As you can see, important information from a number of sources here.

I have quick access to the components that I own, and the entities that I have “starred” in the catalog. I frequently use these components when I’m doing Backstage demos, so it’s useful to have them within easy reach.
My calendar is available, thanks to the Google Calendar plugin created by Alex Rybchenko from Box (#9719). I can even click the zoom links to go directly to a meeting.
I can see my open review and pull requests on GitHub. If the team need me to review something, I can see that any time I log into Backstage.
Finally I have a Roadie News widget. This displays content from a static markdown URL hosted on GitHub. It can be used to share organization news or information about upcoming events. We’ve also seen our customers use it to bookmark links to important pages outside Backstage.

Where this is going

The Home plugin is super new but we’re already seeing amazing demand from our customer base.

At Roadie, we’re hard at work producing more widgets for the Home page. We’d love to use this space to display the Jira tickets you’re working on or keep you updated on the builds running on your PRs for example.

Hopefully, a good portion of the 60+ open-source Backstage plugins which already exist will end up having Home widgets added.

Once this community work happens, the vision of having a single place to get the pulse of your org, your team, your software and your own work will be realised.

Learn more

The Home plugin is available to all Roadie Backstage users. You may need to enable it in you Administration area before it is visible.

Learn more in our Home plugin documentation.

The Backstage scaffolder is now generally available on Roadie

Wed, 09 Mar 2022 16:30:00 GMT

Introduction

At Roadie, we think deeply about the security of every feature we roll out. This has sometimes slowed us down, or meant that we’ve had to run without some features or plugins available. For example, we’ve written at length about steps we take to properly authenticate access for GitHub Apps.

Last year, to ensure our customers security, we made the hard decision to launch Roadie Backstage with the scaffolder disabled. This cost us some customers over the past 6 months, but it was worth it for security.

Today, after months of hard work, we are proud to launch scaffolder support with a completely re-designed and hardened architecture. This new architecture ensures that scaffolder tasks are run in an isolated and ephemeral environment which keeps customer data secure.

The scaffolder is available on all Roadie Backstage environments today.

To use it, start a free trial here, then check out our docs for the scaffolder, and read our walkthrough to learn how to write scaffolder templates.

What is the scaffolder?

Imagine you’re an engineer looking to create a new microservice. You want to get started as quickly as possible, with minimal boilerplate and red-tape to jump through. At the same time, engineering organizations benefit from having consistency in production, and often put gates in place to enforce it.

Instead of creating blockers for engineering teams, Spotify use the scaffolder and quickly create new microservices, while helping to ensure that production remains mostly consistent.

Engineers can choose a pre-defined software template, fill out a few form fields to provide values like the name of the GitHub repo that the new service will occupy, and click a button to run the template and create a the new service.

By making it easier to start new projects, your engineers can move more quicly, while preserving standaeds and reducing complexity in your tech ecosystem.

See it in action in this 3 minute video where we demo a scaffolder template which creates a GitHub pages website.

Roadie’s scaffolder architecture

Before starting the work to enable the scaffolder, we audited the open source scaffolder actions one by one and found that the Backstage Scaffolder was running them on the API backend process of Backstage. Scaffolder tasks technically had access to the same resources available to the API backend.

Tasks that copied files or accessed network resources might are a little risky even when running Backstage inside your corporate firewall. On a SaaS platform like Roadie, they are unacceptable.

To isolate our scaffolder jobs, we run them in a separate process in a private network on AWS ECS. A single container task is spun up for each execution and destroyed once it completes.

The container can only access to its own Backstage database and the public internet. The container does not have access to the network services available to the API backend process and it cannot do things like copy files from the local API backend services file system.

We already support the most frequently used scaffolder actions on Roadie. You can fetch pre-defined templates, use them to create GitHub repositories and write to the Backstage catalog.

You can even send HTTP requests to the public internet, using an open source library we created, so that newly templated microservices can automatically register with your SaaS tools like Circle CI or PagerDuty.

The full list of supported scaffolder actions is available inside your Roadie Backstage instance at https://[sub-domain].roadie.so/create/actions.

Next steps

We’re going to continue working on the scaffolder to make it faster, more featured and more secure. We’re already thinking about features such as custom container support and fully custom scaffolder actions.

We’ve also begun publishing more open-source modules for the scaffolder. We’ve already created a general utils package and some dedicated AWS actions.

If there’s something you’d like to see, please reach out on our public Discord channel.

Deploy a GitHub pages website with the Roadie Backstage scaffolder

Fri, 25 Feb 2022 14:00:00 GMT

Introduction

In this tutorial we are going to learn how to deploy a customized GitHub pages website with the Backstage scaffolder on Roadie.

The website we create will be…

Created from a prepared code skeleton.
Published from its own GitHub repo.
Automatically deployed to the public web via GitHub pages.
Customized with information we collect from the user when they run the scaffolder.
Available in the Backstage catalog so others can find it.
Automatically be hooked up to a monitoring service.

The skills you will learn include:

How to write scaffolder code skeletons and templates.
How to collect input from the user in the scaffolder UI, and pass it through to the codebase.
How to make HTTP requests from the scaffolder, and use the Roadie Backstage proxy to securely add authentication to the requests.
How to create proxies in Roadie.

This tutorial involces some steps which are specific to Roadie and won’t work on a vanilla Backstage installation. To try Roadie, apply for a free trial on our website. You will also see the best results if you are an admin of your Roadie Backstage instance.

Just show me it working!

If you’d prefer to see the scaffolder in action before working through this tutorial, you can watch this short demo video. It demonstrates the same GitHub Pages scaffolder action we create in the step that follow.

This video is part of the Backstage Bites series.

Step 1: Create a GitHub repo containing code from a skeleton directory.

A Backstage Scaffolder template will usually consist of at least two things:

A skeleton code structure, which is used to stamp out new websites, services, or other types of component.
A template.yaml file which describes the steps to run during the scaffolding process.

Create a basic skeleton and template

To get started, make a directory structure to hold our template and the code skeleton we will use to stamp out our GitHub pages website.

.
├── skeleton
│   └── index.html
└── template.yaml

Put the following HTML into the index.html:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>my website</title>
</head>
<body>
  <h1>Welcome to my website</h1>
  <p>This website was created by following the <a href="https://roadie.io/blog/roadie-backstage-scaffolder-website/">Backstage scaffolder tutorial published by Roadie</a></p>
</body>
</html>

And place the following YAML into template.yaml. We’ll explore how this works at a later point in the tutorial. For now, let’s just get it working.

Optionally, you may wish to edit the spec.owner property to refernce the name of a real Group (aka. a team) in your Backstage catalog.

apiVersion: scaffolder.backstage.io/v1beta3?
kind: Template
metadata:
  name: github-pages-website
  title: GitHub Pages Website
  description: Create a static HTML website and publish it via GitHub pages.

spec:
  owner: my-group-name
  type: website
  parameters:
    - title: Choose a Source Control Management tool to store your new website in.
      required:
        - repoUrl
      properties: 
        repoUrl:
          title: Repository Location
          type: string
          ui:field: RepoUrlPicker
          ui:options:
            allowedHosts:
              - github.com

  steps:
    - id: template
      name: Fetch Skeleton + Template
      action: fetch:template
      input:
        url: ./skeleton

    - id: publishToGitHub
      name: Publish to GitHub
      action: publish:github
      input:
        allowedHosts: ['github.com']
        # This will be used as the repo description on GitHub.
        description: 'A static HTML website. Just like the good old days.'
        repoUrl: ${{ parameters.repoUrl }}
        defaultBranch: main
        repoVisibility: public

  output:
    remoteUrl: '{{ steps.publishToGitHub.output.remoteUrl }}'

Now that we have a basic template.yaml and skeleton, we need to push it to GitHub where the Roadie Backstage scaffolder can access it. You must install the Roadie GitHub App before proceeding. To learn how to do that, read our Getting Started docs. Turn your directory structure into a Git repository and push it to GitHub.

Import your template into Roadie Backstage

Copy the URL of the template.yaml file on GitHub and go to Roadie Backstage to import it into the catalog.

On the main catalog view, click the “CREATE COMPONENT” button, then click the “REGISTER EXISTING COMPONENT” button.

Paste the URL of the template.yaml into the URL text field and click “ANALYZE”.

Assuming there are no errors, click the “IMPORT” button.

Great. Your template is now imported into Backstage and ready to use.

Click the “Create…” link in the Backstage sidebar to go back to the main scaffolder view. You should see your template is now available.

Run the template

Click “CHOOSE” and you should be taken to a form like this:

Fill in the name of your GitHub Org in the “Owner” field. This template will use the Roadie Backstage GitHub app to run the scaffolder, so the Org must be the same org that the Roadie Backstage GitHub app is installed into.

Choose an arbitrary GitHub repository name and fill it into the “Repository” field. Something like “my-cool-website” will work. This repository doesn’t need to exist. The scaffolder will create it as it runs through the steps.

Click “NEXT STEP” and then “CREATE”.

After 15 or 20 seconds, you should see scaffolder logs being to appear. These logs will give you information about what the scaffolder is doing, and will display errors if failures occur.

Assuming everything worked correctly, you will see checkmarks appear in the left column of the scaffolder interface as it does its thing.

Click the link labelled “Repo” in the left sidebar to visit the GitHub repo which we just created. You should see it contains an index.html with the same content we placed in the HTML file earlier.

Step 2: Publish the site to GitHub Pages

Having a repo with some code in it is great, but it’s not a website until you can visit it in the browser.

To turn this HTML into a GitHub pages website, we can manually visit the settings of our newly created GitHub repo and click some buttons to publish the site to the web. We don’t like manual work though. Let’s see if we can do this in an automated fashion instead.

To make this work, we’re going to use the Open Source http:backstage:request scaffolder action from Roadie. You can see this package on GitHub. Don’t worry about following any of the installation steps. This package is already installed in Roadie Backstage.

This scaffolder action allows us to send HTTP requests from our template.yaml. We will use it it hit the GitHub API endpoint which can turn on GitHub pages for a GitHub repository.

Using the HTTP Backstage Request scaffolder action

Add a new step called publishToWeb to your template.yaml, commit it and push it to GitHub.

# ... spec and other properties above this point
  steps:
    # ... existing template and publishToGitHub steps here.

    - id: publishToWeb
      name: Publish to web with GitHub Pages
      action: http:backstage:request
      input:
        method: 'POST'
        path: /api/proxy/github/api/repos/${{ (parameters.repoUrl | parseRepoUrl)["owner"] }}/${{ (parameters.repoUrl | parseRepoUrl)["repo"] }}/pages
        headers:
          content-type: 'application/json'
        body:
          source:
            branch: main
            path: '/'

Let’s go through this YAML to learn what it’s doing.

- id: publishToWeb
  name: Publish to web with GitHub Pages
  action: http:backstage:request

The first three lines are relatively self explanatory. We’re adding a new step with an id and a human readable name. Then we’re declaring that we’re going to use the http:backstage:request action. If you ever want a list of all the available actions, just visit /create/actions inside Roadie Backstage.

input:
  method: 'POST'
  path: /api/proxy/github/api/repos/${{ (parameters.repoUrl | parseRepoUrl)["owner"] }}/${{ (parameters.repoUrl | parseRepoUrl)["repo"] }}/pages
  headers:
    content-type: 'application/json'
  body:
    source:
      branch: main
      path: '/'

The input is passed to the http:backstage:request action when it runs. Our input says we want to make a POST request with a particular body which satisfies the requirements of the GitHub API. We also want to pass some headers. These are the same details you might expect to see if we were calling this API endpoint with curl or Postman.

The path is interesting. Instead of calling the GitHub API directly, we’re proxying through the Roadie Backstage backend. As the request passes through Roadie Backstage, the proxy will transparently add an authentication token to the request. This means we don’t have to hardcode the authentication token into the template.yaml, or ask the user to provide it at runtime.

The first three parts of the path, /api/proxy/github/api , contain the location of the proxy on the Backstage API. Everything after github/api is the path of the API endpoint we want to hit on GitHub.

The GitHub path we want to call is /repos/[owner]/[repository]/pages. Of course, we don’t know the name of the owner and repository until the user runs the template. To work around this, we use a special syntax to parse them out at runtime.

We can get the “Owner” the user types in with ${{ (parameters.repoUrl | parseRepoUrl)["owner"] }} and we can get the “Repository” the user types in with ${{ (parameters.repoUrl | parseRepoUrl)["repository"] }}.

Storing the GITHUB_TOKEN securely

Our proxy knows how to forward requests over to GitHub, but it doesn’t yet have the correct token to be able to successfully authenticate. Lets securely store an authentication token in Roadie so the proxy can use it.

Click “Secrets” in the left sidebar of the SETTINGS tab on Roadie Backstage.

At the top of the list you should see a row for “GITHUB_TOKEN”. Visit your GitHub account settings and create a personal access token which has the pages scope. Be sure to authorize it on your GitHub org if you use SSO.

Click the pencil icon on the secrets page to open a dialog box. Paste in the personal access token you just created and click “SAVE”.

Roadie Backstage has to restart to enable this token. This can take a few seconds. Wait for the GITHUB_TOKEN table row to display a green status indicator and the text “Ready” before proceeding.

Putting it all together

Now we have a template which sends a HTTP request to a proxy and we have a secret which the proxy will use to authenticate with GitHub.

Since we pushed the new version of our template.yaml a while ago, the Backstage catalog should have already looped over it and picked up the new version with the call to the GitHub pages API in it.

Go back to your scaffolder template and fill out the details again. Remember, GitHub repos must have unique names, so if you haven’t deleted the repo you created in step 1, you’ll have to choose a different name this time.

Click “NEXT STEP” and “CREATE”.

This time you should see that we have 3 steps due to run in the left sidebar of the Task Activity page. The scaffolder is going to:

Fetch Skeleton + Template
Publish to GitHub
Publish to web with GitHub Pages

Success! Now visit the repo you just created on GitHub and go to the Pages part of the Settings.

Visit the following URL to see your new website:

https://[owner].github.io/[repository]/

It doesn’t look like much but it’s a start!

Step 3: Customize the website

Creating a website from a template is great, but it would be better if we were able to customize it a little. We don’t want our website looking exactly like hundreds of other scaffolded websites 😃.

For customization, the scaffolder lets you pass values through to the template as it is being processed. To see how this works, let’s collect a website name from the user and use it in the title tag and main heading.

First, we have to update the index.html in our website to render a value called website_name. This will be provided by the user when they run the scaffolder task.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>${{ values.website_name | title }}</title>
</head>
<body>
  <h1>Welcome to ${{ values.website_name }}</h1>
  <p>This website was created by following the <a href="https://roadie.io/blog/roadie-backstage-scaffolder-website/">Backstage scaffolder tutorial published by Roadie</a></p>
</body>
</html>

We also need to update our template to make it ask the user for the website name.

Add the following YAML to the template.yaml in the parameters section, above the item which asks for a repo owner and repository.

- title: Provide some simple information
  required:
    - website_name
  properties:
    website_name:
      title: Website name
      type: string
      description: This will be displayed prominantly on your website and in the title tag.

Finally, we need to name and pass the collected website name value down into the templating step so it is available in the index.html.

steps:
  - id: template
    name: Fetch Skeleton + Template
    action: fetch:template
    input:
      url: ./skeleton
      values:
        website_name: ${{ parameters.website_name }}

Commit and push those changes. Once the catalog refreshes the template from GitHub, you should now see a new text field in the interface. If you don’t want to wait, you can manually reimport the template to refresh it.

If we fill that in with a website name and proceed through the rest of the steps as before, we should eventually end up with a new GitHub pages website which has a unique name. In this example, I typed “My Cool Site 3” into the text field.

Step 4: Add a catalog-info to the new service

This new website looks promising, but it would be even better if it was automatically added to the Roadie Backstage catalog so that other people in your company could discover it.

To make this happen, we can add a catalog-info.yaml to our skeleton codebase and pass some values into it to set sensible defaults. Once this YAML file is created, we can rely on Roadie Backstage’s auto-discovery mechanism to pick it up.

Create a file named catalog-info.yaml inside the skeleton directory we created earlier. Place the following content in it:

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: ${{ values.repo_name }}
  title: ${{ values.website_name | title }}
  description: A static HTML website. Just like the good old days.
  links:
    - url: https://${{ values.repo_owner}}.github.io/${{ values.repo_name }}/
      title: Live website
  annotations:
    github.com/project-slug: ${{ values.repo_owner }}/${{ values.repo_name }}
spec:
  type: website
  owner: engineering
  lifecycle: experimental

You can see that this file relies on a few values which contain information about the website we are templating. In order to have access to these values, we need to pass them in from the template.yaml. Edit the step named “template” to pass in the values.

steps:
  - id: template
    name: Fetch Skeleton + Template
    action: fetch:template
    input:
      url: ./skeleton
      values:
        website_name: ${{ parameters.website_name }}
        repo_name: ${{ (parameters.repoUrl | parseRepoUrl)["repo"] }}
        repo_owner: ${{ (parameters.repoUrl | parseRepoUrl)["owner"] }}

Next time we run this template, we will see that an extra file called catalog-info.yaml is created in the newly scaffolded GitHub repo. Roadie Backstage’s auto discovery will automatically find this file and use it to populate the Backstage catalog.

You may notice that we have specified some links in the catalog-info.yaml. This metadata will automatically populate the Links Backstage plugin with a link to our website on the public internet. Clicking the link will take the user to GitHub pages.

If you don’t see this card, ask a Roadie Backstage admin to add the EntityLinkCard to your website component overview page.

Step 5: Register the service with Better Uptime

Calling the GitHub Pages API was relatively simple, because there was already a proxy in place to use. But what if we want to call an authenticated endpoint which doesn’t have a default proxy?

To see how this works, we’re going to create our own proxy for the website monitoring service, Better Uptime, and then use the HTTP Request scaffolder action to setup a website ping when our scaffolder template is executed.

Roadie has no affiliation with Better Uptime. They have an easy to use UI and a free tier, which is useful for a tutorial like this.

Create a monitor in Better Uptime

The first thing you will need to do is to sign up for Better Uptime. Their free plan allows more than enough features to get through this tutorial.

One you have an account you will need to take note of your API token. Click Integrations in the sidebar, then APIs in the secondary header. Click the Copy to clipboard link.

To see how the Better Uptime API works, let’s create a monitor for https://google.com as a test.

Run the following curl command, making sure to substitute <API_TOKEN> with the API token you copy/pasted from the Better Uptime integrations page.

curl -X POST \
     --header 'Authorization: Bearer <API_TOKEN>' \
     --header 'Content-Type: application/json' \
     --url https://betteruptime.com/api/v2/monitors 
     --data '{"url":"https://google.com"}'

Assuming that runs successfully, you should see a monitor has been created in Better Uptime.

Now that we know how to do that with curl and Google, let’s see how to do it with the scaffolder and our GitHub Pages website.

Calling the Better Uptime API from the scaffolder

To set up monitoring for the website we create with the scaffolder, all we need to do is reimplement this curl command in our template.yaml using the HTTP Backstage Request module.

The YAML step to accomplish that looks like this:

steps:
  # all of the previous steps already discussed

  - id: registerInBetterUptime
      name: Register in Better Uptime
      action: http:backstage:request
      input:
        method: 'POST'
        path: /api/proxy/betteruptime/monitors
        headers:
          content-type: 'application/json'
        body:
          url: https://${{ (parameters.repoUrl | parseRepoUrl)["owner"] }}.github.io/${{ (parameters.repoUrl | parseRepoUrl)["repo"] }}/

In this case, are sending a POST request to the path /api/proxy/betteruptime/monitors. This path doesn’t currently exist, so let’s create it using Roadie’s proxy UI.

Create a proxy for Better Uptime

To create a proxy, click “Administration” at the bottom of the main sidebar, then go to the “SETTINGS” tab and click “Proxy” in the minor left sidebar.

Click “ADD PROXY” to add a proxy.

There are a lot of fields and options available here. We don’t need to understand all of them to complete this tutorial. So we will focus on the most important ones.

“Path” is the path on which we will be able to call our proxy. This must match a part of the path we added to our template.yaml earlier. In our case, this is /betteruptime.

“Target” is the URL which we want to forward request on to. The root of the Better Uptime API is at https://betteruptime.com/api/v2, so in this case that is our Target.

Click the Advanced Settings to see some more options.

The “Allowed Methods” field can be used to restrict the HTTP methods that the proxy will accept. For example, if we choose GET and POST, then the proxy will refuse to forward DELETE requests. In this case, we only need to enable POST requests.

The “Headers” section can be used to add headers to the request as it is being forwarded on to the Target. This is how we will add an authentication token to our requests.

To make a Better Uptime proxy for our scaffolder template request, fill out the following:

Path = /betteruptime
Target = https://betteruptime.com/api/v2
Allowed Methods = POST
Check the “Secure” and “Change Origin” checkboxes at the bottom of the page.

To add the authentication token to the request, create a header called authorization and set its value to Bearer ${CUSTOMER_TOKEN_1}. When the request is proxied, ${CUSTOMER_TOKEN_1} will be replaced with an actual token, but we don’t want to store that here in plain text. Instead, we will use Roadie’s secure secrets functionality for this.

We also need to add a Content-Type header which specifies application/json.

Our proxy settings now look something like this:

Click “SAVE” and then “APPLY & RESTART” to create the proxy. You will need to wait approximately 3 minutes for the settings to be applied.

Securely store the Better Uptime token

In the previous section, we created an authorization header with the placeholder ${CUSTOMER_TOKEN_1}. In order for this placeholder to be replaced with the actual authentication token, we have to store the token in the Roadie Secrets area, just like we did previously with the GITHUB_TOKEN.

To store the token securely, visit the Secrets page in the Roadie Backstage Administration area and set the CUSTOMER_TOKEN_1 to the value of the API token we got from Better Uptime.

Re-run the scaffolder job

Assuming those steps all worked correctly, we should now be able to re-run our scaffolder template and create a GitHub pages website which is hooked up to Better Uptime.

Go back to the Roadie Backstage scaffolder, choose your GitHub pages template and fill in the website name and GitHub owner and repo name again. Remember to use unique values. Hit CREATE and watch the magic happen.

This time, 4 steps run together, including the new Better Uptime step.

If we check out the Better Uptime monitors page, we should see that our website is registered. Initially it will be reporting as “DOWN”. That happens because it takes GitHub pages a minute or two to publish the website. Once it goes live, Better Uptime will update and go green.

Conclusion

Now that you’ve learned how to use the templates, proxy and secrets together, you should be able to apply this knowledge to other tasks. For example,

Try to add some basic TechDocs to the skeleton so that the docs show up in Backstage automatically.
Register the website with an error tracking tool like Sentry.
Enable branch protection on the GitHub repo created by your scaffolder template.

The scaffolder is capable of automating all of these tasks to help your org be more productive and more standardized.

Future work in Roadie

We love this functionality already, but we also believe we have more work to do to make it the best it can be. Here are some of the areas we will be looking to improve next:

Custom secrets so you can add your own secrets and choose their labels.
Custom scaffolder actions so you can run completely custom code in the scaffolder.
Open source scaffolder packages such as our utils package and an AWS package.

To learn about these new features and releases as they roll out, please join our Backstage Weekly newsletter.

10 reasons to get Backstage from Roadie

Thu, 27 Jan 2022 16:00:00 GMT

At Roadie, we ❤️ Backstage. We’ve talked to countless companies who have adopted and are getting value from the technology.

Self-hosted Backstage remains the right choice for many organizations. Large organizations with thousands of developers and abundant resources will be able to staff a team to deploy, customize and manage Backstage.

For companies with a few hundred developers, it’s not so easy. Every engineering hour spent on internal tools is an hour that could be spent delivering customer-facing value. We believe Roadie is the right answer in this situation.

Below are 10 reasons why Roadie might be right for your company.

1) Roadie cuts the time to value

Internal friction on the path to production can mean it takes longer than you expect to get a production deployment of self-hosted Backstage running. Consideration must be given to the TechDocs pipeline, search, authentication, config management and other complications.

When adopting new tools, speed and momentum are important. If the migration and ramp up takes too long, the initiative will lose steam before it even gets going.

Before joining Roadie, software engineer Miklós Kiss worked on the Backstage implementation at Prezi. He spent multiple weeks working with a colleague to get Backstage deployed to production there.

A big chunk of the time went to understanding what Backstage is, how it works, what plugins are and how to utilize them. We spent time fighting with the GitHub rate limit, figuring out the authentication, adding telemetry into it etc. It was a lot of work!

Miklós Kiss

With Roadie, in less than an hour you can go from clicking the “Request a free trial” button to having a catalog populated with components, basic TechDocs for documentation, and plugins installed and integrated.

2) We handle the upgrades

Once you’ve gotten it running, you have to keep it recent. We’ve all seen examples of self-hosted software which is way out of date and generally unmaintained.

At Roadie, we upgrade every Backstage instance approximately once per week, and you’re typically not much more than 2 weeks behind the latest release.

We can do this in a cost effective way because we have economies of scale working in our favor. But it costs valuable engineering time each week, and you may want to keep your engineers focussed on customer facing work.

Upgrades can and do go wrong. At Roadie, we perform automated and manual verifications against each release to try and ensure it is going to roll out cleanly. Broken features cause frustrated users, and nobody wants to adopt a tool they believe will be flakey.

3) We help you adopt Backstage

Of course, deployment and maintenance is only half the battle. The other side of the challenge is adoption. We’ve built features into Backstage to help your engineering teams get the most from the technology.

For example

We automatically syntax check and validate your Backstage metadata YAMLs before they’re ingested. This catches errors which can prevent components from showing up in the catalog as expected.
We provide a Locations Log where users can go to understand why their components are not appearing as they expect.
We’ve produced simple, user focussed documentation which helps your end users get started quickly.

These measures work together with a healthy dose of support provided by our customer communications channels. We’d like your org to get maximum value from Backstage while ensuring you don’t need a full-time staff employed to answer questions from your engineers.

If you want to learn more about the features Backstage provides, check out our Ultimate Guide to Backstage by Spotify.

4) We prioritize security

We take security seriously at Roadie. Our founding team comes from enterprise companies like Workday and Spotify and we understand what it takes to keep data and processes safe.

Some of our security measures include:

Thoughtful and careful design of every part of our architecture. For example, you can read about how we designed our GitHub integration to prevent cross-org access.
Multiple GitHub apps so you can choose the level of access you grant to Roadie Backstage.
Frequent and extensive third-party pen-testing. We often provide the reports to security teams as we go through procurement.
Running certification programs. We’re working on SOC 2 Type II and expect to be certified by September 2022.

At Roadie, we’re always working to improve security. Please request a demo if you would like to hear more about what we’re doing to keep customer data safe.

5) You don’t have to edit the code

Many people expect that the Backstage repository works like a standard UI application. You clone the repository, run it and start using it immediately.

In reality, it’s more like create-react-app. It’s a framework or set of components and plugins that you can compose together to make a developer portal for your organization.

This means that changes like adding a plugin to the Backstage interface require editing the code, committing and re-deploying.

At Roadie, we’ve built a drag and drop composer on top of the normal Backstage plugins, so adding a plugin takes a couple of clicks.

Configuration is handled in a similar way. Want to set up the Kubernetes plugin? Just head to the administration area and add a cluster via the UI.

6) 22+ plugins work straight out of the box

Backstage wouldn’t be much without its plugins. From TechDocs documentation to Kubernetes integration, it’s the plugins which give Backstage much of its discoverability value and power.

We support all of the best Backstage plugins and if there is something you don’t see, we typically integrate it for you within a couple of hours.

Not only do we support all the best plugins, we actually built some of them. We’ve created 12+ open-source Backstage plugins which are free for the community to use. Our open-source team is always looking for inspiration, and we frequently take customer feedback on board when deciding where to focus our efforts.

7) You can bring your own plugins

Every company has home-grown tools and technologies that only make sense in the context of the place they were invented. Sometimes our customers want to build Backstage plugins around these tools so they can be more easily discovered by other engineers in their company.

At Roadie, every Growth Plan customer gets a private artefact repository where they can publish vanilla Backstage plugins. These plugins integrate and run inside Roadie Backstage just like all the normal open-source plugins. You don’t have to do any special magic to your plugins to make this work. Just npm publish them using your normal npm workflow, and we handle the rest.

8) We track the community

With 50+ pull requests being merged into the project each week, hardly a week goes by when there isn’t a new feature or plugin released.

Your teams have customer-facing work they’re trying to get done and they won’t have time to follow all of this work and understand how they can integrate it and get value from it.

At Roadie, we eat, drink and breathe Backstage, so we know what’s happening. We uptake and integrate significant new features for you, so you can stay focussed on what you do best.

If you do want to keep your finger on the pulse, we publish a regular newsletter which tracks the project which can help you stay up to date with the most important changes.

9) We’ve got the scaffolder

Early versions of Roadie didn’t have the scaffolder because we knew it needed special consideration to run safely in a multi-tenanted environment.

After months of hard work we’re ready to make the scaffolder broadly available. In early 2022, we’re making the scaffolder generally available on Roadie.

By making it easier to start new projects, your engineers get to the good part of coding features faster. And your organization’s best practices are built into the templates, encouraging standards and reducing complexity in your tech ecosystem.

10) We’re here to support you

We understand that most teams don’t want to go it alone. That’s why we do our best to support your company on its Backstage journey.

Every customer gets a dedicated support channel in Slack or Discord. If something is not working as expected, we’re there to help you debug it. Anyone in your company is free to join the conversation.

We also meet each one of our customers on a regular cadence so they have a place to make requests, get support, and influence our roadmap. Feedback delivered in these meetings feeds directly into our planning process.

We’re constantly working to improve Roadie. Want to hear more about an item on the list, or ask us anything at all, why not Request a Demo on our website.

Roadie's response to recent log4j vulnerabilities

Wed, 22 Dec 2021 16:00:00 GMT

Roadie is not impacted by the log4j vulnerabilities, CVE-2021-44228 or CVE-2021-45046, also known as log4shell.

On December 9th, 2021 CVE-2021-44228 was announced, impacting versions 2.x of log4j (also known as log4j2). This issue was believed to be fixed in log4j 2.15.0, however on December 14th, 2021 CVE-2021-45046 was announced, and log4j 2.16.0 was released, fixing the additional exploitation vectors.

Roadie is written in TypeScript and JavaScript and therefore does not make use of the Java logging library, log4j or the Java Virtual Machine. There is one component in our stack, PlantUML, which is written in Java, but it does not make use of log4j.

SaaS

Roadie’s SaaS platform was not impacted by the log4j vulnerabilities. As a TypeScript application, we do not make use of log4j directly. While thoroughly examining our cloud environment, we determined that we are not running any impacted software in a way that is publicly available.

We have taken the following steps to ensure our infrastructure is not vulnerable:

Audited our cloud environment to ensure we are not running log4j in any application code directly.
Upgraded all AWS EC2 Node Groups to the latest AMI version provided by Amazon.
Hotpatched all AWS ECS containers with the mitigations provided by Amazon.
Audited our sub-processors to ensure they are taking steps to mitigate the vulnerability in their own software stacks.

Links to sub-processor responses:

AWS - upgrades applied
Auth0 - not vulnerable
Google Analytics - not vulnerable
Functional Software - not vulnerable
Amplitude - upgrades applied
Intercom - upgrades applied

Open Source

Roadie’s OSS code is not impacted by the log4j vulnerabilities. As TypeScript applications, our Open Source code does not make use of log4j directly.

OAuth Token Exchange: AWS → GCP

Fri, 03 Dec 2021 21:00:00 GMT

When working with multiple cloud providers, it can often become difficult to manage authentication. Even more so with inter communication. In this blog post, I will talk about my experience with negotiating AWS identity tokens for GCP OAuth tokens.

Normally, when trying to gain access to another AWS account, we use cross account federation. With this cross account federation, we authorize access to certain AWS principals (roles etc). This is done by assuming a “role”. This “role” is exclusively controlled by the owner’s account. The account owners can determine exactly what access the external account has. With this, we are able to provide a secure way for two (or more) AWS accounts to communicate with each other.

Now between cloud providers, this is a lot more complicated. Each cloud provider has their own method of authentication as well as authorization. This is where the difficulty lies when trying to exchange an AWS role identity token for a GCP token.

Thankfully AWS provides a service that allows us to add authentication to our API requests through HTTP. It adds AWS specific headers/query params, that are then used to confirm the identity of the request.

GetCallerIdentity

In a lot of cases when working with cloud providers, it is difficult to grasp exactly the identity a service might be using. In many cases an identity may change due to a specific behaviour. AWS provides an easy mechanism for this and it is controlled by the Security Token Service (or STS). More specifically the GetCallerIdentity API. This here returns details on the caller. This includes the unique Identity and Management (IAM) name (ARN). Using this ARN, we are able to pinpoint a user and or a service. This can be valuable when trying to confirm the identity of a user.

GCP

Service account

Service accounts in GCP is a concept that is shared throughout the GCP ecosystem. They are used to gain access to certain resources and have permissions attached to them. With service accounts, we are able to restrict and regulate actions that a particular user/service is allowed to do. For our investigation, we will be acting on behalf of a service account.

Creating a service account

For the sake of simplicity, I will be using the gcloud cli although this can very easily be configured through the GCP console.

$ gcloud iam service-accounts create aws-service-account-demo \
  --description="A service account that AWS can access" \
  --display-name="aws-service-account-demo"

Workload identities

As stated before, using service accounts allows us to restrict access and assign specific permissions to a service or application. In order to gain access to these service accounts, we need some way of verifying our identity. This can be done in two ways, using long term tokens or short lived ones. For this document we will be using the short lived ones. This concept is referred to as workload identities federation.

Enabling IAM services on your GCP project

For the access token exchange flow to work, we must expose/enable services on our GCP project. This can once again be configured through the console but for simplicity, the gcloud cli is favoured.

$ gcloud services enable sts.googleapis.com
$ gcloud services enable iamcredentials.googleapis.com
$ gcloud resource-manager org-policies allow constraints/iam.workloadIdentityPoolAwsAccounts \
    <aws-account-id> --organization=<gcp org>

Note: you will also need to ensure that you have the Workload Identity Pool Admin (roles/iam.workloadIdentityPoolAdmin) and Service Account Admin (roles/iam.serviceAccountAdmin) roles on the project.

Creating a workload identity provider

Here we will create a workload identity provider for our token exchange with AWS

First let’s create the pool

$ gcloud iam workload-identity-pools create aws-pool \
    --location="global" \
    --description="Workload identity pool for aws connectivity." \
    --display-name="AWS pool"

Then the provider

$ gcloud iam workload-identity-pools providers create-aws aws-test-account \
    --location="global"  \
    --workload-identity-pool="aws-pool" \
    --account-id="<your account>" \
    --display-name="Test AWS provider"  \
    --description="The Identity Provider for AWS test service"

Note: if you would like to be more explicit about what aws role can access this workload identity add the following (replacing the account name and role)

--attribute-condition="'arn:aws:sts::000000000000:assumed-role/some-role' == attribute.aws_role"  \

Combining workload federation and service accounts

Now that we have the ability to gain an access token using the workload federation, we need to allow the workload provider to assume the service account role.

$ gcloud iam service-accounts add-iam-policy-binding aws-service-account-demo@example-project.iam.gserviceaccount.com \
   --role roles/iam.workloadIdentityUser \
   --member "principalSet://iam.googleapis.com/projects/$gcp_project_number/locations/global/workloadIdentityPools/aws-pool/subject/arn:aws:sts::${aws_account_id}:assumed-role/$role_name"

Note: if you want to allow all resources in the workload identity pool to assume the service account replace the —member field with the following

--member "principalSet://iam.googleapis.com/projects/$gcp_project_number/locations/global/workloadIdentityPools/aws-pool/*"

Token exchange flow

Combining all the steps configured and knowledge above, we are now ready to initiate the token exchange flow. Below is a flow chart of each of the steps.

The users logs in to their aws ecosystem and receives a set of aws specific temporary credentials
User signs their GetCallerIdenity request with their personal temporary credentials
User sends a POST request to GCP; https://sts.googleapis.com/v1/token. This contains all the custom signed credentials in the requested payload. This will be used to verify the user’s identity.
GCP calls on the AWS STS api to verify the credentials in the payload. If the signed request matches the users identity, GCP will return a federated workload identity token (this is a one time token).
User now exchanges the one time token for an ephemeral service account OAuth token. This is done using a POST request to https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/$SA-NAME@$PROJECT-ID.iam.gserviceaccount.com:generateAccessToken
User is now able to access all the resources the service account has access to.

The following flow above is also demonstrated here in a POC Javascript project.

We have also mocked up a library for you to use at your own discretion cloud-token-exchanger.

Resources

Information on AWS GetCallerIdentity

Information on AWS Signed requests

GKE workload identity federation conditions

GKE workload identity federation configuration of pools

GKE workload identity federation token exchange flow

How to model monorepos in Backstage

Mon, 22 Nov 2021 21:00:00 GMT

We’ve been onboarding an increased number of awesome engineering organisations to our SaaS Backstage platform recently, and one question comes up again and again… “Does Backstage support monorepos?”

The good news is that Backstage does support monorepos. In fact, there are multiple different ways to represent monorepos in Backstage, each with its own setup, benefits and drawbacks. This post will teach you everything you need to know to get your monorepo code loaded and represented the way you want.

This post will be applicable whether you’re using the Roadie hosted Backstage platform or self-hosted Backstage.

Huge thanks to Enrique Amodeo Rubio, Staff Engineer at Contentful (linkedin). He did a lot of the hard work of testing the various monorepo representations in Backstage, and was kind enough to share some tips with us. These tips formed the basis of this guide.

Combined vs Split monorepo representations

There are two approaches to treating monorepos in Backstage. Combined and Split monorepos.

Combined monorepos

Combined monorepos present as a single entity in Backstage. When you look at your Backstage catalog, you’ll see one row to represent the monorepo, regardless of the number of sub-components contained within.

It will only have one Backstage metadata file and one set of TechDocs which describe the entire monorepo.

Split monorepos

Split monorepos treat each component of the monorepo as an individual Backstage entity. A split monorepo will have multiple associated catalog entries, one for each sub-component within the monorepo. It will contain many Backstage metadata files and many sets of TechDocs.

Which option should I use?

Combined monorepos make sense when the entire monorepo is owned by a single team. The Backstage project itself is perhaps a good example of this. It’s a relatively large monorepo but it’s owned by a single team of maintainers.

The combined monorepo has a single place in Backstage to find the documentation for all of the sub-components. if the components are tightly coupled or frequently used together, this approach might make it easier to browse all of the docs at once.

Split monorepos make sense when different components within the monorepo are owned by different teams. For example, a monorepo which co-locates many different backend services which expose different HTTP APIs and are owned by different teams within a company.

Split monorepos also make more sense when each component in the monorepo exposes its own HTTP API spec.

Summary

Use combined monorepos when the monorepo contains tightly coupled components, all which expose one or zero HTTP APIs, and are owned by a single team.
Use split monorepos when the monorepo contains loosely coupled components which each have their own HTTP API and their own owners.

Setting up your YAML files

Combined monorepo setup

The combined monorepo representation is easier to set up in Backstage because it requires less YAML configuration.

Simply create a top level catalog-info.yaml file, of the Component kind, in the root of the monorepo. Name it after the monorepo.

---
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: combined-monorepo
  description: All our components represented as a monorepo
  annotations:
    github.com/project-slug: RoadieHQ/sample-combined-monorepo
spec:
  type: service
  owner: engineering
  lifecycle: production

Here’s a public GitHub repository which demonstrates this setup.

Split monorepo setup

The split monorepo setup uses a single metadata file with the Location kind in the root of the monorepo, and many metadata files with Component kind in the subdirectories. The Location acts as a pointer to each of the components in the sub-directories, insuring they can be managed from a single location.

Assuming we have a monorepo structure something like this:

.
└── services
    ├── banana-service
    │   └── src
    └── pricing-service
        └── src

Then we would create one metadata file for each component and co-locate it with the component code. In this example they are called backstage.yaml files.

.
└── services
    ├── banana-service
    │   ├── backstage.yaml
    │   └── src
    └── pricing-service
        ├── backstage.yaml
        └── src

Then we would create a metadata file containing a Location at the root of the repo:

---
apiVersion: backstage.io/v1alpha1
kind: Location
metadata:
  name: split-monorepo
spec:
  type: url
  targets:
    - ./services/pricing-service/backstage.yaml
    - ./services/banana-service/backstage.yaml

Using this setup, each team can independently manage their own backstage.yaml files and individual components can be added or removed from the Backstage catalog simply by updating the catalog-info.yaml file in the root of the monorepo.

Here’s an example of a monorepo set up with the split monorepo representation.

Using TechDocs in monorepos

TechDocs is used slightly differently in each of the two possible representations, and the results can be quite different.

Combined monorepo representation

The combined monorepo representation makes use of the mkdocs-monorepo-plugin created by Spotify. This plugin supports having multiple sets of MkDocs TechDocs within one monorepo.

Within Backstage, the TechDocs automatically render with a nested sidebar so the reader can browse through the documentation for each component in one place. Here we can see two services, the calculator and candle service, represented in the documentation of the Combined monorepo.

To set up TechDocs in the combined monorepo fashion, create a docs directory and mkdocs.yml file in the sub-directory of each component.

├── services
    ├── calculator-service
    │   ├── docs
    │   ├── mkdocs.yml
    │   └── src
    └── candle-service
        ├── docs
        ├── mkdocs.yml
        └── src

The markdown documentation files live in each docs directory and the mkdocs.yml file points to them as normal.

# Note: Whitespace is not currently supported in this site_name
site_name: calculator-service

nav:
  - Home: index.md

plugins:
  - techdocs-core

To create the nested sidebar effect, create one more mkdocs.yml file in the root of the monorepo, at the same level as the catalog-info.yaml.

In it, include the monorepo plugin and use the !include directive to pull in each of the mkdocs.yml files in the sub-directories.

As a bonus, you can also reference mardown files in a docs directory at the root of your monorepo, as we are doing below. These root level might be a good place to talk about the nature of the monorepo and the components contained within.

site_name: Root docs

nav:
  - Home: index.md
  - Subdirectory docs:
    - Calculator Service: '!include ./services/calculator-service/mkdocs.yml'
    - Candle Service: '!include ./services/candle-service/mkdocs.yml'

plugins:
  - monorepo
  - techdocs-core

Lastly, add the techdocs-ref annotation to the catalog-info.yaml file in the monorepo.

---
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: combined-monorepo
  description: A combined monorepo
  annotations:
    # ..
    backstage.io/techdocs-ref: dir:.
spec:
  type: service
  owner: engineering
  lifecycle: experimental

Split monorepo representation

As we saw in the introduction, the split monorepo representation results in each monorepo component having its own entity in Backstage. As you can imagine, each component gets its own set of TechDocs, just like a non-monorepo component would.

To set up docs in the split monorepo fashion, simply create an mkdocs.yml file and docs directory in the sub-directory of each component.

.
└── services
    ├── banana-service
    │   ├── backstage.yaml
    │   ├── mkdocs.yml
    │   ├── docs
    │   └── src
    └── pricing-service
        ├── backstage.yaml
        ├── mkdocs.yml
        ├── docs
        └── src

The markdown documentation files live in each docs directory and the mkdocs.yml file points to them as normal.

site_name: Pricing Service

nav:
  - Home: index.md

plugins:
  - techdocs-core # required to style your docs like Backstage

Don’t forget to add the techdocs-ref annotation to each backstage.yaml file.

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: pricing-service
  title: Pricing service
  description: Component for Pricing service
  annotations:
    # ...
    backstage.io/techdocs-ref: dir:.
spec:
  type: service
  owner: engineering
  lifecycle: production

Conclusion

Whether you end up using the combined or split monorepo representation, Backstage can certainly support your needs.

Have you got other tips for using monorepos with Backstage? We’d love to mention them here and credit you. Please email support@roadie.io with your ideas.

Plugins migration to monorepo

Mon, 23 Aug 2021 16:00:00 GMT

Contributing to the Backstage community has been one of the top goals in our roadmap. We have focused on developing plugins for developers with the goal of making their job more efficient. Over time, we produced multiple plugins contained within their own repositories. This is sometimes referred to as a multirepo approach as opposed to a monorepo with a single repository that contains multiple plugins. Our multirepo setup was a reasonable approach to begin with.

Although a number of teams have embraced monorepos, there are reasons why we have stayed away up until now. We started to face challenges with the increasing number of plugins that we maintain. One of the main challenges was with dependency management across all of our repositories which eventually became very complex. Instead, we wanted to have an automated, simple solution that would not be so time consuming and would give us a solid ground for additional features we have in mind. So, we made a decision to migrate all of our plugins to the RoadieHQ/backstage-roadie-plugins monorepo.

Improvements

There are a number of improvements we introduced by moving to monorepo.

1) Better control of dependency management.

As mentioned previously, we wanted to simplify internal and third-party dependency management. Having plugins in different repositories raised concerns about having diamond dependency conflicts and challenges of having different versions of the same dependency in different repositories.

Testing specific versions of a dependency is easier because it gives us the ability to test for breaking changes and backwards compatibility across the entire codebase when an update is needed. It is easier and faster to follow Backstage team updates so that we can make sure our plugins work with the latest versions of the Backstage packages.

2) Better visibility of all the plugins.

It is easier for contributors to test against other plugins and possibly make multiple plugin changes in a single commit or pull request. It can also help encourage more collaboration and code reuse.

3) One place to store all configs and tests.

We can reuse and improve CI/CD configuration and tests across all of our plugins at the same time without needing to have separate and sometimes duplicated configuration per plugin.

4) Easier to keep track of upstream updates.

We created a workflow that runs periodically to check for updates from the Backstage team. The workflow automatically creates a pull request with updates to the package versions for all of the plugins. The workflow also runs checks to ensure everything works as expected once the changes are merged to the main branch.

Challenges

The monorepo approach is not withouts its challenges. We believed we would stumble across a few, especially in terms of building and publishing packages.

1) Build Pipelines

Ensuring builds are efficient and practical is a challenge regardless of the team size or codebase. The monorepo approach results in a lot of source code in one place. We recognized that it may take more time for CI to run all required tasks in order to approve every pull request. Ultimately, we did not see a substantial increase to build time for our monorepo..

2) Manage publishing of the packages

Although all plugins are contained within a single source code repository, each plugin is individually published to NPM. We needed a tool that would allow us to publish multiple packages but also optimize the workflow to ensure only packages that have changes are published.

We decided to use Lerna to manage our monorepo. We settled on a semi-automatic build and publish workflow. The package versioning is done manually and the publishing is done automatically. Lerna helps with detecting changes in the packages and only publishes the ones that have updated versions.

Conclusion

All of the plugins we developed and maintain are gradually being migrated to the RoadieHQ/backstage-roadie-plugins repository.

Plugin users will not notice any difference with how they consume our plugins from NPM. This migration does make a difference for plugin contributors. You can read more about contributing in our CONTRIBUTING.md file.

This type of structural change is always a bit difficult at the start but we are confident it will result in a better experience for our plugin users. We always welcome contributions to our plugins and hope that this change will also make it easier to contribute.

GitHub Apps - How to avoid leaking your customer’s source code with GitHub apps

Thu, 19 Aug 2021 11:16:00 GMT

Security, tenant isolation and protecting our customer’s intellectual property is important to us at Roadie. While investigating options for integrating with GitHub APIs we recognized that you have to work hard to do it securely. There are a number of ways to access GitHub APIs. It is quite easy to integrate with them incorrectly and potentially leak data between customers.

In fact we found (and reported) a vulnerability in a handful of major SaaS products that allowed a user to access the resources of an organisation that the user was not a member of.

Using default settings with GitHub Apps may put you at risk of leaking data between GitHub App installations.

Roadie provides its customers a hosted and managed Backstage environment. Backstage is a platform that helps you build developer portals on top of a centralised software catalog. Your developers can extend Backstage by creating new or customizing existing frontend and backend plugins in order to build a developer portal that meets your user’s needs. Read our Ultimate Guide to Spotify Backstage to learn more.

Backstage is a three tier application. The frontend tier runs in the browser and is built with React. The backend tier is built with Express running on Node.js. This tech stack ensures developers have access to a large community and ecosystem of packages and tools that make it even easier to extend their Backstage implementations.

The Backstage backend includes the Backstage Software Catalog to help bring visibility to your software components. The software catalog is a core component of Backstage. It can discover and index Git repositories hosted on GitHub and GitHub Enterprise using GitHub REST and GraphQL APIs. Indexed data is exposed via backend REST APIs to the frontend.

Backstage plugins can be added to the frontend to extend the core software catalog and integrate with external services. For example, the GitHub Pull Request plugin allows you to see a list of open pull requests associated with an item in the software catalog. Frontend plugins such as this one can use GitHub APIs to retrieve information and render it directly in Backstage. This is just one very specific yet common use of GitHub.

When it comes to integrating with GitHub APIs, there are at least six ways to authenticate. Some of these authentication methods are suitable for the frontend, the backend or both. Let’s go through each.

Type	Frontend	Backend
Personal Access Token (PAT)	No	Yes
GitHub OAuth Apps	Yes	No
GitHub Apps acting as a GitHub app itself	No	No
GitHub Apps acting as an Installation	No	Yes
GitHub Apps acting as an OAuth provider	Yes	No
Anonymously	Yes	Yes

Personal Access Token (PAT)

Any GitHub user can create a Personal Access Token (PAT) via their GitHub developer settings. The user chooses a set of permissions to allow for the token. This token can be used to access any resource from the GitHub API on behalf of that user. It is a sensitive, long-lived token that should not be used in a frontend application. Technically it could be used on a backend application to retrieve data from GitHub APIs. However, this approach is concerning from a security perspective. The backend would essentially be using credentials that allow it to act on behalf of the token’s owner. This is not desirable because the backend, and in turn its token, is shared by all users of the application. In order to mitigate this, you could create a special user in GitHub, and create a PAT for the application. This can be difficult to manage and maintain.

GitHub OAuth Apps

GitHub provides a way to create an OAuth app that can be used to login application users via a web frontend. Let’s disregard the details about how the OAuth token negotiation works in this article. Effectively, the frontend application sends the user to GitHub to get a token from the OAuth app. The user is then redirected back to the frontend application with a code that can be exchanged for a token. This token can be used by the frontend application to call GitHub APIs on behalf of the user. Any permissions that apply to the user also apply to the API requests made with the token. This is a reasonable mechanism for frontend only uses.

GitHub Apps

GitHub allows developers to create what is referred to as a GitHub app. A GitHub app can be installed on a GitHub organization or a personal GitHub account. Once installed, the GitHub app can request a new token for each installation of the app.

We believe that although it can be difficult to implement correctly, GitHub Apps is the best way to provide GitHub API access to a backend application.

“GitHub Apps is the best way to provide GitHub API access to a backend application”

As we mentioned earlier and alluded to in the title, it can be difficult to get GitHub Apps configured in a way that ensures customer isolation between installations of a GitHub App. This is especially true for a multi-tenanted application as is the one we are providing to our customers.

So let’s take a look at the three ways a GitHub app can be used to authenticate to GitHub APIs:

Acting as a GitHub app itself

The GitHub App has a private key that is used to generate a GitHub App token. This token can be used for a subset of the GitHub APIs. One of the available APIs can be used to retrieve a list of its app installations and request GitHub to generate a token for each installation. This GitHub App private key is very sensitive. Suppose your service has two customers, and two installations of the GitHub app. Technically speaking, that private key can be used to retrieve a token for both customers and then read and write data for both customers with GitHub APIs. As such the token should only be used minimally for the purposes of retrieving an installation token.

In order to generate a GitHub App token, the GitHub App encodes a JWT token with the GitHub App ID and signs it with the private key of the GitHub App. Here is what it looks like in Ruby:

GITHUB_APP_PRIVATE_KEY_FILE = "private-key.pem"
GITHUB_APP_ID = "12345678"

private_key = OpenSSL::PKey::RSA.new(File.read(GITHUB_APP_PRIVATE_KEY_FILE))

payload = {
  iat: Time.now.to_i - 60,
  exp: Time.now.to_i + (10 * 60),
  iss: GITHUB_APP_ID
}

GITHUB_TOKEN = JWT.encode(payload, private_key, "RS256")

The token can then be used to list installations of that GitHub App:

curl -X GET https://api.github.com/app/installations \
     -H "Authorization: Bearer ${GITHUB_TOKEN}"

[
  {
    "id": 12345678,
    "account": {
      "login": "AcmeInc",
      "id": 12345678,
      …
    }
  },
  {
    "id": 12345679,
    "account": {
      "login": "SomeCorporation",
      "id": 12345679,
      …
    }
  }
]

And retrieve a token for a specific GitHub App installation:

curl -X POST https://api.github.com/app/installations/12345678/access_tokens \
     -H "Authorization: Bearer ${GITHUB_TOKEN}"

{
  "token": "ghs_<redacted>",
  "expires_at": "2021-08-17T13:16:07Z",
  "permissions": {
    "members": "read",
    "organization_administration": "read",
    "actions": "read",
    "contents": "read",
    "metadata": "read",
    "security_events": "read"
  },
  "repository_selection": "selected"
}

Once the GitHub App has retrieved a token for a specific installation, it can call GitHub APIs. The set of APIs that it is allowed to access is configured in the GitHub App and requested during the installation of the app. This installation-specific token should only be used in a backend.

It is incumbent upon the GitHub App owner to make sure that the resources retrieved from one installation are only available to members of the organization of the installation.

Acting as an OAuth provider

The GitHub App can also act as an OAuth provider, so that users of the backend can retrieve a token in a web frontend. The GitHub App’s settings contain the credentials required to allow the web frontend to login users with GitHub in the same way that the GitHub OAuth App does. Once the user’s browser has retrieved the token from GitHub, the web frontend application can gather data and perform actions on behalf of the user in GitHub.

Acting as an Installation

We saw that the GitHub App’s private key can be used to retrieve an access token for any installation. So how do developers of large multi-tenanted SaaS software make sure that users are only accessing data from installations that they are supposed to? For example, if one of the customers chooses to install the GitHub App, then no other customers should be able to see that customer’s data.

“Avoid using the Setup URL Callback and validate the ownership of installations ids”

So how do we make sure that a customer is allowed to use a GitHub App installation?

GitHub Apps allow a developer to provide a URL to which the user is redirected to after installation. The setting is called “Setup URL”. When the GitHub App installation completes the user is redirected to the Setup URL with the id of the installation.

The problem is that GitHub does not provide any means for the installing application to verify the ownership of the installation. Installation ids are 8 digits and not considered secure. They can be guessed easily. Consider the highlighted message in the sequence diagram below. In this setup, it is very easy to bombard the application with any guessed or known installation id.

We found (and reported) this vulnerability in a handful of major SaaS products. This would have allowed us to access other GitHub organizations.

The way to ensure that this does not occur is to verify the user’s identity and organisations and compare whether or not they are allowed to access/install the app. This can be done by setting “Request User Authorization” and providing a “Callback URL” on your app.

With this setting enabled, users are forced to login and the callback contains a code that can be exchanged for an auth token for the user. The backend should use this code to validate that the user is allowed to access this installation.

So what does this validation look like? Here is what we have done in our backend Express application:

router.post('/installations', async (req, res) => {
 const code = req.query.code as string;
 const installationId = req.query.installation_id as string;
 const setupAction = req.query.setup_action as string;

 if ((!setupAction || setupAction !== 'install')) {
   logger.error(`Action is not of type 'install'. Got: ${setupAction}`);
   httpResponse = 400;
 } const userGitHub: Octokit = (await Octokit.auth({
   type: 'oauth-user',
   code: code,
   factory: (options: OAuthAppAuthOptions) => {
     return new Octokit({
       authStrategy: createOAuthUserAuth,
       auth: options,
     });
   },
 })) as Octokit;

 const { data } = await userGitHub.request('GET /user/installations');

 if(data.installations.some(installation => installation.id === installationId)) {
  // The following is pseudocode for storing the installation Id.
  database.saveInstallationId(installationId)
  res.sendStatus(201);
 } else {
  res.sendStatus(403);
 }
}

This works because we have ensured that the request contains a code. Only a user who has logged into GitHub could have access to this code, and then we can validate that the user is a member of this organization before persisting the installation.

In conclusion

If you are rolling out features that require GitHub API access to your customers, be mindful of how you are doing it. We hope you will appreciate how easy it is to unintentionally and unexpectedly expose a customer’s GitHub data to unauthorized users.

Backstage TechDocs - How to embed lucid chart diagrams

Tue, 03 Aug 2021 10:30:00 GMT

TechDocs is the core Backstage feature which transforms markdown documentation into HTML and displays it inside Backstage where your engineering teams can find it.

You can easily embed diagrams from lucid charts and other external sources in techdocs. Start by exporting the generated iframe from the external application. For example if you are using lucid charts you can click the Share button in the top right.

This will show a dialog as follows.

Click advanced and then click embed.

You can choose to adjust the size of the embedded diagrams.

Copy the html snippet and click the “Activate Embedded Code” button.

Now copy the code snippet into your techdocs files as it is and you will get diagrams in your techdocs that update when the diagrams are changed in lucid chart.

How to model software in Backstage

Tue, 29 Jun 2021 21:00:00 GMT

Backstage’s service catalog serves as a metadata store for useful information about the software assets being used and developed in your organization.

It can also group software in ways which makes logical sense to the humans who build and use the software. Grouping software makes it easier to understand the overall architecture and can highlight previously unseen dependencies.

A more understandable architecture is easier to onboard engineers onto and faster to repair when problems occur.

To see how different types of software asset are represented in Backstage, we’re going to model part of the architecture you might find in a hypothetical ride-sharing company. We’ll also see how Backstage models the relationships between software and how it can diagram the network of dependencies.

Here’s how our hypothetical architecture looks. We have two backend services. One of them, Passenger Backend, is dependent on two important libraries, the Core Queueing Library and the Core Auth Module. The second backend service, Trips Counter, calls the API of the Passenger Backend.

This model doesn’t demonstrate all of the modeling capabilities that Backstage has to offer. We have omitted Resources, which typically represent shared infrastructure, like a Kubernetes cluster. We have also omitted the sub-component relationship because it has a very niche use-case in the fat-client world.

We have also ignored the other fundamental pillar of modeling in Backstage — humans and the teams they group themselves in. Backstage provides User and Group concepts for this purpose. They are outside the scope of this document.

Modeling components, the basics

Let’s start with a simple concept like a typical backend service, the Passenger Backend. This could be a NodeJS or Go application perhaps. It probably has some API endpoints, some business logic, a connection to a database and a bunch of libraries installed into it.

Backstage represents services like this using three properties, the kind, type and name.

kind: Component
type: service
name: passenger-backend

Components are one of the fundamental building blocks in Backstage. A component is a single logical unit of code which is owned by a person or group of people. Assuming you don’t use mono-repos, it might correlate to a single GitHub or Gitlab repository. It has a type which indicates how it might be used. To represent a backend service like our Passenger Backend, we use the type service.

A good rule of thumb is to draw the boundaries between pieces of software by considering their ownership. If a codebase, or a part of a codebase, is owned by a team, that’s probably a component you want to model by itself.

Adding libraries

Components can depend other components. For example, our Passenger Backend has two important libraries installed into it. The Core Queuing Library is used to pass jobs over a shared queuing service and the Core Auth Module is used to authenticate incoming requests.

The Core Queueing Library is represented as a library component.

kind: Component
type: library
name: core-queuing-library

The relationship between the the Core Queuing Library and the Passenger Backend is defined by a property on Passenger Backend.

kind: Component
type: service
name: passenger-backend
dependsOn:
  - core-queuing-library

Once that relationship is defined, we can show it off in Backstage by adding the EntityDependsOnComponentsCard to the interface.

You typically wouldn’t attempt to represent all dependencies of a service like this. Some services will have hundreds of libraries they depend on, and trying to account for all of them will introduce too much fragility into the model.

However, it might be appropriate to indicate a dependency on important libraries which are developed in-house and are found in lots of other components across the company.

Once we indicate that the Passenger Backend depends on the Core Queuing Library, Backstage has enough information to establish an inverse relationship. If we add the EntityDependencyOfComponentsCard and visit the Core Queueing Library in the Backstage catalog, we should see that it is a dependency of Passenger Backend.

Representing an API

The Passenger Backend service exposes a RESTful HTTP API so that other software in the company can communicate with it. They may use this API to look up the current location of a passenger for example.

APIs are represented in Backstage using the same three properties as components.

kind: API
type: openapi
name: passenger-api

The type specifies the specification language you are using to describe your API. We’ve specified OpenAPI here but others like GraphQL and gPRC are supported.

Once we have defined the API, we can indicate that the Passenger Backend service exposes it.

kind: Component
type: service
name: passenger-backend
providesApi:
  - passenger-api

Once we have indicated this relationship, we can show it off in the Backstage UI by adding the EntityProvidedApisCard. We would typically add this card to a tab on the Passenger Backend component so that people can look that component up in the catalog in order to read its API definition.

Combining things into systems

Together, the Passenger Backend and Passenger API make up a logical system. They are a group of entities with a well defined purpose, providing and managing information on passengers.

To represent them as a logical group in Backstage, we can define a system. Systems don’t have types, they are just systems.

kind: System
name: passengers

We can declare that the Passenger API and the Passenger Backend are part of the system by adding the system property to their definitions.

kind: Component
type: service
name: passenger-backend
system: passengers
providesApi:
  - passenger-api

kind: API
type: openapi
name: passenger-api
system: passengers

Once the system exists in Backstage, it will get it’s own page in the UI where we can represent its relationships. For example we can add the EntityHasApisCard to see the APIs which are part of this system.

Similarly, we can add the EntityHasComponentsCard to see the components which are part of the system.

It’s important to note that the Core Queueing Library and the Core Auth Module are not considered to be part of the Passengers system. This is because they are shared libraries which are used in a large number of components throughout the org. The are probably owned and developed by a different organization, within our ride-sharing company.

Now that we have defined a system, Backstage can diagram it for us. When we add the EntitySystemDiagramCard, we see something like the following:

Consuming APIs

Of course, it takes more than just a Passengers system to make a ride-sharing company hum. Those passengers need to go on trips, and we need to count the trips to see how rich we’re going to get. Let’s add the Trips system into Backstage, give it a Component and connect it up to the Passenger API.

The key properties required to represent this in Backstage are as follows:

kind: System
name: trips
---
kind: Component
type: service
name: trips-backend
consumesApis:
  - passengers-api

When we look up the Passengers API in Backstage, we can now see that the trips-backend is a downstream dependency.

For complex systems, it would be quite onerous to track and compile these dependencies manually. We are hopeful that the community will develop integrations into technologies like API gateways and service meshes so that dependencies can be inferred and represented in Backstage automatically.

Business domains

After some time, upper management decides that our ride sharing company should branch out into food delivery. To achieve this vision, they establish a new arm of the company.

To differentiate the systems we have created to move passengers around from those required to move takeout around, we can create a Domain in Backstage. Domains represent collections of systems which make up a coherent business unit.

Conclusion

With some simple labels like kind, type and name and a handful of relationships like dependsOn, providesApi and consumesApi, complex software architectures can be accurately modeled in Backstage.

Of course, it’s up to you to decide how granularly you want to represent your software. It’s totally fine to add components to Backstage and to choose not to group them into systems or domains. APIs are probably the second most useful concept to include since they indicate the interfaces between components.

To learn more about this topic, please refer to the Backstage documentation on entities and well-known relations.

Developer portals are a superpower

Wed, 12 May 2021 21:00:00 GMT

Last week, Cloud Economist and AWS guru Corey Quinn, wrote a blog post declaring that developer portals are an anti-pattern. He mentioned Backstage, and explained why he believed that it was taking the industry in the wrong direction.

Despite generally excellent commentary on all things tech, in this case Corey’s arguments are mistaken.

Corey’s case against developer portals, and specifically Backstage, is centred around two main arguments:

Building in-house tooling to wrangle cloud services “robs a company’s engineers of an opportunity to develop reusable skills.”
“Developer portals inherently lag the underlying service’s capabilities”.

Let’s look at each argument in turn, and see why Backstage and its vibrant open-source community, is part of a better engineering future.

To learn more about Backstage in general, and understand what it can do for your engineering organization, checkout out our Ultimate Guide to Spotify Backstage.

In-house tooling

Corey’s first argument is that

building in-house tooling to wrangle cloud services […] robs a company’s engineers of an opportunity to develop reusable skills.

This broad argument can be applied to any in-house tool, not just developer portals and Backstage. Building a bespoke continuous integration tool will rob engineers of an opportunity to learn how to use GitHub Actions or Circle CI for example.

It’s odd to see Backstage mentioned in this context because Backstage is actually part of the solution to this problem, rather than an exacerbating factor.

Backstage’s open source nature means that it can be deployed inside any company. If you use Roadie then you can use it as a SaaS tool just like GitHub Actions, Circle CI or any other reusable tool.

If the project eventually turns out to be as successful as something like Kubernetes, you will be able to leave a company which has Backstage, join a new one, and fire up Backstage on day one to learn about the ecosystem around you.

In fact, Backstage brings an opinionated UI/UX which increases the chance that skills will be transferable between companies, even if the internals are customized to the tools and cloud vendor of each companies choosing.

Capability lag

Corey’s second point is that

developer portals inherently lag the underlying service’s capabilities

This is true of any downstream technology dependency. Features must be released in the upstream project before they can be exposed to users. Amazon’s Elastic Kubernetes Service will lag new Kubernetes releases, AWS Lambda will lag new NodeJS versions. Yet, thousands of companies use these services every day.

Backstage is not trying to completely hide underlying technologies from its users. If you have a special case or you need a cutting edge feature, you are absolutely free to jump into the PagerDuty UI or call the Kubernetes API directly. Backstage doesn’t block this.

Backstage’s goal is to handle the use cases which make up 80% of work. Reading docs, checking who is on call, re-triggering builds and so on.

The fact that Backstage is open-source software will help ensure that this lag is minimised. An array of open-source plugins are already being created by the community. If a feature is not supported, you can add it for yourself and for everyone else who is using that plugin. At Roadie, we are actively funding the maintanance and improvement of these plugins.

Each day, a large proportion of Spotify’s engineering organization choose to use Backstage, not because they are forced to, but because it adds value for them.

Proofpoints

As evidence of the apparent ills of developer portals, Corey offers up the fact that he hasn’t seen Backstage deployed in any company other than Spotify.

The reality is that Expedia Group, Zalando, and American Airlines have all chosen Backstage for their internal developer portal. The adopters list has many more participants listed.

Let’s be clear, we are still early in the curve of Backstage adoption. The open-source version is just over a year old. It was released early and with limited functionality in place. Spotify are rapidly iterating, alongside the community and with their input, rather than simply dropping a finished product.

This development style means that open-source Backstage isn’t quite baked enough for some companies. That is ok. The community is flourishing, the CNCF is backing it, and Spotify and Roadie are heavily invested in building a powerhouse project.

Roadie

Of course, I’m biased in my belief that Backstage will succeed. I spent years working on a developer portal and service catalog at Workday, and I’ve seen the value first hand, both for the business and the end user.

Our vision is to make Backstage as ubiquitous, powerful, and pleasant to use as GitHub. Backstage will be a reusable skill for engineers all over the world. They will use it because it improves their work lives and gives them access to the information they need to do their jobs. Long live developer portals.

If you share this vision, join us.

Backstage TechDocs - How it works

Sun, 18 Apr 2021 21:00:00 GMT

TechDocs is the core Backstage feature which transforms markdown documentation into HTML and displays it inside Backstage where your engineering teams can find it.

There are two ways to set up TechDocs in Backstage, the Basic approach and the Recommended approache. But how do they work and which should you use?

Read on to find out.

Prerequisites

Docker installed and running locally on your machine.
The git version control system and a GitHub account.

Basic TechDocs

First let’s see what the basic experience gets us and how it works.

Use git to clone the main Backstage repo. We have used this point in the history but most versions should work. Run yarn install, yarn tsc and yarn build to prepare the codebase and then start it with yarn dev.

Backstage should shortly be running on http://localhost:3000. Sign in as a guest, add this sample-service to your Backstage catalog and navigate to its docs tab.

Once the loading process completes, you should see some docs. Simple!

Let’s take a look at what actually happened under the hood.

How basic TechDocs works

Docs were generated and displayed because Backstage detected the backstage.io/techdocs-ref annotation contained in the catalog-info.yaml file of our sample-service. This tells Backstage that there are docs available which it should show to the user.

There are three actors involved in building the docs: the preparer, the generator and the publisher.

The preparer cloned the sample service repository into a temp directory on our local machine so the docs can be accessed.

The generator then downloaded the spotify/techdocs image from Docker Hub. This image contains Python, a few dependencies and a Python library called mkdocs-techdocs-core. The generator ran this library against the docs directory of the sample-service in order to convert the markdown files located there into HTML, CSS and JS files.

The mkdocs-techdocs-core library is a wrapper around two other libraries, MkDocs and Material for MkDocs.

MkDocs is a static-site generator which takes a directory and some config and uses it to create a documentation website containing HTML, CSS and JS.
Material for MkDocs is a MkDocs theme which emulates the Material UI design pattern.

So MkDocs is generating a static website and Material for MkDocs is styling it. What’s next?

Once the documentation site has been stamped out into a temp directory, it must be moved somewhere where Backstage can access it.

The publisher is responsible for this step and by default it chooses to move the documentation to plugins/techdocs-backend/static/docs/default/Component/sample-service-1/ . If you open this directory you will find a sensible structure containing HTML, JS and CSS files. You should notice a clear similarity between these files and the docs you see in Backstage.

Now that the files are on the filesystem, the TechDocs frontend can simply request them and insert them into the browser’s DOM as a shadow-DOM. That’s how they end up in the page where you can see them.

Limitations of basic TechDocs

This basic architecture is easy to get started with but it has a number of downsides:

Docker must be available in the place where you want to generate the docs. This may not be viable if Backstage is running in an environment like a Kubernetes pod. You can use the MkDocs binary instead, but then you end up with non-core Backstage dependencies in your Dockerfile.
It’s slow on the first request because TechDocs must generate the docs and place them in the filesystem.
When running multiple Backstage backends, TechDocs may generate and store the docs once for each backend. This leads to extra slowness for the end users.
Backstage is pulling down the entire source code of the component to the local filesystem to generate the docs. This may not match your security expectations.

For these reasons, the TechDocs team recommends a CI driven architecture for generating and storing docs.

The idea is that a process, like a GitHub action or other CI build, runs every time there is a change to the markdown files which contain our documentation. This process uses the mkdocs-techdocs-core library to convert the markdown files to a static website just like before. However, instead of writing the resulting HTML, CSS and JS files to the local filesystem, it pushes them to an object store like an AWS S3 bucket. From here, Backstage can request them when needed and render them in the browser for the user.

Converting to the recommended architecture

To convert our basic setup to the recommended architecture, we need to make a few changes. We’re using AWS in this example but Google Cloud Platform, Azure and a host of other platforms are supported. We’re also using GitHub Actions but CircleCI and others shoud work too.

We need the following things:

An AWS S3 bucket to store our docs, and credentials to authenticate uploading and downloading.
A process to convert markdown docs to HTML, CSS and JS and to push the resulting files to our bucket.
Configuration to tell Backstage to pull the docs from S3 instead of generating them with Docker.

Follow the official AWS documentation to create an AWS S3 bucket. Aquire an Access Key ID and Secret Access Key which will authenticate requests to your bucket. You will also need to note the region your bucket lives in.

Add a GitHub Action to your component to do the markdown to HTML conversion and push to S3. The official Backstage docs have a really good example of the code required. Don’t forget to create secrets in your GitHub repo to store the bucket name and AWS credentials you created earlier.

Edit the app-config.yaml file in your Backstage repo.

Change techdocs.builder to external to tell Backstage to stop generating docs locally.
Change techdocs.publisher.type to awsS3.
Set techdocs.publisher.awsS3.bucketName to the name of your bucket.

The techdocs section of your app-config.yaml should now look like this:

techdocs:
  builder: 'external' # Alternatives - 'external'
  generators:
    techdocs: 'docker' # Alternatives - 'local'
  publisher:
    type: 'awsS3' # Alternatives - 'googleGcs' or 'awsS3' or 'azureBlobStorage' or 'openStackSwift'. Read documentation for using alternatives.
    awsS3:
      bucketName: 'demo.roadie.so'

Restart backstage, with the AWS credentials present in the environment variables:

env AWS_ACCESS_KEY_ID=xxx AWS_SECRET_ACCESS_KEY=yyy AWS_REGION=ppp yarn dev

From now on, when you merge a change to the default branch of your GitHub repo, a GitHub action will run to generate and publish docs to S3. From there, Backstage will request them and show them to the user.

Conclusion

Converting your TechDocs from the basic to the recommended setup brings a number of advantages and it only takes a few minutes to switch over one repo.

Deploying Backstage application to AWS ECS Fargate

Wed, 17 Feb 2021 16:00:00 GMT

In this tutorial, we’re going to deploy a basic Backstage application to AWS. The application will be using a stack of AWS resources to its advantage. We’ll set up a database to run PostgreSQL on AWS RDS, store our environment variables to AWS SSM Parameter Store, route our traffic through an AWS Application Load Balancer and last but not least, run our Backstage application on AWS Fargate compute engine.

We’ll be using the AWS console for most of the actions to scaffold the application, but all steps can be done using either aws-cli or infrastructure as code tools like Terraform or Pulumi.

Prerequisites

To complete this tutorial, you will need:

Docker installed and running on your local machine.
NodeJS and Yarn installed on your local machine.
AWS account with permissions to create IAM policies, RDS databases, Load Balancers, ECS Fargate Clusters and managed ECR repositories.
AWS CLI set up locally with your AWS credentials.

Step 1 - Spinning up your RDS Database instance

To run properly, Backstage needs a database to store and handle data. In AWS environment we can spin up an RDS PostgreSQL database to handle that for us.

Let’s navigate to the AWS RDS console and do just that. We’ll start of by clicking the big orange button, saying ‘Create database’.

We select the standard create option and select PostgreSQL as our database engine. For templates, we can for now go with the free tier one if it is still available for your AWS account.

On the settings section we will set up our database name and master username, and finally generate a password using our favorite password manager. These are good items to temporarily store somewhere, because we will be needing them later. For this deployment the database instance does not yet have to be big and beefy so we will go with the free tier T2.micro instance.

We can leave Storage, Availability & durability as well as Database authentication sections to their default values and focus our attention to the Connectivity section. In this section we will select our preferred VPC and subnets. If nothing special is needed, you can use the default VPC for now as well as the default subnet group. Ideally you don’t need your database subnet to be able to accessible from the internet, or even access the internet itself but securing networking within AWS is out of scope for this tutorial.

We do want to create a new security group to our instance though. We’ll name it backstage_rds_SG and select 5432 as our port. AWS will automatically create a new security group for us that grants access to the database port from our IP address. We will later change this IP to point to the security group of our Fargate service.

After these selections we can click Create database and wait for it to become available.

Step 2 - Setting up proper policies to run Fargate containers

Before we can start shipping our Backstage container to AWS we need to have few prerequisites set up for the task to be able to run properly. We’ll want good logging so we’ll give the task permissions to write to CloudWatch. We also want to be able to read environment variables stored in System Manager Paramater Store, so we’ll create a policy to do just that as well. Additionally we are creating a private repository for our container images so we’ll create a policy to be able to pull those down. All of these policies will be attached to the AWS IAM Role that we will assign to the running container.

To set up these policies and roles, let’s go to AWS IAM Management Console. In there we will first go to the Policies section and click create a new Policy. The policy json to read SSM Parameters is the following:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameters"
      ],
      "Resource": "*"
    }
  ]
}

We should additionally restrict the star-scoped resource to match only needed parameters for this application. That could be something like arn:aws:ssm:[REGION]:[ACCOUNT_ID]:parameter/roadie/backstage/*, depending on the namespace we choose to use in later steps.

We also want our logs from Fargate to go to some place where we can see them so we’ll create another policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogStreams"
      ],
      "Resource": [
        "arn:aws:logs:*"
      ]
    }
  ]
}

Again, if we want to write into just some predefined log stream, so we should scope the resource section to match that. We can also leave CreateLogGroup out in that case since the Fargate task doesn’t need permissions to create it.

The last policy we want to create is to allow Fargate to download the Docker images we have pushed to our private ECR.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "*"
    }
  ]
}

Finally, now that we have our policies set up, we can create a Role that we can attach to the running Fargate task.

We’ll jump into the Roles section of IAM console and click the ‘Create role’ button. We select trusted entity type to be ‘Elastic Container Service’ and our use case to be ‘Elastic Container Service Task’. On the next page where a list of permissions are displayed we select the three policies we created above.

When we navigate to the role we should make sure that the correct trust policy JSON has been assigned to it. We don’t want to use this same role to be used by our running tasks, only the ECS service itself, so out trust policy is pointing to ecs.amazonaws.com only.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "ecs.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Now we have all the prerequisites on IAM side ready for our deployment.

Step 3 - Defining our environment in System Manager Parameter Store

We know what the connection string is to connect to our database so next we will move on to set up those. We’ll also set up our Github token in the same way as an environment variable (make sure you have created a Github token explained here, if you want to use Github with your Backstage). There are few different ways to pass in environment variables to running containers in AWS ECS. We will be using AWS System Manager Parameter Store to save those in a safe place where they can be then loaded to the running container. AWS ECS also provides the possibility to load environment variables from a flat file stored in S3 or pass them in directly (unsecured) to the task definition.

Let’s navigate to the Parameter Store and populate the needed values in there. By clicking Create Parameter we can create values for our database credentials for the RDS instance we created previously as well as the Github token we have created.

At least the DB_PASSWORD and GITHUB_TOKEN should be of type SecureString, so they are encrypted. We’ll be using just the default KMS Key in this case to encrypt the values, but it might be worthwhile to generate a specific key for these parameters.

In the end we should be ending up with a list of few parameters that we can use later.

Note that we have not defined the database port to be retrieved from parameter store here. That might be something you want to do if the ports change regularly or are non-standard, but is not really necessary.

Step 4 - Creating a Load Balancer for our Backstage service

The last scaffolding bit we want to do to support our Fargate Backstage is to set up a load balancer in standby to wait for our Fargate service to attach itself to it. We do this step a bit prematurely just to have a good static URL available to point to when we eventually start building the actual Backstage application.

Let’s navigate to AWS Load Balancer section in the console and spin one up.

We want to create an Application Load balancer that is internet facing. It is a good idea to select all subnets, so the load balancer is able to target our containers even if they are spread out across different AWS Availability Zones. For now the only listener we attach to the load balancer is listening to HTTP traffic through port 80, but if you have a domain that you control and can create certificates for, you should be using HTTPS and port 443.

On the next step we configure the security of our balancer. If you didn’t choose to use HTTPS protocol, AWS will show you a little warning to do so. If you did, you should be putting in your certificate details for the domain name you have available.

We’ll continue onwards to setting up our security groups. In this case we want create a new one for the load balancer. The only thing we need to listen (in this setup without HTTPS) is to configure this group to allow traffic to port 80 from everywhere (0.0.0.0/0, ::/0). If you want to restrict access to your Backstage instance, you can define an IP range of your office network or VPN, or your personal public IP.

For target groups we just change the port to 7000, which will be the one our Backstage instance will be using and give the target group a name. We will not be registering any actual targets yet. That will be handled when we spin up our Fargate services.

The load balancer will take few a minutes to spin up. While waiting for that we will take note of the DNS name of the balancer, this will be the entry we’ll modify our application configuration with. Of course, if you have added an CNAME/Alias entry of your own domain to point to the load balancer, you should use that instead.

Step 5 - Creating your Backstage image

To deploy the Backstage application we want to have it packaged into a docker image with configurations best suitable for our environment. We’ll start this journey in the Backstage repository. For more information how to scaffold the initial application you can take a look at the post to get Backstage running with Docker compose. For this post we start the same way and scaffold a new fresh Backstage application by running npx @backstage/create-app. After we have figured out a good name for the app and selected PostgreSQL as our database provider, we are ready to massage our configuration files to match what we want our environment to look like.

If we take a look at the default app-config.yaml file we see few environment variables that are needed to get the app running properly. These environment variables, for our use case, based on the default app-config.yaml file are:

POSTGRES_HOST
POSTGRES_PORT
POSTGRES_USER
POSTGRES_PASSWORD
GITHUB_TOKEN

These happen to be the same items we created in AWS Parameter Store previously so looks like we are on the right track.

A lot of the values in the default configuration file are not necessary and can be removed. Things like default catalog locations can be removed since that section depends a lot on the way you want to configure your Backstage instance. For this tutorial, we will leave the whole configuration file as is.

Previously we created a load balancer to have a more stable DNS entry we can use as the application entrypoint. We’ll add that in to our application configuration by modifying the app-config.production.yaml. We’ll also turn off HTTP->HTTPS redirection from our Content-Security-Policy for now since our Load Balancer only supports port 80. This would be something that can be omitted for more secure environments where HTTPS is set up.

The whole file would eventually look something like this:

app:
  baseUrl: http://roadie-fargate-loadbalancer-123456789.eu-west-1.elb.amazonaws.com

backend:
  baseUrl: http://roadie-fargate-loadbalancer-123456789.eu-west-1.elb.amazonaws.com
  listen:
    port: 7000
  csp:
    upgrade-insecure-requests: false # For tutorial purposes only

That is all configuration needed to build an image we can run on Fargate. To create the actual deployable we can rely on the built-in build-image command that produces a Docker image with the current content of our workspace. We run yarn build-image and wait for Backstage-CLI to do its thing. By running docker images we can see that the previous command has created a Docker image for us with the repository name backstage and tag LATEST.

To run containers in Fargate we need to store the Docker image somewhere where the ECS service can download it from. For that we will create a new repository in ECR which we can use as the home for our container. The easiest way to do this is to use the aws-cli tool. Note that we are using eu-west-1 region throughout this tutorial, so be sure to change to your preferred region accordingly.

aws ecr create-repository --repository-name fargate-backstage --region eu-west-1

AWS responds to us with the configuration of the repository which we can then use to tag and push our image to. Here is the json output from the command:

{
    "repository": {
        "repositoryArn": "arn:aws:ecr:eu-west-1:123456789012:repository/fargate-backstage",
        "registryId": "123456789012",
        "repositoryName": "fargate-backstage",
        "repositoryUri": "123456789012.dkr.ecr.eu-west-1.amazonaws.com/fargate-backstage",
        "createdAt": "2021-02-16T13:56:38+01:00",
        "imageTagMutability": "MUTABLE",
        "imageScanningConfiguration": {
            "scanOnPush": false
        },
        "encryptionConfiguration": {
            "encryptionType": "AES256"
        }
    }
}

From that configuration we grab the repositoryUri and use that to tag our Backstage image with the correct repository path and version number. We’ll trust that this first iteration of our image is production ready, so we bravely start versioning from number 1.0.0.

docker tag backstage:latest 123456789012.dkr.ecr.eu-west-1.amazonaws.com/fargate-backstage:1.0.0

Now we are ready to push our image to our newly created repository and move to scaffold other AWS resources. First, let’s login to ECR:

aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.eu-west-1.amazonaws.com

and then push the image up to AWS ECR

docker push 123456789012.dkr.ecr.eu-west-1.amazonaws.com/fargate-backstage:1.0.0

Step 6 - Defining our Fargate tasks

All the supporting configuration should now be done, and we can finally move to define the actual container, service and task that will be running our Backstage instance.

We’ll start of by creating a new cluster in AWS ECS. Clusters in ECS are mostly just for namespacing purposes, but they are tied to a specific VPC, so make sure you choose the same VPC where the load balancer and RDS database are.

The next step is to create a task definition. This will contain the settings for our Fargate instance, our container definitions and the configuration on how we pass in our environment variables. We’ll select Fargate as the type of task and fill in the needed values. The role for the task itself as well as the task execution role should be the one we created earlier for this purpose. We’ll select half a vCPU and 1GB of memory for this first iteration and see how the service behaves. These can be updated later if there is need for more resources.

Our task of course needs a container, so we will create a new container definition by clicking ‘Add container’. We’ll give our container a descriptive name and on the image textfield add our freshly created and pushed 123456789012.dkr.ecr.eu-west-1.amazonaws.com/fargate-backstage:1.0.0 Docker image. For port mappings we’ll add a single item, exposing port 7000 from the container. Other values on this section can be left as default.

A little bit further down in the environment section we will add few lines to retrieve our env variables from Parameter store. The environment variable names came from our app-config.yaml file and were:

POSTGRES_HOST
POSTGRES_PORT
POSTGRES_USER
POSTGRES_PASSWORD
GITHUB_TOKEN

Most of the environment variables we define will use the ‘ValueFrom’ type to retrieve needed information. For these we add the key and point the value to the ARN of the corresponding parameter in Parameter Store. The port value alone is passed in as plain text.

We’ll also click on autoconfigure CloudWatch Logs to be able to see the logs from the running container.

That is all the configuration needed for the task definition for now.

Final step to start up these tasks is to create an ECS service within our cluster that points to the task definition we have created. We’ll navigate back to the cluster we have created and on the services tab click Create.

We’ll select Fargate launch type and pick our just created task definition. The Platform Version is good to set as 1.4.0 since ‘Latest’ counterintuitively actually points to ‘1.3.0’. We can leave deployments to be a rolling update for now and Task Tagging config to be their default values.

Next step on the wizard is the networking configuration. We’ll choose our same VPC that our cluster, RDS and load balancer are and select few (or all, our LB supports all of them) of the subnets from the dropdown. We want to assign a public IP to the service in this case because we are accessing a regional AWS service, SSM Parameter Store and we don’t have a VPC endpoint set up for it.

We will create one more security group for this service. The security group doesn’t really need to accept traffic from anywhere else than our load balancer. It is good a practice to keep the firewall as secure as possible, so we’ll configure the new security group only to allow access to port 7000 and from only one source group, our load balancer security group.

Note that AWS doesn’t really make UX around this too easy by deciding to display security group ids only. You need to navigate to either security groups in the VPC console or directly to your Load Balancer to see what the id of the security group is.

Now that our security group allows access from our load balancer, we can click the radio button selection on the Load Balancing section to be Application Load Balancer. We select our load balancer from the drop down, select our container from the second drop down and add that to be balancer. Most of the values are autopopulated for us. We’ll choose our created target group and let ECS to register the service as a target to it.

The rest of the settings we can leave as defaults and just click through the wizard. ECS will automatically start spinning up our service. When we can navigate to our ECS service and tasks tab we should be able to see ECS trying hard to provision our containers.

It will take few minutes before it reaches ‘RUNNING’ status. Unfortunately it doesn’t seem to stay in ‘RUNNING’ status for too long and instead ends up in a loop of starting a new task and failing one after another.

We can investigate and debug why the running container fails to stay up by checking CloudWatch logs that our task has written. In cases where the task doesn’t start at all we can take a look at the task itself from the ECS pages to see what prevents it from starting. These could be something like IAM policy issues or perhaps a wrong URL to the image that we have defined.

When we take a look at the logs we can see that the container starts up and Backstage itself within the container tries its best to start up. It fails on Knex timeout error, telling us that Knex is unable to connect to the database.

There is one thing we want to do to fix that. In the first step we spun up an RDS database and created a new security group for it. This security group does not allow our container to access the database, so we need to make some modifications to it. We can navigate to the security group via RDS and modify the inbound rules of it. We will add a new line allowing traffic to port 5432 from security group that we have created to our Fargate service. After clicking save, the change to the firewall is immediate, and the next task ECS spins up for us should be able to connect to the database and stay up and running.

And that should be it!

We can now navigate to our load balancer URL and we should be seeing a running Backstage instance with default data scaffolded for us.

Conclusion

Setting Backstage up and running on AWS Fargate requires multiple steps and configurations but provides a secure and manageable Backstage instance after the initial configuration is done. There are few avenues where this solution can evolve from here. Things like high availability and monitoring are something to think about when spinning up a Backstage instance as well and will eventually bring more complexity into the solution. With this tutorial you can get going and start experimenting with Backstage before moving into more complex architectures.

How to deploy Backstage on KIND Kubernetes

Mon, 01 Feb 2021 21:00:00 GMT

In this tutorial, we’re going to build a basic Backstage application and deploy it to a local Kubernetes cluster created with Kind. The application will be able to store data, such as the services in the Backstage catalog, in an in-memory Sqlite3 database.

Prerequisites

To complete this tutorial, you will need:

Docker and installed and running on your local machine.
NodeJS installed on your local machine.
The Yarn package manager installed. You can use npm if you like, although you will have to modify the shell commands somewhat.
The KIND Kubernetes cluster manager installed. You can skip this requirement if you already have a Kubernetes cluster which you wish to install Backstage into.
The Kubernetes kubectl command line tool, for interfacing with the cluster we will create.

Step 1 - Scaffold a Backstage application

To run Backstage on Kuberentes, we first need to scaffold a Backstage application to work with. The main Backstage codebase does ship with a sample application we can run, but best practices dictate that we should create our own so we can customize it with our company name and other attributes.

Backstage requires a database to store information about the components, websites and other entities you want to track in the catalog. There are two built in database options, Sqlite and PostgreSQL. We’re going to use Sqlite3 for this tutorial.

It is simpler and quicker to get set up with Backstage and Sqlite3. The downside is that our data will be stored in memory, and will be lost if we upgrade or restart our Backstage instance or Kubernetes pod.

This tutorial uses version 0.3.7 of the Backstage CLI to create this application. You may see different results if you’re using a different version.

npx @backstage/create-app --version
npx: installed 67 in 5.094s
0.3.7

npx @backstage/create-app
npx: installed 67 in 4.944s
? Enter a name for the app [required] scaffolded-app-sqlite
? Select database for the backend [required] SQLite

Creating the app...

 Checking if the directory is available:
  checking      scaffolded-app-sqlite ✔

 Creating a temporary app directory:
  creating      temporary directory ✔

 Preparing files:
  templating    .gitignore.hbs ✔
  copying       .eslintrc.js ✔
  copying       app-config.production.yaml ✔
  templating    app-config.yaml.hbs ✔
  templating    catalog-info.yaml.hbs ✔
  copying       README.md ✔
  copying       lerna.json ✔
  templating    package.json.hbs ✔
  copying       tsconfig.json ✔
  copying       .eslintrc.js ✔
  copying       Dockerfile ✔
  copying       README.md ✔
  templating    package.json.hbs ✔
  copying       index.test.ts ✔
  copying       index.ts ✔
  copying       types.ts ✔
  copying       app.ts ✔
  copying       auth.ts ✔
  copying       catalog.ts ✔
  copying       proxy.ts ✔
  copying       scaffolder.ts ✔
  copying       techdocs.ts ✔
  copying       .eslintrc.js ✔
  copying       cypress.json ✔
  templating    package.json.hbs ✔
  copying       apple-touch-icon.png ✔
  copying       android-chrome-192x192.png ✔
  copying       favicon-16x16.png ✔
  copying       favicon-32x32.png ✔
  copying       favicon.ico ✔
  copying       index.html ✔
  copying       manifest.json ✔
  copying       robots.txt ✔
  copying       safari-pinned-tab.svg ✔
  copying       .eslintrc.json ✔
  copying       app.js ✔
  copying       App.test.tsx ✔
  copying       App.tsx ✔
  copying       LogoFull.tsx ✔
  copying       LogoIcon.tsx ✔
  copying       apis.ts ✔
  copying       index.tsx ✔
  copying       plugins.ts ✔
  copying       setupTests.ts ✔
  copying       sidebar.tsx ✔
  copying       EntityPage.tsx ✔

 Moving to final location:
  moving        scaffolded-app-sqlite ✔

 Building the app:
  executing     yarn install ✔
  executing     yarn tsc ✔

🥇  Successfully created scaffolded-app-sqlite

See https://backstage.io/docs/tutorials/quickstart-app-auth to know more about enabling auth providers

Step 2 - Building a Docker image

Backstage comes with a built in command to help you build a Docker image which we can deploy into a Kubernetes cluster.

Change into the scaffolded-app-sqlite directory which we just created, and use yarn to run a command which will build the Docker image.

yarn build-image
yarn run v1.22.10
$ yarn workspace backend build-image
$ backstage-cli backend:build-image --build --tag backstage
# Lots of output omitted...
=> => naming to docker.io/library/backstage                                                                                                                                                                                                                                                                                                                                            0.0s
✨  Done in 177.33s.

We should now see that an image has been built successfully.

docker images                                                                                                                                                                                                                                                                                                                                                                                                                                       1 ↵
REPOSITORY         TAG       IMAGE ID       CREATED         SIZE
backstage          latest    7b452013e713   3 minutes ago   1.1GB

And we can run it using Docker directly.

docker run -p 7000:7000 backstage
2021-01-31T16:41:18.319Z backstage info Initializing http server
2021-01-31T16:41:18.322Z backstage info Listening on :7000

Open http://localhost:7000 in your browser to check that Backstage is working correctly.

Step 3 - Create a KIND Kubernetes cluster

Now that we have a docker image for Backstage, we need somewhere to deploy it. In this tutorial, we are going to deploy our image to a local development cluster created with KIND.

Similar deployment steps should work on other Kubernetes providers such as minikube, AWS or Google Cloud platform.

Use kind to create a Kubernetes cluster to work with. We need some special settings on our cluster so we can configure ingress in the cluster with Nginx. Use this snippet from the KIND docs.

kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.19.1) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Once this completes, your kubectl command line utility should be automatically configured to use this newly created cluster.

» kubectl config get-contexts                                                                                                                                                                                                                                                            130 ↵
CURRENT      NAME           CLUSTER        AUTHINFO       NAMESPACE
*            kind-kind      kind-kind      kind-kind

The backstage Docker image we built previously is not automatically shared with our KIND kubernetes cluster. Before we can use it, we have to load it into the cluster. This is covered in the Kind docs.

kind load docker-image backstage:latest
Image: "backstage:latest" with ID "sha256:fe0c8bf5323b46fc145cab5832e6df4d7871d1cfd230e497d025e5bb5bdd2c05" not yet present on node "kind-control-plane", loading...

Now that the image is loaded, we can create a Backstage deployment and a service to expose it on an IP inside the cluster. Save the following YAML into a file called manifest.yaml.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: backstage
spec:
  replicas: 1
  selector:
    matchLabels:
      app: backstage
  template:
    metadata:
      labels:
        app: backstage
    spec:
      containers:
        - name: backstage
          imagePullPolicy: Never
          image: docker.io/library/backstage:latest
          ports:
            - containerPort: 7000
---
kind: Service
apiVersion: v1
metadata:
  name: backstage-service
spec:
  selector:
    app: backstage
  ports:
    - port: 7000

You’ll notice that we have set the imagePullPolicy to Never. This prevents a problem where kubernetes will attempt to find a new version of the backstage docker image on the network, instead of using the one we loaded onto the cluster earlier. This cluster has no network access and thus, without setting imagePullPolicy: Never, our deployment would fail.

We apply this change to the cluster with the following command.

kubectl apply -f manifest.yaml

We can double-check that the change was applied successfully by inspecting our backstage Kubernetes pod.

kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
backstage-64d46b7886-r7l7r   1/1     Running   0          8m14s

We know this is running successfully because the STATUS is Running.

Step 4 - Access Backstage in the browser

Our local KIND kubernetes cluster doesn’t provide a way to access Backstage from our local machine, which is outside the cluster.

To work around this, we will have to forward a port inside the cluster, to one on our local machine. To do this, we will use the built in port forwarding feature of kubectl.

kubectl port-forward backstage-64d46b7886-4rdtp 7000:7000

As before, open http://localhost:7000 in your browser to view Backstage. It looks like nothing has changed, but this page is being rendered inside our Kubernetes cluster and exposed to the browser.

Conclusion

In this tutorial you learned how to get Backstage running in a local Kubernetes cluster and expose it to your browser.

Using GitHub Auth with Backstage

Wed, 05 Aug 2020 21:00:00 GMT

Update Sept 2021: Backstage now supports GitHub authentication via GitHub apps. If you are using a GitHub app, you do not need to follow the steps described below. They are only valid if you are using a GitHub Personal Access Token with Backstage.

GitHub is one of the most popular Backstage authentication mechanisms going. There’s a good reason for this, Backstage ultimately needs to pull service catalog information from YAML files, those YAML files usually live in git, and the git repos usually live on GitHub.

Setting up GitHub authentication can be a little tricky, but this post will tell you everything you need to know.

There are basically two steps:

Create an OAuth application on GitHub,
Pass the identity information from this application to Backstage.

Let’s get into it.

Create an OAuth application on GitHub

To create an OAuth app for local development, visit your OAuth Apps settings page on GitHub. Click the “New OAuth App” button and you’ll see a form you have to fill out.

Enter the following values:

Application name: Backstage local development
Homepage URL: http://localhost:3000
Application description: Login to Backstage on localhost
Authorization callback URL: http://localhost:7000/api/auth/github/handler/frame

Your form should now look something like this:

The tricky thing with this, is that the homepage URL should point to the Backstage Frontend, because that’s what your users will consider to be “Backstage”, but the Authorization callback URL must point to the Backstage Backend.

When GitHub authenticates a user, it will call out to the application Backend, with some authentication parameters included in the URL query string. Backstage will check these parameters and then server-side render a confirmation page for the user.

Once you submit that form, GitHub provides you with a Client ID and Client Secret for your OAuth application.

Note these down, you’ll need them in the next step.

Tell Backstage about your OAuth application

Go back to the command line where you run the Backstage backend and pass the Client ID and Client Secret into Backstage when you start it up.

# starting in the root of your Backstage repo
» cd packages/backend
» env AUTH_GITHUB_CLIENT_ID=eafc816045b5533ba581 AUTH_GITHUB_CLIENT_SECRET=34922f6547991760e8f5219a529a9c00b0fd44ea yarn start

That’s all there is to it. When Backstage starts up and opens on http://localhost:3000, you’ll be able to login via GitHub.

Running the Backstage service catalog with Docker Compose

Tue, 09 Jun 2020 21:00:00 GMT

In this tutorial, we’re going to build and run a basic Backstage application with Docker Compose. The application will be able to store data in a PostgreSQL database, and connect to GitHub to pull in repositories. We will also make a config change in the Backstage application and re-run it.

Just want to get started quickly? Check out our community Backstage Docker image.

Prerequisites

To complete this tutorial, you will need:

Docker and Docker Compose installed and running on your local machine.
NodeJS installed on your local machine.
The Yarn package manager installed. You can use npm if you like, although you will have to modify the shell commands somewhat.

Step 1 - Scaffold a Backstage application

To run Backstage on Docker Compose, we need to create a Backstage instance to work with. The main Backstage codebase does ship with a sample application we can run, but best practices dictate that we should create our own so we can configure it with our company name and other attributes.

Backstage comes with a CLI for creating Backstage instances. Let’s use it to scaffold a new instance and configure it for PostgreSQL. We’ll call this instance scaffolded-app, but you can choose a name that makes more sense for you.

This tutorial uses version 0.3.2 of the Backstage CLI to create this application. You may see different results if you’re using a different version.

» npx @backstage/create-app --version
0.3.2

» npx @backstage/create-app
npx: installed 68 in 14.197s
? Enter a name for the app [required] scaffolded-app
? Select database for the backend [required] PostgreSQL

Creating the app...

 Checking if the directory is available:
  checking      scaffolded-app ✔

 Creating a temporary app directory:
  creating      temporary directory ✔

 Preparing files:
  copying       README.md ✔
  copying       .npmignore ✔
  copying       lerna.json ✔
  templating    app-config.yaml.hbs ✔
  templating    package.json.hbs ✔
  copying       tsconfig.json ✔
  copying       .eslintrc.js ✔
  copying       cypress.json ✔
  templating    package.json.hbs ✔
  copying       .eslintrc.js ✔
  copying       android-chrome-192x192.png ✔
  copying       favicon-16x16.png ✔
  copying       apple-touch-icon.png ✔
  copying       favicon-32x32.png ✔
  copying       favicon.ico ✔
  copying       manifest.json ✔
  copying       index.html ✔
  copying       safari-pinned-tab.svg ✔
  copying       robots.txt ✔
  copying       App.tsx ✔
  copying       App.test.tsx ✔
  copying       index.tsx ✔
  copying       apis.ts ✔
  copying       plugins.ts ✔
  copying       sidebar.tsx ✔
  copying       setupTests.ts ✔
  copying       .eslintrc.json ✔
  copying       app.js ✔
  copying       .eslintrc.js ✔
  copying       Dockerfile ✔
  copying       README.md ✔
  templating    package.json.hbs ✔
  copying       index.ts ✔
  copying       types.ts ✔
  copying       index.test.ts ✔
  copying       auth.ts ✔
  copying       catalog.ts ✔
  copying       identity.ts ✔
  copying       proxy.ts ✔
  copying       scaffolder.ts ✔
  copying       techdocs.ts ✔

 Moving to final location:
  moving        scaffolded-app ✔

 Building the app:
  executing     yarn install ✔
  executing     yarn tsc ✔
  executing     yarn build ✔

🥇  Successfully created scaffolded-app

If we cd into the scaffolded-app directory which was just created, we can see the directory structure which was created for us.

» ls -al                                                                                                                                                                                                                                                                                                                            146 ↵
total 1776
drwxr-xr-x    19 myuser  staff     608  9 Jan 20:20 .
drwxr-xr-x     3 myuser  staff      96  9 Jan 19:17 ..
-rw-r--r--     1 myuser  staff      36  9 Jan 19:17 .eslintrc.js
-rw-r--r--     1 myuser  staff     420  9 Jan 19:17 .gitignore
-rw-r--r--     1 myuser  staff      93  9 Jan 19:17 README.md
-rw-r--r--     1 myuser  staff     184  9 Jan 19:17 app-config.production.yaml
-rw-r--r--     1 myuser  staff    3250  9 Jan 19:17 app-config.yaml
-rw-r--r--     1 myuser  staff     399  9 Jan 19:17 catalog-info.yaml
drwxr-xr-x     4 myuser  staff     128  9 Jan 19:19 dist-types
-rw-r--r--     1 myuser  staff     116  9 Jan 19:17 lerna.json
drwxr-xr-x  1698 myuser  staff   54336  9 Jan 19:19 node_modules
-rw-r--r--     1 myuser  staff    1339  9 Jan 19:17 package.json
drwxr-xr-x     4 myuser  staff     128  9 Jan 19:17 packages
-rw-r--r--     1 myuser  staff     272  9 Jan 19:17 tsconfig.json
-rw-r--r--     1 myuser  staff  829904  9 Jan 19:19 yarn.lock

The main bulk of the application is in the packages directory. This contains two subdirectories.

» ls -al packages
total 0
drwxr-xr-x   4 myuser  staff  128  9 Jan 19:17 .
drwxr-xr-x  19 myuser  staff  608  9 Jan 22:23 ..
drwxr-xr-x  10 myuser  staff  320  9 Jan 19:40 app
drwxr-xr-x   9 myuser  staff  288  9 Jan 19:50 backend

The app subdirectory contains the frontend UI of Backstage and the backend, as you might expect, contains the API layer and parts that connect to the database.

Step 2 - Building a Docker image

Backstage comes with a built in command to help you build a Docker image which you can run with Docker Compose.

For simple deployments, the Backstage backend has the ability to serve the frontend app to the browser, so you only have to build one Docker image.

» yarn workspace backend build-image
yarn workspace v1.22.10
yarn run v1.22.10
$ backstage-cli backend:build-image --build --tag backstage
# Lots of output omitted...
=> => naming to docker.io/library/backstage                                                                                                                                                                                                                                                                                                                                           0.0s
✨  Done in 114.02s.

Check the image has been built successfully.

» docker images                                                                                                                                                                                                                                                                                                                                                                                                                                       1 ↵
REPOSITORY         TAG       IMAGE ID       CREATED         SIZE
backstage          latest    7b452013e713   3 minutes ago   1.1GB

Now that we have a Docker image, let’s try to run it.

» docker run backstage
2021-01-09T19:51:13.883Z backstage info Loaded config from app-config.yaml, app-config.production.yaml
2021-01-09T19:51:13.887Z backstage info Created UrlReader predicateMux{readers=azure{host=dev.azure.com,authed=false},bitbucket{host=bitbucket.org,authed=false},github{host=github.com,authed=false},gitlab{host=gitlab.com,authed=false},fallback=fetch{}}
Backend failed to start up, Error: connect ECONNREFUSED 127.0.0.1:5432

This fails because the Backstage backend cannot connect to port 5432. Backstage needs to connect to the database in order to store catalog items and other data. It expects to find PostgreSQL running on port 5432. When it can’t, it fails and bails out.

To fix this, let’s use Docker Compose to make PostgreSQL available to our Backstage backend.

Step 2 - Adding PostgreSQL

Below is a simple docker-compose.yaml file which runs the Backstage image we just created and a default PostgreSQL database. Create this file inside your Backstage application and save it.

version: '3'
services:
  backstage:
    image: backstage
    environment:
      # This value must match the name of the postgres configuration block.
      POSTGRES_HOST: db
      POSTGRES_USER: postgres
    ports:
      - '7000:7000'

  db:
    image: postgres
    restart: always
    environment:
	# NOT RECOMMENDED for a production environment. Trusts all incomming
      # connections.
      POSTGRES_HOST_AUTH_METHOD: trust

Once you’ve done that, you can use Docker Compose to start both of these Docker images.

» docker-compose up
Creating network "blog-post-test_default" with the default driver
Creating blog-post-test_db_1        ... done
Creating blog-post-test_backstage_1 ... done
Attaching to blog-post-test_backstage_1, blog-post-test_db_1
# Lots of output omitted...
backstage_1  | Backend failed to start up, Error: Failed to initialize github scaffolding provider, Missing required config value at 'scaffolder.github.token'
blog-post-test_backstage_1 exited with code 1

It still fails, but we’ve made progress. Backstage has successfully connected to the database and then failed because of a missing GitHub token.

Step 3 - Configuring GitHub

Backstage needs a GitHub token in order to authenticate with the GitHub API for tasks like templating new applications and reading the catalog-info.yaml files it uses to store metadata.

Head over to the GitHub docs to learn how to create a Personal Access Token. If you don’t want to use GitHub, you can use a nonsense value like abc in place of the GitHub token value.

Once you have your token, pass it into Backstage via the environment variables.

version: '3'
services:
  backstage:
    image: backstage
    environment:
      POSTGRES_HOST: db
      POSTGRES_USER: postgres
      # Add your token here
      GITHUB_TOKEN: <your-github-token>
    ports:
      - '7000:7000'

  db:
    image: postgres
    restart: always
    environment:
      POSTGRES_HOST_AUTH_METHOD: trust

Once that’s done, let’s give it one more go.

» docker-compose up
Creating network "blog-post-test_default" with the default driver
Creating blog-post-test_db_1        ... done
Creating blog-post-test_backstage_1 ... done
Attaching to blog-post-test_backstage_1, blog-post-test_db_1
# Lots of output omitted...
backstage_1  | 2021-01-09T22:42:27.061Z backstage info Initializing http server
backstage_1  | 2021-01-09T22:42:27.065Z backstage info Listening on :7000

Hurray! 🎉 Now, if you visit localhost:7000, you should see Backstage.

Step 4 - Making a change

Our Backstage instance isn’t quite as perfect as it could be. You’ll notice the header says “My Company Service Catalog”. Let’s change that to include the name of our company, Roadie.

This is a simple change to make. Fire up your text editor and open the app-config.yaml file.

In there, you’ll see the following two lines

organization:
  name: My Company

Simply change “My Company” to something like “Roadie”, rebuild the docker image, run docker-compose up and refresh your browser window to see the change.

Conclusion

In this tutorial you learned how to get Backstage running locally and change it’s configuration. As a next step, you may wish to try adding the Lighthouse plugin to the deployment.

How to use the Backstage Lighthouse plugin

Sat, 23 May 2020 21:00:00 GMT

Lighthouse is an open-source, automated tool for improving the quality of web pages. You give it the URL of a web page, it loads the page and runs tests to check the page’s quality.

You can use it via the PageSpeed Insights website. Simply enter a URL in the box, hit Analyze, and a few seconds later you will have a quality score for the website behind the URL, information about how long the page took to load and some suggestions about what to do better.

You can also use Lighthouse via the Chrome DevTools, the command line and as a NodeJS module.

» npm install -g lighthouse
» lighthouse https://www.davidtuite.com/

Lighthouse with Backstage

In your company, there may be many teams making websites for different purposes. It is useful to track the quality of these websites over time to ensure that code changes are not hurting performance or accessibility.

If customers complain that your website is slow, it can be helpful to look back over Lighthouse results to figure out if and when the performance drop occurred. You can then look at commits around this time to pinpoint the cause of the slowness.

Backstage has a Lighthouse plugin available which makes it easy to run tests against the websites your company produces.

You can track the results of Lighthouse tests over time to see if your site is performing better or worse as you make changes.

Running Lighthouse with Backstage

To use Lighthouse with Backstage, you need three things:

A Backstage instance you can run locally with the Lighthouse Plugin installed.
A running Lighthouse microservice which actually executes the Lighthouse tests before sending the results back to the plugin.
A PostgreSQL database for the Lighthouse microservice to talk to.

Let’s set them up in reverse order.

PostgreSQL

Assuming you already have posgresql installed and running, you can easily create a database for Lighthouse with the following command. The database this creates will have no password but it’s fine for local development.

» createdb -O [username] -U [username] -w lighthouse_audit_service

You can verify this database by logging into it with psql.

» psql -h localhost -p 5432 -U [username] -d lighthouse_audit_service
psql (11.5)
Type "help" for help.

lighthouse_audit_service=#

Fantastic.

Lighthouse microservice

We’re going to run the Lighthouse microservice with docker. We’ll have to pass a few environment variables to our Docker run command. The easiest way to do this is by putting them in a file.

Create a file called development.env with the following variables.

LAS_PORT=3003
LAS_CORS=true
PGUSER=[username]
PGDATABASE=lighthouse_audit_service
PGHOST=host.docker.internal

LAS_PORT tells the Lighthouse microservice which port to expose to incoming HTTP requests. It’s important that this port matches the one defined in the Backstage Lighthouse plugin. Otherwise the plugin will never receive a response to its Lighthouse testing requests.
PGHOST is important because our Lighthouse microservice is running inside Docker but our PG database is exposing a port on localhost. We have to use a special Docker DNS name to allow this.

With that file defined, we should now be able to run the Lighthouse microservice like this:

» docker run -p 3003:3003 -p 5432:5432 --env-file development.env spotify/lighthouse-audit-service:latest                                              130 ↵
yarn run v1.22.0
$ node ./cjs/run.js
info: building express app... {"service":"lighthouse-audit-service","timestamp":"2020-05-23T19:03:00.202Z"}
info: running db migrations... {"service":"lighthouse-audit-service","timestamp":"2020-05-23T19:03:00.290Z"}
info: listening on port 3003 {"service":"lighthouse-audit-service","timestamp":"2020-05-23T19:03:00.320Z"}

It might take a few seconds to start up when you run it for the first time because it will have to download the Docker container from the internet. When it starts, it automatically runs some database migrations to prepare your database.

Backstage with the Lighthouse Plugin

Luckily for us, Backstage comes with the Lighthouse Plugin installed and enabled so it’s easy to try it out.

Follow the Getting Started Guide to get Backstage installed.

If you open packages/app/src/plugins.ts in your favorite code editor you should see that the Lighthouse plugin is already installed.

export { plugin as LighthousePlugin } from '@backstage/plugin-lighthouse';

In packages/app/src/apis.ts you should see that the Lighthouse plugin is configured to send requests to port 3003.

builder.add(lighthouseApiRef, new LighthouseRestApi('http://localhost:3003'));

Now, run Backstage with yarn start and visit http://localhost:3000/lighthouse and you should see the Backstage Lighthouse interface.

Awesome! If you run an audit a few times on the same website you can see the trend over time. Perfect for ensuring that your websites are staying responsive and accessible.