Creating an automated catalog is a critical step toward creating a reliable software catalog. The most common problem is that the catalog information is outdated and everyone stops using the platform.
There are a few effective ways to keep your catalog up to date, in the blog we’ll cover a few options on how to keep your catalog up to date using automation in various levels.
Update the API docs during the CI/CD
Changes in internal API can be frequent, updating the API docs every time a new API endpoint is added or even a new detail is added to the response can be cumbersome. External APIs usually have an automated way to update the API documentation & security policies. Some internal API ones got too but it’s less common.
How to automate?
APIs can be described using the OpenAPI specification/swagger files. The swagger file holds the specification of the API so users will know how to use it. For example, which fields are mandatory for the request and which data & fields include in the response. Most common API frameworks allow the creation of a swagger file automatically like flask-restplus which allows creating a swagger/OpenAPI file automatically from your Flask application.
In each deploy, update the catalog API documentation based on the swagger file this will ensure that the API documentation is always up to date.
Query Owners / Team Members
One question that always comes back is “Who maintains this service/repo?”. Usually, can be a very easy question to answer or a very hard one. It always goes trickier as the company grows and areas of responsibility are not easy as they used to be.
How to automate?
Every git repository got a MAINTAINERS / CODEOWNERS file. These files allow us to know who owns the code and based on that maintain the permissions to review & accept pull requests. The fact that every software component got a git repo allows us to update the software owners in our catalog based on this file and stay up to date.
Fetch information using integrations
Previously in the blog, we covered how to query information from your git repo and push it into the catalog. In this section, we will rely on other tools to provide us with live information that is up to date. By querying live data we will ensure our catalog is up to date. For example, we pick a service and we want to know how many security vulnerabilities it got. We can trigger a job that will scan once a day or in every pull request merge but that’s not enough because the data can be changed.
How to automate?
Add integration & plugins to your software catalog to query relevant data from multiple sources like you query some of your configuration from git. If you using Backstage or any other tool, they are got plugins and integrations. If you build your own, make sure to add ones.
Add Operations Workflows
Operations workflows allow users to do operation activities from the catalog. For example, restart a service or roll back to the previous version. The fact that users can go into a single catalog and run a task that without it would take them a long time to do is valuable for them. Here we will add automation to complete a task to gain adoption. If users will not update their catalog information:
The workflow will stop the work
They will get more involved in taking care of the catalog
Let’s take an example, a user is on call and wakes up in the middle of the night to enter your portal and try to restart the service, but the labels on the service are not updated so the automation doesn’t know which service to restart. The user also finds out that the service details are wrong. The user will update the information quickly to retrigger the automation.
How to automate?
Here we are talking about the fact that if they would have an operations workflow they will have the motivation to use and update your catalog. So just make sure you add your most impactful operations workflows.
Map Relations Between Software Components
Adding the relation between software components is something that is rarely up to date. No one remembers to update dependencies when it adds a feature that uses an API. On the other side, application relationships are super valuable to understand what’s going on in your application/system.
How To Automate?
There are plenty of systems out there that map the dependencies based on distributed tracing activities. It catches calls from one service to the other and maps them as a dependency. In complex systems, you usually would find a tracing tool like DataDog, Tempo, NewRelic, Dynatrace, etc. What you need to do is periodically (once a day) query the tracing data and update your dependencies accordingly.
Summary
Without automation, your catalog will be outdated and not valuable for your users. There are many ways to automate your catalog maintenance - pick the ones which are the most important for you and use them :-)
Comentarios