Puppet: Introducing Puppet Data Service

by Dimitri Tischenko|25 May 2022

What is this blog post about?

I work on the Solutions Architects team here at Puppet. We are sometimes the first line of defense in solving some of our customers' most challenging or unique problems, and every so often, we see trends in problems that we'd previously considered to be edge cases. For example, customers often ask us two questions:

How can we connect external configuration data to Puppet?
How can we speed up changing their own configuration for service owners without onboarding them to Git approval processes?

These questions (and others) have inspired the Solutions Architects team to develop Puppet Data Service. Puppet Data Service (PDS) is an add-on for Puppet Enterprise that provides a REST API and database storage for configuration data. This article explains why PDS exists and how it can help your daily Puppet automation practices.

Management summary

PDS provides:

A database for trusted node data
A database-backed Hiera backend
A REST API for managing the above

By providing these features, PDS enables new workflows for managing configuration without using Git processes, such as self-service workflows for infrastructure owners, DevOps teams or end users of infrastructure.

Please note: This software is not supported by Puppet and does not qualify for Puppet Support plans. It's provided without guarantee or warranty. Its status is experimental.

Puppet configuration 101

(Note: Skip this section if you are an experienced Puppet Enterprise (PE) admin and know the secrets of facts, trusted facts, and Hiera.)

As a seasoned practitioner, you know that one of the secrets of a successful configuration management strategy is keeping a Single Source of Truth. (This principle is sometimes also called “DRY” or “Don't Repeat Yourself.”) The idea is that each piece of your configuration should be stored in one and only one predictable location. This is important because when you need to change that configuration, you want to be able to modify it quickly and efficiently, and update all dependent configurations automatically accordingly. It is also important to realize that Puppet code is usually not the right place to store (hardcoded) configurations; you want to be flexible and make your code reusable across different teams, segments, and platforms within your infrastructure.

Puppet Enterprise has many options for storing, maintaining, and retrieving configuration data to be used in code. In this article, we will focus on two most frequently used configuration points: trusted facts and Hiera.

Facts and trusted facts

Puppet comes with many built-in facts which are available on any Puppet node. Puppet teams can also write their own custom facts.

A disadvantage of using facts is that they are supplied by the node at the start of a Puppet run, meaning that a user with admin access to a node can potentially tamper with their value on that node, possibly impersonating another node type and thus changing its configuration.

To prevent this from happening, Puppet offers trusted facts. Those facts are built into the node certificate before signing and cannot be changed later without invalidating the certificate's signature. Because these facts are tamper-resistant, updating them is a heavy process, requiring creating and signing a new certificate for the node.

Hiera data

So our node sends all its facts to the Puppet server and asks it to provide configuration. The Puppet server compiles a catalog using modules and classes available to it which are assigned to the node using the Node Classifier (usually, the PE Console). Classes reside within Puppet modules written in the Puppet language.

Puppet modules (such as puppetlabs-mysql for managing mysql server, or puppetlabs-ntp for managing the ntp service) are designed to be reusable. This means that they are generic and do not contain configuration information specific to anyone's environment.

Hiera is a hierarchical database for configuration data looked up by Puppet during catalog compilation. This way, you achieve a single source of truth again; all "logic" for your configuration is in Puppet code, and all "data" for your configuration is inside your Hiera database. This allows the general purpose Forge modules to be customized to your specific needs without changing their code.

By default, the Hiera database consists of text files in the YAML format and stored on the Puppet server. You determine how the files are organized and in which order they need to be looked up.

Usually, Hiera data is part of a Git repository we call the control repo. To change Hiera data, you need to follow the process for submitting a Git pull (or merge) request and get an approval, get the change tested, and finally deployed. This is well-suited to the Puppet platform team, but less so for end users, such as DevOps teams or service owners who just want to quickly change a configuration item inside their application.

PDS to the rescue

Puppet Data Service, or PDS for short, is a Puppet Enterprise extension that provides you with an additional way to store and access configuration data for Puppet.

PDS solves the problems described above by offering the following features for Puppet teams and users:

It provides an alternative source (database) for trusted node data, appearing to Puppet as trusted facts that the local node admin cannot tamper with. Site administrators can update this trusted data without requiring certificate changes on every update!
It provides an additional Hiera backend allowing arbitrary data to be stored in the PDS database and benefit from the simplified management workflows.
It allows management of this data with either an easy-to-use CLI tool or a simple and robust REST API.
Since you can add any data you wish, storing class parameters in PDS provides the ability for service owners to provide classification to be merged in with centralized classification from the site administrators.

Now service owners or DevOps teams can start managing their own configuration data without going through the full Git pull request and approval process. You can also combine the PDS Hiera backend with the original flat-file YAML backend, giving the PDS backend priority but keeping the YAML backend as a fallback.

To that end, PDS provides:

A REST API service
A back-end database
A CLI for testing and scripting
A Hiera backend
A trusted external command for node data
A Puppet module for simple installation and configuration rpm packages used by the module

What is this blog post about?

Management summary

Puppet configuration 101

Facts and trusted facts

Hiera data

PDS to the rescue

Recommend

剑玉游戏能让我平静

Figma component properties — broken logic or new approach?

EDIFIER 漫步者 S1000MKII 多媒体音箱木纹色 898元包邮-聚超值

抖音、美团“神仙打架”

HP 惠普战X 2022款 13.3英寸轻薄笔记本（R5-6600U、16GB、512GB） 5299元包邮-聚超值

面试官让我5分钟内写一个抢红包程序，我和他说了半小时原理！

游戏主播认为竞技游戏的对手匹配算法对他们不友好

to check weather a given username in the file exist on github or not

除了大大大，我们还能如何「适老化」设计

东莞又跑出一个IPO：他靠床垫干出40亿身家

About Joyk