Chef is a systems and cloud infrastructure automation framework that makes it easy to deploy servers and applications to any physical, virtual, or cloud location, no matter the size of the infrastructure. Chef relies on abstract definitions (known as cookbooks and recipes) that are written in Ruby and are managed like source code. Each definition describes how a specific part of your infrastructure should be built and managed. Chef then applies those definitions to servers and applications, as specified, resulting in a fully automated infrastructure. When a new node is brought online, the only thing that Chef needs to know is which cookbooks and recipes to apply.
The following diagram shows the relationships between the various elements of a Chef organization, including the nodes, the server, and the workstations. These elements work together to provide Chef the information and instruction that it needs so that it can do its job. As you are reviewing the rest of this doc, use the icons in the tables to refer back to this image.
Chef comprises three main elements: a server, one (or more) nodes, and at least one workstation.
Cookbooks are also a very important element of Chef and will be treated as a separate component (alongside the server, nodes, and the workstation) across the documentation. In general, the cookbooks are authored and managed from the workstation, moved to the Chef server, and then are pulled down to nodes by the chef-client during each Chef run.
The following sections discuss these elements (and their various components) in more detail.
A node is any server or virtual server that is configured to be maintained by a chef-client. A node can be physical or cloud-based. A Chef organization comprises any combination of physical and cloud-based nodes. A chef-client runs on each node. Ohai is used to collect data about the system so that it is available to the chef-client during every Chef run.
There are two types of nodes that Chef can manage:
|A cloud-based node is hosted in an external cloud-based service, such as Amazon Virtual Private Cloud, OpenStack, Rackspace, Google Compute Engine, Linode, or Windows Azure. Plugins are available for Knife that provide support for external cloud-based services. Knife can use these plugins to create instances on cloud-based services. Once created, Chef can be used to deploy, configure, and maintain those instances.|
|A physical node is typically a server or a virtual machine, but it can be any active device attached to a network that is capable of sending, receiving, and forwarding information over a communications channel. In other words, a physical node is any active device attached to a network that can run a chef-client and communicate with a Chef Server.|
Some important components on nodes include:
A chef-client is an agent that runs locally on every node that is registered with the Chef Server. When a chef-client is run, it will perform all of the steps that are required to bring the node into the expected state, including:
Chef uses RSA public key-pairs to authenticate a chef-client with the Chef Server every time a chef-client needs access to data that is stored on the Chef Server. This prevents any node from accessing data that it shouldn’t and ensures that only nodes that are properly registered with the Chef server can be managed by Chef.
Ohai is a tool that is used to detect certain properties about a node’s environment and provide them to the chef-client during every Chef run. The types of properties Ohai reports on include:
When Chef configures the node object during each Chef run, these attributes are used by the chef-client to ensure that certain properties remain unchanged. (These properties are also referred to as automatic attributes.) Ohai is part of the required configuration on each node that is registered with the Chef Server.
A workstation is a computer that is configured to run Knife, to synchronize with the Chef repository, and interact with a single Chef Server. The workstation is the location from which most users of Chef will do most of their work, including:
Some important components of workstations include:
Knife is a command-line tool that provides an interface between a local Chef repository and the Chef Server. Knife helps users of Chef to manage:
Chef uses RSA public key-pairs to authenticate Knife with the Chef Server every time Knife attempts to access the Chef Server. This ensures that each instance of Knife is properly registered with the Chef Server and that only trusted users can make changes to the data.
The Chef repository is the location in which the following data objects are stored:
The Chef repository is located on a workstation and should be synchronized with a version control system, such as git. All of the data in the Chef repository should be treated like source code.
Knife is used to upload data to the Chef Server from the Chef repository. Once uploaded, that data is used by Chef to manage all of the nodes that are registered with the Chef Server and to ensure that the correct cookbooks, environments, roles, and other settings are applied to nodes correctly.
Chef assumes that system administrators and developers know best about how the infrastructure should be put together. Chef makes as few decisions on its own as possible. When a decision must be made, Chef uses a reasonable default setting that can be easily changed by the system administrators and developers, most often by defining attributes in cookbooks that take precedence over the default attributes present on nodes.
The Chef Server acts as a hub for configuration data. The Chef Server stores cookbooks, the policies that are applied to cookbooks, and metadata that describes each registered node in the infrastructure. Nodes use the chef-client to ask the Chef Server for configuration details, such as recipes, templates, and file distributions. The chef-client then does as much of the configuration work as possible on the nodes themselves (and not on the Chef Server). This scalable approach distributes the configuration effort throughout the organization.
There are three types of Chef servers:
Hosted Chef is a version of a Chef Server that is hosted by Opscode. Hosted Chef is cloud-based, scalable, and available (24x7/365), with resource-based access control. Hosted Chef has all of the automation capabilities of Chef, but without requiring it to be set up and managed from behind the firewall.
Hosted Chef is based on the idea that an infrastructure management tool should be built around a collection of API primitives. By using an API to talk to a cloud provider (such as Amazon Virtual Private Cloud, Windows Azure, or Rackspace), it allows the freedom to think of those primitives as building blocks. Chef only needs to know about the desired state, how it should get there, and what the proper functionality of that desired state should be.
Private Chef is a version of a Chef Server that is designed to provide all of the infrastructure automation capabilities of Chef, set up and managed from within the organization.
Private Chef evolved out of a need for customers to have the same functionality provided by Hosted Chef, but located behind the firewall. Private Chef is the same as Hosted Chef in every other way. Hosted Chef is the largest Private Chef deployment in the world.
|The open source Chef Server is a free version of the Chef Server that contains much of the same functionality as Hosted Chef, but requires that each instance be configured and managed locally, including performing data migrations, applying updates to the open source Chef Server, and ensuring that the open source Chef Server scales as the local infrastructure it is supporting grows. The open source Chef Server includes support from the Chef community, but does not include support directly from Opscode.|
An API client is any machine that has permission to use the Chef Server API to communicate with the Chef Server. An API client is typically a node (on which the chef-client runs) or a workstation (on which Knife runs), but can also be any other machine configured to use the Chef Server API.
In addition to node objects, policy, and cookbooks, a Chef Server includes:
|Search indexes allow queries to be made for any type of data that is indexed by the Chef Server, including data bags (and data bag items), environments, nodes, and roles. Chef has a defined query syntax that supports search patterns like exact, wildcard, range, and fuzzy. A search is a full-text query that can be done from several locations, including from within a recipe, by using the search subcommand in Knife, by using the search functionality in the Management Console, or by using the /search or /search/INDEX endpoints in the Chef Server API. The search engine is based on Apache Solr and is run from the Chef Server.|
The Chef manager is a web-based interface that provides users of Chef a way to manage the following from the Chef Server:
For Chef, two important aspects of nodes are groups of attributes and run-lists. An attribute is a specific piece of data about the node, such as a network interface, a file system, the number of clients a service running on a node is capable of accepting, and so on. A run-list is an ordered list of recipes and/or roles that are run in an exact order. The node object consists of the run-list and node attributes, which is a JSON file that is stored on the Chef Server. The chef-client gets a copy of the node object from the Chef Server during each Chef run and places an updated copy on the Chef Server at the end of each Chef run.
Some important node objects include:
An attribute is a specific detail about a node. Attributes are used by Chef to understand:
Attributes are defined by:
During every Chef run, the chef-client builds the attribute list using:
After the node object is rebuilt, all of attributes are compared, and then the node is updated based on attribute precedence. At the end of every Chef run, the node object that defines the current state of the node is uploaded to the Chef Server so that it can be indexed for search.
|A run-list is an ordered list of roles and/or recipes that are run in an exact order. A run-list is always specific to the node on which it runs, though it is possible for many nodes to have run-lists that are similar or even identical. The items within a run-list are maintained using Knife and are uploaded to the Chef Server and stored as part of the node object for each node. Chef always configures a node in the exact order specified by its run-list and will never run the same recipe twice.|
Policy settings can be used to map the capabilities of Chef to business and operational requirements, such as process and workflow. Roles define server types, such as “web server” or “database server”. Environments define process, such as “dev”, “staging”, or “production”. Certain types of data, such as passwords, user account data, and other sensitive items can be placed in data bags, which are located in a secure sub-area of Chef that can only be accessed by nodes that have the correct SSL certificates.
Some important aspects of policy include:
|A role is a way to define certain patterns and processes that exist across nodes in a Chef organization as belonging to a single job function. Each role consists of zero (or more) attributes and a run list. Each node can have zero (or more) roles assigned to it. When a role is run against a node, the configuration details of that node are compared against the attributes of the role, and then the contents of that role’s run list are applied to the node’s configuration details. When a chef-client runs, it merges its own attributes and run lists with those contained within each assigned role.|
|A data bag is a global variable that is stored as JSON data and is accessible from a Chef Server. A data bag is indexed for searching and can be loaded by a recipe or accessed during a search. The contents of a data bag can vary, but they often include sensitive information (such as database passwords).|
|An environment is a way to map an organization’s real-life workflow to what can be configured and managed when using Chef Server. Every Chef organization begins with a single environment called the _default environment, which cannot be modified (or deleted). Additional environments can be created, such as production, staging, testing, and development. Generally, an environment is also associated with one (or more) cookbook versions.|
A cookbook is the fundamental unit of configuration and policy distribution in Chef. Each cookbook defines a scenario, such as everything needed to install and configure MySQL, and then it contains all of the components that are required to support that scenario, including:
Chef uses Ruby as its reference language for creating cookbooks and defining recipes, with an extended DSL for specific resources. Chef provides a reasonable set of resources, enough to support many of the most common infrastructure automation scenarios; however, this DSL can also be extended when additional resources and capabilities are required.
Some important components of cookbooks include:
|An attribute can be defined in a cookbook (or a recipe) and then used to override the default settings on a node. When a cookbook is loaded during a Chef run, these attributes are compared to the attributes that are already present on the node. When the cookbook attributes take precedence over the default attributes, Chef will apply those new settings and values during the Chef run on the node.|
A recipe is the most fundamental configuration element within the Chef environment. A recipe:
|A cookbook version represents a specific set of functionality that is different from the cookbook on which it is based. A version may exist for many reasons, such as ensuring that the correct version of third-party component is being used appropriately or providing an update to a cookbook that fixes a bug or adds a new improvement. A cookbook version can be defined using syntax and operators, it can be associated with environments, cookbook metadata, or run-lists, and it can be frozen (to prevent unwanted updates from being made). A cookbook version is handled just a cookbook with regard to how the repository sees a cookbook version, how cookbook versions are stored on the Chef Server, how cookbook versions are pushed out to nodes, and how cookbook versions are used during a Chef run.|
Chef will run a recipe only when asked. When Chef runs the same recipe more than once, the results will be the same system state each time. When a recipe is run against a system, but nothing has changed on either the system or in the recipe, Chef won’t change anything.
In addition to attributes, recipes, and versions, the following items are also part of cookbooks:
The key underlying principle of Chef is that you (the user) knows best about what your environment is, what it should do, and how it should be maintained. Chef is designed to not make assumptions about any of those things. Only the individuals on the ground—that’s you and your team—understand the technical problems and what is required to solve them. Only your team can understand the human problems (skill levels, audit trails, and other internal issues) that are unique to your organization and whether any single technical solution is viable.
The idea that you know best about what should happen in your organization goes hand-in-hand with the notion that you still need help keeping it all running. It is rare that a single individual knows everything about a very complex problem, let alone knows all of the steps that may be required to solve them. The same is true with tools. Chef provides help with infrastructure management. And Chef can help solve very complicated problems. Chef also has a large community of users who have a lot of experience solving a lot of very complex problems. That Chef community can provide knowledge and support in areas that your organization may not have and (along with Chef) can help your organization solve any complex problem.
For history of Chef, where it came from and how it evolved, watch these two (short) videos:
For more information about Opscode, cookbooks for Chef, and the Chef community: