Assignment 3 – Scripting for System Automation COMP9053
1) Identify key Infrastructure Automation tools.
2) For each category, pick 3 of the tools listed & give a high-level description/definition of each tool. The description should consist of a single paragraph
Terraform is an infrastructure provisioning tool designed with an emphasis on reusability and share-ability. It allows for representation of physical hardware, virtual machines, containers, DNS providers and more. Through the use of configuration files written in GO, Terraform interprets, by generating an ‘execution plan’, what is required to provide the required infrastructure to the user.
Cobbler is a Linux installation server that allows for rapid setup of network installation environments. It automates many associated Linux tasks so you do not have to switch between many various commands and applications when deploying new systems and changing existing ones. Cobbler can help with provisioning, managing DNS and DHCP.
Chef provisioning is a collection of resources that enable the creation of machines and machine infrastructures using the chef-client. It has a plugin model that allows bootstrap operations to be done against any infrastructure, such as VirtualBox, DigitalOcean, Amazon EC2, LXC, bare metal, and more.
Puppet is an open-source, model driven management system, where the user defines required system resources in a Puppet ‘manifest’ file, written in a non-standard declarative language. These manifest files will often be hosted on a server and provided to the client as configuration instructions.
Ansible covers many aspects of Infrastructure as Code. It utilises YAML to construct ‘Ansible Playbooks’, allowing the user to manage configurations of web servers. It favours using ssh over the client/server model of Puppet.
CFEngine is an open-source confutation management system that has been in development since 1993 and has spawned several iterations since. Developed in the C programming language, it promotes lower memory usage, fewer dependencies and faster speeds than similar configuration management tools such as Puppet.
Juju uses ‘charms’ to to deploy, manage and scale software and interconnected services across one or more Ubuntu servers and cloud platforms.
Cloudify is an open-source orchestration tool that promotes it’s usability across all technologies. For architectures that run across multiple clouds, Cloudify can perform network automation at scale using TOSCA declarative blueprint files.
Heat is an orchestration tool for managing applications deployed on the cloud. Heat can orchestrate functionalities such as ports, routers, instances, and private networks.
Jenkins is an open-source tool to automate deployment of software through continuous building and testing. Through plugins it can be expanded to integrate with third party applications to handle all steps in the Continuous Integration and Continuous Deployment pipeline.
TravisCI builds are configured using a YML file which contains the build tasks that will be executed upon running the build.
GitLabCI is a hosted Continuous Integration solution fully integrated with GitLab that process builds written in the Go language. These builds can run on Windows, Linux, OSX, FreeBSD, and Docker
3) For the 3 chosen tools in a given category, highlight the key similarities/difference between them & from this identify environments/users that might be suitable for each tool
Puppet and CFEngine are two of the most widely compared tools in the IAC (Infrastructure as
Code) spectrum. Puppet is regarded to be the Ops-friendly option, due to its model-driven approach and relatively small learning curve whereas CFEngine is deemed to be more of a dev-friendly implementation. In contrast to the Ruby-built Puppet, CFEngine was developed with C. This means that CFEngine has a dramatically smaller memory footprint, runs faster and has far fewer dependencies. For configuration information, CFEngine uses its own declarative language to create policy statements. Puppet, on the other hand, uses a Ruby DSL to create its manifests. So those with some Ruby experience may find themselves in more familiar territory with Puppet.
One of the primary issues for CFEngine is that the learning curve is quite steep. Puppet’s model-driven implementation requires a much smaller learning curve, which makes it a popular choice for sysadmins with limited coding experience. The model-driven implementation also takes on a lot of the burden for dependency management.
Ansible, on the other hand, is seen as more useful for small, fast or non-permanent deployments (like managing a set of web servers for a one-off project). As a much more recent product, the Ansible GUI is less developed than, say, Puppets, and offers less of a level of comfort than it’s two bigger competitors. It is developed in Python, making it somewhat slower than CFEngine for example, but similar to another high level language like Puppet’s Ruby. Both tools also have the idea of idempotency, which is essential for automation. If written correctly, an automated script should leave the system in a consistent state, no matter how many times it is performed and despite what else may have changed on that system. This is very difficult when writing regular shell scripts, but it is made much easier by a tool like Puppet or Ansible that is written in terms of resources and desired state.
Similarities between Puppet and Ansible config: https://dzone.com/articles/puppet-or-ansible-how-to-choose
Both heat and Juju are service orchestration on the cloud. Juju is used to deploy everything including complete open stack cloud both on virtual and physical environment. In other words, Juju is considered as both cloud orchestration and deployment tool. But heat is cloud-formation like orchestration engine, that provides open stack infrastructure resource provisioning and resource lifecycle management for instances running on the open stack compute service such as floating IPs, volume, virtual touters etc.
Juju is more than an orchestrator. The basic unit of Juju is the ‘charm’, which is essentially a software package with some configuration information bundled together in a consistent way. Juju allows you to combine charms together into an application running across multiple machines, sort of like a distributed package manager. So, for example, a web application server might be provided with the details of how it can connect to a database deployed on another (or the same) server. It can also provision machines (virtual or actual) to deploy the software on, from any of a number of platforms: OpenStack Nova, EC2, MaaS.
Heat is focused on OpenStack infrastructure – not just Nova servers, but every kind of resource that OpenStack provides (Nova servers, Cinder volumes, Neutron networks, load balancers, security groups, Swift buckets, Trove databases, Sahara clusters and many, many others). Heat also includes Software Deployment resources which allow you to deploy software to Nova servers and link together the configuration of the various deployments in a similar way to Juju. A Heat template can define the entire infrastructure and software deployment of an application, though it can also be broken down into logical units where appropriate.
When first implementing a continuous interaction solution, the most pressing matter will be where it needs to be hosted. Jenkins requires developers to run and maintain their own dedicated server, whereas TravisCI will host externally (often at a cost, dependent on the users needs). Like TravisCI, GitLabCI is also a hosted solution. Travis and Gitlab both work on large scale projects out of the box, but lack the configurability of Jenkins, which requires a lot more configuration but also a lot more extendability via it’s plugins. For TravisCI, the user must maintain only a config file, while for Jenkins the user is responsible for maintaining the entire system, although system updates are made simple through its GUI. Travis CI is ideal for open source projects that require testing in multiple environments, and Jenkins is better suited for larger projects that require a high degree of customisation.
Perhaps GitLab CI’s biggest draw is it’s integration out-of-the-box with GitLab, however Jenkins does have a plug in that integrates with GitLab, albeit in a slightly less user friendly way.
4) For each category pick one of the tools & give a detailed description of it.
Chef is an automation tool that provides a way to define infrastructure as code, using a Ruby, domain-specific language for writing system configurations. In Chef, Nodes are dynamically updated with the configurations in the server. There is therefore no need to execute a command on the Chef server to push the configuration on the nodes, nodes will automatically update themselves with the configurations present in the Server.
Chef supports several platforms such as AIX, RHEL/CentOS, FreeBSD, OS X, Solaris, Microsoft Windows and Ubuntu as well as client platforms including Arch Linux, Debian and Fedora. Chef also has the ability to be integrated with several cloud-based platforms such as Internap, Amazon EC2, Google Cloud Platform, OpenStack, SoftLayer, Microsoft Azure and Rackspace to automatically provision and configure new machines.
As outlined in the above diagram, there are three major Chef components:
The Workstation is the location from which all of Chef configurations are managed. This machine holds all the configuration data that can later be pushed to the central Chef Server. These configurations are tested in the workstation before pushing it into the Chef Server. A workstation consists of a command-line tool called Knife (replaced in the 12.0 release of Chef, with a config.rb file), that is used to interact with the Chef Server. There can be multiple Workstations that together manage the central Chef Server
The Chef Server acts as a hub for configuration data. The Chef Server stores Cookbooks, the policies that are applied to Nodes, and metadata that describes each registered Node that is being managed by the Chef-Client.
Nodes use the Chef-Client to ask the Chef Server for configuration details, such as Recipes, Templates, and file distributions. The Chef-Client then does as much of the configuration work as possible on the Nodes themselves (and not on the Chef Server). Each Node has a Chef Client software installed, which will pull down the configuration from the central Chef Server that are applicable to that Node. This scalable approach distributes the configuration effort throughout the organisation.
Nodes can be a cloud based virtual server or a physical server in your own data center, that is managed using central Chef Server. The main component that needs to be present on the Node is an agent that will establish communication with the central Chef Server. This is called Chef Client.
Chef Client performs the following functions:
• It is responsible for interacting with the central Chef Server.
• It manages the initial registration of the Node to the central Chef Server.
• It pulls down Cookbooks, and applies them on the Node, to configure it.
• Periodic polling of the central Chef Server to fetch new configuration items, if any
Sample config.rb file
Sample metadata.rb file
Puppet configuration consists of a language, client-server processes and the Resource Abstraction Layer. The language allows the description of a server configuration with an abstraction of the resources that an administrator already thinks in: users, groups, packages, files, cron, mount and services.
The relationships between the resources are also specified. For example, a service depends on a configuration file, and that file depends on a package being installed. The relationships provide order as the policy is applied and allow Puppet to restart dependent services when their configurations change.
The resources can also be composed into logical collections. To reuse the previous example, a package, a configuration file and a service can be grouped together. The group can then be reused and treated as a single logical entity in other Puppet code. The client-server setup provides a secure mechanism for transporting the specific configurations from the central description to the individual hosts over HTTP with SSL authentication and encryption — the same SSL used to secure online banking and e-commerce. Each host only receives its specifically compiled configuration to apply.
The following functions are performed in the above image:
• The Puppet Agent sends the Facts to the Puppet Master. Facts are basically key/value data pair that represents some aspect of Slave state, such as its IP address, up-time, operating system, or whether it’s a virtual machine.
• Puppet Master uses the facts to compile a Catalog that defines how the Slave should be configured. Catalog is a document that describes the desired state for each resource that Puppet Master manages on a Slave.
• Puppet Slave reports back to Master indicating that Configuration is complete, which is visible in the Puppet dashboard.
• Puppet Slave asks for Puppet Master certificate.
• After receiving Puppet Master certificate, Master requests for Slave certificate.
• Once Master has signed the Slave certificate, Slave requests for configuration/data.
• Finally, Puppet Master will send the configuration to Puppet Slave.
Every Slave has got its configuration details in Puppet Master, written in the native Puppet language. These details are written in the language which Puppet can understand and are termed as Manifests. They are composed of Puppet code and their filenames use the .pp extension. These are basically Puppet programs.
Puppet manifest: https://www.digitalocean.com/community/tutorials/getting-started-with-puppet-code-manifests-and-modules
Ansible is a minimalist IT automation tool that has a low learning curve, using YAML for it;s provisioning scripts. It has a great number of built-in modules that can be used to abstract tasks such as installing packages and working with templates. Modules are Ansible’s way of abstracting certain system management or configuration tasks. In many ways, this is where the real power in Ansible lies. By abstracting commands and state into modules, Ansible is able to make system management idempotent. This is an important concept that makes configuration management tools like Ansible much more powerful and safe than something like a typical shell script
Playbooks allow you to organize your configuration and management tasks in simple, human-readable files. Each playbook contains a list of tasks (‘plays’ in Ansible parlance) and are defined in a YAML file. Playbooks can be combined with other playbooks and organized into Roles which allow you to define sophisticated infrastructures and then easily provision and manage them
Jenkins coordinates a wide variety of activities, such as checking out and building new versions of code, running tests, and deploying software
Jobs are the runnable tasks that are controlled and monitored by Jenkins. Examples of jobs include compiling source code, running tests, provisioning a test environment, deploying, archiving, posting build jobs such as reporting, and executing arbitrary scripts. Jenkins jobs can be scheduled to run continuously, often on a nightly basis. The setup of a Jenkins job is straightforward and can integrate with third party extensions in order to deploy to external applications. Webhooks can integrate with triggers to initiate builds when certain criteria is met, for example when a branch updates on Github. Jenkins monitors the execution of the steps and allows to stop the process, if one of the steps fails. Jenkins can also send out notifications in case of a build success or failure.
Jenkins stores all the settings, logs and build artefacts in its home directory. Users can view previous builds, console output and workspaces.
Jobs are defined in their config.xml file.
Sample config.xml file: https://github.com/cyberswat/jenkins-example-configs/blob/master/jobs/devops%20test%20boulder/config.xml