$ServerName.active

Chef is an automation platform the "turns infrastructure into code," allowing organizations to version control and deploy services and code to multiple servers in a repeatable fashion. Chef cookbooks are the fundamental system of configuration and policy distribution on the chef platform. A cookbook defines a system or application and contains everything that is required to support those components. A cookbook can contain the following elements :

Recipes the define the resources to use and the order in which to apply them.
Attribute values
Files
Templates
Extensions to Chef including custom resources and libraries.

This post will discuss the versioning of cookbooks using semantic versioning practices. Please Note that this post is a conglomeration of several blog posts

Cookbook Versioning

Use semantic versioning when numbering cookbooks. This versioning can be found in the metadata.rb file of the cookbook.

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes,
MINOR version when you add functioanlity in a backwards-compatible manner,
PATCH version when you make backwards compatible bug fixes

Additional lables for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

Only upload stable cookbooks from master.
Only upload unstable cookbooks to your own fork. Merge to master and bump the version when stable.
Never ever decrement the version of a cookbook!

Chef-client will always use the highest-numbered cookbook that is available after considering all constraints. If Chef Server knows about a cookbook with a higher number than the one you just uploaded, then your code is not going to get run. Do not add a version constraint in your test environment to work around this; it will definitely bite you later on. Your build system should fail the build if the cookbook version has not been incremented beyond the last uploaded cookbook. This matters even more if you're publishing to Supermarket.

Bug fixes not affecting the code increment the patch version, backwards compatible additions/changes increment the minor version, and backwards incompatible changes increment the major version.
This system is called "Semantic Versioning." Under this scheme, version numbers and the way they change convey meaning about the underlying code and what has been modified from one version to the next.

Cookbook Versioning Specifications

A normal version number MUST take the form X.Y.Z where X, Y, and Z are non-negative integers, and MUST NOT contain leading zeroes. X is the major version, Y is the minor version, and Z is the patch version. Each element MUST increase numerically. For instance: 1.9.0 -> 1.10.0 -> 1.11.0.
Once a versioned package has been released, the contents of that version MUST NOT be modified. Any modifications MUST be released as a new version.
Major version zero (0.y.z) is for initial development. Anything may change at any time. This public cookbook should not be considered stable.
Version 1.0.0 defines the public cookbook. The way in which the version number is incremented after this release is dependent on how the cookbook changes
Patch version Z (x.y.Z | x ; 0) MUST be incremented if only backwards compatible bug fixes are introduced. A bug fix is defined as an internal change that fixes incorrect behavior
Minor version Y (x.Y.z | x ; 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API. It MUST be incremented if any public API functionality is marked as deprecated. It MAY be incremented if substantial new functionality or improvements are introduced within the private code. It MAY include patch level changes. Patch version MUST be reset to 0 when minor version is incremented.
Major version X (X.y.z | X ; 0) MUST be incremented if any backwards incompatible changes are introduced to the public API. It MAY include minor and patch level changes. Patch and minor version MUST be reset to 0 when major version is incremented.
Precedence refers to how versions are compared to each other when ordered. Precedence MUST be calculated by separating the version into major, minor, patch and pre-release identifiers in that order (Build metadata does not figure into precedence). Precedence is determined by the first difference when comparing each of these identifiers from left to right as follows: Major, minor, and patch versions are always compared numerically. Example: 1.0.0, 2.0.0, 2.1.0, 2.1.1. When major, minor, and patch are equal, a pre-release version has lower precedence than a normal version.

FAQ

How Should I deal with revisions in the 0.y.z initial development phase?

The simplest thing to do is start your initial development release at 0.1.0 and then increment the minor version for each subsequent release.

How do I know when to release to 1.0.0?

If your software is being used in production, it should probably already be 1.0.0. If you have a stable cookbook on which users have come to depend, you should be 1.0.0. If you’re worrying a lot about backwards compatibility, you should probably already be 1.0.0.

If even the tiniest backwards incompatible changes to the public cookbook require a major version bump, won't I end up at version 42.0.0 very rapidly?

This is a question of responsible development and foresight. Incompatible changes should not be introduced lightly to software that has a lot of dependent code. The cost that must be incurred to upgrade can be significant. Having to bump major versions to release incompatible changes means you’ll think through the impact of your changes, and evaluate the cost/benefit ratio involved.

What do I do if I accidentally release a backwards incompatible changes as a minor version?

As soon as you realize that you’ve broken the Semantic Versioning spec, fix the problem and release a new minor version that corrects the problem and restores backwards compatibility. Even under this circumstance, it is unacceptable to modify versioned releases. If it’s appropriate, document the offending version and inform your users of the problem so that they are aware of the offending version.

How should I handle deprecating functionality?

Deprecating existing functionality is a normal part of software development and is often required to make forward progress. When you deprecate part of your public API, you should do two things: (1) update your documentation to let users know about the change, (2) issue a new minor release with the deprecation in place. Before you completely remove the functionality in a new major release there should be at least one minor release that contains the deprecation so that users can smoothly transition to the new API.

Part of what makes chef tooling so powerful is its ability to test your product quickly and easily over a variety of different platforms. Using recipes and test-kitchen, a chef user can call on a variety of different drivers to push their cookbooks to EC2 instances, docker containers, VSphere and Azure instances in a matter of moments. If a driver exists for the platform, test-kitchen can be used for "local" deployment.

This article will talk about the deployment of infrastructure to EC2 instances through test-kitchen. Note however that EC2 is just an example case to get one started. Several other driver platforms can be found including ::

kitchen-all	A driver for everything, or “all the drivers in a single Ruby gem”.
kitchen-bluebox	A driver for Blue Box.
kitchen-cloudstack	A driver for CloudStack.
kitchen-digitalocean	A driver for DigitalOcean.
kitchen-docker	A driver for Docker.
kitchen-dsc	A driver for Windows PowerShell Desired State Configuration (DSC).
kitchen-ec2	A driver for Amazon EC2.
kitchen-fog	A driver for Fog, a Ruby gem for interacting with various cloud providers.
kitchen-google	A driver for Google Compute Engine.
kitchen-hyperv	A driver for Hyper-V Server.
kitchen-joyent	A driver for Joyent.
kitchen-linode	A driver for Linode.
kitchen-opennebula	A driver for OpenNebula.
kitchen-openstack	A driver for OpenStack.
kitchen-pester	A driver for Pester, a testing framework for Microsoft Windows.
kitchen-rackspace	A driver for Rackspace.
kitchen-terraform	A driver for Terraform.
kitchen-vagrant	A driver for Vagrant. The default driver packaged with the Chef development kit.

This list is pulled directly from Chef's Kitchen docs page. Other drivers can also be found in the open community including kitchen-cloudformation which was used to some success for a recent project that my company worked on.

In order to begin using test-kitchen you must install chefdk. This will not be covered here but I discussed these steps earlier in a previous blog post here. The kitchen-ec2 driver is installed with ChefDK by default and so no further work will be needed to setup the ec2 kitchen driver.

An AWS account will also be required to launch instances.

REQUIREMENTS:

AWS Account
ChefDK installed on your local computer

First start by opening your Chef Development tools by double clicking on the link.

Next, we will need to create a cookbook to run test kitchen on. We can do this by running the command chef generate cookbook [cookbook_name] . I will name this cookbook test_cookbook to keep it descriptive.

When this command runs successfully you will see a number of things returned, including a section which says "Your cookbook is ready. Type 'cd [cookbook_name]' to enter it"

Since actually designing these cookbooks is out of scope for now all we are really interested in is the kitchen.yml file. This file is what will give our cookbook the information needed to spin up our instance through test-kitchen. Using your favorite editor (we will be using visual studio code for this demo, which can be downloaded here) open the new cookbook that you just created.

As we can see in the explorer window there are a great number of files and directories from which to choose from. We will discuss these in a later blog post. What we are most worried about now is the kitchen.yml file.

The kitchen.yml contains a number of different components and will provide a great deal of control when spinning up our instances. We are currently using the EC2 driver. It's full documentation can be found here. Since this is just a guide to get us started lets look at four specific locations; driver, transport, platform and suites.

Lets walk through the components.

Driver

name is the name of the driver that we are going to use to spin up our instance. In this case, as explained earlier, we will use the ec2 kitchen driver. This driver uses the aws sdk gem to provision and destroy EC2 instances.

Instance_type is the EC2 instance type (also known as size) to use. The default is t2.micro or t1.micro, depending on whether the image is hvm or paravirtual. (paravirtual images are incompatible with t2.micro.)

aws_ssh_key_id is The ID of the AWS key pair you want to use. The default will be read from the AWS_SSH_KEY_ID environment variable if set, or nil otherwise. If aws_ssh_key_id is specified, it must be one of the KeyName values shown by the AWS CLI: aws ec2 describe-key-pairs. Otherwise, if not specified, you must either have a user pre-provisioned on the AMI, or provision the user using user_data. This all gets very technical but luckily there are some good instructions on how to pull down these keys here.

security_group_ids these are an Array of EC2 security groups which will be applied to the instance. The default is ["default"].

Transport

ssh_key is The private key file for the AWS key pair you want to use. This will allow you to ssh into your instance to run your recipe during the kitchen converge stage and using the kitchen login

Platforms

name is the way to specify the image you are wanting to run on your instance.

Suites

suits a collection of test suites, with each suite_namegrouping defining an aspect of a cookbook to be tested. Each suite_name must specify a run-list, for example

name the name of the suite that we are going to run.

run_list what we want run on the instance when it is up and running.

All together our kitchen.yml will look something like this.

This .yml says we want to provision a ubuntu-16.04 server on a t2.medium instance. We will use the key my_key as our aws ssh key to provision the instance. This key will also be used to ssh into the instance once it is provisioned. The instance will have two security groups included (As part of an array), thus the [], and will run the default recipe for our cookbook.

Since our cookbook default recipe is completely empty this will be a relatively boring test but we will run it anyway just to get a feel for what test-kitchen will do.

First we will launch chefDK again if we have closed it by going to the icon.

Navigate to the root folder of our project. For this case it was test_cookbook, yours may differ.

Next we will run berks install this will download any cookbook dependencies that we might need to get our current cookbook to run. While we are not layering or wrapping any cookbooks currently, this is a good habit to get into so you don't run into problems in the future. NOTE: If you have already run berks install on a cookbook you will run berks upgrade from that point on. Berks install is only for the initial run.

Unfortunately, my connection was not being friendly when this was written, but berks did complete. Next we will actually run test-kitchen. From your command line run kitchen converge. This command will spin up the instance and then install your cookbook on top of it.

If everything is setup appropriately you should start to see movement in your command prompt

After everything is deployed and setup you should get the return "Kitchen is finished" letting you know that everything deployed successfully.

in order to log in to our instance we will run kitchen login

Once in our instance we can make any changes necessary and verify our instance. Type exit to get back out.

Finally, to destroy the instance we will type kitchen destroy.

This is just the very beginning steps of what can be done with kitchen. Test kitchen is a powerful tool for testing cookbooks locally before pushing into staging or production boxes.

Recently, I have been working with Chef Automation. Chef, as pulled directly from their literature:

lets you manage ... all (servers) by turning infrastructure into code. Infrastructure described as code is flexible, versionable, human-readable, and testable. Whether your infrastructure is in the cloud, on-premises or in a hybrid environment, you can easily and quickly adapt to your business’s changing needs with Chef

As described in the pull quote above the benefit of Chef is that everyone from infrastructure engineer to developer will write their product through code which can be version controlled and deployed on both your internal (on-premise) and external (cloud) solutions.

It's a very powerful tool with huge ramifications for fledgling or aging organizations.

Before we get to deep into the woods on Chef, the philosophy of infrastructure through code, and all the tools chef opens up to an organization, lets first touch on the installation of the chef developer tools which will henceforth be referred to as ChefDK.

Unfortunately, or fortunately, I work in a windows based shop so the following instruction set will be designed specifically with the Microsoft OS in mind.

The following documentation is written to the current state of the chef system at time of writing. Since Chef at its base is an open source solution it is constantly changing with product need.

Since I am in love with CLI commands and Powershell I avoid the installation guide provided by chef at their website. While these instructions work fantastically well I like to script the great majority of my installations so I can call upon them in the future.

With this in mind I am a huge proponent of Chocolately. Chocolately easily manages all aspects of Windows software (installation, configuration, upgrade, and uninstallation) and makes scripting installs soooo much easier on windows. To install Chocolately:

First, ensure that you are using an administrative powershell
Run Get-ExecutionPolicy. If it returns Restricted, then run Set-ExecutionPolicy AllSigned or Set-ExecutionPolicy Bypass.
Now run the following command

Set-ExecutionPolicy Bypass; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

And that's it after the script has completed its run, chocolately will have been installed on your local machine.

From here installing chefdk is as simple as finding the package on the chocolately site:

From the top right hand corner of https://chocolatey.org/ select "Packages"

In the "Search Packages" box type Chef.

We are looking for the Chef Development Kit which is the first selection presented. As we can see in the image above, the chocolately command that we need to run to install this package is choco install chefdk.

From our powershell window, still running in administrator mode, type the above command.

After being prompted whether you wish for the package to be installed, you do by the way (hit Y for yes), ChefDK will now be installed for you through the chocolately tool.

With the chocolately package installed you can quickly upgrade you chefdk installation in the future with choco upgrade chefdk or easily remove the package with choco uninstall chefdk.

Google failed me....When I was thinking about how to start off this blog post I ran the gamut of jokes, quips and amusing anecdotes but in the end I felt it fitting to start with my disbelief. In my years of IT Google has always had all the answers...Mind you not always EXACTLY how I've wanted them, aka: spending an afternoon modifying code, using some random error that was kind of like mine to fix an issue or getting the answer "you can't do that you shouldn't have taken on this project/problem in the first place". But this time nothing. There was plenty of material about JDBC drivers or Hive with ODBC drivers outside of Amazon Web Services Elastic Map Reduce but nothing that worked for my instance and certainly nothing that was going to help me solve my problem. So I will add to the almighty Google and help it fill the void that I tragically found.

First, and foremost let me explain my situation a little to give you a general overview of what we are doing here. I have an Amazon Web Services (AWS) account and am using it to spin up Elastic Map Reduce (EMR) instances. EMR is (according to Amazon) :

Amazon EMR is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).

Basically, what all this means is EMR is cloud Hadoop. If that doesn't mean anything to you then you probably aren't really in need of this post to begin with. My issue came when I was trying to use Microsoft Excel to attach to Hive (A component of the HADOOP installation) and view the tables, columns, etc that were being spit out to me by my HADOOP processes. Mind you this is a over simplification for what is going on, but I'd rather get to the good stuff instead of spending all day explaining what very little I know about HADOOP and its processes. So without further ado here is the process I used to get my ODBC driver set up with HIVE on my AWS EMR instance. (That was a mouthful -ed.)

First we will need to download the ODBC driver that is available through AWS. This can be found at : http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-bi-tools.html . Download the ODBC driver that is necessary for your environment. In my case http://amazon-odbc-jdbc-drivers.s3.amazonaws.com/public/HiveODBC.zip.

Once this has been downloaded we will install it on our machine. This is a pretty straight forward process. Make sure we choose the correct version of the product. Windows is my OS, 64bit is what my machine is running at home. Next, Accept EULA, Next, Installer Location, Next, Install, Finish

After the installation we will need to launch an EMR cluster, if one is already running don't worry about this step. NOTE: many times ODBC connections to HIVE will call for a Thrift server. By default if you have HIVE installed on your EMR cluster Thrift will be installed as well and this will be moot.

Once we have made sure our new cluster is provisioned and running we will need to connect to it. We will start with the normal steps that we find on the AWS site for establishing SSH connections.:

If using Putty, after selecting the session option on the left, add the Hostname (hadoop@ec2-###-###-###-###.us-west-2.compute.amazonaws.com, change the hashes with the servers public IP) and the Port (22)

We will then select the plus next to SSH and expand it. From here we will go to Auth. Under "Private key file for authentication" put the key that is associated with the cluster we have launched and added to our "hostname" in the above step.

Next, click on the "Tunnels" on the left hand side. this is where things will deviate from the original instructions as outlined by Amazon. We will still add our 8157, by default, tunnel for the connection to the web tools for our EMR instance but we will add on additional tunnel. This tunnel will be for the connection we will be making with our ODBC driver to HIVE on the machine. By default this tunnel needs to sit at port 10000. This is determined by the installation version of HIVE on the EMR instance. For our purposes, we will select Local and Auto which can be found BELOW "Destination". In the "Destination" section we will add the hostname of our cluster sans the hadoop@ that proceeds it. We will then at the end of our hostname add :10000 with 10000 being the tunnel we are establishing. (ec2-###-###-###-###.us-west-2.compute.amazonaws.com: 10000). In source port we will also add 10000. At the end of the day, prior to clicking "Add", It should all look something like this.

Once this is to your liking click "Add" and it will be displayed in the box above "add new forwarded port:". For the sake of speeding these instructions up we will also add our 8157 port with Dynamic and Auto selected, as per Amazons original guidelines.

Once these steps have been completed select Open. to pull up your HADOOP SSH instance.

We will now want to verify that we can reach the port that we just opened up through SSH to the server. The easiest way to do this is with a simple telnet command. So open up command prompt, or powershell if you're into that kind of thing, and type in (telnet ec2-###-###-###-###.us-west-2.compute.amazonaws.com 1000). If all goes well you should be presented with an empty prompt box...otherwise you will get an error and will need to check security settings on both your local computers side and the AWS VPC side.

Without keen insight into any one given persons environment getting this telnet session to establish will be different from environment to environment. The best advice I can give you is to make sure your ports are allowed through Windows Firewall, AntiVirus, AWS VPC Security settings (Inbound and Outbound) and then lastly with your companies physical firewall settings.

Now that the basics are taken care of we can configure the ODBC driver. First, find your ODBC Data Sources (32-bit or 64-bit) program under administrative tools and open it. I will be using the 64 bit version of the tool since I have 64 bit Microsoft Excel that I will be accessing the ODBC connection with.

Once this is pulled up we will create a new ODBC driver by going to "System DSN" and clicking "Add"

In the window presented we will select the "Amazon Hive ODBC Driver" option and click "finish".

We will now start adding the information we know into the "Amazon Hive ODBC Driver DSN Setup" Window. Our "Data Source Name" can be whatever we want. The "description" is also trivial. For the "host" we will once again add our Hostname (ec2-###-###-###-###.us-west-2.compute.amazonaws.com) and we will change the port to the port that we added for HIVE, in our case 10000. For the database we will need to put in whatever database we have created or used, I'm just using default, and for "Hive Server Type" we will put in "Hive Server 1". NOTE: If this does not work try using "Hive Server 2". I have had problems where "Hive Server 1" does not allow me to put in my User Name and Password.

Moving onto Authentication our Mechanism for Authentication will be username and password. By default this will be set to "user name" : emr "password" : emr.

Once these steps are completed click Test.

If everything is happy we should return a "TESTS COMPLETED SUCCESSFULLY!" message. Select OK to close the test window and OK again to accept and save the changes to our DSN setup.

With all these steps completed its a simple Datasource addition into Excel before you are enjoying your HIVE data in your Microsoft Excel spreadsheet. You're welcome Google.

Back in the day, which was a Wednesday for those of you who are interested, we had a less than wonderful browser call Internet Explorer 8. This browser like the comedian I stole that last joke from, Dane Cook if you are interested, was popular only because we didn't know any better and it was easy to get to by default. But like all terrible things sometimes "tweaking" was necessary in order to get the software to perform like we needed it to. One such "tweak" was the enabling of cookies, which is required more often than not by certain websites. In order to enable cookies in IE 8:

$ServerName.active

Cookbook Versioning

Cookbook Versioning Specifications

FAQ

Driver

Transport

Platforms

Suites

Search This Blog

Popular Posts

Connect With Us

Blog Archive