Many of our customers writing their own custom Chef recipes often need to know how to reference certain information about their environment from the virtual machine running Chef recipes.
When Engine Yard's infrastructure runs Chef, it runs a version of Chef called chef-solo. This version doesn't make use of chef-server or any other related components from Chef (formerly Opscode); instead, the platform downloads the latest tar'd and gzipped version of your recipes as we know them (which were uploaded via ey recipes upload), removes everything on the instance under /etc/chef-custom, then unpacks the archive there and runs it with chef-solo.
This can happen as a single run on its own using ey recipes apply, or at the end of a full Chef run by clicking "apply" in the Engine Yard Cloud dashboard.
Basic Anatomy of a Chef Run on Engine Yard Cloud
When Chef is run on Engine Yard Cloud, there are two types of recipes:
Main (also called "base") recipes
Custom (your custom chef recipes)
These are run in the order of "main, custom" if you click the "apply" button
on the dashboard. This tells the Engine Yard architecture to form a connection
to the instance, obliterate everyhing under /etc/chef, download our base recipes
that constitute our "stack", unpack them there, and run them. This happens every
time a base run is fired off.
The second type is the "custom" chef run. This is your custom code that you
upload to Engine Yard Cloud using the ey command from the engineyard
gem. These recipes are uploaded from your computer to our infrastructure. Then,
when the time comes to run those recipes, the same architecture sends a command
to the server(s) in question to obliterate everything under /etc/chef-custom,
download the latest version of your custom chef recipes as we know them, unpack
them in that location, then run that code with Chef.
Knowing this, you should be able to see an obvious caveat off the bat: you can't
(and shouldn't) modify Chef recipes on the instance and expect them to run.
Since the directory where your custom chef is located will be removed prior
to the next run and replaced with a "fresh" copy of your recipes, you'll lose
any changes you make on the server.
Finally, there's a bit of a "loophole" to the order in which these recipes are
run as mentioned earlier. If you use the "apply" button, recipes do indeed run
in the "main, custom" order, waiting for a full run of the base recipes that we
provide, regardless of environment state, before running your custom chef
recipes. This can be undesirable in various situations, so you can use the
engineyard gem to run only your custom recipes with the following
ey recipes apply -c YourAccountName -e YourEnvironmentName
This will not upload recipes on your computer to the cloud; instead it will
simply instruct our platform to execute a custom chef run on the environment
specified. Base recipes are not run in this particular case.
Every time a custom Chef run happens, a file located at /etc/chef-custom/dna.json is read. That file is put into place by the Engine Yard Cloud automation system and contains a great deal of information about the instances in your environment. The below is an example of what that dna.json file might look like. This serves as a reference for anyone who needs to look up what the structure might be.
Note that the values for these keys have been removed and replaced with descriptive values. In some cases, another key called "_comment" was added to explain an entire "block" of data; these don't exist in the actual dna.json file.
For another reference of this same file, feel free to use this GitHub gist.
Examining the node object
So having this JSON file is great, but how do we actually use it? How do you get at the information contained herein?
When Engine Yard's Chef run starts, it reads this file and places the keys into a hash called node. This hash can then be referenced using Ruby symbols to find the value you're looking for. Here are a few example key/value pairs that are frequently used.
Instance name. For example, if you have a util named "elasticsearch", this value would evaluate to "elasticsearch" on that specific instance.
Instance role. Could be one of util, app, app_master, db_master or db_slave.
Contains an array of hash objects representing the instances in the environment. For example, if you have five instances in your environment total (say an app master, standard application, a database master, database replica and a util), this is going to be an array of 5 hashes, where each object represents one of the five aforementioned hypothetical instances.
Contains an array of hashes that represent all applications deployed on the target environment. Remember, you can deploy more than one application to a single environment in Engine Yard Cloud (although it's not always recommended due to memory contention issues, it's still possible). For most dedicated production environments, this should only have one object in it, but you could have more than one if you deployed more than one app to this environment.
Name of the environment. For example, "myapp_production". This is what you called it when you first created it in the dashboard.
The framework_env variable. Usually something like "staging" or "production". Could be arbitrary depending on what you put in the dashboard (we've seen some people do "development", "test", "qa", "acceptance", etc.).
A few examples
Personally, I don't learn much from reading documentation - I need to see simple code samples. What follows is my best attempt to give you some simple examples that you can start to use right away.
Bear in mind, Engine Yard's truly epic support team maintains a ready-to-use Chef cookbook repository on GitHub at https://github.com/engineyard/ey-cloud-recipes. THESE ARE NOT OFFICIALLY SUPPORTED so don't ask our support team for help with those recipes. If you find a bug, submit a pull request. These are an easy way to get off the ground, but won't be a perfect fit for every project or team. Use them as training wheels and a scaffold to get you started, then spruce things up your own way for your own purposes later on.
Conditional logic on instance name and role (or type)
Let's say you want a Chef recipe to run on any instance named "elasticsearch" something or other.
This isn't necessarily the best way to go about this, however. We're missing at least one if conditional that we should have to avoid accidents. What if somebody on your team accidentally names one of your database replicas "elasticsearch", editing the wrong object for some reason? You want to be more explicit here - match the instance_role, too:
However, let's say you want to add more than one elasticsearch machine for some reason. You should probably number them - e.g. elasticsearch_1, elasticsearch_2, and so on. Or, suffix them with an explicit purpose: elasticsearch_customers, elasticsearch_invoices, and so on.
In this case, our String#match statement isn't going to work that well. We need to expand it to do the right thing on the right virtual machines.
Make your bash prompt more useful
When you SSH to an instance, normally you'll just get a basic prompt that doesn't necessarily contain the information you want to know about. This is fine for small environments, but larger more complex ones could use a little more... panache.
To do this, we're going to utilize two files in a single chef cookbook: a template and the recipe itself.
Create the following directory structure:
The first task is to build out our ~/.bashrc. Let's do that with a template now, and then we can populate the variables later.
Now we have to tell Chef what those variables are supposed to be. Enter the recipe itself.
Now that you have these files written and in the proper directories, use the engineyard gem to upload them to your environment and apply them:
ey recipes upload -c YourAccountName -e YourEnvironmentName --apply
If you look in the dashboard, you should see the instances "spinning", which indicates Chef is running. In a few minutes it should be done, and you can SSH up to an instance to see your new bash prompt.
This is a cursory overview with some examples of the information you can get from dna.json. If you want to dig into it, I would suggest you scp the file from your instance(s) to your local machine and take a look around at it. Hint: an editor like Atom makes collapsing/browsing the huge JSON hash that is dna.json much easier.
You should now have dna.json on your desktop (OS X, desktop Linux distributions). Open it in any editor and poke around.
Some Common Questions and Answers
"Do I need to 'sudo' my commands in Chef?"
No, you don't. Chef, when it runs on Engine Yard Cloud, will run as the root user so you won't need to "sudo" anything.
"I get an error when running chef that monit can't do something because it's trying to restart "
This commonly happens because monit - our system monitoring daemon - is busy trying to restart something else. I recommend taking a look at the syslog:
tail -f /var/log/syslog
to see if you can find monit repeating a loop of restarting the same thing and failing each time. This usually happens if the init.d system in Gentoo starts to act up. There are some cases where /etc/init.d/ start will claim a process is already started, but it doesn't show up when running ps auxfww. In that case, you want to run sudo monit unmonitor all -g <groupname> where groupname is the name of the group given in the service's monitrc file, located in /etc/monit.d/<servicename>.monitrc. Once you've unmonitored that, run sudo /etc/init.d/<scriptname> zap to force init.d to reset that service to a "stopped" state. Now, you can re-run ps auxfww | grep -i <service, partial string match> to see if you can make sure it's not running. If it isn't, run /etc/init.d/<servicename> start to see if it starts, then verify with ps again. Once that's done, monit monitor all -g <groupname> to tell monit to start watching it again. Run a tail -f /var/log/syslog to ensure that monit can start it correctly and it doesn't flip out anymore.
Something's wrong and I'm not sure what. Where are the logs?
There are two ways to get at your Chef logs. The easiest is to go to the dashboard (https://cloud.engineyard.com/), click on the environment in question, then on any server where there was a Chef failure there should be a red exclamation mark. Click the "Custom Logs" link to see custom logs.
If for some unknown reason that doesn't seem to have the correct information, SSH up to the instance and look at /var/log/chef.custom.log. This is a symbolic link to the latest custom Chef run log which should be date stamped. You can also see the main run log there in the same path - chef.main.log.
Now that you have a better feel for the node object that Engine Yard Cloud's Chef run will make available to you, you can start looking at other documentation we have on our version of Chef.
One thing to make note of is that Engine Yard totally side-steps the need for chef-server by having our own infrastructure setup. In short, it works like this:
You write your own custom Chef code, then use the engineyard gem to upload to us: ey recipes upload.
The gem runs a tar command on the recipes, then gzips them and sends them to Engine Yard, where we securely store the recipes for later use.
When the time comes for running those recipes, our automation goes to your instance(s), deletes everything under /etc/chef-custom (where your custom recipes go), downloads the most recently uploaded copy of your recipe's compressed tarball, decompresses it into /etc/chef-custom, then runs chef-solo against those recipes.
This means that there's no need for knife, chef-server or the like. It also means that there may be some differences between our Chef and the canonical Chef from Opsco-err, "Chef" as they are now known.
Now you are armed with some knowledge about Chef on Engine Yard! Here are some additional resources: