SHOCKER: Vagrant base box using RPMs

I’ve been using some of the base boxes available from as a starting point for lots of Vagrant VMs recently, but came unstuck when the version of puppet in use on the base box was substantially different from our production environment (2.7 v 2.6.8 in our production environment). (I was working on alt_gem, an alternate package provider for maintaining gems outside the RVM in use by puppet)

At first I thought it would be simple enough to downgrade puppet on one of my Vagrant VMs, but then I discovered that nearly all of the CentOS/Red Hat vagrant boxes install ruby & puppet from tarballs, which is balls frankly, shouldn’t we be using packages for everything?!  (Kris Buytaert says so, so it must be true)

So instead of ranting, I tweaked an existing veewee CentOS template to install puppet & chef from RPMs, for puppet, it uses the official puppetlabs yum repo, for chef it uses the frameos packages. (I’m a puppet user, so I’ve only tested the puppet stuff, chef is at least passing the “veewee validate” tests).

You can grab the box here:


To use it in your Vagrant config, make sure this is in your Vagrantfile:

  # Every Vagrant virtual environment requires a box to build off of. = "CentOS-56-64-packages"

  # The url from where the '' box will be fetched if it
  # doesn't already exist on the user's system.
  config.vm.box_url = ""


I’ve sent a pull request to Patrick to get the new template included in veewee, and a pull request to Gareth to get the box listed on

Now, time to go back to what I was doing originally before I got side tracked 🙂



An Average Day

There is no such thing as an average or normal day, but here’s what today yesterday looked like (the first day back after a 6 day break):

  • 1000-1200 Wiki Gardening – moving some WIP from my head/evernote to team wiki pages
  • 1200-1230 Monitoring & Measuring Catchup – a quick check around on the stuff we don’t get alerted about, checkup on some new nodes I added to cacti before I finished up last week.
  • 1230-1330 Reading – closing out a bunch of open tabs etc
  • 1330-1430 Lunch (the joy of working from home, soup & sandwiches with family:-))
  • 1430-1600 Open CM tickets for up coming changes, upgrade Puppet Dashboard, publish some more strategy information on our internal wiki.  Work through the dreaded email back log.
  • 1600-1700 Baby Dr Appointment
  • 1700-1800 Weekly Team conference call (mostly around some major work scheduled for this weekend)
  • 1800-2100 Family Time – help the kids tidy up & get them to bed, get something to eat
  • 2100-2200 Finish up some puppet work, mostly tidying up & committing some work to git, review steps for some up coming work with Pacific Time colleague
  • 2200-2300 TV Break  – Michael McIntyre 🙂
  • 2300-0030 more follow up with PT colleague, ironed out plan for moving from DAS to NAS for a pilot set of machines, committed plan to wiki for tracking, discussed the general meanness of some of the people we work with.
After the brain searching trying to remember what I did, I’ve re-installed RescueTime..

Faking Production – database access

One of our services has been around for a while, a realy long time.  It used to get developed in production, there is an awful lot of work involved in making the app self-contained, to where it could be brought up in a VM and run without access to production or some kinds of fake supporting environment.  There’s lots of stuff hard coded in the app (like database server names/ip etc), and indeed, and there’s a lot of code designed to handle inaccessible database servers in some kind of graceful manor.

We’ve been taking bite sized chunks of all of this over the last few years, we’re on the home straight.

One of the handy tricks we used to get this application to be better self-contained was avoid changing all of the database access layer (hint, there isn’t one) and just use iptables to redirect requests to production database servers to either local empty database schema on the VM, or shared database servers with realistic amounts of data.

We manage our database pools (,, etc) using DNS (PowerDNS with MySQL back end), in production, if you make a DNS request for, you will get 3+ IPs back, one of which will be in your datacentre, the others will be other datacentres, the app has logic for selecting the local DB first, and using an offsite DB if there is some kind of connection issue.  We also mark databases as offline by prepending the relevant record in MySQL with OUTOF, so that a request for will return only 2 IPs, and a DNS request for will return any DB servers marked out of service.

Why am I telling you all of this?  Well, it’s just not very straight forward for us to update a single config file and have the entire app start using a different database server. Fear not, our production databases aren’t actually accessible from the dev environments.

But what we can do is easily identify the IP:PORT combinations that an application server will try and connect to.  And once we know that it’s pretty trivial to generate a set of iptables statements that will quietly divert that traffic elsewhere.

Here’s a little ruby that generates some iptables statements to divert access to remote, production, databases to local ports, where you can either use ssh port-forwarding to forward on to a shared set of development databases, or to several local empty-schema MySQL instances:

require “rubygems”
require ‘socket’

# map FQDNs to local ports
fqdn_port =
fqdn_port[“”] = 3311
fqdn_port[“”] = 3312
fqdn_port[“”] = 3314

fqdn_port.each do |fqdn, port|
puts “#”
puts “# #{fqdn}”
# addressess for this FQDN
fqdn_addr =

# get the addresses for the FQDN
addr = TCPSocket.gethostbyname(fqdn)
addr[3, addr.length].each { |ip| fqdn_addr << ip }

addr = TCPSocket.gethostbyname(‘OUTOF’ + fqdn)
addr[3, addr.length].each { |ip| fqdn_addr << ip }

fqdn_addr.each do |ip|
puts “iptables -t nat -A OUTPUT -p tcp -d #{ip} –dport 3306 -j DNAT –to{fqdn_port[fqdn]}”

And yes, this only generates the statements, just pipe the output into bash if you want the commands actually run.  Want to see what it’s going to do?  Just run it.  Simples.

State of the Java Onion

I’m siting on my flight home from my first devopsdays in Goteborg, so firstly, many thanks to the awesome Patrick Debois, Ulf & many many others that put the effort in to organising the conference, and everybody that turned up and made the event so worth while! My primary reason for going was to hear other people’s experience with configuration management and general ops deployment experience. (I’m in the process of adding puppet to our large legacy LAMP stack)

I kind of expected to be the fuddy duddy in the room (my group runs 4 SaaS services, our largest is a LAMP+JBoss SIP stack, a Solaris/Tomcat/Oracle/Coherence stack, a Linux/Tomcat/MySQL stack and a Apache/Weblogic/Cognos/Oracle stack, all hosted on our own hardware, how retro), so I was prepared to hear stories of how easy it was to deploy services built on modern interpreted stacks to the cloud, but I was pleasantly surprised to hear that plenty of people are using java application servers of all shapes & sizes in production. I was less pleased to hear, but somewhat comforted, that everybody running java stacks in production is suffering pain somewhere (damn, no silver bullet to take home).


Deployment Pain

Lots of people were good enough to share their success & horror stories about how their current java stacks get into production, some of the recurring topics:


I think this deserved a talk or open space on it’s own, but John E. Vincent covered chunks of this in his great tools talk, and it came up in the “deploying java artifacts” open space.

I’ve got some take away reading to do about tools like Apache Whirr & UrbanCode’s deployment & configuration tools, but everybody has similar problems, needing a controlled, reliable method of automating the pre & post deployment steps (traffic bleed off, deploy, service verification, data load, back in service) and managing the service availability during the deployment (or managing the stress on systems affected by the post deployment steps)

Hot/Cold deployments?

In general, hot deployments never seem to work as planned reliably, hot deployments are highly desirable for some services due to session requirements, but most people observed that hot deployments are prone to problems, leaking memory on many occasions, leading to hot deployments being something you can only get away with a few times, if at all depending on your memory overhead.
The guys from zeroturnaround demoed their latest jrebel/liverebel tools. JRebel is a developer focused tool that allows a jar to be hot updated in a running JaS, for quicker iterative java development. LiveRebel is built on similar technology but aimed at use in production, to do hot updates (I’m not sure how this differs from the hot deployments of war & ear etc, but that’s a gap in my JaS understanding)

war/ear or exploded webapps directory?

Currently we do both, and each have their pros & cons, exploded webapp directories have a tendency to build up undocumented cruft essential to the smooth running, and war/ear deploys have a tendency to break your heart with environment issues (what do you mean we need a new build to use a different database server!?)

For our next service going into production, we need to be able to vary the number of tomcats running on a physical host, each running on a different port, to support this we’ve extended our existing in_service hook, which in a our simpler environments just lets the load balancer know that this host is now good to take traffic, now it will build out the multiple tomcat CATALINA_HOME trees from scratch, going as far as grabbing the ant & tomcat tarballs required (version numbers pulled from central config db, allowing per host overrides for piloting versions on individual machines), the aim here is 2 fold, have a clearly documented process for building a working CATALINA_HOME and be able to dynamically vary our tomcat count without lots of manual preparation required)

Environment & Configuration

Lots of issues & lots of different solutions to this one, best case, war file ships with 3 environments configured, default to production, over ride on the command line for other environments (down side, production passwords are in the artefacts & therefore in source control).

There was some discussion over externalising the config, various methods (XML includes in comtext.xml/server.xml) and providing an standard-ish API to get/set properties (some commercial JaS already do this).

Horror stories included the deploy process having a start to explode the war, stop, remove the war, fix the config, restart. Another involved a post restart data load that took 30-40min before the tomcat was ready for traffic again.

General concencus was that involving dev in more of the ops deployment pain helped hilight areas that needed some improvement.

Config/Properties APIs

There was a little bit of discussion around APIs for managing configs in a running AS, some of the commercial AS already have this, but their was little support around the room for single vendor solutions, although most agreed that practically nobody changes AS after initial selection, the desire for a single tool was focused on having a single tool to gain momentum instead of fragmented tooling.

No Ops in the Java Serverlet Steering committee?

I missed the exact names of the standards & people involved, I’ll update this if you have specifics
One of the participants in the java artifacts open space is on the Java Community Process mailing list, he pointed out that their was practically no one representing ops in the ML, some of the proposed changes for horrified ops people in the room.

the platform/application split

Some ardent supporters of deploying nothing but packages, people using FPM and other tools to build RPMs and other packages of the AS and another package for application.

Part of this also comes down to your orchestration tools and how you run your CM, Ramon of Hyves hilighted that they don’t run puppet continuously, they run once a day, due to orchestration requirements and scale (they have 3,000+ application server in production).

Most people agree that the CM goes as far as preparing the AS for an application to be dropped in, although most of these environments ran a single AS per host.

Windows 7 Essentials

I’ve just rebuilt my laptop (a combination of McAfee Whole Disk Encryption slowing the current build down & a Crucial ReadSSD 128Gb that was too cheap to resist forced me to, honest guv), so it’s time to refresh & re-document the essential software list:

  1. Windows 7 Professional 64bit
  2. VistaSwitcher (better alt-tab)
  3. WindowSpace (snap windows to screen edges & other windows, extended keyboard support for moving/resizing)
  4. Launchy
  5. Thunderbird6
    1. Lightning (required for work calendars)
    2. OBET
    3. Provider for Google Calendar (so I can see my personal calendar)
    4. Google Contacts (sync sync sync)
    5. Mail Redirect (bounce/redirect email to a ticketing system)
    6. Nostalgy (move/copy mail to different folders from the keyboard)
    7. Phoenity Shredder or Littlebird (the default theme is a bit slow, these are lighter and quicker)
    8. Hacked BlunderDelay & mailnews.sendInBackground=true
  6. Chrome + Xmarks
  7. Xmarks for IE
  8. Evernote
  9. Dropbox & Dropbox Folder Sync
  10. PuTTY (remember to export HKEY_CURRENT_USERSoftwareSimonTathamPuTTYSessions)
  11. WinSCP
  12. Pidgin + OTR
  13. gVim
  14. Cisco AnyConnect (main work VPN)
  15. Cisco VPNClient (backup & OOB VPN)

I think that’s it for now.

Apple Mac Toolbox

Following up on my recent post on the engineers toobox, I’ve just rebuilt my Apple MacBook (newer, bigger hard disk was the perfect opportunity for a fresh Snow Leopard install and to fix some annoying iPhoto index & thumbnail corruption), so here’s my list of essentials for my MacBook, in no particular order:

  • Evernote
  • SpanningSync
  • Dropbox
  • Xmarks for Safari
  • Xmarks for Firefox
  • Panic Coda
  • Thunderbird 3.x
  • Adium
  • VMWare Fusion
  • iPhoto
  • iMovie
  • QuickSilver
  • Microsoft Office
  • Google Chrome
  • Flickr Uploader
  • Skype
  • Google Picasa
  • Get iPlayer Automator
  • Cyberduck
  • MacVim
  • Spotify
  • iSquint
  • AudioHub
  • VisualHub
  • SuperSync
  • TweetDeck
  • ClickToFlash
  • Growl

What’s changed?

Matt Johnston commented recently that the recent surge in activity in the community side of the local tech & business scene could be “the ‘real’ end of the ‘Troubles’?”.  It’s definitely a positive thing, I’m delighted the next generation of technologists in Northern Ireland has a growing & diverse community around them.  Something that was sorely lacking in my formative years, where it seemed that the only exposure to technology was from inside the technology firm you worked in.  [and I’m a committed technologist, not a 9-5 salary man].  So what’s changed?

Many of us are of a similar age, all in full-time IT roles from the mid-90’s onwards, some for much longer.  Is it the relatively recent additions that have invigorated us? People like the hyper-active Andy McMillen, or what’s caused “the old guard” like Matt to push on with xcake & startvi, or Colm & Norbert to persevere with MobileMondayBelfast, or Darryl & the first Open Coffee Belfast?

Surely none of us would admit to letting Northern Ireland’s previous problems get in the way of the way we lead our lives?

So what changed?  How do we make sure we don’t loose momentum?

How much would your life changed if you had the community & adventure surrounding you 10-15 years ago when you first discovered your passion for technology could pay the bills?  Would you have endured the 10 years in big, faceless corporations? [how did we get brain washed into thinking that the best IT career involved one of 3 or 4 companies in NI?]

Would I still be doing what I’m doing now? Probably, but probably not for who I’m doing it for.  And I hope I would have had a much interesting & independent path here.