I’m siting on my flight home from my first devopsdays in Goteborg, so firstly, many thanks to the awesome Patrick Debois, Ulf & many many others that put the effort in to organising the conference, and everybody that turned up and made the event so worth while! My primary reason for going was to hear other people’s experience with configuration management and general ops deployment experience. (I’m in the process of adding puppet to our large legacy LAMP stack)
I kind of expected to be the fuddy duddy in the room (my group runs 4 SaaS services, our largest is a LAMP+JBoss SIP stack, a Solaris/Tomcat/Oracle/Coherence stack, a Linux/Tomcat/MySQL stack and a Apache/Weblogic/Cognos/Oracle stack, all hosted on our own hardware, how retro), so I was prepared to hear stories of how easy it was to deploy services built on modern interpreted stacks to the cloud, but I was pleasantly surprised to hear that plenty of people are using java application servers of all shapes & sizes in production. I was less pleased to hear, but somewhat comforted, that everybody running java stacks in production is suffering pain somewhere (damn, no silver bullet to take home).
Lots of people were good enough to share their success & horror stories about how their current java stacks get into production, some of the recurring topics:
I think this deserved a talk or open space on it’s own, but John E. Vincent covered chunks of this in his great tools talk, and it came up in the “deploying java artifacts” open space.
I’ve got some take away reading to do about tools like Apache Whirr & UrbanCode’s deployment & configuration tools, but everybody has similar problems, needing a controlled, reliable method of automating the pre & post deployment steps (traffic bleed off, deploy, service verification, data load, back in service) and managing the service availability during the deployment (or managing the stress on systems affected by the post deployment steps)
In general, hot deployments never seem to work as planned reliably, hot deployments are highly desirable for some services due to session requirements, but most people observed that hot deployments are prone to problems, leaking memory on many occasions, leading to hot deployments being something you can only get away with a few times, if at all depending on your memory overhead.
The guys from zeroturnaround demoed their latest jrebel/liverebel tools. JRebel is a developer focused tool that allows a jar to be hot updated in a running JaS, for quicker iterative java development. LiveRebel is built on similar technology but aimed at use in production, to do hot updates (I’m not sure how this differs from the hot deployments of war & ear etc, but that’s a gap in my JaS understanding)
war/ear or exploded webapps directory?
Currently we do both, and each have their pros & cons, exploded webapp directories have a tendency to build up undocumented cruft essential to the smooth running, and war/ear deploys have a tendency to break your heart with environment issues (what do you mean we need a new build to use a different database server!?)
For our next service going into production, we need to be able to vary the number of tomcats running on a physical host, each running on a different port, to support this we’ve extended our existing in_service hook, which in a our simpler environments just lets the load balancer know that this host is now good to take traffic, now it will build out the multiple tomcat CATALINA_HOME trees from scratch, going as far as grabbing the ant & tomcat tarballs required (version numbers pulled from central config db, allowing per host overrides for piloting versions on individual machines), the aim here is 2 fold, have a clearly documented process for building a working CATALINA_HOME and be able to dynamically vary our tomcat count without lots of manual preparation required)
Environment & Configuration
Lots of issues & lots of different solutions to this one, best case, war file ships with 3 environments configured, default to production, over ride on the command line for other environments (down side, production passwords are in the artefacts & therefore in source control).
There was some discussion over externalising the config, various methods (XML includes in comtext.xml/server.xml) and providing an standard-ish API to get/set properties (some commercial JaS already do this).
Horror stories included the deploy process having a start to explode the war, stop, remove the war, fix the config, restart. Another involved a post restart data load that took 30-40min before the tomcat was ready for traffic again.
General concencus was that involving dev in more of the ops deployment pain helped hilight areas that needed some improvement.
There was a little bit of discussion around APIs for managing configs in a running AS, some of the commercial AS already have this, but their was little support around the room for single vendor solutions, although most agreed that practically nobody changes AS after initial selection, the desire for a single tool was focused on having a single tool to gain momentum instead of fragmented tooling.
No Ops in the Java Serverlet Steering committee?
I missed the exact names of the standards & people involved, I’ll update this if you have specifics
One of the participants in the java artifacts open space is on the Java Community Process mailing list, he pointed out that their was practically no one representing ops in the ML, some of the proposed changes for horrified ops people in the room.
the platform/application split
Some ardent supporters of deploying nothing but packages, people using FPM and other tools to build RPMs and other packages of the AS and another package for application.
Part of this also comes down to your orchestration tools and how you run your CM, Ramon of Hyves hilighted that they don’t run puppet continuously, they run once a day, due to orchestration requirements and scale (they have 3,000+ application server in production).
Most people agree that the CM goes as far as preparing the AS for an application to be dropped in, although most of these environments ran a single AS per host.