Quick links:

Arrfab's Blog » Cluster: Rolling updates with Ansible and Apache reverse proxies

Arrfab's Blog » Cluster: Ansible as an alternative to puppet/chef/cfengine and others …

Florian's blog: Imitation is the sincerest form of flattery

Florian's blog: 4 extra seats available in Cloud Bootcamp in Wellington!

Florian's blog: Returning to Paris for OpenStack in Action 2: Production Ready

Florian's blog: Our first Cloud Bootcamp is now Sold Out

Florian's blog: An exciting day for the Ceph community

Florian's blog: More details on OSCON 2012, and your chance to get in cheaper!

Florian's blog: Coming to New Zealand!

Florian's blog: A look back at my first OpenStack Design Summit & Conference

Florian's blog: Speaking at the Percona Live MySQL Conference and Expo

Florian's blog: Speaking at OSCON 2012

Florian's blog: Feature article on Pacemaker in this month’s Linux Journal

Florian's blog: Presentation accepted for OpenStack Spring 2012 Conference

Florian's blog: Announcing the High Performance High Availability Guide documentation project

Florian's blog: On my (ex-)maintainership of the DRBD User’s Guide

Florian's blog: Lots of new stuff on our web site

Florian's blog: Ceph: tickling my geek genes

Florian's blog: Announcing Cloud Jumpstart for OpenStack™ – and your chance to get into LinuxTag for free!

Florian's blog: OpenStack Spring 2012 Design Summit & Conference

Florian's blog: This blog is about to move!

Florian's blog: Speaking at the 2012 Percona Live MySQL Conference

Florian's blog: Last Minute discount now available for High Availability Expert training in Berlin

Florian's blog: Speaking at linux.conf.au, meet us in Ballarat!

Florian's blog: Now available: Slides from Percona Live and Linuxcon Europe

Florian's blog: Ready to roll for Percona Live UK

Florian's blog: Twitter

Florian's blog: Busy weeks ahead!

Florian's blog: Speaking at Percona Live — and you can get there for cheap!

Arrfab's Blog » Cluster: Monitoring DRBD resources with Zabbix on CentOS

Rolling updates with Ansible and Apache reverse proxies

Posted in Arrfab's Blog » Cluster by fabian.arrotin at May 23, 2013 04:36 PM

It's not a secret anymore that I use Ansible to do a lot of things. That goes from simple "one shot" actions with ansible on multiple nodes to "configuration management and deployment tasks" with ansible-playbook. One of the thing I also really like with Ansible is the fact that it's also a great orchestration tool.

For example, in some WSOA flows you can have a bunch of servers behind load balancer nodes. When you want to put a backend node/web server node in maintenance mode (to change configuration/update package/update app/whatever), you just "remove" that node from the production flow, do what you need to do, verify it's up again and put that node back in production. The principle of "rolling updates" is then interesting as you still have 24/7 flows in production.

But what if you're not in charge of the whole infrastructure ? AKA for example you're in charge of some servers, but not the load balancers in front of your infrastructure. Let's consider the following situation, and how we'll use ansible to still disable/enable a backend server behind Apache reverse proxies.

So here is the (simplified) situation : two Apache reverse proxies (using the mod_proxy_balancer module) are used to load balance traffic to four backend nodes (Jboss in our simplified case). We can't directly touch those upstream Apache nodes, but we can still interact on them , thanks to the fact that "balancer manager support" is active (and protected !)

Let's have a look at a (simplified) ansible inventory file :

[jboss-cluster]

jboss-1

jboss-2

jboss-3

jboss-4

[apache-group-1]

apache-node-1

apache-node-2

Let's now create a generic (write once/use it many) task to disable a backend node from apache ! :

---
##############################################################################
#
# This task can be included in a playbook to pause a backend node
# being load balanced by Apache Reverse Proxies
# Several variables need to be defined :
#   - ${apache_rp_backend_url} : the URL of the backend server, as known by Apache server
#   - ${apache_rp_backend_cluster} : the name of the cluster as defined on the Apache RP (the group the node is member of)
#   - ${apache_rp_group} : the name of the group declared in hosts.cfg containing Apache Reverse Proxies
#   - ${apache_rp_user}: the username used to authenticate against the Apache balancer-manager
#   - ${apache_rp_password}: the password used to authenticate against the Apache balancer-manager
#   - ${apache_rp_balancer_manager_uri}: the URI where to find the balancer-manager Apache mod
#
##############################################################################
- name: Disabling the worker in Apache Reverse Proxies
local_action: shell /usr/bin/curl -k --user ${apache_rp_user}:${apache_rp_password} "https://${item}/${apache_rp_balancer_manager_uri}?b=${apache_rp_backend_cluster}&w=${apache_rp_backend_url}&nonce=$(curl -k --user ${apache_rp_user}:${apache_rp_password} https://${item}/${apache_rp_balancer_manager_uri} |grep nonce|tail -n 1|cut -f 3 -d '&'|cut -f 2 -d '='|cut -f 1 -d '"')&dw=Disable"
with_items: ${groups.${apache_rp_group}}

- name: Waiting 20 seconds to be sure no traffic is being sent anymore to that worker backend node
pause: seconds=20

The interesting bit is the with_items one : it will use the apache_rp_group variable to know which apache servers are used upstream (assuming you can have multiple nodes/clusters) and will play that command for every host in the list obtained from the inventory !

We can now, in the "rolling-updates" playbook, just call the previous tasks (assuming we saved it as ../tasks/apache-disable-worker.yml) :

---

- hosts: jboss-cluster

serial: 1

user: root

tasks:

- include: ../tasks/apache-disable-worker.yml

- etc/etc ...

- wait_for: port=8443 state=started

- include: ../tasks/apache-enable-worker.yml

But Wait ! As you've seen, we still need to declare some variables : let's do that in the inventory, under group_vars and host_vars !

group_vars/jboss-cluster :

# Apache reverse proxies settins
apache_rp_group: apache-group-1
apache_rp_user: my-admin-account
apache_rp_password: my-beautiful-pass
apache_rp_balancer_manager_uri: balancer-manager-hidden-and-redirected

host_vars/jboss-1 :

apache_rp_backend_url : 'https://jboss1.myinternal.domain.org:8443'
apache_rp_backend_cluster : nameofmyclusterdefinedinapache

Now when we'll use that playbook, we'll have a local action that will interact with the balancer manager to disable that backend node while we do maintainance.

I let you imagine (and create) a ../tasks/apache-enable-worker.yml file to enable it (which you'll call at the end of your playbook).

Ansible as an alternative to puppet/chef/cfengine and others …

Posted in Arrfab's Blog » Cluster by fabian.arrotin at October 26, 2012 02:02 PM

I already know that i'll be criticized for this post, but i don't care :-) . Strangely my last blog post (which is *very* old ...) was about a puppet dashboard, so why speaking about another tool ? Well, first i got a new job and some prerequisites have changed. I still like puppet (and I'd even want to be able to use puppet but that's another story ...) but I was faced to some constraints when being in front of a new project. For that specific project,  I had to configure a bunch of new Virtual Machines (RHEL6) coming as OVF files. Problem number one was that I can't alter or modify the base image so i can't push packages (from the distro or third-party repositories). Second issue is that I can't install nor have a daemon/agent running on those machines. I had a look at the different config tools available but they all require either a daemon to be started, or at least having extra packages to be installed on each managed node. (so not possible to have puppetd nor puppetrun or invoke puppet directly through ssh , as puppet can't even be installed, same for saltstack). That's why i decided to give Ansible a try. It was already on my "TO-test" list for a long time but it seems it was really fitting the bill for that specific project and constraints : using the 'already-in-place' ssh authorization, no packages to be installed on the managed nodes, and last-but-no-least, a learning curve that is really thin (compared to puppet and others, but that's my personal opinion/experience).

The other good thing with Ansible is that you can start very easily and then slowly add 'complexity' to your playbooks/tasks. I'm still using for example a flat inventory file, but already organized to reflect what we can do in the future (hostnames included in groups, themselves included in parents groups - aka nested groups). Same for the variables inheritance : at the group level and down to the host level, host variables overwriting those defined at the group level , etc ...)

The Yaml syntax is really easy to understand so you can have quickly your first playbook being played on a bunch of machines simultaneously (thanks to paramiko/parallel ssh). The number of modules is less than the puppet resources, but is quickly growing. I also just tested to tie the execution of ansible playbook with Jenkins so that people not having access to the ansible inventory/playbooks/tasks (stored in a vcs, subversion in my case) can use it from a gui.. More to come on Ansible in the future

Imitation is the sincerest form of flattery

Posted in Florian's blog by Florian Haas at May 29, 2012 10:37 AM

Know how they say that you don’t know you’re doing something right until someone starts imitating you? Well, this is a great time for us.

Someone evidently took a good long read of our hastexo High Availability Expert class agenda, and made this. It’s wonderful for us to see that it’s such an inspiration to others (even the acronym!), and we hope to see more folks doing this in the future.

Meanwhile, I should say this since I haven’t yet mentioned it here on my blog: there’s an new feature to our own HHAX class that we’ve added to our upcoming class in Berlin, and that is the booth arbitration daemon and ticket manager.

Read more…


4 extra seats available in Cloud Bootcamp in Wellington!

Posted in Florian's blog by Florian Haas at May 21, 2012 09:00 PM

Our Cloud Bootcamp for OpenStack™ in Wellington next month just got 4 extra seats! If you’re in New Zealand, Australia, or the Pacific region, and want to learn about OpenStack, now is your chance!

Read more…


Returning to Paris for OpenStack in Action 2: Production Ready

Posted in Florian's blog by Florian Haas at May 10, 2012 11:30 AM

This month, I’m thrilled to go to Paris to talk about highly-available OpenStack. The event I’m speaking at is OpenStack in Action 2: Production Ready, and it’s being organized by French hosting & cloud services provider eNovance.

Read more…


Our first Cloud Bootcamp is now Sold Out

Posted in Florian's blog by Florian Haas at May 08, 2012 09:44 PM

Less than two weeks after it’s been announced, our inaugural Cloud Bootcamp for OpenStack™ in Wellington, New Zealand is now sold out. Our friends at Catalyst IT have put up a wait list, and we’re currently working on tacking on extra days to fill the excess demand.

This will be fun.

Read more…


An exciting day for the Ceph community

Posted in Florian's blog by Florian Haas at May 03, 2012 11:10 AM

Today, as you’ve probably noticed if you’re following the development of the Ceph stack, something mighty cool has been happening. The ceph.com web site received a major makeover with a slick new design, and the people behind Ceph haveannounced the launch of a brand new company to drive the Ceph stack, Inktank.

Read more…


More details on OSCON 2012, and your chance to get in cheaper!

Posted in Florian's blog by Florian Haas at May 01, 2012 06:29 PM

A few more details on my speaking slot at this year’s OSCON, titled Highly Available Cloud: OpenStack Integration with Pacemaker.

Read more…


Coming to New Zealand!

Posted in Florian's blog by Florian Haas at April 25, 2012 05:37 AM

hastexo is offering Cloud Bootcamp for OpenStack™ in Wellington. Another fine example of the global OpenStack community at work.

Read more…


A look back at my first OpenStack Design Summit & Conference

Posted in Florian's blog by Florian Haas at April 24, 2012 09:35 AM

I’ve just returned from the OpenStack Folsom Design Summit and Spring 2012 Conference, and am finally getting rid of my jet lag. Here’s a summary of what’s been a mind-blowing conference experience for me.

Read more…


Speaking at the Percona Live MySQL Conference and Expo

Posted in Florian's blog by Florian Haas at April 14, 2012 04:09 PM

This week, I had the pleasure of speaking at the Percona Live MySQL Conference & Expo. This was the first year it was not the O’Reilly MySQL Conference & Expo, and also the first time Oracle was not involved in any way. And what can I say, Terry Erisman and his team at Percona have put together an awe-inspiring conference.

Read more…


Speaking at OSCON 2012

Posted in Florian's blog by Florian Haas at April 03, 2012 09:28 AM

I’ll be speaking at OSCON 2012 in Portland, on high availability in OpenStack.

Read more…


Feature article on Pacemaker in this month’s Linux Journal

Posted in Florian's blog by Florian Haas at April 02, 2012 12:45 PM

I’ve written an article on the Pacemaker stack that’s being featured in this month’s Issue 216 of Linux Journal.

Read more…


Announcing the High Performance High Availability Guide documentation project

Posted in Florian's blog by Florian Haas at March 23, 2012 01:06 PM

A bit of updated information on my limited involvement in the DRBD User’s Guide, and something new that came out of it.

Read more…


On my (ex-)maintainership of the DRBD User’s Guide

Posted in Florian's blog by Florian Haas at March 20, 2012 01:19 PM

Here’s a quick summary of my past and current relationship with the DRBD User’s Guide.

Read more…


Lots of new stuff on our web site

Posted in Florian's blog by Florian Haas at March 13, 2012 05:13 PM

Over the past couple of weeks, we’ve quietly rolled out new content, new functionality, and Hangouts on our web site. Here’s a summary of these nifty little changes.

Read more…


Ceph: tickling my geek genes

Posted in Florian's blog by Florian Haas at March 08, 2012 07:12 PM

Haven’t heard of Ceph, the open-source distributed petascale storage stack? Well, you’ve really been missing out. It’s not just a filesystem. It’s a filesystem, and a striped/replicated block device provider, and a virtualization storage backend, and a cloud object store, and then some.

Read more…


Announcing Cloud Jumpstart for OpenStack™ – and your chance to get into LinuxTag for free!

Posted in Florian's blog by Florian Haas at March 06, 2012 10:15 AM

Yesterday, we announced Cloud Jumpstack for OpenStack™ – our brand new training offering with 2 full days of deep-diving into OpenStack. If you have little or no experience with OpenStack, and you want to get your feet wet and your hands dirty real quick, then Cloud Jumpstart for OpenStack is for you. And there’s an extra sweet deal on our first incarnation of this awesome class.

Read more


OpenStack Spring 2012 Design Summit & Conference

Posted in Florian's blog by Florian Haas at February 29, 2012 09:13 AM

This April, right after the MySQL Conference & Expo, I’ll stay around the San Francisco Bay Area for another week, as I’ve been invited attend the OpenStack Spring 2012 Design Summit & Conference.

Read more


This blog is about to move!

Posted in Florian's blog by Florian Haas at February 28, 2012 01:57 PM

My dear and faithful readers, this will cease to be my primary blog site.

From now on, I’ll be blogging over on the hastexo web site, where you can find my blog at http://www.hastexo.com/blogs/florian. An RSS feed, for those of you who want to update their readers, is at http://www.hastexo.com/blogs/florian/feed.

This is something I’ve been meaning to do for a while, and I’ve finally had the breathing room to do so.

The statements I make on my blog will continue to be my own, rather than “official” hastexo company statements. The same is true for Martin, whose blog is also moving to our web site (RSS).

To ease the transition, I plan to post the opening paragraphs of blog post popping up over there on this site. It’s just that the “Read More” links will now point to the new primary blog site. I have no intention on taking the WordPress site down anytime soon, so anything recorded here will remain for reference purposes.


Speaking at the 2012 Percona Live MySQL Conference

Posted in Florian's blog by Florian Haas at February 27, 2012 12:39 PM

This year, I have the pleasure of returning to the MySQL Conference & Expo as a speaker. Percona have picked up the torch that O’Reilly had held as the conference organizers, and they’re putting together a 3-day conference this year. I am co-presenting a tutorial with Yves Trudeau from Percona.

Read more


Last Minute discount now available for High Availability Expert training in Berlin

Posted in Florian's blog by Florian Haas at February 07, 2012 05:10 AM

We have exactly one seat still left open in our hastexo High Availability Expert class coming up in Berlin next week. So if you want to learn about GFS2, OCFS2, advanced Pacemaker, GlusterFS and Ceph in one of Europe’s most beautiful cities, now is your chance!

And, we have a Last Minute discount available so you can get in for cheap! You’ll just have to be really, really fast before someone else grabs it. Our web site has the details.


Speaking at linux.conf.au, meet us in Ballarat!

Posted in Florian's blog by Florian Haas at January 04, 2012 10:35 PM

After last year’s talk in Brisbane, where I greatly enjoyed co-presenting with Tim Serong, I have the privilege of returning to Australia for this year’s linux.conf.au in Ballarat, Victoria.

This time I have a brief talk opening up the High Availability and Distributed Storage miniconf on Monday, January 16, and a tutorial entitled High Availability Sprint in the morning on Thursday, January 19. Tim Serong is again joining me for the tutorial, and Pacemaker author Andrew Beekhof will be chiming in too. See you all in Ballarat!


Now available: Slides from Percona Live and Linuxcon Europe

Posted in Florian's blog by Florian Haas at November 01, 2011 08:53 PM

The slides from last week’s talks I (co-)presented at Percona Live and Linuxcon Europe are now available from our web site.

All slides are available entirely free of charge for logged-in users on our web site. To log in, you don’t even need to register — just use your Google Profile, or Google Apps account, or your WordPress account, or anything else that uses OpenID, and you’ll be good to go.

Comments on our slides are, of course, always highly appreciated.


Ready to roll for Percona Live UK

Posted in Florian's blog by Florian Haas at October 23, 2011 06:17 PM

Percona Live MySQL Conference, London, Oct 24th and 25th, 2011

All slides are done, all virtual images are completed and we’re ready to roll for tomorrow’s MySQL High Availability Sprint: Launch the Pacemaker tutorial at Percona Live UK 2011.

This is probably your very last chance to register for PLUK as there are only a handful of tickets left. You can still use my discount code, HaasPLUK11. See you tomorrow!


Twitter

Posted in Florian's blog by Florian Haas at October 19, 2011 10:22 AM

Henceforth, you can find and follow us on Twitter. See you there!


Busy weeks ahead!

Posted in Florian's blog by Florian Haas at October 17, 2011 03:17 PM

I’m speaking at Percona Live, LinuxCon Europe, and linux.conf.au. And I just co-founded a new company.

I have a few busy weeks behind me, and even busier weeks ahead. If you’ve been wondering why recently I haven’t been updating this space too frequently, here’s why:

Yours truly and fellow ex-Linbiters Martin Loschwitz and Andreas Kurz have recently founded hastexo, an independent professional services organization focused on open-source high availability and disaster recovery. We are already offering both on-site and remote consultancy, custom training, and our Availability Checkup package, with more services lined up to be added to our offering.

We’re able to offer direct, 24/7 access to high availability experts with dial-in numbers in Europe, North America and Australia. We’re offering our services under an extremely flexible, versatile payments scheme with an attractive volume discount model. We’re experts in an array of high availability and disaster recover technologies — like Pacemaker, Corosync, Heartbeat, DRBD, highly available virtualization (a.k.a “enterprise cloud”), and cluster file systems.

And we’ve got a unique, free offering. Have you ever considered hiring a high availability consultant to review your setup or provide expert advice, but were unsure as to the expected cost involved? At hastexo, we can help. You simply go to our Help page (free-of-charge registration required), collect information as instructed, and then just create a ticket in our support system. And we’ll make a qualified estimate as to the amount of effort (and cost) required to fix your issue, or improve your uptime, or both.

And, just in case one of us has previously help you on a mailing list, on IRC, or at a conference, as we frequently do, then please leave us a message in our Shoutbox. We love to support the high availability community, and we’re thrilled to hear about it when we can help.

Speaking of conferences: next week, I’m doing back-to-back conferences in Europe.

And, for those of you making plans for Ballarat in January: I’ll return to linux.conf.au as a tutorial speaker, together with Andrew Beekhof and Tim Serong. I have also submitted a talk for the High Availability and Distributed Storage miniconf, preceding the main conference. See you there!


Speaking at Percona Live — and you can get there for cheap!

Posted in Florian's blog by Florian Haas at September 08, 2011 06:16 AM

Following my departure from Linbit, I’m honored to be serving a number of speaking requests at conferences over the next few months.

The first I am pleased to announce is my commitment to speak at Percona Live in London this October. The conference venue is the America Square Conference Centre not too far from the iconic Tower of London. My 3-hour tutorial MySQL High Availability Sprint: Launch The Pacemaker! is scheduled for Monday, October 24th at 1pm.

In this tutorial, I’ll show you the simplest, quickest and easiest way to set up MySQL high availability in Pacemaker clusters — once you understand the concept, you’ll be able to pull this sort of thing off in under an hour.

What’s cool is that Percona provide tutorial speakers with discount codes for registration, which we can freely share. Thus, if you register for Percona Live using the discount code HaasPLUK11, you get £40 off the Conference+Tutorials ticket — and if you do so before September 19, you save an additional £135 with Early Bird Registration. This discount is valid regardless of whether you actually come to my tutorial or choose a concurrently scheduled one — so even if my tut is not for you, I can still help you get into the conference cheaper!

I’m thrilled to be doing this and can’t wait to see a bunch of familiar faces in London. And I’d be thrilled so see you!


Monitoring DRBD resources with Zabbix on CentOS

Posted in Arrfab's Blog » Cluster by fabian.arrotin at September 07, 2011 12:10 PM

We use DRBD at work on several CentOS 5.x nodes to replicate data between our two computer rooms (in different buildings but linked with Gigabit fiber). It's true that you can know if something wrong happens at the DRBD level if you have configured the correct 'handlers' and the appropriate notifications scripts (Have a look for example at the Split Brain notification script). Those scripts are 'cool' but what if you could 'plumb' the DRBD status in your actual monitoring solution ? We use Zabbix at $work and I was asked to centralize events from differents sources and Zabbix doesn't support directly monitoring DRBD devices. But one of the cool thing with Zabbix is that it's like a Lego system : you can extend what it does if you know what to query and how to do it. If you want to monitor DRBD devices, the best that Zabbix can do (on the agent side, when using the zabbix agent running as a simple zabbix user with /sbin/nologin as shell) is to query and parse /proc/drbd . So here we go : we need to modify the Zabbix agent to use Flexible User Parameters, like this (in /etc/zabbix/zabbix_agentd.conf) :

UserParameter=drbd.cstate[*],cat /proc/drbd |grep $1:|tr [:blank:] \\n|grep cs|cut -f 2 -d ':'|grep Connected |wc -l
UserParameter=drbd.dstate[*],cat /proc/drbd |grep $1:|tr [:blank:] \\n|grep ds|cut -f 2 -d ':'|cut -f 1 -d '/'|grep UpToDate|wc -l

We just need to inform the Zabbix server of the actual Connection State (cs) and Disk State (ds) . For that we just need to create Application/Items and Triggers .. but what if we could just create a Zabbix Template so that we can just link that template to a DRBD host ? I attach to this post the DRBD Zabbix template (xml file that you can import in your zabbix setup) and you can just link it to your drbd hosts. Here is the link . That XML file contains both two Items (cstate and dstate) and the associated triggers. Of course you can extend it, especially if you use multiple resources , drbd disks. Because we used the Flexible parameters, you can for example in the Zabbix item, create a new one (based on the template) and monitor the /dev/drbd1 device just by using the drbd.dstate[1] key in that zabbix item.

Happy Monitoring and DRBD'ing ...