This will be my first multi-part blog post, and I am actually not sure just how many parts it will have by the time I am finished. My original intent was to test some failure scenarios whereby I would emulate a WAN link disappearing. That quickly expanded into a more ambitious test, I still wanted to test failure scenarios but I also wanted to test with some real world (ish) values for latency, packet loss, jitter etc.
In particular, I wanted to see how a sharded MongoDB cluster would behave with this type of variable performance and what I could do to make things better (I have some interesting ideas there), as well as test some improvements in the 2.6 version. I’ve created configurations like this in labs with hardware (expensive), XenServer (paid) but I wanted something others could reproduce, reuse and preferably easily and at no cost. Hence I decided to see if I could make this work with VirtualBox (I also plan to come up with something similar for Docker/Containers having read this excellent summary, but that is for later).
My immediate thought was to use Traffic Control but I had a vague recollection of having used a nice utility in the past that gave me a nice (basic) web interface for configuring various options, and was fairly easy to set up. A bit of quick Googling got me to WANem and this was indeed what I had used in the past. I recalled the major drawback at the time was that after booting it, we needed to reconfigure each time because it was a live CD. Hence the first task was to fix that and get it to the point that it was a permanent VM (note: there is a pre-built VMWare appliance available for those on that platform).
That was reasonably straight forward, and I wrote up the process over at SuperUser as a Q&A:
Once that was done, it was time to configure the interfaces, make routing work and test that the latency and packet loss settings actually worked (continues after the jump).
To help visualize things, for my initial testing, what I wanted was pretty basic, something like this:
The subnets in the diagram are the defaults that are set up when you create “Host Only” networks (vboxnet0 and vboxnet1 respectively). They suited my needs, so I left them as-is, and in place of the various MongoDB processes, I started with two Ubuntu 12.04 VMs, with one connected to each “Host Only” network. I disabled the DHCP server on the networks, and set their IP addresses (192.168.56.10 and 192.168.57.10) and default gateways (192.168.56.1 and 192.168.57.1) manually. There are plenty of guides on how to go about this, so I’ll just leave that as a basic description and move on to the WANem config.
The first thing I did was configure the eth0 (vboxnet0) and eth1 (vboxnet1) interfaces with the appropriate IP addresses to function as the default gateways for the respective subnets. This meant editing the /etc/network/interfaces file so that it looked like this (:
# /etc/network/interfaces -- configuration file for ifup(8), ifdown(8)
# The loopback interface # automatically added when upgrading auto lo iface lo inet loopback
auto eth2 iface eth2 inet dhcp
auto eth0 iface eth0 inet static address 192.168.56.1 netmask 255.255.255.0
auto eth1 iface eth1 inet static address 192.168.57.1 netmask 255.255.255.0
Nothing too fancy, but the interfaces should now be up and pingable from the Ubuntu hosts on each subnet (though not across the subnets – routing is not yet enabled). To enable routing, and to actually make my internet connection via eth2 work, I ended up having to add the lines below to the bottom of /etc/rc.local – I’m not happy with that as a solution, it’s a bit hacky but it works:
echo "1" > /proc/sys/net/ipv4/ip_forward route del default gw 192.168.56.1 route add default gw 192.168.1.1
To explain the three lines: for some reason, no matter how I tried to make the setting persist, the ip_forward setting would not survive a reboot, so I eventually added it here out of frustration. I’ve successfully done this on many distributions but not this one – eventually I gave up when I found this workable hack – getting a converted Knoppix distro to work is after all not the point of this exercise. The second and third lines are required to make the internet connection (via eth2) in the diagram work and function as the default route – despite tinkering with various options in the interfaces file, it persisted in using the eth0 route as the default otherwise.
There were some issues (likely with ARP caching or similar) but once I had pinged each host from each end of the connection, the routing started to flow as expected. I was able to use the WANem web interface to introduce latency, packet loss, and more. Example basic settings can be seen here:
Don’t forget to ping the host addresses (in my case 192.168.5.10) from the WanEM shell to kick things off. Starting/stopping WANem gave me the variations in performance I expected. Basically, I now had the test set up I needed, all that remained was to get the MongoDB instances configured as needed and I was good to go