Test Suite: Difference between revisions
m (move section "Sanitizers" to within "To improve") |
(→Running all test cases: testing/pluto/stop-tests-now) |
||
Line 259: | Line 259: | ||
make check UPDATE=1 | make check UPDATE=1 | ||
</pre> | </pre> | ||
=== stopping pluto tests gracefully === | |||
The tests run for a long time. For example, on one of our machines they currently take 10 hours. If you want to stop a test run between individual pluto tests, you can create a file to indicate this: | |||
<pre> | |||
echo "" >testing/pluto/stop-tests-now | |||
</pre> | |||
Be sure to remove the file afterwards. | |||
== Updating the VMs == | == Updating the VMs == |
Revision as of 17:05, 12 September 2014
Libreswan comes with an extensive test suite, written mostly in python, that uses KVM virtual machines and virtual networks. It has replaced the old UML test suite. Apart from KVM, the test suite uses libvirtd and qemu. It is strongly recommended to run the test suite natively on the OS (not in a VM itself) on a machine that has a CPU wth virtualization instructions. The PLAN9 filesystem (9p) is used to mount host directories in the guests - NFS is avoided to prevent network lockups when an IPsec test case would cripple the guest's networking.
libvirt 0.9.11 and qemu 1.0 or better are required. RHEL does not support a writable 9p filesystem, so the recommended host/guest OS is Fedora 20 |
Preparing the host machine
Nothing apart from the system services requires root access. However, it does require that the user you are using is allowed to run various commands as root via sudo. Additionally, libvirt assumes the VMs are running under the qemu uid, but because we want to share files using the 9p filesystem between host and guests, we want the VMs to run under our own uid. The easiest solution to accomplish all of these is to add your user (for example the username "build") to the kvm, qemu and wheel groups. These are the changed lines from /etc/groups:
wheel:x:10:root,build kvm:x:36:root,qemu,build qemu:x:107:root,qemu,build
Commands to effect this:
sudo usermod -G wheel,kvm,qemu root sudo usermod -G wheel,kvm,qemu build sudo usermod -G kvm,qemu qemu
And the file /etc/sudoers would have a line:
%wheel ALL=(ALL) NOPASSWD: ALL
You might need to relogin for all group changes to take effect.
Now we are ready to install the various components of libvirtd, qemu and kvm and then start the libvirtd service.
sudo yum install virt-manager virt-install qemu-system-x86 libvirt-daemon-driver-qemu qemu \ qemu-kvm libvirt-daemon-qemu qemu-img qemu-user libvirt-daemon-qemu libvirt-python racoon strongswan sudo systemctl enable libvirtd.service sudo systemctl start libvirtd.service
Because our VMs don't run as qemu, /var/lib/libvirt/qemu needs to be changed using chmod to make it writable for the qemu group. This needs to be repeated if the libvirtd package is updated on the system |
do not install strongswan-libipsec because you won't be able to run non-NAT strongswan tests! |
Various tools are used or convenient to have when running tests:
sudo yum install python-pexpect git tcpdump expect python-setproctitle python-ujson
tcpdump permissions
groupadd tcpdump grep tcpdump /etc/group* #add build to group tcpdump vim /etc/group ls -lt /sbin/tcpdump chown root.tcpdump /sbin/tcpdump setcap "CAP_NET_RAW+eip" /sbin/tcpdump grep tcpdump /etc/group* #/etc/group:tcpdump:x:72:build #/etc/group-:tcpdump:x:72: #when the installtion is complete the following should work tcpdump -i swan12
The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:
git clone https://github.com/libreswan/libreswan cd libreswan
Creating the VMs
A configuration file called kvmsetup.sh is used to configure a few parameters for the test suite:
cp kvmsetup.sh.sample kvmsetup.sh
This file contains various environment variables used for creating and running the tests. In the example version, the KVMPREFIX= is set to the home directory of the user "build". The POOLSPACE= is where all the VM images will be stored. There should be at least 16GB of free disk space in the pool/ directory. You can change the OSTYPE= if you prefer to use ubuntu guests over fedora guests. We recommend that the host and guest run the same OS - it makes things like running gdb on the host for core dumps created in the guests much easier. The OSMEDIA= can be changed to point to a local distribution mirror.
If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to testing/baseconfigs/all/etc/ssh/authorized_keys. This file is installed as /root/.ssh/authorized_keys on all VMs
Once the kvmsetup.sh file has been edited, we can create the VMs:
sh testing/libvirt/install.sh
First, a new VM is added to the system called "fedorabase" (or "ubuntubase"). This is an automated minimal install using kickstart. In the "post install" phase of the anaconda installer, this VM runs a "yum update" to ensure we have the latest versions of all packages. In that %post phase we also install various packages that we need to run the tests. This can result in the installer spending a very long time in the "post install" phase. During this time, the VM displays no progress bar. Just be patient.
Once the VM is fully installed, the disk image is converted to QCOW and copied for each test VM, west, east, north, road and nic. A few virtual networks are created to hook up the VMs in isolation. These virtual networks have names like "192_1_2_0" and use bridge interfaces names like "swan12". Finally, the actual VMs are added to the system's libvirt/KVM system and the "fedorabase" VM is deleted.
Since you will be using ssh a lot to login to these machines, it is recommended to put their names in /etc/hosts:
# /etc/hosts entries for libreswan test suite 192.1.2.45 west 192.1.2.23 east 192.0.3.254 north 192.1.3.209 road 192.1.2.254 nic
Logging into the VMs
You can login to the VMs in three different ways:
- using ssh (with the above names in /etc/hosts)
- using sudo virsh console <name>
- using virt-manager on the "graphics console"
Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine. This command, run on the host, installs your public key on the root account of the guest machines west. This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway). Remember that the root password on each guest machine is "swan".
ssh-copy-id root@west
You can use ssh-copy for any VM. Unfortunately, the key is forgotten when the VM is restarted.
Preparing the VMs
The VMs came pre-installed with everything, except libreswan. We do not want to use the OS libreswan package because we want to run our own version to test our code changes. Some of the test cases use the NETKEY/XFRM IPsec stack but most test cases use the KLIPS IPsec stack. Login to the first VM and compile and install the libreswan userland and KLIPS ipsec kernel module:
[build@host:~/libreswan $ sudo virsh start west [build@host:~/libreswan $ ssh root@west swan-update
swan-update first builds libreswan and then installs libreswan. For the other VMs (except "nic" which never runs IPsec) we only need to install, as libreswan is already built in the first VM.
[build@host:~/libreswan $ for vm in east north road; do sudo virsh start $vm; done (wait for machines to boot) [build@host:~/libreswan $ for vm in east north road; do ssh root@$vm swan-install; done
All VMs are now fully provisioned to run test cases.
The directories /source and /testing inside any VM are automatically mounted from the host's libreswan directory. Do not move the libreswan or the pool space directory on the host |
Running a test case
All the test cases involving VMs are located in the libreswan directory under testing/pluto/ . The most basic test case is called basic-pluto-01. Each test case consists of a few files:
- description.txt to explain what this test case actually tests
- ipsec.conf files - for host west is called west.conf. This can also include configuration files for strongswan or racoon2 for interop testig
- ipsec.secret files - if non-default configurations are used. also uses the host syntax, eg west.secrets, east.secrets.
- An init.sh file for each VM that needs to start (eg westinit.sh, eastinit.sh, etc)
- One run.sh file for the host that is the initiator (eg westrun.sh)
- Known good (sanitized) output for each VM (eg west.console.txt, east.console.txt)
- testparams.sh if there are any non-default test parameters
You can run this test case by issuing the following command on the host:
cd testing/pluto/basic-pluto-01/ ../../utils/dotest.py
Once the testrun has completed, you will see an OUTPUT/ directory in the test case directory:
$ ls OUTPUT/ east.console.diff east.console.verbose.txt RESULT west.console.txt west.pluto.log east.console.txt east.pluto.log swan12.pcap west.console.diff west.console.verbose.txt
- RESULT is a text file (whose format is sure to change in the next few months) stating whether the test succeeded or failed.
- The diff files show the differences between this testrun and the last known good output.
- Each VM's serial (sanitized) console log (eg west.console.txt)
- Each VM's unsanitized verbose console output (eg west.console.verbose.txt)
- A network capture from the bridge device (eg swan12.pcap)
- Each VM's pluto log, created with plutodebug=all (eg west.pluto.log)
- Any core dumps generated if a pluto daemon crashed
Diagnosing inside the VM
Once a test run has completed, the VMs shut down the ipsec subsystem. You can use ssh to login as root on any host (password "swan") and rerun the testcase manually. This gives you a chance to repeat a crasher while using gdb. You need three terminals to do this.
Terminal 1: prepare west
ssh root@west cd /testing/pluto/basic-pluto-01 sh ./westinit.sh
Terminal 2: prepare east
ssh root@east cd /testing/pluto/basic-pluto-01 sh ./eastinit.sh
terminal 3: gdb
This assumes that initialization worked and pluto hasn't yet crashed. Pick the side you wish to gdb, ssh in, and start gdb
ssh root@eastORwest gdb -p `pidof pluto` gdb> cont
If pluto wasn't running, gdb would complain: --p requires an argument
When pluto crashes, gdb will show that and await commands. For example, the bt command will show a backtrace.
terminal 1: start the test
sh ./westrun.sh
Diagnosing inside the VM (alternative version)
Once a testrun has completed, the VMs shut down the ipsec subsystem. You can use ssh to login as root on any host (password "swan") and rerun the testcase manually. This gives you a chance to repeat a crasher while using gdb:
ssh root@east ipsec setup start pidof pluto cd /source/OBJ* gdb programs/pluto/pluto gdb> attach <pid> gdb> cont
In another window, prepare west:
ssh root@west cd /testing/pluto/basic-pluto-01 sh ./westinit.sh
In still another window, you can login to east and re-trigger the failure. You can either use the root command history using the arrow keys to start ipsec and load the right connection, or you can re-run the "eastinit.sh" file:
ssh root@east cd /testing/pluto/basic-pluto-01 sh ./eastinit.sh
In the west window, you can either continue with running "westrun.sh" or you can look at westrun.sh and issue the commands manually.
Running all test cases
To run all test cases, you need to be able to compile libreswan on the host (not for any good reason, but "make check" runs "make programs" first). You might need to install some build requirements:
sudo yum install flex bison gmp-devel nss-devel nspr-devel openldap-devel curl-devel pam-devel unbound-devel fipscheck-devel libcap-ng-devel
To run all test cases (which include compiling and installing it on all vms, and non-VM based test cases), run:
make check UPDATE=1
stopping pluto tests gracefully
The tests run for a long time. For example, on one of our machines they currently take 10 hours. If you want to stop a test run between individual pluto tests, you can create a file to indicate this:
echo "" >testing/pluto/stop-tests-now
Be sure to remove the file afterwards.
Updating the VMs
Sometimes you want to update a VM's system or add a package to assist with debugging. This requires an internet connection. While the VMs are completely isolated, the "nic" VM can be configured to give internet access to the machines:
ssh root@nic ifup eth3 iptables -I POSTROUTING -t nat -o eth3 -j MASQUERADE route add default gw 192.168.234.1 # may be needed exit
On the other VMs, change the nameserver entry in /etc/resolv.conf to point to a valid resolver (eg 8.8.8.8 or 193.110.157.123) and the VM will have full internet connectivity.
Do not enable eth3 on "nic" per default, as it will affect the actual test cases that are run. |
The /testing/guestbin directory
The guestbin directory contains scripts that are used within the VMs only.
swan-transmogrify
When the VMs were installed, an XML configuration file from testing/libvirt/vm/ was used to configure each VM with the right disks, mounts and nic cards. Each VM mounts the libreswan directory as /source and the libreswan/testing/ directory as /testing . This makes the /testing/guestbin/ directory available on the VMs. At boot, the VMs run /testing/guestbin/swan-transmogrify. This python script compares the nic of eth0 with the list of known MAC addresses from the XML files. By identifying the MAC, it knows which identity (west, east, etc) it should take on. Files are copied from /testing/baseconfigs/ into the VM's /etc directory and the network service is restarted.
swan-build, swans-install, swan-update
These commands are used to build, install or build+install (update) the libreswan userland and kernel code
swan-prep
This command is run as the first command of each test case to setup the host. It copies the required files from /testing/baseconfigs/ and the specific test case files onto the VM test machine. It does not start libreswan. That is done in the "init.sh" script.
The swan-prep command takes two options. The --x509 option is required to copy in all the required certificates and update the NSS database. The --46 /--6 option is used to give the host IPv4 and/or IPv6 connectivity. Hosts per default only get IPv4 connectivity as this reduces the noise captured with tcpdump
fipson and fipsoff
These are used to fake a kernel into FIPS mode, which is required for some of the tests.
Various notes
- Currently, only one test can run at a time.
- You can peek at the guests using virt-manager or you can ssh into the test machines from the host.
- ssh may be slow to prompt for the password. If so, start up the vm "nic"
- On VMs use only once CPU core. Multiple CPUs may cause pexpect to mangle output.
- 2014 Mar: DHR needed to do the following to make things work each time he rebooted the host
$ sudo setenforce Permissive $ ls -ld /var/lib/libvirt/qemu drwxr-x---. 6 qemu qemu 4096 Mar 14 01:23 /var/lib/libvirt/qemu $ sudo chmod g+w /var/lib/libvirt/qemu $ ( cd testing/libvirt/net ; for i in * ; do sudo virsh net-start $i ; done ; )
To improve
- swan-build and swan-install does not stop on compile/install error and signel it to dotest.py
- install and remove RPM using dotest.py + make rpm support
- add summarizing script that generate html/json to git repo
- coreump . It has been a mystery :) systemd or some daemon appears to block coredump on the Fedora 20 systems.
Sanitizers
- summarize output from tcpdump
- count established IKE, ESP , AH states
- dpd ping sanitizer. DPD tests have unpredictable packet loss for ping.
- look into pluto-testlist-scan.sh.dumb-cert-fragment