Test Suite - KVM: Difference between revisions

From Libreswan
Jump to navigation Jump to search
m (install -y)
(firewall)
(41 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== KVM Test framework ==


Libreswan comes with an extensive test suite, written mostly in python, that uses KVM virtual machines and virtual networks. It has replaced the old UML test suite.
Libreswan's test framework can be run using KVM guests, and the kvm scripts. It is strongly recommended to run the test suite on a host machine that has a CPU wth virtualisation instructions.
Apart from KVM, the test suite uses libvirtd and qemu. It is strongly recommended to run the test suite natively on the OS (not in a VM itself) on a machine that has a CPU wth virtualization instructions.
The PLAN9 filesystem (9p) is used to mount host directories in the guests - NFS is avoided to prevent network lockups when an IPsec test case would cripple the guest's networking.  


{{ ambox | nocat=true | type=important | text = libvirt 0.9.11 and qemu 1.0 or better are required. RHEL does not support a writable 9p filesystem, so the recommended host/guest OS is Fedora }}
To access files on the host file system:


== Test Frameworks ==
* Linux guests (Fedora) use the PLAN9 filesystem (9p)
* BSD guests (FreeBSD, NetBSD, OpenBSD) use NFS via the NAT interface


This page describes the make kvm framework.
For an overview of the tests see [[Test_Suite]]


Instead of using virtual machines, it is possible to use Docker instances.
== Preparing the host machine ==


More information is found in [[Test Suite - Docker]] in this Wiki
=== Enable virtualization in the BIOS ===
 
== Preparing the host machine ==


In the following it is assumed that your account is called "build".
Virtualization needs to be enabled by the BIOS during boot.


=== Add Yourself to sudo ===
=== Add yourself to <tt>sudo</tt> ===


Some of the test scrips need to be run as root.  The test environment assumes this can be done using <tt>sudo</tt> without a password vis:
Some of the test scrips need to be run as root.  The test environment assumes this can be done using <tt>sudo</tt> without a password vis:


<pre>
sudo pwd
sudo pwd
</pre>


XXX: Surely qemu can be driven without root?
''XXX: Surely qemu can be driven without root?''


This is done by creating a no-pasword rule to /etc/sudoers.d/.
This is setup by adding an entry under /etc/sudoers.d/ specifying that your account does not need a password to become root:


To set this up, add your account to the wheel group:
echo "$(id -u -n) ALL=(ALL) NOPASSWD: ALL" | sudo dd of=/etc/sudoers.d/$(id -u -n)
 
<pre>
sudo usermod -a -G wheel $(id -u -n)
</pre>
 
and permit wheel to have no-password access:
 
<pre>
echo '%wheel ALL=(ALL) NOPASSWD: ALL' | sudo dd of=/etc/sudoers.d/wheel
sudo chmod ug=r,o= /etc/sudoers.d/wheel
sudo chown root.root /etc/sudoers.d/wheel
</pre>


=== Fight SELinux ===
=== Fight SELinux ===
Line 50: Line 34:
* set SELinux to permissive (recommended)
* set SELinux to permissive (recommended)


<pre>
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config
sudo setenforce Permissive
sudo setenforce Permissive
</pre>


* disable SELinux
* disable SELinux


<pre>
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sudo reboot
sudo reboot
</pre>


* (experimental) label source tree for SELinux
* (experimental) label source tree for SELinux
Line 66: Line 46:
The source tree on the host is shared with the virtual machines.  SELinux considers this a bug unless the tree is labelled with type svirt_image_t.
The source tree on the host is shared with the virtual machines.  SELinux considers this a bug unless the tree is labelled with type svirt_image_t.


<pre>
sudo dnf install policycoreutils-python-utils
sudo dnf install policycoreutils-python-utils
sudo semanage fcontext -a -t svirt_image_t "$(pwd)"'(/.*)?'
sudo semanage fcontext -a -t svirt_image_t "$(pwd)"'(/.*)?'
sudo restorecon -vR /home/build/libreswan
sudo restorecon -vR /home/build/libreswan
</pre>


There may be other things that SELinux objects to.
There may be other things that SELinux objects to.


=== Install Required Dependencies ===
=== Check that the host has enough entropy ===


Now we are ready to install the various components of libvirtd, qemu and kvm and then start the libvirtd service.
As a rough guide run:


==== Fedora ====
while true ; do cat /proc/sys/kernel/random/entropy_avail ; sleep 3 ; done


To get qemu working (while virt-manager isn't strictly required it's useful on a desktop):
it should have values in the hundrets if not thousands.  If it is in the units or tens then see [[Entropy matters]]


<pre>
=== Install Dependencies ===
sudo dnf install -y make git
sudo dnf install -y qemu virt-manager virt-install libvirt-daemon-kvm libvirt-daemon-qemu
sudo dnf install -y python3-pexpect
</pre>


so web pages can be generated:
{| class="wikitable"
 
|-
<pre>
! Why || Fedora !! Mint (debian)
sudo dnf install -y jq nodejs-typescript
|-
</pre>
| Basics
|| sudo dnf install -y make git gitk
|| sudo apt-get install -y make make-doc git gitk
|-
| Python
|| sudo dnf install -y python3-pexpect
|| sudo apt-get install python3-pexpect
|-
| Virtulization
|| sudo dnf install -y qemu virt-install libvirt-daemon-kvm libvirt-daemon-qemu
|| sudo apt install -y qemu virtinst libvirt-clients libvirt-daemon libvirt-daemon-system libvirt-daemon-driver-qemu libosinfo-query qemu-system-x86?
|-
| Boot CDs
|| sudo dnf install -y dvd+rw-tools
|| sudo apt-get install -y dvd+rw-tools
|-
| Web pages
|| sudo dnf install -y jq nodejs-typescript
|| sudo apt-get install -y jq node-typescript
|-
| NFS
|| ???
|| sudo apt-get install -y nfs-kernel-server rpcbind
|}


?why?
=== Enable libvirt ===


<pre>
''If you're switching from the old libvirtd see https://libvirt.org/daemons.html#switching-to-modular-daemons for how to shut down the old daemons.''
sudo dnf install -y bind-dnssec-utils
</pre>


Once all is installed start libvirtd and then check it is running:
Start the "collection of modular daemons that replace functionality previously provided by the monolithic libvirtd daemon":


<pre>
for drv in qemu network nodedev nwfilter secret storage interface
sudo systemctl enable libvirtd
do
sudo systemctl start libvirtd
    sudo systemctl unmask virt${drv}d.service
sudo systemctl status libvirtd
    sudo systemctl unmask virt${drv}d{,-ro,-admin}.socket
</pre>
    sudo systemctl enable virt${drv}d.service
    sudo systemctl enable virt${drv}d{,-ro,-admin}.socket
done
for drv in qemu network nodedev nwfilter secret storage
do
    sudo systemctl start virt${drv}d{,-ro,-admin}.socket
done


There should be no errors and warnings.
There should be no errors and warnings.


On testing and F29, this failed with the error:
This code has bugs:


<pre>
* [https://bugzilla.redhat.com/show_bug.cgi?id=2075736| error: Disconnected from qemu:///system due to keepalive timeout]
error : virQEMUCapsNewForBinaryInternal:4664 : internal error: Failed to probe QEMU binary with QMP: /usr/bin/qemu-system-xtensa: error while loading shared libraries: libbrlapi.so.0.6: cannot open shared object file: No such file or directory
: work-around is to stop the daemons shutting down when idle:
</pre>
: $ grep ARGS /etc/sysconfig/*virt*
 
: /etc/sysconfig/virtnetworkd:VIRTNETWORKD_ARGS=""
and it was found that 'brlapi' needed to be manually installed.
:  /etc/sysconfig/virtqemud:VIRTQEMUD_ARGS=""
 
:  /etc/sysconfig/virtstoraged:VIRTSTORAGED_ARGS=""
==== Debian ====
* [https://bugzilla.redhat.com/show_bug.cgi?id=2111582| libvirtd deadlocks] (fixed)
 
: no work-around
Anyone?
* [https://bugzilla.redhat.com/show_bug.cgi?id=2123828| virtqemud gets slower and slower]
: work-around is to run <tt>sudo systemctl restart virtqemud</tt> when things get slow


{{ ambox | nocat=true | type=important | text = do not install strongswan-libipsec because you won't be able to run non-NAT strongswan tests! }}
=== Add yourself to the KVM/QEMU group ===


=== Setting Users and Groups ===
You need to add yourself to the group that QEMU/KVM uses when writing to /var/lib/libvirt/qemu.  On Fedora it is 'qemu', and on Debian it is 'kvm'.  Something like:


You need to add yourself to the qemu group.  For instance:
sudo usermod -a -G $(stat --format %G /var/lib/libvirt/qemu) $(id -u -n)


<pre>
After this you will will need to re-login (or run <tt>sudo su - $(id -u -n)</tt>
sudo usermod -a -G qemu $(id -u -n)
</pre>


You will need to re-login for this to take effect.
=== Make certain that <tt>root</tt> can access the build ===


The path to your build needs to be accessible (executable) by root:
The path to your build needs to be accessible (executable) by root, assuming things are under home:


<pre>
chmod a+x $HOME
chmod a+x ~
</pre>


=== Fix /var/lib/libvirt/qemu ===
=== Fix /var/lib/libvirt/qemu ===
Line 144: Line 142:
{{ ambox | nocat=true | type=important | text = Because our VMs don't run as qemu, /var/lib/libvirt/qemu needs to be changed using chmod g+w to make it writable for the qemu group. This needs to be repeated if the libvirtd package is updated on the system }}
{{ ambox | nocat=true | type=important | text = Because our VMs don't run as qemu, /var/lib/libvirt/qemu needs to be changed using chmod g+w to make it writable for the qemu group. This needs to be repeated if the libvirtd package is updated on the system }}


<pre>
sudo chmod g+w /var/lib/libvirt/qemu
sudo chmod g+w /var/lib/libvirt/qemu
 
</pre>
Arguably we should run libvirt as a normal user instead.


=== Create /etc/modules-load.d/virtio.conf ===
=== Create /etc/modules-load.d/virtio.conf (obsolete since 2022 at least) ===


Several virtio modules need to be loaded into the host's kernel.  This could be done by modprobe ahead of running any virtual machines but it is easier to install them whenever the host boots.  This is arranged by listing the modules in a file within /etc/modules-load.d.  The host must be rebooted for this to take effect.
Several virtio modules need to be loaded into the host's kernel.  This could be done by modprobe ahead of running any virtual machines but it is easier to install them whenever the host boots.  This is arranged by listing the modules in a file within /etc/modules-load.d.  The host must be rebooted for this to take effect.


<pre>
sudo dd <<EOF of=/etc/modules-load.d/virtio.conf
sudo dd <<EOF of=/etc/modules-load.d/virtio.conf
virtio_blk
virtio_blk
virtio-rng
virtio-rng
virtio_console
virtio_console
virtio_net
virtio_net
virtio_scsi
virtio_scsi
virtio
virtio
virtio_balloon
virtio_balloon
virtio_input
virtio_input
virtio_pci
virtio_pci
virtio_ring
virtio_ring
9pnet_virtio
9pnet_virtio
EOF
EOF
</pre>


As of Fedora 28, several of these modules are now built into the kernel and will not show up in /proc/modules (virtio, virtio_rng, virtio_pci, virtio_ring).
As of Fedora 28, several of these modules are built into the kernel and will not show up in /proc/modules (virtio, virtio_rng, virtio_pci, virtio_ring).


=== Ensure that the host has enough entropy ===
=== Debian ===


[[Entropy matters]]
On Debian slack based systems (i.e., Linux Mint 20.3), the default python is too old.  Fortunately python 3.9 is also available vis:


With KVM, a guest systems uses entropy from the host through the kernel module "virtio_rng" in the guest's kernel (set above)This has advantages:
sudo apt-get install python3.9
  echo KVM_PYTHON=python3.9 >> Makefile.inc.local


* entropy only needs to be gathered on one machine (the host) rather than all machines (the host and the guests)
=== BSD ===
* the host is in the Real World and thus has more sources of real entropy
* any hacking to make entropy available need only be done on one machine


To ensure the host has enough randomness, run either jitterentropy-rngd  or havegd.
Anyone?
 
Fedora commands for using jitterentropy-rngd (broken on F26, service file specifies /usr/local for path):
<pre>
sudo dnf install jitterentropy-rngd
sudo systemctl enable jitterentropy-rngd
sudo systemctl start jitterentropy-rngd
</pre>
 
Fedora commands for using havegd:
 
<pre>
sudo dnf install haveged
sudo systemctl enable haveged
sudo systemctl start haveged
</pre>


== Download and configure libreswan ==
== Download and configure libreswan ==
Line 203: Line 183:
The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:
The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:


<pre>
git clone https://github.com/libreswan/libreswan
git clone https://github.com/libreswan/libreswan
cd libreswan
cd libreswan
</pre>


=== Create the Pool directory for storing VM disk images - $(KVM_POOLDIR) ===
=== Create the Pool directory for storing VM disk images - $(KVM_POOLDIR) ===
Line 214: Line 192:
To change the location of the pool directory, set the KVM_POOLDIR make variable in Makefile.inc.local.  For instance:
To change the location of the pool directory, set the KVM_POOLDIR make variable in Makefile.inc.local.  For instance:


<pre>
$ grep KVM_POOLDIR Makefile.inc.local
$ grep KVM_POOLDIR Makefile.inc.local
KVM_POOLDIR=/home/libreswan/pool
KVM_POOLDIR=/home/libreswan/pool
</pre>


=== (optional) Use /tmp/pool (tmpfs) to store test VM disk images - $(KVM_LOCALDIR) ===
=== (optional) Use /tmp/pool (tmpfs) to store test VM disk images - $(KVM_LOCALDIR) ===
Line 223: Line 199:
By default, all disk mages are stored in $(KVM_POOLDIR) (see above).  That is both the base VM disk image, and the build VM and test VM disk images.  Since only the base VM image needs long-term storage, $(KVM_LOCALDIR) can be used to specify that the build and test images are stored in /tmp:
By default, all disk mages are stored in $(KVM_POOLDIR) (see above).  That is both the base VM disk image, and the build VM and test VM disk images.  Since only the base VM image needs long-term storage, $(KVM_LOCALDIR) can be used to specify that the build and test images are stored in /tmp:


<pre>
$ grep KVM_LOCALDIR Makefile.inc.local
$ grep KVM_LOCALDIR Makefile.inc.local
KVM_LOCALDIR=/tmp/pool
KVM_LOCALDIR=/tmp/pool
</pre>


This has the advantage of eliminating physical disk I/O as a bottle neck when accessing VM disk images; but the disadvantage of needing to re-build the images after a reboot.
This has the advantage of eliminating physical disk I/O as a bottle neck when accessing VM disk images; but the disadvantage of needing to re-build the images after a reboot.


=== (optional) Run tests in parallel - $(KVM_PREFIXES) ===
=== (optional) Run tests in parallel - $(KVM_PREFIXES) ===


By default only one test is run at a time.  This can be changed using KVM_PREFIXES make variable which specifies the prefix to prepend to test domains.  The default value is:
By default only one test is run at a time.  This can be changed using the $(KVM_PREFIXES) make variable.  This provides a list of prefixes to be pretended to test domains creating multiple test groups.  The default value is:


<pre>
KVM_PREFIXES=''
KVM_PREFIXES=''
</pre>


which creates the domains ''east, ''west, et.al. (i.e., after expansion east, west, et.al.).
which creates the build domains ''fedora-build, ''netbsd-build, et.al., and the test domains ''east, ''west, et.al. (i.e., after expansion east, west, et.al.).


Multiple tests can be run in parallel by specifying more prefixes - a rule of thumb is one prefix per two CPU cores.  For instance, on a 4-core machine, two prefixes can be specified using:
To run tests in parallel, specify multiple prefixes.  For instance two tests can be run in parallel by specifying:


<pre>
KVM_PREFIXES='' 1.
KVM_PREFIXES='' 1.
</pre>


which creates, after expansion, the domains east, west, et.al. and 1.east, 1.west, et.al.
This will create the build domains ''fedora-build, ''netbsd-build, et.al., and the test domains ''east, ''west, et.al., and separately 1.east, 1.west, et.al.


=== (very optional) Boot VMs in parallel - $(KVM_WORKERS) ===
{{ ambox | nocat=true | type=important | text = TODO: generate $(KVM_PREFIXES) from $(KVM_PREFIX) and $(KVM_WORKERS) so that the build domains are prefixed by $(KVM_PREFIX) and the test domains are prefixed by $(KVM_PREFIX), $(KVM_PREFIX)2, ..., $(KVM_PREFIX)$(KVM_WORKERS); only create the first test domains and then create the rest as runtime snapshots. }}


By default one thread is dedicated to booting VMs.  Since booting a VM is very CPU intensive, trying to boot multiple VMs can quickly boog down the machine causing tests being run in parallel to become so slow that they timeout.
=== (optional) Parallel builds - $(KVM_WORKERS) ===


So while not recommended, this can be changed using the make variable KVM_WORKERS:
By default, build domains only have on virtual CPU.  Since building is very CPU intensive, this can be increased using $(KVM_WORKERS).


<pre>
<pre>
KVM_WORKERS=2
KVM_WORKERS=2
</pre>
</pre>
{{ ambox | nocat=true | type=important | text = In the past, because many tests were racy (results were sensitive to CPU load) KVM_WORKERS was used throttle the number of domains been booted in parallel (it is very CPU intensive).  That is no longer true.  See notes under KVM_PREFIXES above. }}


=== (optional) Generate a web page of the test results ===
=== (optional) Generate a web page of the test results ===
Line 269: Line 240:
</pre>
</pre>


The files can the be viewed using http://file. To disable web page generation, delete the directory <tt>RESULTS/</tt>.
The files can the be viewed using http://file.
 
To disable web page generation, delete the directory <tt>RESULTS/</tt>.


Alternatively, a web server can be installed and configured:
Alternatively, a web server can be installed and configured:


<pre>
sudo dnf install httpd
sudo dnf install httpd
sudo mkdir /var/www/html/results/
sudo systemctl enable httpd
sudo chown $(id -un) /var/www/html/results/
sudo systemctl start httpd
sudo chmod 755 /var/www/html/results/
sudo mkdir /var/www/html/results/
sudo sh -c 'echo "AddType text/plain .diff" >/etc/httpd/conf.d/diff.conf'
sudo chown $(id -un) /var/www/html/results/
 
sudo chmod 755 /var/www/html/results/
# until next reboot
sudo sh -c 'echo "AddType text/plain .diff" >/etc/httpd/conf.d/diff.conf'
sudo firewall-cmd --add-service=http
</pre>
sudo systemctl start httpd
# make it permenant
sudo systemctl enable httpd      # make it permenant
# firewall rule goes here!


and then $(WEB_SUMMARYDIR) used to specify that the web pages should be published under the server directory:
and then $(WEB_SUMMARYDIR) used to specify that the web pages should be published under the server directory:


<pre>
$ grep WEB_SUMMARYDIR Makefile.inc.local
$ grep WEB_SUMMARYDIR Makefile.inc.local
WEB_SUMMARYDIR=/var/www/html/results
WEB_SUMMARYDIR=/var/www/html/results
</pre>


If you want it to be the main page of the website, you can create the file /var/www/html/index.html containing:
If you want it to be the main page of the website, you can create the file /var/www/html/index.html containing:
Line 305: Line 280:
== Running the testsuite ==
== Running the testsuite ==


=== <tt>make kvm-install<tt> - build (update) and install libreswan ===
In the past, the testsuite was driven using <tt>make kvm-...</tt> commands.  That's largely been replaced by the top-level wrapper script <tt>./kvm</tt> which has several advantages over make:


To build the VMs, and build (update) and install libreswan, use:
* it is file (file) completion friendly
* it is shell script friendly


<pre>
=== For the impatient: <tt>./kvm install check</tt> ===
make kvm-install
</pre>


To force a scratch build (re-constructing build and test domains, re-generating the certificates, and build/install libreswan from scratch), use:
To build the VMs, and build and install (or update) libreswan, and then run the tests, use:


<pre>
./kvm install check
make kvm-clean
make kvm-install
</pre>


To also force an OS upgrade of the domains, use:
=== Setting up <tt>./kvm</tt> (tab completion) ===


<pre>
If this:
make kvm-purge
make kvm-install
</pre>


(make kvm-demolish wipes out everything)
complete -o filenames -C './kvm' ./kvm


=== (optional) Generate the Certificates ===
is added to  <tt>.bashrc</tt> then tab completion with <tt>./kvm</tt> will include both commands and directories.


The full testsuite requires a number of certificates.  If not present, then <tt>make kvm-test</tt> (see below) will automatically generate them using the domain <tt>build</tt>.
=== Running the testsuite ===


Just note that the certificates have a limited lifetime.
; ./kvm install
: update the KVMs ready for a new test run


Should the test system detects out-of-date certificates then <tt>make kvm-test</tt> will barf.  When this happens, the commands:
; ./kvm check
: run the testsuite, previous results are saved in <tt>BACKUP/-date-</tt>


<pre>
; ./kvm recheck
make kvm-keys-clean
: run the testsuite, but skip tests that already passed
make kvm-keys
</pre>


can be used to force the generation of new certificates.
; ./kvm results
: list the results from the test run


=== <tt>make kvm-test</tt> - run the testsuite ===
; ./kvm diffs
: display differences between the test results and the expected results, exit non-zero if there are any


To (re)run all test cases, use:
the operations can be combined on a single line:


<pre>
./kvm install check recheck diff
make kvm-test
</pre>


(wehen a test is re-run the previous results are stored in the directory <tt>BACKUP/</tt>).
and individual tests can be selected (see Running a Single Test, below):


To just run tests that previously failed:
./kvm install check diff testing/pluto/*ikev2*
 
<pre>
make kvm-retest
</pre>


And to run a select group of tests, either:
To stop <tt>./kvm</tt> use control-c.


<pre>
=== Updating Certificates ===
make kvm-test KVM_TESTS+=testing/pluto/basic-pluto-01/
</pre>


or:
The full testsuite requires a number of certificates.  If not present, then <tt>./kvm check</tt> will automatically generate them using the domain <tt>build</tt>.  Just note that the certificates have a limited lifetime.  Should the test system detects out-of-date certificates then <tt>./kvm check</tt> will barf.


<pre>
To rebuild the certificates:
./testing/utils/kvmtest.py testing/pluto/basic-pluto-01
</pre>


; ./kvm keys


=== <tt>make kvm-diffs</tt> -- inspect (and update) the test results ===
can be used to force the generation of new certificates.


See kvmresults.py, the following make targets are useful (they can be run while the testsuite is still running):
=== Cleaning up (and general maintenance) ===


<pre>
; ./kvm check-clean
make kvm-diffs
: delete the test results
make kvm-results
<pre>


in addition, test runs can be limited to just the test files that have been modified (but not committed) using:
; ./kvm uninstall
; delete the KVM build and test domains (but don't touch the build tree or test results)


<pre>
; ./kvm clean
kvm-modified
: delete the test results, the KVM build and test domains, the build tree, and the certificates
kvm-modified-check
kvm-modified-recheck
kvm-modified-results
kvm-modified-diffs
</pre>


=== Stopping pluto tests (gracefully) ===
; ./kvm purge
: also delete the test networks (is purge still useful?)


Type control-C; it will eventually stop (but may need to wait for all threads to become idle).  If you grow impatient, just type control-C again.
; ./kvm demolish
: also delete the KVM base domain that was used to create the other domains


To determine if the testsuite is running on a remote machine use:
; ./kvm upgrade
: delete all KVM build and test domains, and then upgrade and transmogrify the base domain ready for a fresh install


<pre>
; ./kvm transmogrify
make kvm-status
: run a fresh transmogrify on the base domain (the base domain is reverted to before the last transmogrify)
</pre>
 
the running test suite can then be killed using:
 
<pre>
make kvm-kill
</pre>


; ./kvm downgrade
: revert the base domain back to before it was upgraded (useful when debugging upgrade and transmogrify)


== Shell and Console Access (Logging In) ==
== Shell and Console Access (Logging In) ==
Line 418: Line 371:
* while SSH takes more to set up, it supports things like proper terminal configuration and file copy
* while SSH takes more to set up, it supports things like proper terminal configuration and file copy


=== Serial Console access using "make kvmsh-HOST" (kvmsh.py) ===
=== Serial Console access using <tt>./kvm sh HOST</tt> (kvmsh.py) ===
 
"kvmsh", is a wrapper around "virsh".  It automatically handles things like booting the machine, logging in, and correctly configuring the terminal:


<pre>
<tt>./kvm sh HOST</tt> is a wrapper around "virsh" that automatically handles things like booting the machine, logging in, and correctly configuring the terminal. It's big advantage is that it always works. For instance:
$ ./testing/utils/kvmsh.py east
[...]
Escape character is ^]
[root@east ~]# printenv TERM
xterm
[root@east ~]# stty -a
...; rows 52; columns 185; ...
[root@east ~]#
</pre>


"kvmsh.py" can also be used to script remote commands (for instance, it is used to run "make" on the build domain):
$ ./testing/utils/kvmsh.py east
[...]
Escape character is ^]
[root@east ~]# printenv TERM
xterm
[root@east ~]# stty -a
...; rows 52; columns 185; ...
[root@east ~]#


<pre>
The script "kvmsh.py" can also be used directly to invoke commands on a guest (this is how <tt>./kvm install</tt> works):
$ ./testing/utils/kvmsh.py east ls
[root@east ~]# ls
anaconda-ks.cfg
</pre>


Finally, "make kvmsh-HOST" provides a short cut for the above; and if your using multiple build trees (see further down), it will connect to the DOMAIN that corresponds to HOSTFor instance, notice how the domain "a.east" is passed to kvmsh.py in the below:
$ ./testing/utils/kvmsh.py east ls
  [root@east ~]# ls
anaconda-ks.cfg


<pre>
When $(KVM_PREFIXES) contains multiple prefixes, <tt>./kvm sh east</tt> always logs into the first prefixe's domain.
$ make kvmsh-east
/home/libreswan/pools/testing/utils/kvmsh.py --output ++compile-log.txt --chdir . a.east
Escape character is ^]
[root@east source]#
</pre>


Limitations:
Limitations:


* no file transfer but files can be accessed via /testing
* no file transfer but files can be accessed via <tt>/pool</tt> and </tt>/testing</tt>


=== Graphical Console access using virt-manager ===
=== Graphical Console access using virt-manager ===
Line 460: Line 401:


While easy to use, it doesn't support cut/paste or mechanisms for copying files.
While easy to use, it doesn't support cut/paste or mechanisms for copying files.


=== Shell access using SSH ===
=== Shell access using SSH ===


While requiring slightly more effort to set up, it provides full shell access to the domains.
While requiring more effort to set up, it provides full shell access to the domains.


Since you will be using ssh a lot to login to these machines, it is recommended to either put their names in /etc/hosts:
Since you will be using ssh a lot to login to these machines, it is recommended to either put their names in /etc/hosts:


<pre>
# /etc/hosts entries for libreswan test suite
# /etc/hosts entries for libreswan test suite
192.1.2.45 west
192.1.2.45 west
192.1.2.23 east
192.1.2.23 east
192.0.3.254 north
192.0.3.254 north
192.1.3.209 road
192.1.3.209 road
192.1.2.254 nic
192.1.2.254 nic
</pre>


or add entries to .ssh/config such as:
or add entries to .ssh/config such as:


<pre>
Host west
Host west
         Hostname 192.1.2.45
         Hostname 192.1.2.45
</pre>


If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to '''testing/baseconfigs/all/etc/ssh/authorized_keys'''. This file is installed as /root/.ssh/authorized_keys on all VMs
If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to '''testing/baseconfigs/all/etc/ssh/authorized_keys'''. This file is installed as /root/.ssh/authorized_keys on all VMs


Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine.  This command, run on the host, installs your public key on the root account of the guest machines west.  This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway).  Remember that the root password on each guest machine is "swan".
Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine.  This command, run on the host, installs your public key on the root account of the guest machines west.  This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway).  Remember that the root password on each guest machine is "swan".
<pre>
 
ssh-copy-id root@west
ssh-copy-id root@west
</pre>
 
You can use ssh-copy for any VM.  Unfortunately, the key is forgotten when the VM is restarted.
You can use ssh-copy for any VM.  Unfortunately, the key is forgotten when the VM is restarted.


== How tests work ==
Limitations:
 
* this only works with the default east, et.al. (it does not work with KVM_PREFIXES and/or multiple test directories)
 
== kvm workflows ==
 
(seeing as everyone has a "flow", why not kvm) here are some common workflows, the following commands are used:
 
; ./kvm modified
: list the test directories that have been modified
; ./kvm baseline
: compare test results against a baseline
; ./kvm patch
: update the expected test results
; ./kvm add
: <tt>git add</tt> the modified test results
; ./kvm status
: show the status of the currently running testsuite
; ./kvm kill
: kill the currently running testsuite
 
=== Running a single test ===
 
There are two ways to run an individual test:
# the test to run can be specified on the command line:
#: kvm check testing/pluto/basic-pluto-01
# the test is implied when running <tt>kvm</tt> from a test directory:
#: cd testing/pluto/basic-pluto-01
#: ../../../kvm
 
But there's a catch!  The behaviour is different to a normal test run.
 
When there are multiple tests, as each test finishes:
* pluto is stopped (via post-mortem.sh)
* the domain is shutdown.
This is so that bugs in the shutdown code can be flushed out.
 
However, when there's only one test these steps are skipped:
* pluto is left running (post-mortem.sh is not run)
* the domain is not shutdown
This is so that it is possible to login and look around after the test finishes (but it also means that bugs in shutdown code can be missed).


All the test cases involving VMs are located in the libreswan directory under <tt>testing/pluto/</tt>. The most basic test case is called basic-pluto-01. Each test case consists of a few files:
To override this behaviour, add:
KVMRUNNER_FLAGS += --run-post-mortem
to <tt>Makefile.inc.local</tt>.


* description.txt to explain what this test case actually tests
=== Working on individual tests ===
* ipsec.conf files - for host west is called west.conf. This can also include configuration files for strongswan or racoon2 for interop testig
* ipsec.secret files - if non-default configurations are used. also uses the host syntax, eg west.secrets, east.secrets.
* An init.sh file for each VM that needs to start (eg westinit.sh, eastinit.sh, etc)
* One run.sh file for the host that is the initiator (eg westrun.sh)
* Known good (sanitized) output for each VM (eg west.console.txt, east.console.txt)
* testparams.sh if there are any non-default test parameters


Once the test run has completed, you will see an OUTPUT/ directory in the test case directory:
The <tt>modified</tt> command can be used to limit the test run to just tests with modified files (according to git):


<pre>
; ./kvm modified install check diff
$ ls OUTPUT/
: install libreswan and then run the testsuite against just the modified tests, display differences differences
east.console.diff east.console.verbose.txt RESULT      west.console.txt          west.pluto.log
; ./kvm modified recheck diff
east.console.txt  east.pluto.log            swan12.pcap west.console.diff  west.console.verbose.txt
: re-run the modified tests that are failing, display differences
</pre>
; ./kvm modified patch add
: update the modified tests applying the latest output and add them to git
 
this workflow comes into its own, when updating tests en-mass using sed, for instance:
 
sed -i -e 's/PARENT_//' testing/pluto/*/*.console.txt
./kvm modified check
 
=== Checking for regressions ===
 
Start by setting up a base directory.  Give the KVMs unique bN prefixes (only "b1" is needed, but we're in a hurry so add "b2 b3 b4", 4 boot workers, and /tmp/pool for KVM disk images) and kick off a test run:
 
$ git clone https://github.com/libreswan/libreswan base
$ cd base
base$ # base - use bN as the prefix
base$ echo KVM_PREFIXES=b1 b2 b3 b4 >> base/Makefile.inc.local
base$ echo KVM_WORKERS=4            >> base/Makefile.inc.local
base$ echo KVM_LOCALDIR=/tmp/pool  >> base/Makefile.inc.local
base$ mkdir -p ../pool
base$ nohup ./kvm install check &
base$ tail -f nohup.out
 
Next, set up a working directory.  This time the KVMs are given the unique wN prefix, and point KVM_BASELINE back at base:
 
$ git clone https://github.com/libreswan/libreswan work
$ cd work
work$ # work - use wN as the prefix
work$ echo KVM_PREFIXES=w1 w2 w3 w4 >> work/Makefile.inc.local
work$ echo KVM_WORKERS=4            >> work/Makefile.inc.local
work$ echo KVM_BASELINE=../base    >> work/Makefile.inc.local
work$ echo KVM_LOCALDIR=/tmp/pool  >> work/Makefile.inc.local
work$ mkdir -p ../pool
 
work then then progress in the work directory, and when ready the test run started (here in the background):
 
work$ ed programs/pluto/plutomain.c
/static bool selftest_only = false/ s/false/true/
w
q
work$ gmake && nohup ./kvm install check &
 
as the tests progress, the results can be monitored:
 
work$ ./kvm baseline results
testing/pluto/basic-pluto-01 failed east:baseline-passed,output-different west:baseline-passed,output-different
...
  work$ ./kvm baseline diffs testing/pluto/basic-pluto-01
+whack: Pluto is not running (no "/run/pluto/pluto.ctl")
 
and then the test run aborted, and the problem fixed and tested, and the test run restarted:
 
work$ ./kvm kill
  work$ git checkout -- programs/pluto/plutomain.c
work$ ./kvm install check diff testing/pluto/basic-pluto-01
work$ nohup recheck &
 
The output can be fine tuned using baseline-failed (show differences when the baseline failed, ignoring passed and unresolved) baseline-passed (show differences when the baseline passed, ignoring failed and unresolved).
 
To override the KVM_BASELINE make variable, use <tt>--baseline DIRECTORY</tt>
 
=== Tracking down regressions (using git bisect) ===
 
Lets assume that the test <tt>basic-pluto-01</tt>, which was working, is now failing.
 
==== The easy way ====
 
This workflow works best when the regression is recent (i.e., the last few commits) and nothing significant has happened in the meantime (for instance, os upgrade, test rename, ...).
 
The command <tt>./kvm install check diff</tt> exits with a <tt>git bisect</tt> friendly status codes which means it can be combined with <tt>git bisect run</tt> to automate regression testing.
 
For instance:
 
git bisect start main ^<suspect-commit>
git bisect run ./kvm install check diff testing/pluto/basic-pluto-01
git bisect visualize
# finally
git bisect reset
 
==== The hard way ====
 
This workflow works best when trying to track down a regression in an older version of libreswan.
 
Two repositories are used:
 
#  <tt>repo-under-test</tt>
#: this contains the sources that will be built and installed into the test domains; it is what git bisect will manipulate
# <tt>testbench</tt>
#: this contains the test scripts used to test <tt>repo-under-test</tt>
 
Start by checking out the two repositories (existing repositories can also be used, carefully):
 
git clone ... /home/repo-under-test
git clone ... /home/testbench
 
Next, add the following to <tt>/home/testbench/Makefile.inc.local</tt> so that the <tt>/source</tt> directory used by <tt>testbench</tt> is pointing at <tt>repo-under-test</tt>:
 
# repo-under-test's sources are built
KVM_SOURCEDIR=/home/repo-under-test
# testbench's testing directory is used
#KVM_TESTINGDIR=/home/testbench/testing
 
Next, (re-)transmogrify the <tt>testbench</tt> so that, within the domains, <tt>/source</tt> points at <tt>repo-under-test</tt>:
 
cd /home/testbench
  ./kvm transmogrify
 
Finally run the tests:
 
cd /home/testbench
git -C /home/repo-under-test bisect start main ^<suspect-commit>
./kvm install check diff testing/pluto/basic-pluto-01
# based on output, pick one:
git bisect {good,bad}
  # might work: git -C /home/repo-under-test bisect run -c 'cd ../testbench && ./kvm install check diff testing/pluto/basic-pluto-01
git -C /home/repo-under-test bisect visualize
# finally
git bisect reset
 
KVM_TESTINGDIR can also be pointed at <tt>repo-under-test</tt>.
 
=== Controlling a test run remotely ===
 
Start the testsuite in the background:
 
nohup ./kvm install check &
 
To determine if the testsuite is still running:
 
./kvm status


* RESULT is a text file (whose format is sure to change in the next few months) stating whether the test succeeded or failed.
and to stop the running testsuite:
* The diff files show the differences between this testrun and the last known good output.
* Each VM's serial (sanitized) console log  (eg west.console.txt)
* Each VM's unsanitized verbose console output (eg west.console.verbose.txt)
* A network capture from the bridge device (eg swan12.pcap)
* Each VM's pluto log, created with plutodebug=all (eg west.pluto.log)
* Any core dumps generated if a pluto daemon crashed


== Debugging inside the VM ==
./kvm kill


=== Debugging pluto on east ===
=== Debugging inside the VM (pluto on east) ===


Terminal 1 - east: log into east, start pluto, and attach gdb
Terminal 1 - east: log into east, start pluto, and attach gdb


<pre>
./kvm sh east
make kvmsh-east
east# cd /testing/pluto/basic-pluto-01
east# cd /testing/pluto/basic-pluto-01
east# sh -x ./eastinit.sh
east# sh -x ./eastinit.sh
east# gdb /usr/local/libexec/ipsec/pluto $(pidof pluto)
east# gdb /usr/local/libexec/ipsec/pluto $(pidof pluto)
(gdb) c
(gdb) c
 
</pre>
If pluto isn't running then gdb will complain with: ''<code>--p requires an argument</code>''


Terminal 2 - west: log into west, start pluto and the test
Terminal 2 - west: log into west, start pluto and the test


<pre>
./kvm sh west
make kvmsh-west
west# sh -x ./westinit.sh ; sh -x westrun.sh
west# sh -x ./westinit.sh ; sh -x westrun.sh
</pre>
If pluto wasn't running, gdb would complain: ''<code>--p requires an argument</code>''


When pluto crashes, gdb will show that and await commands.  For example, the bt command will show a backtrace.
When pluto crashes, gdb will show that and await commands.  For example, the <tt>bt</tt> command will show a backtrace.


=== Debugging pluto on west ===
TODO:


See above, but also use virt as a terminal.
* stop watchdog eventually killing pluto
* notes for west


=== /root/.gdbinit ===
=== Installing a custom Fedora kernel ===


If you want to get rid of the warning "warning: File "/testing/pluto/ikev2-dpd-01/.gdbinit" auto-loading has been declined by your `auto-load safe-path'"
Assuming the kernel RPMs are in the directory <tt>$(KVM_POOLDIR)/kernel-ipsec/ say, add the following to <tt>Makefile.inc.local</tt>:
 
KVM_FEDORA_KERNEL_RPMDIR = /pool/kernel-ipsec/
<pre>
KVM_FEDORA_KERNEL_ARCH = x86_64
echo "set auto-load safe-path /" >> /root/.gdbinit
KVM_FEDORA_KERNEL_VERSION = -5.18.7-100.aiven_ipsec.fc35.$(KVM_FEDORA_KERNEL_ARCH).rpm
</pre>
and then run:
./kvm upgrade-fedora
(should, like for NetBSD do this during transmogrify?)


== Network Diagram (out-of-date) ==
=== Installing a custom NetBSD kernel ===


[[File:testnet.png]]
Copy the kernel to:
$(KVM_POOLDIR)/$(KVM_PREFIX)netbsd-kernel
and then run:
./kvm transmogrify-netbsd

Revision as of 20:46, 6 September 2022

KVM Test framework

Libreswan's test framework can be run using KVM guests, and the kvm scripts. It is strongly recommended to run the test suite on a host machine that has a CPU wth virtualisation instructions.

To access files on the host file system:

  • Linux guests (Fedora) use the PLAN9 filesystem (9p)
  • BSD guests (FreeBSD, NetBSD, OpenBSD) use NFS via the NAT interface

For an overview of the tests see Test_Suite

Preparing the host machine

Enable virtualization in the BIOS

Virtualization needs to be enabled by the BIOS during boot.

Add yourself to sudo

Some of the test scrips need to be run as root. The test environment assumes this can be done using sudo without a password vis:

sudo pwd

XXX: Surely qemu can be driven without root?

This is setup by adding an entry under /etc/sudoers.d/ specifying that your account does not need a password to become root:

echo "$(id -u -n) ALL=(ALL) NOPASSWD: ALL" | sudo dd of=/etc/sudoers.d/$(id -u -n)

Fight SELinux

SELinux blocks some actions that we need. We have not created any SELinux rules to avoid this. The options are:

  • set SELinux to permissive (recommended)
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config
sudo setenforce Permissive
  • disable SELinux
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sudo reboot
  • (experimental) label source tree for SELinux

The source tree on the host is shared with the virtual machines. SELinux considers this a bug unless the tree is labelled with type svirt_image_t.

sudo dnf install policycoreutils-python-utils
sudo semanage fcontext -a -t svirt_image_t "$(pwd)"'(/.*)?'
sudo restorecon -vR /home/build/libreswan

There may be other things that SELinux objects to.

Check that the host has enough entropy

As a rough guide run:

while true ; do cat /proc/sys/kernel/random/entropy_avail ; sleep 3 ; done

it should have values in the hundrets if not thousands. If it is in the units or tens then see Entropy matters

Install Dependencies

Why Fedora Mint (debian)
Basics sudo dnf install -y make git gitk sudo apt-get install -y make make-doc git gitk
Python sudo dnf install -y python3-pexpect sudo apt-get install python3-pexpect
Virtulization sudo dnf install -y qemu virt-install libvirt-daemon-kvm libvirt-daemon-qemu sudo apt install -y qemu virtinst libvirt-clients libvirt-daemon libvirt-daemon-system libvirt-daemon-driver-qemu libosinfo-query qemu-system-x86?
Boot CDs sudo dnf install -y dvd+rw-tools sudo apt-get install -y dvd+rw-tools
Web pages sudo dnf install -y jq nodejs-typescript sudo apt-get install -y jq node-typescript
NFS ??? sudo apt-get install -y nfs-kernel-server rpcbind

Enable libvirt

If you're switching from the old libvirtd see https://libvirt.org/daemons.html#switching-to-modular-daemons for how to shut down the old daemons.

Start the "collection of modular daemons that replace functionality previously provided by the monolithic libvirtd daemon":

for drv in qemu network nodedev nwfilter secret storage interface
do
   sudo systemctl unmask virt${drv}d.service
   sudo systemctl unmask virt${drv}d{,-ro,-admin}.socket
   sudo systemctl enable virt${drv}d.service
   sudo systemctl enable virt${drv}d{,-ro,-admin}.socket
done
for drv in qemu network nodedev nwfilter secret storage
do
   sudo systemctl start virt${drv}d{,-ro,-admin}.socket
done

There should be no errors and warnings.

This code has bugs:

work-around is to stop the daemons shutting down when idle:
$ grep ARGS /etc/sysconfig/*virt*
/etc/sysconfig/virtnetworkd:VIRTNETWORKD_ARGS=""
/etc/sysconfig/virtqemud:VIRTQEMUD_ARGS=""
/etc/sysconfig/virtstoraged:VIRTSTORAGED_ARGS=""
no work-around
work-around is to run sudo systemctl restart virtqemud when things get slow

Add yourself to the KVM/QEMU group

You need to add yourself to the group that QEMU/KVM uses when writing to /var/lib/libvirt/qemu. On Fedora it is 'qemu', and on Debian it is 'kvm'. Something like:

sudo usermod -a -G $(stat --format %G /var/lib/libvirt/qemu) $(id -u -n)

After this you will will need to re-login (or run sudo su - $(id -u -n)

Make certain that root can access the build

The path to your build needs to be accessible (executable) by root, assuming things are under home:

chmod a+x $HOME

Fix /var/lib/libvirt/qemu

sudo chmod g+w /var/lib/libvirt/qemu

Arguably we should run libvirt as a normal user instead.

Create /etc/modules-load.d/virtio.conf (obsolete since 2022 at least)

Several virtio modules need to be loaded into the host's kernel. This could be done by modprobe ahead of running any virtual machines but it is easier to install them whenever the host boots. This is arranged by listing the modules in a file within /etc/modules-load.d. The host must be rebooted for this to take effect.

sudo dd <<EOF of=/etc/modules-load.d/virtio.conf
virtio_blk
virtio-rng
virtio_console
virtio_net
virtio_scsi
virtio
virtio_balloon
virtio_input
virtio_pci
virtio_ring
9pnet_virtio
EOF

As of Fedora 28, several of these modules are built into the kernel and will not show up in /proc/modules (virtio, virtio_rng, virtio_pci, virtio_ring).

Debian

On Debian slack based systems (i.e., Linux Mint 20.3), the default python is too old. Fortunately python 3.9 is also available vis:

sudo apt-get install python3.9
echo KVM_PYTHON=python3.9 >> Makefile.inc.local

BSD

Anyone?

Download and configure libreswan

Fetch Libreswan

The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:

git clone https://github.com/libreswan/libreswan
cd libreswan

Create the Pool directory for storing VM disk images - $(KVM_POOLDIR)

The pool directory is used used to store VM disk images and other configuration files. By default $(top_srcdir)/../pool is used (that is, adjacent to your source tree).

To change the location of the pool directory, set the KVM_POOLDIR make variable in Makefile.inc.local. For instance:

$ grep KVM_POOLDIR Makefile.inc.local
KVM_POOLDIR=/home/libreswan/pool

(optional) Use /tmp/pool (tmpfs) to store test VM disk images - $(KVM_LOCALDIR)

By default, all disk mages are stored in $(KVM_POOLDIR) (see above). That is both the base VM disk image, and the build VM and test VM disk images. Since only the base VM image needs long-term storage, $(KVM_LOCALDIR) can be used to specify that the build and test images are stored in /tmp:

$ grep KVM_LOCALDIR Makefile.inc.local
KVM_LOCALDIR=/tmp/pool

This has the advantage of eliminating physical disk I/O as a bottle neck when accessing VM disk images; but the disadvantage of needing to re-build the images after a reboot.

(optional) Run tests in parallel - $(KVM_PREFIXES)

By default only one test is run at a time. This can be changed using the $(KVM_PREFIXES) make variable. This provides a list of prefixes to be pretended to test domains creating multiple test groups. The default value is:

KVM_PREFIXES=

which creates the build domains fedora-build, netbsd-build, et.al., and the test domains east, west, et.al. (i.e., after expansion east, west, et.al.).

To run tests in parallel, specify multiple prefixes. For instance two tests can be run in parallel by specifying:

KVM_PREFIXES= 1.

This will create the build domains fedora-build, netbsd-build, et.al., and the test domains east, west, et.al., and separately 1.east, 1.west, et.al.

(optional) Parallel builds - $(KVM_WORKERS)

By default, build domains only have on virtual CPU. Since building is very CPU intensive, this can be increased using $(KVM_WORKERS).

 KVM_WORKERS=2

(optional) Generate a web page of the test results

See the nightly test results for an example.

To create the web directory RESULTS/ and populate it with the current test results use:

make web

The files can the be viewed using http://file.

To disable web page generation, delete the directory RESULTS/.

Alternatively, a web server can be installed and configured:

sudo dnf install httpd
sudo mkdir /var/www/html/results/
sudo chown $(id -un) /var/www/html/results/
sudo chmod 755 /var/www/html/results/
sudo sh -c 'echo "AddType text/plain .diff" >/etc/httpd/conf.d/diff.conf'
# until next reboot
sudo firewall-cmd --add-service=http
sudo systemctl start httpd

# make it permenant
sudo systemctl enable httpd      # make it permenant
# firewall rule goes here!

and then $(WEB_SUMMARYDIR) used to specify that the web pages should be published under the server directory:

$ grep WEB_SUMMARYDIR Makefile.inc.local
WEB_SUMMARYDIR=/var/www/html/results

If you want it to be the main page of the website, you can create the file /var/www/html/index.html containing:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
  <head>
    <meta http-equiv="REFRESH" content="0;url=/results/">
  </head>
  <BODY>
  </BODY>
</HTML>

Running the testsuite

In the past, the testsuite was driven using make kvm-... commands. That's largely been replaced by the top-level wrapper script ./kvm which has several advantages over make:

  • it is file (file) completion friendly
  • it is shell script friendly

For the impatient: ./kvm install check

To build the VMs, and build and install (or update) libreswan, and then run the tests, use:

./kvm install check

Setting up ./kvm (tab completion)

If this:

complete -o filenames -C './kvm' ./kvm

is added to .bashrc then tab completion with ./kvm will include both commands and directories.

Running the testsuite

./kvm install
update the KVMs ready for a new test run
./kvm check
run the testsuite, previous results are saved in BACKUP/-date-
./kvm recheck
run the testsuite, but skip tests that already passed
./kvm results
list the results from the test run
./kvm diffs
display differences between the test results and the expected results, exit non-zero if there are any

the operations can be combined on a single line:

./kvm install check recheck diff

and individual tests can be selected (see Running a Single Test, below):

./kvm install check diff testing/pluto/*ikev2*

To stop ./kvm use control-c.

Updating Certificates

The full testsuite requires a number of certificates. If not present, then ./kvm check will automatically generate them using the domain build. Just note that the certificates have a limited lifetime. Should the test system detects out-of-date certificates then ./kvm check will barf.

To rebuild the certificates:

./kvm keys

can be used to force the generation of new certificates.

Cleaning up (and general maintenance)

./kvm check-clean
delete the test results
./kvm uninstall
delete the KVM build and test domains (but don't touch the build tree or test results)
./kvm clean
delete the test results, the KVM build and test domains, the build tree, and the certificates
./kvm purge
also delete the test networks (is purge still useful?)
./kvm demolish
also delete the KVM base domain that was used to create the other domains
./kvm upgrade
delete all KVM build and test domains, and then upgrade and transmogrify the base domain ready for a fresh install
./kvm transmogrify
run a fresh transmogrify on the base domain (the base domain is reverted to before the last transmogrify)
./kvm downgrade
revert the base domain back to before it was upgraded (useful when debugging upgrade and transmogrify)

Shell and Console Access (Logging In)

There are several different ways to gain shell access to the domains.

Each method, depending on the situation, has both advantages and disadvantages. For instance:

  • while make kvmsh-host provide quick access to the console, it doesn't support file copy
  • while SSH takes more to set up, it supports things like proper terminal configuration and file copy

Serial Console access using ./kvm sh HOST (kvmsh.py)

./kvm sh HOST is a wrapper around "virsh" that automatically handles things like booting the machine, logging in, and correctly configuring the terminal. It's big advantage is that it always works. For instance:

$ ./testing/utils/kvmsh.py east
[...]
Escape character is ^]
[root@east ~]# printenv TERM
xterm
[root@east ~]# stty -a
...; rows 52; columns 185; ... 
[root@east ~]#

The script "kvmsh.py" can also be used directly to invoke commands on a guest (this is how ./kvm install works):

$ ./testing/utils/kvmsh.py east ls
[root@east ~]# ls
anaconda-ks.cfg

When $(KVM_PREFIXES) contains multiple prefixes, ./kvm sh east always logs into the first prefixe's domain.

Limitations:

  • no file transfer but files can be accessed via /pool and /testing

Graphical Console access using virt-manager

"virt-manager", a gnome tool can be used to access individual domains.

While easy to use, it doesn't support cut/paste or mechanisms for copying files.

Shell access using SSH

While requiring more effort to set up, it provides full shell access to the domains.

Since you will be using ssh a lot to login to these machines, it is recommended to either put their names in /etc/hosts:

# /etc/hosts entries for libreswan test suite
192.1.2.45 west
192.1.2.23 east
192.0.3.254 north
192.1.3.209 road
192.1.2.254 nic

or add entries to .ssh/config such as:

Host west
       Hostname 192.1.2.45

If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to testing/baseconfigs/all/etc/ssh/authorized_keys. This file is installed as /root/.ssh/authorized_keys on all VMs

Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine. This command, run on the host, installs your public key on the root account of the guest machines west. This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway). Remember that the root password on each guest machine is "swan".

ssh-copy-id root@west

You can use ssh-copy for any VM. Unfortunately, the key is forgotten when the VM is restarted.

Limitations:

  • this only works with the default east, et.al. (it does not work with KVM_PREFIXES and/or multiple test directories)

kvm workflows

(seeing as everyone has a "flow", why not kvm) here are some common workflows, the following commands are used:

./kvm modified
list the test directories that have been modified
./kvm baseline
compare test results against a baseline
./kvm patch
update the expected test results
./kvm add
git add the modified test results
./kvm status
show the status of the currently running testsuite
./kvm kill
kill the currently running testsuite

Running a single test

There are two ways to run an individual test:

  1. the test to run can be specified on the command line:
    kvm check testing/pluto/basic-pluto-01
  2. the test is implied when running kvm from a test directory:
    cd testing/pluto/basic-pluto-01
    ../../../kvm

But there's a catch! The behaviour is different to a normal test run.

When there are multiple tests, as each test finishes:

  • pluto is stopped (via post-mortem.sh)
  • the domain is shutdown.

This is so that bugs in the shutdown code can be flushed out.

However, when there's only one test these steps are skipped:

  • pluto is left running (post-mortem.sh is not run)
  • the domain is not shutdown

This is so that it is possible to login and look around after the test finishes (but it also means that bugs in shutdown code can be missed).

To override this behaviour, add:

KVMRUNNER_FLAGS += --run-post-mortem

to Makefile.inc.local.

Working on individual tests

The modified command can be used to limit the test run to just tests with modified files (according to git):

./kvm modified install check diff
install libreswan and then run the testsuite against just the modified tests, display differences differences
./kvm modified recheck diff
re-run the modified tests that are failing, display differences
./kvm modified patch add
update the modified tests applying the latest output and add them to git

this workflow comes into its own, when updating tests en-mass using sed, for instance:

sed -i -e 's/PARENT_//' testing/pluto/*/*.console.txt
./kvm modified check

Checking for regressions

Start by setting up a base directory. Give the KVMs unique bN prefixes (only "b1" is needed, but we're in a hurry so add "b2 b3 b4", 4 boot workers, and /tmp/pool for KVM disk images) and kick off a test run:

$ git clone https://github.com/libreswan/libreswan base
$ cd base
base$ # base - use bN as the prefix
base$ echo KVM_PREFIXES=b1 b2 b3 b4 >> base/Makefile.inc.local
base$ echo KVM_WORKERS=4            >> base/Makefile.inc.local
base$ echo KVM_LOCALDIR=/tmp/pool   >> base/Makefile.inc.local
base$ mkdir -p ../pool
base$ nohup ./kvm install check &
base$ tail -f nohup.out

Next, set up a working directory. This time the KVMs are given the unique wN prefix, and point KVM_BASELINE back at base:

$ git clone https://github.com/libreswan/libreswan work
$ cd work
work$ # work - use wN as the prefix
work$ echo KVM_PREFIXES=w1 w2 w3 w4 >> work/Makefile.inc.local
work$ echo KVM_WORKERS=4            >> work/Makefile.inc.local
work$ echo KVM_BASELINE=../base     >> work/Makefile.inc.local
work$ echo KVM_LOCALDIR=/tmp/pool   >> work/Makefile.inc.local
work$ mkdir -p ../pool

work then then progress in the work directory, and when ready the test run started (here in the background):

work$ ed programs/pluto/plutomain.c
/static bool selftest_only = false/ s/false/true/
w
q
work$ gmake && nohup ./kvm install check &

as the tests progress, the results can be monitored:

work$ ./kvm baseline results
testing/pluto/basic-pluto-01 failed east:baseline-passed,output-different west:baseline-passed,output-different
...
work$ ./kvm baseline diffs testing/pluto/basic-pluto-01
+whack: Pluto is not running (no "/run/pluto/pluto.ctl")

and then the test run aborted, and the problem fixed and tested, and the test run restarted:

work$ ./kvm kill
work$ git checkout -- programs/pluto/plutomain.c
work$ ./kvm install check diff testing/pluto/basic-pluto-01
work$ nohup recheck &

The output can be fine tuned using baseline-failed (show differences when the baseline failed, ignoring passed and unresolved) baseline-passed (show differences when the baseline passed, ignoring failed and unresolved).

To override the KVM_BASELINE make variable, use --baseline DIRECTORY

Tracking down regressions (using git bisect)

Lets assume that the test basic-pluto-01, which was working, is now failing.

The easy way

This workflow works best when the regression is recent (i.e., the last few commits) and nothing significant has happened in the meantime (for instance, os upgrade, test rename, ...).

The command ./kvm install check diff exits with a git bisect friendly status codes which means it can be combined with git bisect run to automate regression testing.

For instance:

git bisect start main ^<suspect-commit>
git bisect run ./kvm install check diff testing/pluto/basic-pluto-01
git bisect visualize
# finally
git bisect reset

The hard way

This workflow works best when trying to track down a regression in an older version of libreswan.

Two repositories are used:

  1. repo-under-test
    this contains the sources that will be built and installed into the test domains; it is what git bisect will manipulate
  2. testbench
    this contains the test scripts used to test repo-under-test

Start by checking out the two repositories (existing repositories can also be used, carefully):

git clone ... /home/repo-under-test
git clone ... /home/testbench

Next, add the following to /home/testbench/Makefile.inc.local so that the /source directory used by testbench is pointing at repo-under-test:

# repo-under-test's sources are built
KVM_SOURCEDIR=/home/repo-under-test
# testbench's testing directory is used
#KVM_TESTINGDIR=/home/testbench/testing

Next, (re-)transmogrify the testbench so that, within the domains, /source points at repo-under-test:

cd /home/testbench
./kvm transmogrify

Finally run the tests:

cd /home/testbench
git -C /home/repo-under-test bisect start main ^<suspect-commit>
./kvm install check diff testing/pluto/basic-pluto-01
# based on output, pick one:
git bisect {good,bad}
# might work: git -C /home/repo-under-test bisect run -c 'cd ../testbench && ./kvm install check diff testing/pluto/basic-pluto-01
git -C /home/repo-under-test bisect visualize
# finally
git bisect reset

KVM_TESTINGDIR can also be pointed at repo-under-test.

Controlling a test run remotely

Start the testsuite in the background:

nohup ./kvm install check &

To determine if the testsuite is still running:

./kvm status

and to stop the running testsuite:

./kvm kill

Debugging inside the VM (pluto on east)

Terminal 1 - east: log into east, start pluto, and attach gdb

./kvm sh east
east# cd /testing/pluto/basic-pluto-01
east# sh -x ./eastinit.sh
east# gdb /usr/local/libexec/ipsec/pluto $(pidof pluto)
(gdb) c

If pluto isn't running then gdb will complain with: --p requires an argument

Terminal 2 - west: log into west, start pluto and the test

./kvm sh west
west# sh -x ./westinit.sh ; sh -x westrun.sh

When pluto crashes, gdb will show that and await commands. For example, the bt command will show a backtrace.

TODO:

  • stop watchdog eventually killing pluto
  • notes for west

Installing a custom Fedora kernel

Assuming the kernel RPMs are in the directory $(KVM_POOLDIR)/kernel-ipsec/ say, add the following to Makefile.inc.local:

KVM_FEDORA_KERNEL_RPMDIR = /pool/kernel-ipsec/
KVM_FEDORA_KERNEL_ARCH = x86_64
KVM_FEDORA_KERNEL_VERSION = -5.18.7-100.aiven_ipsec.fc35.$(KVM_FEDORA_KERNEL_ARCH).rpm

and then run:

./kvm upgrade-fedora

(should, like for NetBSD do this during transmogrify?)

Installing a custom NetBSD kernel

Copy the kernel to:

$(KVM_POOLDIR)/$(KVM_PREFIX)netbsd-kernel

and then run:

./kvm transmogrify-netbsd