Test Suite - KVM: Difference between revisions

From Libreswan
Jump to navigation Jump to search
(Copy existing Testsuite page)
 
(transendental)
 
(98 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== KVM Test framework ==


Libreswan comes with an extensive test suite, written mostly in python, that uses KVM virtual machines and virtual networks. It has replaced the old UML test suite.
Libreswan's test framework can be run using KVM guests, and the <tt>./kvm</tt> script. It is strongly recommended to run the test suite on a host machine that has a CPU with virtualisation instructions.
Apart from KVM, the test suite uses libvirtd and qemu. It is strongly recommended to run the test suite natively on the OS (not in a VM itself) on a machine that has a CPU wth virtualization instructions.
The PLAN9 filesystem (9p) is used to mount host directories in the guests - NFS is avoided to prevent network lockups when an IPsec test case would cripple the guest's networking.  


{{ ambox | nocat=true | type=important | text = libvirt 0.9.11 and qemu 1.0 or better are required. RHEL does not support a writable 9p filesystem, so the recommended host/guest OS is Fedora 22 }}
To access files on the host file system:


[[File:testnet.png]]
* Fedora uses the PLAN9 filesystem (9p)
* Other guests (Alpine, Debian, FreeBSD, NetBSD, OpenBSD) use NFS via the NAT interface


== Test Frameworks ==
For an overview of the network and testing see [[Test_Suite]]


This page describes the make kvm framework.


Instead of using virtual machines, it is possible to use Docker instances.
== Preparing the host machine ==
 
=== Check Virtualization is enabled in the BIOS ===
 
Virtualization needs to be enabled by the BIOS during boot.
 
  grep -e vmx -e svm /proc/cpuinfo
 
=== Add yourself to <tt>sudo</tt> ===
 
Some of the test scrips need to be run as root.  The test environment assumes this can be done using <tt>sudo</tt> without a password vis:
 
sudo pwd
 
''XXX: Surely qemu can be driven without root?''
 
This is setup by adding an entry under /etc/sudoers.d/ specifying that your account does not need a password to become root:
 
echo "$(id -u -n) ALL=(ALL) NOPASSWD: ALL" | sudo dd of=/etc/sudoers.d/$(id -u -n)


More information is found in [[Test Suite - Docker]] in this Wiki
=== Fight SELinux ===


== Preparing the host machine ==
SELinux blocks some actions that we need.  We have not created any SELinux rules to avoid this.  To check the current settings:
 
  getenforce


In the following it is assumed that your account is called "build".
The options are:


=== Add Yourself to sudo ===
* set SELinux to permissive (recommended)


The test scripts rely on being able to use sudo without a password to gain root access. This is done by creating a no-pasword rule to /etc/sudoers.d/.
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config
sudo setenforce Permissive


XXX: Surely qemu can be driven without root?
* disable SELinux


To set this up, add your account to the wheel group and permit wheel to have no-password access. Issue the following commands as root:
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sudo reboot


<pre>
* (experimental) label source tree for SELinux
echo '%wheel ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/swantest
chmod 0440 /etc/sudoers.d/swantest
chown root.root /etc/sudoers.d/swantest
usermod -a -G wheel build
</pre>


=== Disable SELinux ===
The source tree on the host is shared with the virtual machines.  SELinux considers this a bug unless the tree is labelled with type svirt_image_t.


SELinux blocks some actions that we needWe have not created any SELinux rules to avoid this.
sudo dnf install policycoreutils-python-utils
sudo semanage fcontext -a -t svirt_image_t "$(pwd)"'(/.*)?'
  sudo restorecon -vR /home/build/libreswan


Either set it to permissive:
There may be other things that SELinux objects to.


<pre>
=== Check that the host has enough entropy ===
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config
sudo setenforce Permissive
</pre>


Or disabled:
As a rough guide run:


<pre>
while true ; do cat /proc/sys/kernel/random/entropy_avail ; sleep 3 ; done
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sudo reboot
</pre>


=== Install Required Dependencies ===
it should have values in the hundrets if not thousands.  If it is in the units or tens then see [[Entropy matters]]


Now we are ready to install the various components of libvirtd, qemu and kvm and then start the libvirtd service.
=== Install Dependencies ===


Even virt-manager isn't strictly required.
{| class="wikitable"
|-
! Why || Fedora !! Mint (debian)
|-
| Basics
|| sudo dnf install -y make git gitk patch xmlto python3-pexpect curl tar
|| sudo apt-get install -y make make-doc git gitk xmlto python3-pexpect curl tar
|-
| Virtualization
|| sudo dnf install -y qemu virt-install libvirt-daemon-kvm libvirt-daemon-qemu
|| sudo apt install -y qemu virtinst libvirt-clients libvirt-daemon libvirt-daemon-system libvirt-daemon-driver-qemu libosinfo-query qemu-system-x86?
|-
| Build BSD Boot CDs
|| sudo dnf install -y dvd+rw-tools
|| sudo apt-get install -y dvd+rw-tools
|-
| Build Web Pages
|| sudo dnf install -y jq typescript
|| sudo apt-get install -y jq node-typescript
|-
| Serve Web Server (optional)
|| sudo dnf install -y httpd
|| sudo apt-get install -y ????
|-
| NFS
|| sudo dnf install -y nfs-utils # ???
|| sudo apt-get install -y nfs-kernel-server rpcbind
|-
| Broken makefiles
|| sudo dnf install -y nss-devel # make file invokes pkg-config nss
||
|}


On Fedora 28:
=== Enable libvirt ===


<pre>
''If you're switching from the old libvirtd see https://libvirt.org/daemons.html#switching-to-modular-daemons for how to shut down the old daemons.''
# is virt-manager really needed
sudo dnf install -y qemu virt-manager virt-install libvirt-daemon-kvm libvirt-daemon-qemu
sudo dnf install -y python3-pexpect
</pre>


Once all is installed start libvirtd:
Start the "collection of modular daemons that replace functionality previously provided by the monolithic libvirtd daemon":


<pre>
for drv in qemu network nodedev nwfilter secret storage interface
sudo systemctl enable libvirtd
do
sudo systemctl start libvirtd
    sudo systemctl unmask virt${drv}d.service
</pre>
    sudo systemctl unmask virt${drv}d{,-ro,-admin}.socket
    sudo systemctl enable virt${drv}d.service
    sudo systemctl enable virt${drv}d{,-ro,-admin}.socket
done
for drv in qemu network nodedev nwfilter secret storage
do
    sudo systemctl start virt${drv}d{,-ro,-admin}.socket
done


On Debian?
There should be no errors and warnings.


=== Install Utilities (Optional) ===
=== Stop libvirt daemons shutting down ===


Various tools are used or convenient to have when running tests:
By default the libvirt daemons timeout and shutdown after 120 seconds (surely systemd will restart them!).  It turns out this hasn't worked so well:


Optional packages to install on Fedora
* [https://bugzilla.redhat.com/show_bug.cgi?id=2213660 | libvirt clients hang because virtnetworkd.service misses when virtnetworkd is dead]
: systemd doesn't restart the daemon
* [https://bugzilla.redhat.com/show_bug.cgi?id=2075736| error: Disconnected from qemu:///system due to keepalive timeout]
: the restart is painfully slow with lots of networks which causes the timeout
* [https://bugzilla.redhat.com/show_bug.cgi?id=2111582 | libvirtd deadlocks] (fixed)
* [https://bugzilla.redhat.com/show_bug.cgi?id=2123828 | virtqemud gets slower and slower]


<pre>
Disabling the timeout and just leaving the daemons running seems to help.  Add the following:
sudo dnf -y install git patch tcpdump expect python-setproctitle python-ujson pyOpenSSL python3-pyOpenSSL
sudo dnf install -y python2-pexpect python3-setproctitle diffstat
</pre>


Optional packages to install on Ubuntu
echo VIRTNETWORKD_ARGS= | sudo dd of=/etc/sysconfig/virtnetworkd
echo VIRTQEMUD_ARGS=    | sudo dd of=/etc/sysconfig/virtqemud
echo VIRTSTORAGED_ARGS= | sudo dd of=/etc/sysconfig/virtstoraged


<pre>
the standard libvirt systemd config files read these settings using EnvironmentFile=
apt-get install python-pexpect git tcpdump  expect python-setproctitle python-ujson \
        python3-pexpect python3-setproctitle
</pre>


{{ ambox | nocat=true | type=important | text = do not install strongswan-libipsec because you won't be able to run non-NAT strongswan tests! }}
=== Add yourself to the KVM/QEMU group ===


=== Setting Users and Groups ===
You need to add yourself to the group that QEMU/KVM uses when writing to /var/lib/libvirt/qemu.  On Fedora it is 'qemu', and on Debian it is 'kvm'.  Something like:


You need to add yourself to the qemu group.  For instance:
sudo usermod -a -G $(stat --format %G /var/lib/libvirt/qemu) $(id -u -n)


<pre>
After this you will will need to re-login (or run <tt>sudo su - $(id -u -n)</tt>
sudo usermod -a -G qemu $(id -u -n)
</pre>


You will need to re-login for this to take effect.
=== Make certain that <tt>root</tt> can access the build ===


The path to your build needs to be accessible (executable) by root:
The path to your build needs to be accessible (executable) by root, assuming things are under home:


<pre>
chmod a+x $HOME
chmod a+x ~
</pre>


=== Fix /var/lib/libvirt/qemu ===
=== Fix /var/lib/libvirt/qemu ===
Line 116: Line 159:
{{ ambox | nocat=true | type=important | text = Because our VMs don't run as qemu, /var/lib/libvirt/qemu needs to be changed using chmod g+w to make it writable for the qemu group. This needs to be repeated if the libvirtd package is updated on the system }}
{{ ambox | nocat=true | type=important | text = Because our VMs don't run as qemu, /var/lib/libvirt/qemu needs to be changed using chmod g+w to make it writable for the qemu group. This needs to be repeated if the libvirtd package is updated on the system }}


<pre>
sudo chmod g+w /var/lib/libvirt/qemu
sudo chmod g+w /var/lib/libvirt/qemu
</pre>


=== Create /etc/modules-load.d/virtio.conf ===
Arguably we should run libvirt as a normal user instead.


Several virtio modules need to be loaded into the host's kernel.  This could be done by modprobe ahead of running any virtual machines but it is easier to install them whenever the host boots. This is arranged by listing the modules in a file within /etc/modules-load.d.  The host must be rebooted for this to take effect.
=== Enable Tab Completion of <tt>./kvm</tt> ===


<pre>
If this:
sudo dd <<EOF of=/etc/modules-load.d/virtio.conf
virtio_blk
virtio-rng
virtio_console
virtio_net
virtio_scsi
virtio
virtio_balloon
virtio_input
virtio_pci
virtio_ring
9pnet_virtio
EOF
</pre>


As of Fedora 28, several of these modules are now built into the kernel and will not show up in /proc/modules (virtio, virtio_rng, virtio_pci, virtio_ring).
complete -o filenames -C './kvm' ./kvm


=== Ensure that the host has enough entropy ===
is added to  <tt>.bashrc</tt> then tab completion with <tt>./kvm</tt> will include both commands and directories.


[[Entropy matters]]
=== Set up a Web Server (optional) ===


With KVM, a guest systems uses entropy from the host through the kernel module "virtio_rng" in the guest's kernel (set above). This has advantages:
If the machine is to run nightly test runs then it can be set up as a web server.
See the [http://testing.libreswan.org nightly test results] for an example.


* entropy only needs to be gathered on one machine (the host) rather than all machines (the host and the guests)
See above for dependencies.  See below for how to configure libreswan.
* the host is in the Real World and thus has more sources of real entropy
* any hacking to make entropy available need only be done on one machine


To ensure the host has enough randomness, run either jitterentropy-rngd  or havegd.
To set up the server:


Fedora commands for using jitterentropy-rngd (broken on F26, service file specifies /usr/local for path):
sudo mkdir /var/www/html/results/
<pre>
sudo chown $(id -un) /var/www/html/results/
sudo dnf install jitterentropy-rngd
sudo chmod 755 /var/www/html/results/
sudo systemctl enable jitterentropy-rngd
sudo sh -c 'echo "AddType text/plain .diff" >/etc/httpd/conf.d/diff.conf'
sudo systemctl start jitterentropy-rngd
</pre>


Fedora commands for using havegd:
to run the web server until the next reboot:


<pre>
sudo firewall-cmd --add-service=http
sudo dnf install haveged
sudo systemctl start httpd
sudo systemctl enable haveged
sudo systemctl start haveged
to make the web server permanent:
</pre>
 
sudo systemctl enable httpd
sudo firewall-cmd --add-service=http --permanent
 
If you want it to be the main page of the website, you can create the file /var/www/html/index.html containing:
 
cat <<EOF
<pre>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
  <head>
    <meta http-equiv="REFRESH" content="0;url=/results/">
  </head>
  <BODY>
  </BODY>
</HTML>
</pre>
EOF
 
=== Debian ===
 
=== Override python? ===
 
On Debian slack based systems (i.e., Linux Mint 20.3), the default python is too old.  Fortunately python 3.9 is also available vis:
 
sudo apt-get install python3.9
 
in addition, the make variable KVM_PYTHON will need to be added to Makefile.inc.local:
 
echo KVM_PYTHON=python3.9 >> Makefile.inc.local
 
=== BSD ===
 
Anyone?
 
== Download and configure libreswan ==


=== Fetch Libreswan ===
=== Fetch Libreswan ===
Line 173: Line 232:
The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:
The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:


<pre>
git clone https://github.com/libreswan/libreswan
git clone https://github.com/libreswan/libreswan
cd libreswan
cd libreswan
</pre>


=== Experimental: label source tree for SELinux ===
Developers can use Makefile.inc.local to override default build setttings.  Create the file:
This only matters if you are using Fedora and have not disabled SELinux.


The source tree on the host is shared with the virtual machinesSELinux considers this a bug unless the tree is labelled with type svirt_image_t.
touch Makefile.inc.local
<pre>
 
sudo dnf install policycoreutils-python-utils
(packaging systems should not use this, and instead explicitly pass the make variables to the make command)
sudo semanage fcontext -a -t svirt_image_t "$(pwd)"'(/.*)?'
 
sudo restorecon -vR /home/build/libreswan
=== Create $(KVM_POOLDIR) for storing VM disk images ===
</pre>
 
The pool directory is used used to store:
 
* VM disk images
* install CD/DVD images
* downloaded packages installed into the VMs
* other files
 
and can get quite largeIt can and should be shared between build trees (this reflects libvirt which has a single name space for domains).  $(KVM_PREFIX) (see further down) addresses the lack of name spaces.
 
By default $(top_srcdir)/../pool (../pool) is used (that is, adjacent to your source tree).  It will need to be created.
 
Alternatively the shared pool directory can be specified explicitly by setting the make variable KVM_POOLDIR in Makefile.inc.local vis:
 
mkdir KVM_POOLDIR=/home/libreswan/pool
echo KVM_POOLDIR=/home/libreswan/pool >> Makefile.inc.local
 
=== Configure $(KVM_LOCALDIR) to store test domain disks in /tmp/pool (tmpfs) (optional) ===
 
By default, all disk mages are stored in $(KVM_POOLDIR) (see above).  Since the test VM disk images do not need long-term storage (i.e., survive a reboot), $(KVM_LOCALDIR) can be used to specify that test VM disk images are stored in /tmp vis:
 
echo KVM_LOCALDIR=/tmp/pool >> Makefile.inc.local
This has the advantage of eliminating physical disk I/O as a bottle neck when accessing VM disk images; but the disadvantage of needing to re-build the test disk images after a reboot.
 
Note: now that the domains are 100% transient this may have zero benefit.
 
=== Configure $(KVM_PREFIX) to allow allow multiple build trees on a machine (optional) ===
 
By default the domains and networks are assigned names such as linux, east, 198_18_1, et.al..  The problem is that these names are not unique between build trees, and as a result, all build trees try to use the same domains and networks.
 
The "fix" is to define $(KVM_PREFIX) in Makefile.inc.local, giving it a different value in each build tree.  For instance:
 
$ cat libreswan-a/Makefile.inc.local
KVM_PREFIX=a.
$ cat libreswan-b/Makefile.inc.local
KVM_PREFIX=b.
 
will use names such as a.linux et.al. in the first tree and b.linux et.al. in the second tree.
 
For convenience, commands such as:


There may be other things that SELinux objects to.
libreswan-a$ ./kvm sh linux


=== Create the Pool directory - KVM_POOLDIR ===
will log into the current build tree's domain (here a.linux).


The pool directory is used used to store KVM disk images and other configuration files.  By default $(top_srcdir)/../pool is used (that is, adjacent to your source tree).
Note: due to limitations in the network stack (interfaces have a limit of 16 characters) (the prefix needs to be short).


To change the location of the pool directory, set the KVM_POOLDIR make variable in Makefile.inc.local.  For instance:
=== Configure $(KVM_WORKERS) to run things in parallel (Optional) ===


<pre>
By default all operations (building and testing) is serialized (even the VMs are given only one CPU!).  If the host has plenty of cores then the parallelism can be increased using $(KVM_WORKERS). It does the following:
$ grep KVM_POOLDIR Makefile.inc.local
KVM_POOLDIR=/home/libreswan/pool
</pre>


== Serve test results as HTML pages on the test server (optional) ==
- assigns $(KVM_WORKERS) CPUs to the build VMs
- runs <tt>make -j $(KVM_WORKERS)</tt> when building and installing libreswan
- runs $(KVM_WORKERS) tests in parallel


If you want to be able to see the results of testruns in HTML, you can enable a webserver:
To make running tests in parallel possible $(KVM_PREFIX) and the numbers 1..$(KVM_WORKERS) are combined to generate unique domain and network names.  For instance, with:


<pre>
KVM_PREFIX=a.
dnf install httpd
KVM_WORKERS=3
systemctl enable httpd
systemctl start httpd
mkdir /var/www/html/results/
chown build /var/www/html/results/
chmod 755 /var/www/html/results/
cd ~
ln -s /var/www/html/results
</pre>


If you want it to be the main page of the website, you can create the file /var/www/html/index.html containing:
the prefixes a., a2, a3 are used generating the names a.east, a2east, a3east, et.al.


<pre>
Note: $(KVM_WORKERS) is ignored when $(KVM_PREFIX) is not set.  This might be a bug.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta http-equiv="REFRESH" content="0;url=/results/"></HEAD>
<BODY>
</BODY>
</HTML>
</pre>


and then add:
=== Generate a web page of the test results (optional) ===


<pre>
See the [http://testing.libreswan.org nightly test results] for an example and how to set up a web server so results can be viewed remotely.
WEB_SUMMARYDIR=/var/www/html/results
</pre>


To Makefile.inc.local
To initially create the web directory <tt>RESULTS/</tt> and populate it with the current test results use:


== Set up KVM and run the Testsuite (for the impatient) ==
make web


If you're impatient, and want to just run the testsuite using kvm then:
Further test runs will update the <tt>RESULTS/</tt> directory.  The files can the be viewed using http://file.


* install (or update) libreswan (if needed this will create the test domains):
To disable web page generation, delete the directory <tt>RESULTS/</tt>.
: <tt>make kvm-install</tt>
* run the testsuite:
: <tt>make kvm-test</tt>
* list the kvm make targets:
: <tt>make kvm-help</tt>


After that, the following make targets are useful:
To instead publish the results on the web, point <tt>$(WEB_SUMMARYDIR)</tt> at the web directory:


* clean the kvm build tree
$ WEB_SUMMARYDIR=/var/www/html/results >> Makefile.inc.local
: <tt>make kvm-clean</tt>
* clean the kvm build tree and all kvm domains
: <tt>make kvm-purge</tt>


== Running the testsuite ==
== Running the testsuite ==


=== Generating Certificates ===
The testsuite is driven using the top-level script <tt>./kvm</tt>
 
 
=== For the impatient: <tt>./kvm install check</tt>  ===
 
To build the VMs, and build and install (or update) libreswan, and then run the tests, use:
 
./kvm install check
 
=== Running the testsuite ===
 
; ./kvm install
: update the KVMs ready for a new test run
 
; ./kvm check
: run the testsuite, previous results are saved in <tt>BACKUP/-date-</tt>
 
; ./kvm recheck
: run the testsuite, but skip tests that already passed
 
; ./kvm results
: list the results from the test run
 
; ./kvm diffs
: display differences between the test results and the expected results, exit non-zero if there are any
 
; ./kvm test-clean
: delete the current test results
 
the operations can be combined on a single line:
 
./kvm test-clean install check recheck diff
 
and individual tests can be selected (see Running a Single Test, below):
 
./kvm install check diff testing/pluto/*ikev2*


The full testsuite requires a number of certificates. The virtual domains are configured for this purpose. Just use:
To stop <tt>./kvm</tt> use control-c or <tt>./kvm kill</tt> from another terminal.


<pre>
=== Updating Certificates ===
make kvm-keys
</pre>


( ''Before pyOpenSSL version 0.15 you couldn't run dist_certs.py without a patch to support creating SHA1 CRLs.
The full testsuite requires a number of certificates. If not present, then <tt>./kvm check</tt> will automatically generate them using the domain <tt>linux</tt>. Just note that the certificates have a limited lifetime.  Should the test system detects out-of-date certificates then <tt>./kvm check</tt> will barf.
A patch for this can be found at'' https://github.com/pyca/pyopenssl/pull/161 )


=== Run the testsuite ===
To rebuild the certificates:


To run all test cases (which include compiling and installing it on all vms, and non-VM based test cases), run:
./kvm keys


<pre>
can be used to force the generation of new certificates.
make kvm-install kvm-test
</pre>


=== Stopping pluto tests (gracefully) ===
=== Maintaining (rebuilding and updating) the Domains ===


If you used "make kvm-test", type control-C; possibly repeatedly.
In normal operation, the only domains of interest are:
 
; build domains (linux, netbsd, ...)
: <tt>./kvm install</tt> uses these for incremental builds
: to force a scratch build run <tt>./kvm uninstall</tt>
 
; test domains (east, west, ...)
: <tt>./kvm install</tt> always rebuilds these
: since these domains are transient, they disappear after a reboot
 
And to clean up everything:
 
./kvm clean
 
Finally, to upgrade the domains:
 
./kvm upgrade
 
Per above, these can be combined:
 
./kvm test-clean install check
./kvm upgrade install check
 
Internally, additional domains are created.
 
The table below lists all the domains and how to manipulate them.  There's no need to delete a domain before rebuilding it.  For instance:
 
./kvm test-clean upgrade install check
 
is equivalent to:
 
./kvm test-clean
./kvm downgrade
./kvm upgrade
./kvm transmogrify
./kvm install
./kvm check
 
There are two variants of each command.  The first creates all the domains, the second only creates the specified domain.
 
{| class="wikitable"
| step || new domain  || create || cloned from || mounts || networks || delete delete || notes
|-
| base || <tt>linux</tt>-base || ./kvm base<br>./kvm base-<tt>linux</tt>
|| ISOs          || /pool<br>/bench || gateway || ./kvm purge<tt>./kvm demolish
|| installs the bare minimum needed to get a domain on the network<br>root's account is hacked so that exit codes appear in the prompt<br>demolish also deletes the gateway
|-
| upgrade || <tt>linux</tt>-upgrade || ./kvm upgrade<br>./kvm upgrade-<tt>linux</tt>
|| <tt>linux</tt>-base    || /pool<br>/bench || gateway || ./kvm downgrade
|| installs and/or upgrades all packages needed to build and test libreswan using a local cache
|-
| transmogrify || <tt>linux</tt> || ./kvm transmogrify<br>./kvm transmogrify-<tt>linux</tt>
|| <tt>linux</tt>-upgrade || /pool<br>/bench<br>/source<br>/testing || gateway || ./kvm uninstall<br>./kvm clean
|| transmogrify the domain adding configuration and other files needed to build and test<br>if necessary install custom kernels and/or save kernels for direct boot
|-
| install || east et.al.  || ./kvm install<br>./kvm install-<tt>linux</tt>
|| <tt>linux</tt> || /source<br>/testing || test networks<br>possibly gateway || ./kvm uninstall<br>./kvm clean
|| install linux and then clone creating test domains<br>
|}
 
=== Mount Points ===
 
In normal operation, the only mount points of interest within a domain are <tt>/source</tt> and <tt>/testing</tt>.  These are configured to point at the current source tree.
 
Internally, the following additional mount points are used:
 
{| class="wikitable"
| mount    || variable        || default          || use when ... || notes
|-
| /testing  || $(KVM_TESTDIR)  || libreswan/testing || running tests  || the tests to run
|-
| /source  || $(KVM_SOURCEDIR) || libreswan/        || during install || the source code to build and install
|-
| /bench    || $(KVM_SOURCEDIR) || libreswan/        || building VMs  || the scripts driving the tests
|-
| /pool    || $(KVM_POOLDIR)  || pool/            || building VMs  || KVMs and caches
|}
 
It is possible, although unusual, to point these at different source trees.  For instance: testing.libreswan uses benchdir (/bench) for the scripts, and rutdir (/source, /testing) for the directory being tested; when testing old code /source can be pointed at an alternative directory that contains the sources that are to be built and tested.


== Shell and Console Access (Logging In) ==
== Shell and Console Access (Logging In) ==
Line 288: Line 465:
* while SSH takes more to set up, it supports things like proper terminal configuration and file copy
* while SSH takes more to set up, it supports things like proper terminal configuration and file copy


=== Serial Console access using "make kvmsh-HOST" (kvmsh.py) ===
=== Serial Console access using <tt>./kvm sh HOST</tt> (kvmsh.py) ===


"kvmsh", is a wrapper around "virsh".  It automatically handles things like booting the machine, logging in, and correctly configuring the terminal:
<tt>./kvm sh HOST</tt> is a wrapper around "virsh" that automatically handles things like booting the machine, logging in, and correctly configuring the terminal.  It's big advantage is that it always works.  For instance:


<pre>
$ ./testing/utils/kvmsh.py east
$ ./testing/utils/kvmsh.py east
[...]
[...]
Escape character is ^]
Escape character is ^]
[root@east ~]# printenv TERM
[root@east ~]# printenv TERM
xterm
xterm
[root@east ~]# stty -a
[root@east ~]# stty -a
...; rows 52; columns 185; ...  
...; rows 52; columns 185; ...  
[root@east ~]#
[root@east ~]#
</pre>


"kvmsh.py" can also be used to script remote commands (for instance, it is used to run "make" on the build domain):
The script "kvmsh.py" can also be used directly to invoke commands on a guest (this is how <tt>./kvm install</tt> works):


<pre>
$ ./testing/utils/kvmsh.py east ls
$ ./testing/utils/kvmsh.py east ls
[root@east ~]# ls
[root@east ~]# ls
anaconda-ks.cfg
anaconda-ks.cfg
</pre>


Finally, "make kvmsh-HOST" provides a short cut for the above; and if your using multiple build trees (see further down), it will connect to the DOMAIN that corresponds to HOST.  For instance, notice how the domain "a.east" is passed to kvmsh.py in the below:
When $(KVM_PREFIX) (and $(KVM_WORKERS)) is defined <tt>./kvm sh east</tt> can be used to log into $(KVM_PREFIX)east. 
 
<pre>
$ make kvmsh-east
/home/libreswan/pools/testing/utils/kvmsh.py --output ++compile-log.txt --chdir . a.east
Escape character is ^]
[root@east source]#
</pre>


Limitations:
Limitations:


* no file transfer but files can be accessed via /testing
* no file transfer but files can be accessed via <tt>/pool</tt> and </tt>/testing</tt>


=== Graphical Console access using virt-manager ===
=== Graphical Console access using virt-manager ===
Line 330: Line 495:


While easy to use, it doesn't support cut/paste or mechanisms for copying files.
While easy to use, it doesn't support cut/paste or mechanisms for copying files.


=== Shell access using SSH ===
=== Shell access using SSH ===


While requiring slightly more effort to set up, it provides full shell access to the domains.
While requiring more effort to set up, it provides full shell access to the domains.


Since you will be using ssh a lot to login to these machines, it is recommended to either put their names in /etc/hosts:
Since you will be using ssh a lot to login to these machines, it is recommended to either put their names in /etc/hosts:


<pre>
# /etc/hosts entries for libreswan test suite
# /etc/hosts entries for libreswan test suite
192.1.2.45 west
192.1.2.45 west
192.1.2.23 east
192.1.2.23 east
192.0.3.254 north
192.0.3.254 north
192.1.3.209 road
192.1.3.209 road
192.1.2.254 nic
192.1.2.254 nic
</pre>


or add entries to .ssh/config such as:
or add entries to .ssh/config such as:


<pre>
Host west
Host west
         Hostname 192.1.2.45
         Hostname 192.1.2.45
</pre>


If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to '''testing/baseconfigs/all/etc/ssh/authorized_keys'''. This file is installed as /root/.ssh/authorized_keys on all VMs
If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to '''testing/baseconfigs/all/etc/ssh/authorized_keys'''. This file is installed as /root/.ssh/authorized_keys on all VMs


Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine.  This command, run on the host, installs your public key on the root account of the guest machines west.  This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway).  Remember that the root password on each guest machine is "swan".
Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine.  This command, run on the host, installs your public key on the root account of the guest machines west.  This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway).  Remember that the root password on each guest machine is "swan".
<pre>
ssh-copy-id root@west
</pre>
You can use ssh-copy for any VM.  Unfortunately, the key is forgotten when the VM is restarted.


== Run an individual test (or tests) ==
ssh-copy-id root@west


All the test cases involving VMs are located in the libreswan directory under testing/pluto/ . The most basic test case is called basic-pluto-01. Each test case consists of a few files:
You can use ssh-copy for any VM.  Unfortunately, the key is forgotten when the VM is restarted.


* description.txt to explain what this test case actually tests
Limitations:
* ipsec.conf files - for host west is called west.conf. This can also include configuration files for strongswan or racoon2 for interop testig
* ipsec.secret files - if non-default configurations are used. also uses the host syntax, eg west.secrets, east.secrets.
* An init.sh file for each VM that needs to start (eg westinit.sh, eastinit.sh, etc)
* One run.sh file for the host that is the initiator (eg westrun.sh)
* Known good (sanitized) output for each VM (eg west.console.txt, east.console.txt)
* testparams.sh if there are any non-default test parameters


You can run this test case by issuing the following command on the host:
* this only works with the default east, et.al. (it does not work with $(KVM_PREFIX) and/or multiple test directories)


Either:
== kvm workflows ==


<pre>
(seeing as everyone has a "flow", why not kvm) here are some common workflows, the following commands are used:
make kvm-test KVM_TESTS+=testing/pluto/basic-pluto-01/
</pre>


or:
; ./kvm modified
: list the test directories that have been modified
; ./kvm baseline
: compare test results against a baseline
; ./kvm patch
: update the expected test results
; ./kvm add
: <tt>git add</tt> the modified test results
; ./kvm status
: show the status of the currently running testsuite
; ./kvm kill
: kill the currently running testsuite


<pre>
=== Running a single test ===
./testing/utils/kvmtest.py testing/pluto/basic-pluto-01
</pre>


multiple tests can be selected with:
There are two ways to run an individual test:


<pre>
# the test to run can be specified on the command line:
make kvm-test KVM_TESTS+=testing/pluto/basic-pluto-*
#: kvm check testing/pluto/basic-pluto-01
</pre>
# the test is implied when running <tt>kvm</tt> from a test directory:
#: cd testing/pluto/basic-pluto-01
#: ../../../kvm
#: ../../../kvm diff


or:
But there's a catch:


<pre>
* in batch mode <tt>pluto</tt> is shutdown at the end of the test
./testing/utils/kvmresults.py testing/pluto/basic-pluto-*
: this way additional post-mortem checks, such as for memory leaks and core dumps that rely on <tt>pluto</tt> being stopped, can be performed
</pre>
* in single test mode the system is left running
: this way it is possible to log in and look around the running system and attach a debugger to <tt>pluto</tt> before it is shutdown


Once the test run has completed, you will see an OUTPUT/ directory in the test case directory:
To instead force post-mortem, add:


<pre>
  KVMRUNNER_FLAGS += --run-post-mortem
$ ls OUTPUT/
east.console.diff east.console.verbose.txt  RESULT      west.console.txt          west.pluto.log
east.console.txt  east.pluto.log            swan12.pcap  west.console.diff  west.console.verbose.txt
</pre>


* RESULT is a text file (whose format is sure to change in the next few months) stating whether the test succeeded or failed.
to <tt>Makefile.inc.local</tt>.
* The diff files show the differences between this testrun and the last known good output.
* Each VM's serial (sanitized) console log  (eg west.console.txt)
* Each VM's unsanitized verbose console output (eg west.console.verbose.txt)
* A network capture from the bridge device (eg swan12.pcap)
* Each VM's pluto log, created with plutodebug=all (eg west.pluto.log)
* Any core dumps generated if a pluto daemon crashed


== Debugging inside the VM ==
=== Working on individual tests ===


=== Debugging pluto on east ===
The <tt>modified</tt> command can be used to limit the test run to just tests with modified files (according to git):


Terminal 1 - east: log into east, start pluto, and attach gdb
; ./kvm modified install check diff
: install libreswan and then run the testsuite against just the modified tests, display differences differences
; ./kvm modified recheck diff
: re-run the modified tests that are failing, display differences
; ./kvm modified patch add
: update the modified tests applying the latest output and add them to git


<pre>
this workflow comes into its own, when updating tests en-mass using sed, for instance:
make kvmsh-east
east# cd /testing/pluto/basic-pluto-01
east# sh -x ./eastinit.sh
east# gdb /usr/local/libexec/ipsec/pluto $(pidof pluto)
(gdb) c
</pre>


Terminal 2 - west: log into west, start pluto and the test
sed -i -e 's/PARENT_//' testing/pluto/*/*.console.txt
./kvm modified check


<pre>
=== Controlling a test run remotely ===
make kvmsh-west
west# sh -x ./westinit.sh ; sh -x westrun.sh
</pre>
If pluto wasn't running, gdb would complain: ''<code>--p requires an argument</code>''


When pluto crashes, gdb will show that and await commands.  For example, the bt command will show a backtrace.
Start the testsuite in the background:


=== Debugging pluto on west ===
./kvm nohup check


See above, but also use virt as a terminal.
To determine if the testsuite is still running:


=== /root/.gdbinit ===
./kvm status


If you want to get rid of the warning "warning: File "/testing/pluto/ikev2-dpd-01/.gdbinit" auto-loading has been declined by your `auto-load safe-path'"
and to stop the running testsuite:


<pre>
./kvm kill
echo "set auto-load safe-path /" >> /root/.gdbinit
</pre>


== Updating the VMs ==
=== Debugging inside the VM (pluto on east) ===


# delete all the copies of the base VM:
Terminal 1 - east: log into east, start pluto, and attach gdb
#: <tt>$ make kvm-purge</tt>
# install again
#: <tt>$ make kvm-install</tt>


== The /testing/guestbin directory ==
./kvm sh east
east# cd /testing/pluto/basic-pluto-01
east# sh -x ./eastinit.sh
east# gdb /usr/local/libexec/ipsec/pluto $(pidof pluto)
(gdb) c


The guestbin directory contains scripts that are used within the VMs only.
If pluto isn't running then gdb will complain with: ''<code>--p requires an argument</code>''


=== swan-transmogrify ===
Terminal 2 - west: log into west, start pluto and the test
 
When the VMs were installed, an XML configuration file from testing/libvirt/vm/ was used to configure each VM with the right disks, mounts and nic cards. Each VM mounts the libreswan directory as /source and the libreswan/testing/ directory as /testing . This makes the /testing/guestbin/ directory available on the VMs. At boot, the VMs run /testing/guestbin/swan-transmogrify. This python script compares the nic of eth0 with the list of known MAC addresses from the XML files. By identifying the MAC, it knows which identity (west, east, etc) it should take on. Files are copied from /testing/baseconfigs/ into the VM's /etc directory and the network service is restarted.
 
=== swan-build, swans-install, swan-update ===
 
These commands are used to build, install or build+install (update) the libreswan userland and kernel code
 
=== swan-prep ===
 
This command is run as the first command of each test case to setup the host. It copies the required files from /testing/baseconfigs/ and the specific test case files onto the VM test machine. It does not start libreswan. That is done in the "init.sh" script.
 
The swan-prep command takes two options.
The --x509 option is required to copy in all the required certificates and update the NSS database.
The --46 /--6 option is used to give the host IPv4 and/or IPv6 connectivity. Hosts per default only get IPv4 connectivity as this reduces the noise captured with tcpdump
 
=== fipson and fipsoff ===
 
These are used to fake a kernel into FIPS mode, which is required for some of the tests.
 
 
== Various notes ==
 
* Currently, only one test can run at a time.
* You can peek at the guests using virt-manager or you can ssh into the test machines from the host.
* ssh may be slow to prompt for the password.  If so, start up the vm "nic"
* On VMs use only one CPU core. Multiple CPUs may cause pexpect to mangle output.
* 2014 Mar: DHR needed to do the following to make things work each time he rebooted the host
<pre>
$ sudo setenforce Permissive
$ ls -ld /var/lib/libvirt/qemu
drwxr-x---. 6 qemu qemu 4096 Mar 14 01:23 /var/lib/libvirt/qemu
$ sudo chmod g+w /var/lib/libvirt/qemu
$ ( cd testing/libvirt/net ; for i in * ; do sudo virsh net-start $i ; done ; )
</pre>
* to make the SELinux enforcement change persist across host reboots, edit /etc/selinux/config
* to remove "169.254.0.0/16 dev eth0  scope link  metric 1002" from "ipsec status output"
<pre> echo 'NOZEROCONF=1' >> /etc/sysconfig/network </pre>
 
=== Need Strongswan 5.3.2 or later ===
The baseline Strongswan needed for our interop tests is 5.3.2.  This isn't part of Fedora or RHEL/CentOS at this time (2015 September).
 
Ask Paul for a pointer to the required RPM files.


Strongswan has dependency libtspi.so.1
./kvm sh west
<pre>
  west# sh -x ./westinit.sh ; sh -x westrun.sh
sudo dnf install trousers
sudo rpm -ev strongswan
sudo rpm -ev strongswan-libipsec
sudo rpm -i strongswan-5.2.0-4.fc20.x86_64.rpm
</pre>


To update to a newer verson, place the rpm in the source tree on the host machine.  This avoids needing to connect the guests to the internetThen start up all the machines, wait until they are booted, and update the Strongswan package on each machine.  (DHR doesn't know which machines actually need a Strongswan.)
When pluto crashes, gdb will show that and await commandsFor example, the <tt>bt</tt> command will show a backtrace.
<pre>
for vm in west east north road ; do sudo virsh start $vm; done
# wait for booting to finish
for vm in west east north road ; do ssh root@$vm 'rpm -Uv /source/strongswan-5.3.2-1.0.lsw.fc21.x86_64.rpm' ; done
</pre>


== To improve ==
TODO:
* install and remove RPM using swantest + make rpm support
* add summarizing script that generate html/json to git repo
* cordump. It has been a mystery :) systemd or some daemon appears to block coredump on the Fedora 20 systems.
* when running multiple tests from TESTLIST shutdown the hosts before copying OUTPUT dir. This way we get leak detect inf. However, for single test runs do not shut down.


== IPv6 tests ==
* stop watchdog eventually killing pluto
IPv6 test cases seems to work better when IPv6 is disabled on the KVM bridge interfaces the VMs use. The bridges are swanXX and their config files are /etc/libvirt/qemu/networks/192_0_1.xml . Remove the following line from it. Reboot/restart libvirt.
* notes for west


<pre>
=== Running a Custom Kernel ===
libvirt/qemu/networks/192_0_1.xml


<ip family="ipv6" address="2001:db8:0:1::253" prefix="64"/>
==== Custom NetBSD Kernel ====


</pre>
Build the kernel per upstream documentation and then copy it to:


and ifconfig swan01 should have no IPv6 address, no fe:80 or any v6 address. Then the v6 testcases should work.  
  $(KVM_POOLDIR)/$(KVM_PREFIX)netbsd-kernel


<br> please give me feedback if this hack work for you. I shall try to add more info about this.
During transmogrify the stock kernel will be replaced with the above.


== Sanitizers ==
==== Custom Linux Kernel ====
* summarize output from tcpdump
* count established IKE, ESP , AH states (there is count at the end of "ipsec status " that is not accurate. It counts instantiated connection as loaded.


* dpd ping sanitizer. DPD tests have unpredictable packet loss for ping.
The linux domains (east, west, et.al.) test domains boot the kernel directly using:


== Publishing Results on the web: http://testing.libreswan.org/results/ ==
$(KVM_POOLDIR)/$(KVM_PREFIX)linux-upgrade.vmlinuz
$(KVM_POOLDIR)/$(KVM_PREFIX)linux-upgrade.initramfs


This is experimental and uses:
These files are re-created whenever <tt>upgrade</tt> is run.  To boot a different kernel, replace the above (or edit the corresponding east.xml et.al. file with the new location).


* CSS
=== Building and testing an old branch ===
* javascript


Two scripts are available:
Old branches have two problems:


* <tt>testing/web/setup.sh</tt>
* the KVM codebase is out-of-date
: sets up the directory <tt>~/results</tt> adding any dependencies
* the OS releases are gone
* <tt>testing/web/publish.sh</tt>
: runs the testsuite and then copies the results to <tt>~/results</tt>


To view this, use file:///.
Here are two ways to get around it:


To get this working with httpd (Apache web server):
==== Using a test-bench ====


<pre>
This workflow works best when working on an old branch (lets say v4.11)
sudo systemctl enable httpd
sudo systemctl start httpd
sudo ln -s ~/results /var/www/html/
sudo sh -c 'echo "AddType text/plain .diff" >/etc/httpd/conf.d/diff.conf'
</pre>


To view the results, use http://localhost/results.
Two repositories are used:


== Speeding up "make kvm-test" by running things in parallel ==
# repo under test aka <tt>RUTDIR</tt>
#: this contains both the sources and the tests
# <tt>testbench</tt>
#: this contains the test scripts used to drive <tt>${RUT}</tt>


Internally kvmrunner.py has two work queues:
Start by checking out the two repositories (existing repositories can also be used, carefully):


* a pool of reboot threads; each thread reboots one domain at a time
RUTDIR=$PWD/v4_maint ; export RUTDIR
* a pool of test threads; each thread runs one test at a time using domains with a unique prefix
git clone https://github.com/libreswan/libreswan.git -r v4_maint ${RUTDIR}
git clone https://github.com/libreswan/libreswan.git testbench


The test threads uses the reboot thread pool as follows:
Next, configure <tt>testbench</tt> so that it compiles, installs, and runs tests from <tt>${RUTDIR}</tt> by setting the <tt>$(KVM_RUTDIR)</tt> make variable:


* get the next test
echo KVM_RUTDIR=$(realpath $RUTDIR)          >> testbench/Makefile.inc.local
* submit required domains to reboot pool
* wait for domains to reboot
* run test
* repeat


My adjusting KVM_WORKERS and KVM_PREFIXES it is possible:
(<tt>$(KVM_SOURCEDIR)</tt> and <tt>$(KVM_TESTINGDIR)</tt> default to <tt>$(KVM_RUTDIR)</tt>; you can also set $(KVM_SOURCEDIR)</tt> and <tt>$(KVM_TESTINGDIR)</tt> explicitly).


* speed up test runs
Now, (re-)transmogrify the <tt>testbench</tt> so that, within the domains, <tt>/source</tt> points at <tt>${RUT}</tt> and <tt>/testing</tt> points at <tt>${RUT}/testing</tt>:
* run independent testsuites in parallel


=== The reboot thread pool - make KVM_WORKERS=... ===
./testbench/kvm transmogrify


Booting the domains is the most CPU intensive part of running a test, and trying to perform too many reboots in parallel will bog down the machine to the point where tests time out and interactive performance becomes hopeless.  For this reason a pre-sized pool of reboot threads is used to reboot domains:
in the command building the fedora domain look for output like:


* the default is 1 reboot thread limiting things to one domain reboot at a time
--filesystem=target=bench,type=mount,accessmode=squash,source=/.../testbench \
* KVM_WORKERS specifies the number of reboot threads, and hence, the reboot parallelism
--filesystem=target=source,type=mount,accessmode=squash,source=${RUTDIR} \
* increasing this allows more domains to be rebooted in parallel
--filesystem=target=testing,type=mount,accessmode=squash,source=${RUTDIR}/testing \
* however, increasing this consumes more CPU resources


To increase the size of the reboot thread pool set KVM_WORKERS.  For instance:
Finally install and then run a test:


<pre>
./testbench/kvm install check diff $RUT/testing/pluto/basic-pluto-01
$ grep KVM_WORKERS Makefile.inc.local
KVM_WORKERS=2
$ make kvm-install kvm-test
[...]
runner 0.019: using a pool of 2 worker threads to reboot domains
[...]
runner basic-pluto-01 0.647/0.601: 0 shutdown/reboot jobs ahead of us in the queue
runner basic-pluto-01 0.647/0.601: submitting shutdown jobs for unused domains: road nic north
runner basic-pluto-01 0.653/0.607: submitting boot-and-login jobs for test domains: east west
runner basic-pluto-01 0.654/0.608: submitted 5 jobs; currently 3 jobs pending
[...]
runner basic-pluto-01 28.585/28.539: domains started after 28 seconds
</pre>


Only if your machine has lots of cores should you consider adjusting this in Makefile.inc.local.
If you prefer you can run <tt>testbench/kvm</tt>:


=== The tests thread pool - make KVM_PREFIXES=... ===
* from the <tt>testbench</tt> directory as <tt>./kvm</tt>
* from the <tt>${RUTDIR}</tt> directory as <tt>../testbench/kvm</tt>


Note that this is still somewhat experimental and has limitations:
just do not run $RUTDIR/kvm.


* stopping parallel tests requires multiple control-c's
==== Reviving the dead OS ====
* since the duplicate domains have the same IP address, things like "ssh east" don't apply; use "make kvmsh-<prefix><domain>" or "sudo virsh console <prefix><domain" or "./testing/utils/kvmsh.py <prefix><domain>".


Tests spend a lot of their time waiting for timeouts or slow tasks to completeSo that tests can be run in parallel the KVM_PREFIX provides a list of prefixes to add to the host names forming unique domain groups that can each be used to run tests:
Again looking at v4_maint branchCheck it out:


* the default is no prefix limiting things to a single global domain pool
  git checkout ... -b v4_maint
* KVM_PREFIXES specifies the domain prefixes to use, and hence, the test parallelism
* increasing this allows more tests to be run in parallel
* however, increasing this consumes more memory and context switch resources


For instance, setting KVM_PREFIXES in Makefile.inc.local to specify a unique set of domains for this directory:
add the following to Makefile.inc.local:


<pre>
  KVM_PREFIX=v4
$ grep KVM_PREFIX Makefile.inc.local
  KVM_FEDORA_ISO_URL = https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Server/x86_64/iso/Fedora-Server-dvd-x86_64-35-1.2.iso
KVM_PREFIX=a.
$ make kvm-install
[...]
$ make kvm-test
[...]
runner 0.018: using the serial test processor and domain prefix 'a.'
[...]
a.runner basic-pluto-01 0.574: submitting boot-and-login jobs for test domains: a.west a.east
</pre>


And setting KVM_PREFIXES in Makefile.inc.local to specify two prefixes and, consequently, run two tests in parallel:
build fedora-base:


<pre>
  ./kvm base-fedora
$ grep KVM_PREFIX Makefile.inc.local
KVM_PREFIX=a. b.
$ make kvm-install
[...]
$ make kvm-test
[...]
runner 0.019: using the parallel test processor and domain prefixes ['a.', 'b.']
[...]
b.runner basic-pluto-02 0.632/0.596: submitting boot-and-login jobs for test domains: b.west b.east
[...]
a.runner basic-pluto-01 0.769/0.731: submitting boot-and-login jobs for test domains: a.west a.east
</pre>


creates and uses two dedicated domain/network groups (a.east ..., and b.east ...).
login to the base domain:


Finally, to get rid of all the domains use:
  ./kvm sh fedora-base


<pre>
and edit the repos per:
$ make kvm-uninstall
</pre>


or even:
  /etc/yum.repos.d/fedora.repo:name=Fedora $releasever - $basearch
  /etc/yum.repos.d/fedora.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Everything/x86_64/os
  /etc/yum.repos.d/fedora.repo:name=Fedora $releasever - $basearch - Debug
  /etc/yum.repos.d/fedora.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Everything/x86_64/debug/tree/
  /etc/yum.repos.d/fedora.repo:name=Fedora $releasever - Source
  /etc/yum.repos.d/fedora.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Everything/source/tree/
  /etc/yum.repos.d/fedora-updates.repo:name=Fedora $releasever - $basearch - Updates
  /etc/yum.repos.d/fedora-updates.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/updates/35/Everything/x86_64/
  /etc/yum.repos.d/fedora-updates.repo:name=Fedora $releasever - $basearch - Updates - Debug
  /etc/yum.repos.d/fedora-updates.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/updates/35/Everything/x86_64/debug/
  /etc/yum.repos.d/fedora-updates.repo:name=Fedora $releasever - Updates Source
  /etc/yum.repos.d/fedora-updates.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/updates/35/Everything/source/tree/


<pre>
after that:
$ make KVM_PREFIX=b. kvm-uninstall
</pre>


Two domain groups (e.x., KVM_PREFIX=a. b.) seems to give the best results.
  ./kvm install check


=== Recommendations ===
might work


==== Some Analysis ====
=== Tracking down regressions (using git bisect) ===


The test system:
==== The easy way ====


* 4-core 64-bit intel
This workflow works best when the regression is recent (i.e., the last few commits) and nothing significant has happened in the meantime (for instance, os upgrade, test rename, ...).
* plenty of ram
* the file mk/perf.sh


Increasing the number of parallel tests, for a given number of reboot threads:
The command <tt>./kvm install check diff</tt> exits with a <tt>git bisect</tt> friendly status codes which means it can be combined with <tt>git bisect run</tt> to automate regression testing.


[[File:tests-vs-reboots.png]]
For instance:


* having #cores/2 reboot threads has the greatest impact
git bisect start main ^<suspect-commit>
* having more than #cores reboot threads seems to slow things down
git bisect run ./kvm install check diff testing/pluto/basic-pluto-01
git bisect visualize
# finally
git bisect reset


Increasing the number of reboots, for a given number of test threads:
==== The hard way ====


[[File:reboots-vs-tests.png]]
This workflow works best when trying to track down a regression in an older version of libreswan.


* adding a second test thread has a far greater impact than adding a second reboot thread - contrast top lines
Two repositories are used:
* adding a third and even fourth test thread - i.e., up to #cores - still improves things


Finally here's some ASCII art showing what happens to the failure rate when the KVM_PREFIX is set so big that the reboot thread pool is kept 100% busy:
#  <tt>repo-under-test</tt>
#: this contains the sources that will be built and installed into the test domains and is what git bisect will manipulate
# <tt>testbench</tt>
#: this contains the test scripts used to drive <tt>repo-under-test</tt>


<pre>
Start by checking out the two repositories (existing repositories can also be used, carefully):
                  Fails  Reboots  Time
    ************  127      1    6:35  ****************************************
  **************  135      2    3:33  *********************
  ***************  151      3    3:12  *******************
  ***************  154      4    3:01  ******************
</pre>


Notice how having more than #cores/2 KVM_WORKERS (here 2) has little benefit and failures edge upwards.
git clone https://github.com/libreswan/libreswan.git repo-under-test
git clone https://github.com/libreswan/libreswan.git testbench


==== Desktop Development Directory ====
and then cd to the <tt>repo-under-test</tt> directory:


* reduce build/install time - use only one prefix
  cd repo-under-test
* reduce single-test time - boot domains in parallel
* use the non-prefix domains east et.al. so it is easy to access the test domains using tools like ssh


Lets assume 4 cores:
Next, configure <tt>testbench</tt> so that it compiles and installs libreswan from <tt>repo-under-test</tt> but runs tests from <tt>testbench</tt>.  Do this by pointing the <tt>testbench</tt> <tt>KVM_SOURCEDIR</tt> (<tt>/source</tt>) at <tt>repo-under-test</tt> vis:


<pre>
# remember $PWD is repo-under-test
KVM_WORKERS=2
echo KVM_SOURCEDIR=$(realpath ../repo-under-test)    >>../testbench/Makefile.inc.local
KVM_PREFIX=''
echo KVM_TESTINGDIR=$(realpath ../testbench/testing) >>../testbench/Makefile.inc.local
</pre>


You could also add a second prefix vis:
Now, (re-)transmogrify the <tt>testbench</tt> so that, within the domains, <tt>/source</tt> points at <tt>repo-under-test</tt>:


<pre>
../testbench/kvm transmogrify
KVM_PREFIX= '' a.
</pre>


but that, unfortunately, slows down the the build/install time.
in the command building the fedora domain look for output like:


==== Desktop Baseline Directory ====
--filesystem=target=bench,type=mount,accessmode=squash,source=/.../testbench \
--filesystem=target=source,type=mount,accessmode=squash,source=/.../repo-under-test \
--filesystem=target=testing,type=mount,accessmode=squash,source=/.../testbench/testing \


* do not overload the desktop - reduce CPU load by booting sequentially
Finally run the tests (remember testing/pluto/basic-pluto-01 is the test that started failing):
* reduce total testsuite time - run tests in parallel
* keep separate to development directory above


Lets assume 4 cores
# start with the bad commit
git bisect start main
# next checkout and confirm the good commit
# NOTE: run testbench/kvm from repo-under-test directory
git checkout <good-commit>
../testbench/kvm install check diff testing/pluto/basic-pluto-01
git bisect good


* KVM_WORKERS=1
if you're lucky, the test requires no manual intervention and:
* KVM_PREFIX= b1. b2.


==== Dedicated Test Server ====
git bisect run ../testbench/kvm install check diff testing/pluto/basic-pluto-01


* minimize total testsuite time
also works:
* maximize CPU use
* assume only testsuite running


Assuming 4 cores:
# finally
git bisect visualize
git bisect reset


<pre>
TODO: figure out how to get ../testbench/kvm diff to honour KVM_TESTINGDIR so that it can handle a test somewhere other than in <tt?testbench</tt>
* KVM_WORKERS=2
* KVM_PREFIX= '' t1. t2. t3.
</pre>

Latest revision as of 22:55, 18 October 2024

KVM Test framework

Libreswan's test framework can be run using KVM guests, and the ./kvm script. It is strongly recommended to run the test suite on a host machine that has a CPU with virtualisation instructions.

To access files on the host file system:

  • Fedora uses the PLAN9 filesystem (9p)
  • Other guests (Alpine, Debian, FreeBSD, NetBSD, OpenBSD) use NFS via the NAT interface

For an overview of the network and testing see Test_Suite


Preparing the host machine

Check Virtualization is enabled in the BIOS

Virtualization needs to be enabled by the BIOS during boot.

 grep -e vmx -e svm /proc/cpuinfo

Add yourself to sudo

Some of the test scrips need to be run as root. The test environment assumes this can be done using sudo without a password vis:

sudo pwd

XXX: Surely qemu can be driven without root?

This is setup by adding an entry under /etc/sudoers.d/ specifying that your account does not need a password to become root:

echo "$(id -u -n) ALL=(ALL) NOPASSWD: ALL" | sudo dd of=/etc/sudoers.d/$(id -u -n)

Fight SELinux

SELinux blocks some actions that we need. We have not created any SELinux rules to avoid this. To check the current settings:

 getenforce

The options are:

  • set SELinux to permissive (recommended)
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config
sudo setenforce Permissive
  • disable SELinux
sudo sed --in-place=.ORIG -e 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
sudo reboot
  • (experimental) label source tree for SELinux

The source tree on the host is shared with the virtual machines. SELinux considers this a bug unless the tree is labelled with type svirt_image_t.

sudo dnf install policycoreutils-python-utils
sudo semanage fcontext -a -t svirt_image_t "$(pwd)"'(/.*)?'
sudo restorecon -vR /home/build/libreswan

There may be other things that SELinux objects to.

Check that the host has enough entropy

As a rough guide run:

while true ; do cat /proc/sys/kernel/random/entropy_avail ; sleep 3 ; done

it should have values in the hundrets if not thousands. If it is in the units or tens then see Entropy matters

Install Dependencies

Why Fedora Mint (debian)
Basics sudo dnf install -y make git gitk patch xmlto python3-pexpect curl tar sudo apt-get install -y make make-doc git gitk xmlto python3-pexpect curl tar
Virtualization sudo dnf install -y qemu virt-install libvirt-daemon-kvm libvirt-daemon-qemu sudo apt install -y qemu virtinst libvirt-clients libvirt-daemon libvirt-daemon-system libvirt-daemon-driver-qemu libosinfo-query qemu-system-x86?
Build BSD Boot CDs sudo dnf install -y dvd+rw-tools sudo apt-get install -y dvd+rw-tools
Build Web Pages sudo dnf install -y jq typescript sudo apt-get install -y jq node-typescript
Serve Web Server (optional) sudo dnf install -y httpd sudo apt-get install -y ????
NFS sudo dnf install -y nfs-utils # ??? sudo apt-get install -y nfs-kernel-server rpcbind
Broken makefiles sudo dnf install -y nss-devel # make file invokes pkg-config nss

Enable libvirt

If you're switching from the old libvirtd see https://libvirt.org/daemons.html#switching-to-modular-daemons for how to shut down the old daemons.

Start the "collection of modular daemons that replace functionality previously provided by the monolithic libvirtd daemon":

for drv in qemu network nodedev nwfilter secret storage interface
do
   sudo systemctl unmask virt${drv}d.service
   sudo systemctl unmask virt${drv}d{,-ro,-admin}.socket
   sudo systemctl enable virt${drv}d.service
   sudo systemctl enable virt${drv}d{,-ro,-admin}.socket
done
for drv in qemu network nodedev nwfilter secret storage
do
   sudo systemctl start virt${drv}d{,-ro,-admin}.socket
done

There should be no errors and warnings.

Stop libvirt daemons shutting down

By default the libvirt daemons timeout and shutdown after 120 seconds (surely systemd will restart them!). It turns out this hasn't worked so well:

systemd doesn't restart the daemon
the restart is painfully slow with lots of networks which causes the timeout

Disabling the timeout and just leaving the daemons running seems to help. Add the following:

echo VIRTNETWORKD_ARGS= | sudo dd of=/etc/sysconfig/virtnetworkd
echo VIRTQEMUD_ARGS=    | sudo dd of=/etc/sysconfig/virtqemud
echo VIRTSTORAGED_ARGS= | sudo dd of=/etc/sysconfig/virtstoraged

the standard libvirt systemd config files read these settings using EnvironmentFile=

Add yourself to the KVM/QEMU group

You need to add yourself to the group that QEMU/KVM uses when writing to /var/lib/libvirt/qemu. On Fedora it is 'qemu', and on Debian it is 'kvm'. Something like:

sudo usermod -a -G $(stat --format %G /var/lib/libvirt/qemu) $(id -u -n)

After this you will will need to re-login (or run sudo su - $(id -u -n)

Make certain that root can access the build

The path to your build needs to be accessible (executable) by root, assuming things are under home:

chmod a+x $HOME

Fix /var/lib/libvirt/qemu

sudo chmod g+w /var/lib/libvirt/qemu

Arguably we should run libvirt as a normal user instead.

Enable Tab Completion of ./kvm

If this:

complete -o filenames -C './kvm' ./kvm

is added to .bashrc then tab completion with ./kvm will include both commands and directories.

Set up a Web Server (optional)

If the machine is to run nightly test runs then it can be set up as a web server. See the nightly test results for an example.

See above for dependencies. See below for how to configure libreswan.

To set up the server:

sudo mkdir /var/www/html/results/
sudo chown $(id -un) /var/www/html/results/
sudo chmod 755 /var/www/html/results/
sudo sh -c 'echo "AddType text/plain .diff" >/etc/httpd/conf.d/diff.conf'

to run the web server until the next reboot:

sudo firewall-cmd --add-service=http
sudo systemctl start httpd

to make the web server permanent:

sudo systemctl enable httpd
sudo firewall-cmd --add-service=http --permanent

If you want it to be the main page of the website, you can create the file /var/www/html/index.html containing:

cat <<EOF
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
 <html>
   <head>
     <meta http-equiv="REFRESH" content="0;url=/results/">
  </head>
   <BODY>
  </BODY>
 </HTML>
 
EOF

Debian

Override python?

On Debian slack based systems (i.e., Linux Mint 20.3), the default python is too old. Fortunately python 3.9 is also available vis:

sudo apt-get install python3.9

in addition, the make variable KVM_PYTHON will need to be added to Makefile.inc.local:

echo KVM_PYTHON=python3.9 >> Makefile.inc.local

BSD

Anyone?

Download and configure libreswan

Fetch Libreswan

The libreswan source tree includes all the components that are used on the host and inside the test VMs. To get the latest source code using git:

git clone https://github.com/libreswan/libreswan
cd libreswan

Developers can use Makefile.inc.local to override default build setttings. Create the file:

touch Makefile.inc.local

(packaging systems should not use this, and instead explicitly pass the make variables to the make command)

Create $(KVM_POOLDIR) for storing VM disk images

The pool directory is used used to store:

  • VM disk images
  • install CD/DVD images
  • downloaded packages installed into the VMs
  • other files

and can get quite large. It can and should be shared between build trees (this reflects libvirt which has a single name space for domains). $(KVM_PREFIX) (see further down) addresses the lack of name spaces.

By default $(top_srcdir)/../pool (../pool) is used (that is, adjacent to your source tree). It will need to be created.

Alternatively the shared pool directory can be specified explicitly by setting the make variable KVM_POOLDIR in Makefile.inc.local vis:

mkdir KVM_POOLDIR=/home/libreswan/pool
echo KVM_POOLDIR=/home/libreswan/pool >> Makefile.inc.local

Configure $(KVM_LOCALDIR) to store test domain disks in /tmp/pool (tmpfs) (optional)

By default, all disk mages are stored in $(KVM_POOLDIR) (see above). Since the test VM disk images do not need long-term storage (i.e., survive a reboot), $(KVM_LOCALDIR) can be used to specify that test VM disk images are stored in /tmp vis:

echo KVM_LOCALDIR=/tmp/pool >> Makefile.inc.local

This has the advantage of eliminating physical disk I/O as a bottle neck when accessing VM disk images; but the disadvantage of needing to re-build the test disk images after a reboot.

Note: now that the domains are 100% transient this may have zero benefit.

Configure $(KVM_PREFIX) to allow allow multiple build trees on a machine (optional)

By default the domains and networks are assigned names such as linux, east, 198_18_1, et.al.. The problem is that these names are not unique between build trees, and as a result, all build trees try to use the same domains and networks.

The "fix" is to define $(KVM_PREFIX) in Makefile.inc.local, giving it a different value in each build tree. For instance:

$ cat libreswan-a/Makefile.inc.local
KVM_PREFIX=a.
$ cat libreswan-b/Makefile.inc.local
KVM_PREFIX=b.

will use names such as a.linux et.al. in the first tree and b.linux et.al. in the second tree.

For convenience, commands such as:

libreswan-a$ ./kvm sh linux

will log into the current build tree's domain (here a.linux).

Note: due to limitations in the network stack (interfaces have a limit of 16 characters) (the prefix needs to be short).

Configure $(KVM_WORKERS) to run things in parallel (Optional)

By default all operations (building and testing) is serialized (even the VMs are given only one CPU!). If the host has plenty of cores then the parallelism can be increased using $(KVM_WORKERS). It does the following:

- assigns $(KVM_WORKERS) CPUs to the build VMs - runs make -j $(KVM_WORKERS) when building and installing libreswan - runs $(KVM_WORKERS) tests in parallel

To make running tests in parallel possible $(KVM_PREFIX) and the numbers 1..$(KVM_WORKERS) are combined to generate unique domain and network names. For instance, with:

KVM_PREFIX=a.
KVM_WORKERS=3

the prefixes a., a2, a3 are used generating the names a.east, a2east, a3east, et.al.

Note: $(KVM_WORKERS) is ignored when $(KVM_PREFIX) is not set. This might be a bug.

Generate a web page of the test results (optional)

See the nightly test results for an example and how to set up a web server so results can be viewed remotely.

To initially create the web directory RESULTS/ and populate it with the current test results use:

make web

Further test runs will update the RESULTS/ directory. The files can the be viewed using http://file.

To disable web page generation, delete the directory RESULTS/.

To instead publish the results on the web, point $(WEB_SUMMARYDIR) at the web directory:

$ WEB_SUMMARYDIR=/var/www/html/results >> Makefile.inc.local

Running the testsuite

The testsuite is driven using the top-level script ./kvm


For the impatient: ./kvm install check

To build the VMs, and build and install (or update) libreswan, and then run the tests, use:

./kvm install check

Running the testsuite

./kvm install
update the KVMs ready for a new test run
./kvm check
run the testsuite, previous results are saved in BACKUP/-date-
./kvm recheck
run the testsuite, but skip tests that already passed
./kvm results
list the results from the test run
./kvm diffs
display differences between the test results and the expected results, exit non-zero if there are any
./kvm test-clean
delete the current test results

the operations can be combined on a single line:

./kvm test-clean install check recheck diff

and individual tests can be selected (see Running a Single Test, below):

./kvm install check diff testing/pluto/*ikev2*

To stop ./kvm use control-c or ./kvm kill from another terminal.

Updating Certificates

The full testsuite requires a number of certificates. If not present, then ./kvm check will automatically generate them using the domain linux. Just note that the certificates have a limited lifetime. Should the test system detects out-of-date certificates then ./kvm check will barf.

To rebuild the certificates:

./kvm keys

can be used to force the generation of new certificates.

Maintaining (rebuilding and updating) the Domains

In normal operation, the only domains of interest are:

build domains (linux, netbsd, ...)
./kvm install uses these for incremental builds
to force a scratch build run ./kvm uninstall
test domains (east, west, ...)
./kvm install always rebuilds these
since these domains are transient, they disappear after a reboot

And to clean up everything:

./kvm clean

Finally, to upgrade the domains:

./kvm upgrade

Per above, these can be combined:

./kvm test-clean install check
./kvm upgrade install check

Internally, additional domains are created.

The table below lists all the domains and how to manipulate them. There's no need to delete a domain before rebuilding it. For instance:

./kvm test-clean upgrade install check

is equivalent to:

./kvm test-clean
./kvm downgrade
./kvm upgrade
./kvm transmogrify
./kvm install
./kvm check

There are two variants of each command. The first creates all the domains, the second only creates the specified domain.

step new domain create cloned from mounts networks delete delete notes
base linux-base ./kvm base
./kvm base-linux
ISOs /pool
/bench
gateway ./kvm purge./kvm demolish installs the bare minimum needed to get a domain on the network
root's account is hacked so that exit codes appear in the prompt
demolish also deletes the gateway
upgrade linux-upgrade ./kvm upgrade
./kvm upgrade-linux
linux-base /pool
/bench
gateway ./kvm downgrade installs and/or upgrades all packages needed to build and test libreswan using a local cache
transmogrify linux ./kvm transmogrify
./kvm transmogrify-linux
linux-upgrade /pool
/bench
/source
/testing
gateway ./kvm uninstall
./kvm clean
transmogrify the domain adding configuration and other files needed to build and test
if necessary install custom kernels and/or save kernels for direct boot
install east et.al. ./kvm install
./kvm install-linux
linux /source
/testing
test networks
possibly gateway
./kvm uninstall
./kvm clean
install linux and then clone creating test domains

Mount Points

In normal operation, the only mount points of interest within a domain are /source and /testing. These are configured to point at the current source tree.

Internally, the following additional mount points are used:

mount variable default use when ... notes
/testing $(KVM_TESTDIR) libreswan/testing running tests the tests to run
/source $(KVM_SOURCEDIR) libreswan/ during install the source code to build and install
/bench $(KVM_SOURCEDIR) libreswan/ building VMs the scripts driving the tests
/pool $(KVM_POOLDIR) pool/ building VMs KVMs and caches

It is possible, although unusual, to point these at different source trees. For instance: testing.libreswan uses benchdir (/bench) for the scripts, and rutdir (/source, /testing) for the directory being tested; when testing old code /source can be pointed at an alternative directory that contains the sources that are to be built and tested.

Shell and Console Access (Logging In)

There are several different ways to gain shell access to the domains.

Each method, depending on the situation, has both advantages and disadvantages. For instance:

  • while make kvmsh-host provide quick access to the console, it doesn't support file copy
  • while SSH takes more to set up, it supports things like proper terminal configuration and file copy

Serial Console access using ./kvm sh HOST (kvmsh.py)

./kvm sh HOST is a wrapper around "virsh" that automatically handles things like booting the machine, logging in, and correctly configuring the terminal. It's big advantage is that it always works. For instance:

$ ./testing/utils/kvmsh.py east
[...]
Escape character is ^]
[root@east ~]# printenv TERM
xterm
[root@east ~]# stty -a
...; rows 52; columns 185; ... 
[root@east ~]#

The script "kvmsh.py" can also be used directly to invoke commands on a guest (this is how ./kvm install works):

$ ./testing/utils/kvmsh.py east ls
[root@east ~]# ls
anaconda-ks.cfg

When $(KVM_PREFIX) (and $(KVM_WORKERS)) is defined ./kvm sh east can be used to log into $(KVM_PREFIX)east.

Limitations:

  • no file transfer but files can be accessed via /pool and /testing

Graphical Console access using virt-manager

"virt-manager", a gnome tool can be used to access individual domains.

While easy to use, it doesn't support cut/paste or mechanisms for copying files.

Shell access using SSH

While requiring more effort to set up, it provides full shell access to the domains.

Since you will be using ssh a lot to login to these machines, it is recommended to either put their names in /etc/hosts:

# /etc/hosts entries for libreswan test suite
192.1.2.45 west
192.1.2.23 east
192.0.3.254 north
192.1.3.209 road
192.1.2.254 nic

or add entries to .ssh/config such as:

Host west
       Hostname 192.1.2.45

If you wish to be able to ssh into all the VMs created without using a password, add your ssh public key to testing/baseconfigs/all/etc/ssh/authorized_keys. This file is installed as /root/.ssh/authorized_keys on all VMs

Using ssh becomes easier if you are running ssh-agent (you probably are) and your public key is known to the virtual machine. This command, run on the host, installs your public key on the root account of the guest machines west. This assumes that west is up (it might not be, but you can put this off until you actually need ssh, at which time the machine would need to be up anyway). Remember that the root password on each guest machine is "swan".

ssh-copy-id root@west

You can use ssh-copy for any VM. Unfortunately, the key is forgotten when the VM is restarted.

Limitations:

  • this only works with the default east, et.al. (it does not work with $(KVM_PREFIX) and/or multiple test directories)

kvm workflows

(seeing as everyone has a "flow", why not kvm) here are some common workflows, the following commands are used:

./kvm modified
list the test directories that have been modified
./kvm baseline
compare test results against a baseline
./kvm patch
update the expected test results
./kvm add
git add the modified test results
./kvm status
show the status of the currently running testsuite
./kvm kill
kill the currently running testsuite

Running a single test

There are two ways to run an individual test:

  1. the test to run can be specified on the command line:
    kvm check testing/pluto/basic-pluto-01
  2. the test is implied when running kvm from a test directory:
    cd testing/pluto/basic-pluto-01
    ../../../kvm
    ../../../kvm diff

But there's a catch:

  • in batch mode pluto is shutdown at the end of the test
this way additional post-mortem checks, such as for memory leaks and core dumps that rely on pluto being stopped, can be performed
  • in single test mode the system is left running
this way it is possible to log in and look around the running system and attach a debugger to pluto before it is shutdown

To instead force post-mortem, add:

KVMRUNNER_FLAGS += --run-post-mortem

to Makefile.inc.local.

Working on individual tests

The modified command can be used to limit the test run to just tests with modified files (according to git):

./kvm modified install check diff
install libreswan and then run the testsuite against just the modified tests, display differences differences
./kvm modified recheck diff
re-run the modified tests that are failing, display differences
./kvm modified patch add
update the modified tests applying the latest output and add them to git

this workflow comes into its own, when updating tests en-mass using sed, for instance:

sed -i -e 's/PARENT_//' testing/pluto/*/*.console.txt
./kvm modified check

Controlling a test run remotely

Start the testsuite in the background:

./kvm nohup check

To determine if the testsuite is still running:

./kvm status

and to stop the running testsuite:

./kvm kill

Debugging inside the VM (pluto on east)

Terminal 1 - east: log into east, start pluto, and attach gdb

./kvm sh east
east# cd /testing/pluto/basic-pluto-01
east# sh -x ./eastinit.sh
east# gdb /usr/local/libexec/ipsec/pluto $(pidof pluto)
(gdb) c

If pluto isn't running then gdb will complain with: --p requires an argument

Terminal 2 - west: log into west, start pluto and the test

./kvm sh west
west# sh -x ./westinit.sh ; sh -x westrun.sh

When pluto crashes, gdb will show that and await commands. For example, the bt command will show a backtrace.

TODO:

  • stop watchdog eventually killing pluto
  • notes for west

Running a Custom Kernel

Custom NetBSD Kernel

Build the kernel per upstream documentation and then copy it to:

$(KVM_POOLDIR)/$(KVM_PREFIX)netbsd-kernel

During transmogrify the stock kernel will be replaced with the above.

Custom Linux Kernel

The linux domains (east, west, et.al.) test domains boot the kernel directly using:

$(KVM_POOLDIR)/$(KVM_PREFIX)linux-upgrade.vmlinuz
$(KVM_POOLDIR)/$(KVM_PREFIX)linux-upgrade.initramfs

These files are re-created whenever upgrade is run. To boot a different kernel, replace the above (or edit the corresponding east.xml et.al. file with the new location).

Building and testing an old branch

Old branches have two problems:

  • the KVM codebase is out-of-date
  • the OS releases are gone

Here are two ways to get around it:

Using a test-bench

This workflow works best when working on an old branch (lets say v4.11)

Two repositories are used:

  1. repo under test aka RUTDIR
    this contains both the sources and the tests
  2. testbench
    this contains the test scripts used to drive ${RUT}

Start by checking out the two repositories (existing repositories can also be used, carefully):

RUTDIR=$PWD/v4_maint ; export RUTDIR
git clone https://github.com/libreswan/libreswan.git -r v4_maint ${RUTDIR}
git clone https://github.com/libreswan/libreswan.git testbench

Next, configure testbench so that it compiles, installs, and runs tests from ${RUTDIR} by setting the $(KVM_RUTDIR) make variable:

echo KVM_RUTDIR=$(realpath $RUTDIR)           >> testbench/Makefile.inc.local

($(KVM_SOURCEDIR) and $(KVM_TESTINGDIR) default to $(KVM_RUTDIR); you can also set $(KVM_SOURCEDIR) and $(KVM_TESTINGDIR) explicitly).

Now, (re-)transmogrify the testbench so that, within the domains, /source points at ${RUT} and /testing points at ${RUT}/testing:

./testbench/kvm transmogrify

in the command building the fedora domain look for output like:

--filesystem=target=bench,type=mount,accessmode=squash,source=/.../testbench \
--filesystem=target=source,type=mount,accessmode=squash,source=${RUTDIR} \
--filesystem=target=testing,type=mount,accessmode=squash,source=${RUTDIR}/testing \

Finally install and then run a test:

./testbench/kvm install check diff $RUT/testing/pluto/basic-pluto-01

If you prefer you can run testbench/kvm:

  • from the testbench directory as ./kvm
  • from the ${RUTDIR} directory as ../testbench/kvm

just do not run $RUTDIR/kvm.

Reviving the dead OS

Again looking at v4_maint branch. Check it out:

 git checkout ... -b v4_maint

add the following to Makefile.inc.local:

 KVM_PREFIX=v4
 KVM_FEDORA_ISO_URL = https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Server/x86_64/iso/Fedora-Server-dvd-x86_64-35-1.2.iso

build fedora-base:

 ./kvm base-fedora

login to the base domain:

 ./kvm sh fedora-base

and edit the repos per:

 /etc/yum.repos.d/fedora.repo:name=Fedora $releasever - $basearch
 /etc/yum.repos.d/fedora.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Everything/x86_64/os
 /etc/yum.repos.d/fedora.repo:name=Fedora $releasever - $basearch - Debug
 /etc/yum.repos.d/fedora.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Everything/x86_64/debug/tree/
 /etc/yum.repos.d/fedora.repo:name=Fedora $releasever - Source
 /etc/yum.repos.d/fedora.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/35/Everything/source/tree/
 /etc/yum.repos.d/fedora-updates.repo:name=Fedora $releasever - $basearch - Updates
 /etc/yum.repos.d/fedora-updates.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/updates/35/Everything/x86_64/
 /etc/yum.repos.d/fedora-updates.repo:name=Fedora $releasever - $basearch - Updates - Debug
 /etc/yum.repos.d/fedora-updates.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/updates/35/Everything/x86_64/debug/
 /etc/yum.repos.d/fedora-updates.repo:name=Fedora $releasever - Updates Source
 /etc/yum.repos.d/fedora-updates.repo:baseurl=https://archives.fedoraproject.org/pub/archive/fedora/linux/updates/35/Everything/source/tree/

after that:

 ./kvm install check

might work

Tracking down regressions (using git bisect)

The easy way

This workflow works best when the regression is recent (i.e., the last few commits) and nothing significant has happened in the meantime (for instance, os upgrade, test rename, ...).

The command ./kvm install check diff exits with a git bisect friendly status codes which means it can be combined with git bisect run to automate regression testing.

For instance:

git bisect start main ^<suspect-commit>
git bisect run ./kvm install check diff testing/pluto/basic-pluto-01
git bisect visualize
# finally
git bisect reset

The hard way

This workflow works best when trying to track down a regression in an older version of libreswan.

Two repositories are used:

  1. repo-under-test
    this contains the sources that will be built and installed into the test domains and is what git bisect will manipulate
  2. testbench
    this contains the test scripts used to drive repo-under-test

Start by checking out the two repositories (existing repositories can also be used, carefully):

git clone https://github.com/libreswan/libreswan.git repo-under-test
git clone https://github.com/libreswan/libreswan.git testbench

and then cd to the repo-under-test directory:

 cd repo-under-test

Next, configure testbench so that it compiles and installs libreswan from repo-under-test but runs tests from testbench. Do this by pointing the testbench KVM_SOURCEDIR (/source) at repo-under-test vis:

# remember $PWD is repo-under-test
echo KVM_SOURCEDIR=$(realpath ../repo-under-test)    >>../testbench/Makefile.inc.local
echo KVM_TESTINGDIR=$(realpath ../testbench/testing) >>../testbench/Makefile.inc.local

Now, (re-)transmogrify the testbench so that, within the domains, /source points at repo-under-test:

../testbench/kvm transmogrify

in the command building the fedora domain look for output like:

--filesystem=target=bench,type=mount,accessmode=squash,source=/.../testbench \
--filesystem=target=source,type=mount,accessmode=squash,source=/.../repo-under-test \
--filesystem=target=testing,type=mount,accessmode=squash,source=/.../testbench/testing \

Finally run the tests (remember testing/pluto/basic-pluto-01 is the test that started failing):

# start with the bad commit
git bisect start main
# next checkout and confirm the good commit
# NOTE: run testbench/kvm from repo-under-test directory
git checkout <good-commit>
../testbench/kvm install check diff testing/pluto/basic-pluto-01
git bisect good

if you're lucky, the test requires no manual intervention and:

git bisect run ../testbench/kvm install check diff testing/pluto/basic-pluto-01

also works:

# finally
git bisect visualize
git bisect reset

TODO: figure out how to get ../testbench/kvm diff to honour KVM_TESTINGDIR so that it can handle a test somewhere other than in <tt?testbench