Test Suite - Performance

From Libreswan
Revision as of 19:43, 22 October 2018 by Andrew Cagney (talk | contribs) (only performance)
Jump to navigation Jump to search


Speeding up "make kvm-test" by running things in parallel

Internally kvmrunner.py has two work queues:

  • a pool of reboot threads; each thread reboots one domain at a time
  • a pool of test threads; each thread runs one test at a time using domains with a unique prefix

The test threads uses the reboot thread pool as follows:

  • get the next test
  • submit required domains to reboot pool
  • wait for domains to reboot
  • run test
  • repeat

My adjusting KVM_WORKERS and KVM_PREFIXES it is possible:

  • speed up test runs
  • run independent testsuites in parallel

The reboot thread pool - make KVM_WORKERS=...

Booting the domains is the most CPU intensive part of running a test, and trying to perform too many reboots in parallel will bog down the machine to the point where tests time out and interactive performance becomes hopeless. For this reason a pre-sized pool of reboot threads is used to reboot domains:

  • the default is 1 reboot thread limiting things to one domain reboot at a time
  • KVM_WORKERS specifies the number of reboot threads, and hence, the reboot parallelism
  • increasing this allows more domains to be rebooted in parallel
  • however, increasing this consumes more CPU resources

To increase the size of the reboot thread pool set KVM_WORKERS. For instance:

$ grep KVM_WORKERS Makefile.inc.local
KVM_WORKERS=2
$ make kvm-install kvm-test
[...]
runner 0.019: using a pool of 2 worker threads to reboot domains
[...]
runner basic-pluto-01 0.647/0.601: 0 shutdown/reboot jobs ahead of us in the queue
runner basic-pluto-01 0.647/0.601: submitting shutdown jobs for unused domains: road nic north
runner basic-pluto-01 0.653/0.607: submitting boot-and-login jobs for test domains: east west
runner basic-pluto-01 0.654/0.608: submitted 5 jobs; currently 3 jobs pending
[...]
runner basic-pluto-01 28.585/28.539: domains started after 28 seconds

Only if your machine has lots of cores should you consider adjusting this in Makefile.inc.local.

The tests thread pool - make KVM_PREFIXES=...

Note that this is still somewhat experimental and has limitations:

  • stopping parallel tests requires multiple control-c's
  • since the duplicate domains have the same IP address, things like "ssh east" don't apply; use "make kvmsh-<prefix><domain>" or "sudo virsh console <prefix><domain" or "./testing/utils/kvmsh.py <prefix><domain>".

Tests spend a lot of their time waiting for timeouts or slow tasks to complete. So that tests can be run in parallel the KVM_PREFIX provides a list of prefixes to add to the host names forming unique domain groups that can each be used to run tests:

  • the default is no prefix limiting things to a single global domain pool
  • KVM_PREFIXES specifies the domain prefixes to use, and hence, the test parallelism
  • increasing this allows more tests to be run in parallel
  • however, increasing this consumes more memory and context switch resources

For instance, setting KVM_PREFIXES in Makefile.inc.local to specify a unique set of domains for this directory:

$ grep KVM_PREFIX Makefile.inc.local
KVM_PREFIX=a.
$ make kvm-install
[...]
$ make kvm-test
[...]
runner 0.018: using the serial test processor and domain prefix 'a.'
[...]
a.runner basic-pluto-01 0.574: submitting boot-and-login jobs for test domains: a.west a.east

And setting KVM_PREFIXES in Makefile.inc.local to specify two prefixes and, consequently, run two tests in parallel:

$ grep KVM_PREFIX Makefile.inc.local
KVM_PREFIX=a. b.
$ make kvm-install
[...]
$ make kvm-test
[...]
runner 0.019: using the parallel test processor and domain prefixes ['a.', 'b.']
[...]
b.runner basic-pluto-02 0.632/0.596: submitting boot-and-login jobs for test domains: b.west b.east
[...]
a.runner basic-pluto-01 0.769/0.731: submitting boot-and-login jobs for test domains: a.west a.east

creates and uses two dedicated domain/network groups (a.east ..., and b.east ...).

Finally, to get rid of all the domains use:

$ make kvm-uninstall

or even:

$ make KVM_PREFIX=b. kvm-uninstall

Two domain groups (e.x., KVM_PREFIX=a. b.) seems to give the best results.

Recommendations

Some Analysis

The test system:

  • 4-core 64-bit intel
  • plenty of ram
  • the file mk/perf.sh

Increasing the number of parallel tests, for a given number of reboot threads:

Tests-vs-reboots.png

  • having #cores/2 reboot threads has the greatest impact
  • having more than #cores reboot threads seems to slow things down

Increasing the number of reboots, for a given number of test threads:

Reboots-vs-tests.png

  • adding a second test thread has a far greater impact than adding a second reboot thread - contrast top lines
  • adding a third and even fourth test thread - i.e., up to #cores - still improves things

Finally here's some ASCII art showing what happens to the failure rate when the KVM_PREFIX is set so big that the reboot thread pool is kept 100% busy:

                  Fails  Reboots  Time
     ************  127      1     6:35  ****************************************
   **************  135      2     3:33  *********************
  ***************  151      3     3:12  *******************
  ***************  154      4     3:01  ******************

Notice how having more than #cores/2 KVM_WORKERS (here 2) has little benefit and failures edge upwards.

Desktop Development Directory

  • reduce build/install time - use only one prefix
  • reduce single-test time - boot domains in parallel
  • use the non-prefix domains east et.al. so it is easy to access the test domains using tools like ssh

Lets assume 4 cores:

KVM_WORKERS=2
KVM_PREFIX=''

You could also add a second prefix vis:

KVM_PREFIX= '' a.

but that, unfortunately, slows down the the build/install time.

Desktop Baseline Directory

  • do not overload the desktop - reduce CPU load by booting sequentially
  • reduce total testsuite time - run tests in parallel
  • keep separate to development directory above

Lets assume 4 cores

  • KVM_WORKERS=1
  • KVM_PREFIX= b1. b2.

Dedicated Test Server

  • minimize total testsuite time
  • maximize CPU use
  • assume only testsuite running

Assuming 4 cores:

* KVM_WORKERS=2
* KVM_PREFIX= '' t1. t2. t3.