Namespace-Magic: Difference between revisions

From Libreswan
Jump to navigation Jump to search
No edit summary
No edit summary
Line 28: Line 28:


= Scaling issues to navigate =
= Scaling issues to navigate =
== route cache filling up ==
<p>
[2936616.607520] Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
[2936616.607908] Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
[2936616.609100] Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size.
sysctl -a | grep "route.max_size"
net.ipv4.route.max_size = 2147483647
</p>
google suggest flushing the route cache. That does not seems to help. I think routes are stuck inside namespaces.
ip route flush cache
ip route show cache is empty
ip -all netns exec ip link show
show weired errors.
ip -all netns exec ip link show
netns: east-ikev2-mobike-06
setting the network namespace "east-ikev2-mobike-06" failed: Invalid argument
</p>


== iptable fails -w 60 seems to help ==
== iptable fails -w 60 seems to help ==

Revision as of 02:29, 22 July 2019

The namespaces have been around for long time however, it still feel magic. So I start a page to enable magic, in 2019. As time pass it may not be magic anymore or even may become obsolete. An early attempt in Libreswan with Paul.

FAQ

How detect from inside the namespace

* one way seems to look at eth0. inside namespace "eth1@if107" kvm "eth0:"
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:9e:81:71 brd ff:ff:ff:ff:ff:ff
</rep>

* How find veth's peer inside namespace from a host : link-netns

<pre>
on the host ip link output:

107: hweste164512@if106: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master brswan12-64512 state UP mode DEFAULT group default qlen 1000
    link/ether 4a:34:cd:0e:0c:13 brd ff:ff:ff:ff:ff:ff link-netns west-ikev2-03-basic-rawrsa

from inside the name space

106: eth1@if107: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:10:c8:8e:d2:7e brd ff:ff:ff:ff:ff:ff link-netnsid 0

from the host you get the name space name: "link-netns west-ikev2-03-basic-rawrsa" 
for exaactly which interface from "ip link" you see "106: eth1@if107",  "107: hweste164512@if106"


Scaling issues to navigate

route cache filling up

[2936616.607520] Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size. [2936616.607908] Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size. [2936616.609100] Route cache is full: consider increasing sysctl net.ipv[4|6].route.max_size. sysctl -a | grep "route.max_size" net.ipv4.route.max_size = 2147483647

google suggest flushing the route cache. That does not seems to help. I think routes are stuck inside namespaces.

ip route flush cache

ip route show cache is empty

ip -all netns exec ip link show show weired errors.

ip -all netns exec ip link show

netns: east-ikev2-mobike-06 setting the network namespace "east-ikev2-mobike-06" failed: Invalid argument


iptable fails -w 60 seems to help


sudo /usr/bin/nsenter --mount=/run/mountns/west-nstest-4 --net=/run/netns/west-nstest-4 --uts=/run/utsns/west-nstest-4 /bin/bash -c 'cd /testing/pluto/nstest-4;iptables -I INPUT -m policy --dir in --pol ipsec -j ACCEPT
'
Another app is currently holding the xtables lock. Perhaps you want to use the -w option?

I tried putting less /root/.bashrc

alias iptables="iptables --wait 60 --wait-interval=100000"

That seems to have reduced the failure rate from 50% to 2%, when running 500 tests 10 tests in parallel. Changing to --wait 120 still cause 1.3% errror. Going above 120 seconds would skew tests. They usually timeout in 120 seconds


would this work on foo 7/CentOS7: not yet too old util-linux

unshare and or nsenter do not suppor --mount[=file] option.

seems to be some options.

fedora 28 
unshare -V
unshare from util-linux 2.32.1

-m, --mount[=file]
   Unshare the mount namespace.  If file is specified, then a persistent namespace is cre‐ated
   by a bind mount

---- old one foo 7 -----
unshare -V
unshare from util-linux 2.23.2

-m, --mount
   Unshare the mount namespace.

test using "sudo unshare --net=/run/netns/east-basic-pluto-01 /usr/bin/bash"

outstanding issues

reduce the use of iptables =

This would go in steps. First make sure the swan-prep crate the LOGDROP traget only when a test need it. grep in the test scripts for LOGDROP. So mostly it will only run on the initiator.

"Error: Peer netns reference is invalid." =

It seems when the namespaces get mangled up any "ip" command would output this error. For now I am sanitizing it.

running the tests in parallel worker pool =

convert brctl to "ip link"

support bind mount installation using RPM or make install-base

good to know

nsenter alias/function

NSENTER()
{
 ns=$1
 nsargs="--mount=/run/mountns/${ns} --net=/run/netns/${ns} --uts=/run/utsns/${ns}"
 NSENTER_CMD="/usr/bin/nsenter ${nsargs} "
 sudo ${NSENTER_CMD} /bin/bash
}

# Then type

NSENTER east-basic-pluto-01