FAQ: Difference between revisions

From Libreswan
Jump to navigation Jump to search
No edit summary
No edit summary
Line 246: Line 246:
net.ipv4.conf.ethX.proxy_arp=1
net.ipv4.conf.ethX.proxy_arp=1
</pre>
</pre>
note: work is underway to allow more fine grained control of proxyarp per-connection via the _updown script


= Common error messages =
= Common error messages =

Revision as of 20:16, 24 October 2016

General Questions

( we will sort this in categories once we have more )

Which RFC's or other standards does libreswan support?

See Implemented Standards

Which ciphers / algorithms does libreswan support?

IKEv1

  • IKE: AES_CBC, 3DES, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, SHA1, MD5 with all regular MODP groups (non-ECC)
  • ESP on Linux: AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
  • AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5

IKEv2

  • IKE: AES_GCM, AES_CTR, AES_CBC, 3DES, CAMELLIA_CBC, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, SHA1, MD5 with all regular MODP groups (non-ECC) and ESN
  • ESP on Linux: AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
  • AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5

Notes

  • Serpent and Twofish are compile time options and do not come from the NSS library. They use the well-known private allocation numbers from FreeS/WAN.
  • Some algorithms are disabled when running in FIPS mode.

Which IKEv1 and IKEv2 Exchange Modes does libreswan support?

The IANA Registry lists all official Exchange Modes. There are a few IKEv1 Modes that are very common despite never gotten past the draft stage.

Supported:

Not supported


Should I use the XFRM/NETKEY or KLIPS IPsec stack with libreswan?

At this point we recommend using the NETKEY stack for most deployments. If you are using an embedded platform with a cryptographic hardware offload device, it might be better to use KLIPS.

The NETKEY IPsec stack requires no kernel recompiles on most Linux distributions, so it is the easiest stack to use in most standard deployments. It offers a larger selection of cryptographic algorithm support, including the IPsec Suite B algorithms AES CTR, AES GCM and SHA2. It does cause a little additional delay with on-demand IPsec tunnels because it does not implement first+last packet caching. NETKEY supports OCF only using the cryptosoft driver, and is lacking native driver support for most cryptographic hardware cards. NETKEY also does not distribute the load of a single IPsec SA over different CPU's. NETKEY has support for Linux VTI for IPsec SA reference tracking.

The KLIPS IPsec stack offers easier debugging with tcpdump and easier iptables firewall rules due to its use of separate ipsecX interfaces. It also plays a little nicer with on-demand tunneling as it will hold on the first+last packet sent while the tunnel is being setup, and will release those packets once the IPsec tunnel is established. KLIPS distributes the load of a single IPsec SA over multiple CPU's. It supports all OCF hardware devices when compiled with OCF support. The MAST variant of KLIPS use IPsec SAref for IPsec SA reference tracking and is also used for L2TP/IPsec deployments requiring SAref tracking. Although it is recommended to use VPN_server_for_remote_clients_using_IKEv1_XAUTH instead of L2TP/IPsec.

Can I have an ipsec0 interface with XFFRM/NETKEY?

Yes, this is supported as of libreswan-3.18. See Route-based VPN using VTI

Famous vulnerabilities

Libreswan is not vulnerable to the OpenSSL "Heartbleed" exploit

See Libreswan and Heartbleed

Libreswan is not vulnerable to bash CVE-2014-6271 or CVE-2014-7169

No, libreswan is not vulnerable.

Libreswan sanitizers strings that may come from the network, such as XAUTH username, domain and DNS servers by passing it through filter functions remove_metachar() and cisco_stringify() before assigning it to environment variables that are passed to the updown scripts that invoke bash. These filters remove dangerous characters including the ' character needed for these bash exploits.

Libreswan is vulnerable to NSS CVE-2014-1568 RSA Signature Forgery

Please upgrade NSS to one of 3.17.1, 3.16.1 or 3.16.5.

This only affects libreswan when using X.509 certificates. Raw RSA keys using leftrsasigkey/rightrsasigkey are not affected. Connections using auth=secret (PSK) are also not affected.

See Mozilla Foundation Security Advisory 2014-73

Libreswan is not vulnerable to LogJam / weakdh.org CVE-2015-4000

The IKE protocol never allowed any DH group smaller than MODP768. Libreswan has never supported anything smaller than MODP1024

Libreswan as a client to a weak server will allow MODP1024 in IKEv1 as the least secure option, and MODP1536 in IKEv2 as the least secure option. However, the default is MODP2048.

Libreswan supports MODP group upto MODP8192, but it needs to be configured specifically. These might end up in the default proposal set for IKEv2 in the future.

Libreswan also supports the alternative primes for MODP1024 and MODP2048 specified in RFC-5114. None of these will be placed in the default proposal group due to the lack of transparency of where these alternatives came from and why these were needed.

For more details, see "The weak DH and LogJam attack impact on IKE / IPsec (and the *swans)

Libreswan is not vulnerable to the TLS/IKE SLOTH / TRANSCRIPT attacks CVE-2015-7575

The IKE protocol is not affected, see "The SLOTH attack and IKE/IPsec"

Google Cloud VPN issue

Google Cloud VPN does not support NAT. The libreswan endpoint has to have a real public IP that is not NAT'ed

Configuration Matters

Using SHA2_256 for ESP connection establishes but no traffic passes (especially Android 6.0)

It seems that android 6.0 now defaults to ESP with SHA2, but it uses a bad implementation of SHA2. You can work around that using sha2-truncbug=yes but that would break all non-android clients that use the proper RFC SHA2 implementation. It might be possible to avoid SHA2 completely and use esp=aes_gcm-null instead (which is also faster)

See the sha2-truncbug man page entry of ipsec.conf for more information. There is also an android bug 194269 about this issue.

Note Linux kernels before 2.6.33 all used the broken truncation, so to interop with those old kernels, the sha2-truncbug=yes option would need to be set.

libreswan-3.18 and higher prefers sha2_512 over sha2_256 to avoid this issue. A note has also been added to RFC7321bis.

My ssh sessions hang or connectivity is very slow

This could be an MTU issue. The overhead of IPsec encryption (and possibly ESPinUDP encapsulation) yields a slightly smaller packet size. This can cause problems. A good way to confirm MTU problems is if you can login remotely over the IPsec tunnel using ssh, but issuing "ls -l /usr" causes the session to hang. Try adjusting the MTU with:

iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS  --clamp-mss-to-pmtu

If that does not help, try hardcoding it yourself:

iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1380

If these settings don't help, adding mtu=1420 to the connection might work, although it will affect all traffic that the connection covers.

As a last case alternative, you can try lowering the MTU on the internal interface of your IPsec server so that the PMTU discovery locally already goes back to 1440, eg using ip link set dev eth1 mtu 1440. This will not only affect packets for the VPN tunnel, but all packets received and sent on that inerface. Only use this as a last resort.

using auto=route slows down TCP establishments when using XFRM

(also known as rhbz#1010347 )

This should be fixed on recent kernels (3.x) and backported to some older kernels (notably rhel 6.6)

The issue: The ESP packets are arriving sometimes very late or they do not arrive at all. The issues are most noticeable after restarting the IPSec daemon.

The problem as explained by Herbert Xu:

Your first TCP SYN packet triggers the IPsec lookup, however, the packet itself is dropped. TCP then retransmits but it only gets through after the IPsec SAs are fully instated, resulting in the delay.

What happens in some kernels is that the IPsec trigger occurs in a sleepable context, which means that the sending process will wait for the IPsec SAs to be installed before sending the first SYN. However, this was never meant to be a complete solution to supporting auto=route as it relies on the fact that there must be some sleepable context prior to the SYN packet being sent.

Evidently this is no longer the case for some kernels. Going forward I suggest two courses of action:

1) Doo not rely on auto=route. Instead use auto=start and ensure that you synchronously wait for the SAs to complete. For example, ipsec auto --up foo will bring foo up synchronously, while ipsec auto --asynchronous --up foo will not wait and thus may fail.

2) I will take this issue to the IPsec maintainer and the network maintainer to see if we can make adjustments to allow at least the TCP connection case to work with auto=route. However, there is no guarantee that this will be done as we may not be able to insert the requisite sleepable context into the general network stack just so that IPsec auto=route can work.

Longer term for auto=route to be properly supported someone needs to implement packet queueing on larval SAs.

Possible work around:

        echo 0 > /proc/sys/net/core/xfrm_larval_drop
        echo 3 > /proc/sys/net/ipv4/tcp_syn_retries

This means that the first retransmit of the SYN packet (+1s) should make it through, rather than the current behaviour where only the fourth retransmit (+15s) makes it through.

Note that this workaround causes a regression on the connect() call to immediately return on a non-blocking socket with an appropriate POSIX compliant errno, which is why the workaround also sets the TCP SYN retry count to 3.

PSK doesn't work against cisco ASA 55xx

While libreswan has very little restrictions to Pre-shared secret Cisco has additional restriction, you can't have question mark '?' in psk. Cisco handles that as help request.

When using hundreds of tunnels on a xen based cloud system like AWS, a fraction of tunnels fail regularly

This is a known issue that could be a problem of the aesni kernel module in combination with the xen hypervisor. Try unloading the aesni.ko kernel module on the xen server. If you can confirm this fixes your issue (we cannot change the AWS servers), please email the swan-dev list with a confirmation.

My XAUTH authentication via PAM always claims the password is incorrect on centos6

This is an odd bug (feature?) that shows up when you have disabled selinux in /etc/sysconfig/selinux. Running selinux in permissive (or enforcing) mode seems to resolve this.

Why is it recommended to disable send_redirects in /proc/sys/net ?

Let's say you have a VPN server in a cloud that you use with your phone. Your phone will setup an IPsec VPN and all its traffic is encrypted and send to the cloud instance, which decrypts it and sends it on the internet, using SNAT. Replies it receives are encrypted and send to your phone.

Your phone will send the VPN server an encrypted packet. The server receives it on eth0 (its only interface!) and decrypts it. The decrypted packet is then ready to get routed. The server looks which interface it should send the packet to. It is destined to go out eth0. Since the packet came in via eth0 and would go out via eth0, the server concludes there clearly must be a better path not involving itself, since it is going out the same interface. It has no idea the packet arrived encrypted and got decrypted.

This is why we recommend disabling "send_redirects" in /etc/sysctl.conf using

net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0

Why is it recommended to disable rp_filter in /proc/sys/net ?

The kernel has a notion of which interface a packet came from and where it will go to and it determines if the path through the machine makes sense based on the IP address it sees. If 10.0.2.0/24 lives on eth0 and 1.2.3.4 has eth1 with the default route, then rp_filter will automatically block a 10.0.2.1 packet coming in on eth1. The rp_filter code is an implementation of RFC-3704 https://tools.ietf.org/html/rfc3704. Of course, you should created had firewall rules on the machine that would block these packets too. AND firewall rules on the router in front of the machine.

The problem with IPsec appears when you hand out a 10.0.2.13 address, like via XAUTH/IPsec. A packet with IP a.b.c.d comes in on eth1 for 1.2.3.4, which passes rp_filter, then gets decrypted to 10.0.2.13. Now the packet is still seen as coming from eth1, so rp_filter will drop the packet as 10.0.2.0/24 packets are only expected to originate from eth0.

This is why we recommend disabling "rp_filter" in /etc/sysctl.conf using

net.ipv4.conf.default.rp_filter = 0

A network restart or reboot might be neccessary for this entry to be picked up. As a one shot disabling for all interfaces, you can use:

for i in /proc/sys/net/ipv4/conf/*; do echo 0 > $i/rp_filter; done

NAT + IPsec is not working

When using NAT on the same linux machine as IPsec, care must be taken that packets meant for an IPsec remote address is not NATed. The NATed packet would no longer match the IPsec tunnel source and destination IP address ranges.

If you have the following common catch-all NAT rule:

-A POSTROUTING -o eth0 -j MASQUERADE

or

-A POSTROUTING -o eth0 -j SNAT --to-source 1.2.3.4

then either change these rules to only apply with a non-ipsec policy:

-A POSTROUTING -o eth0 -m policy --dir out --pol none -j MASQUERADE

or insert a ipsec skip rule before these:

-A POSTROUTING -o eth0 -m policy --dir out --pol ipsec -j RETURN
-A POSTROUTING -o eth0 -j MASQUERADE


Can I hand out LAN IP addresses in the addresspool?

Yes, but you will need to enable proxyarp on the IPsec server. You can do this globally using the proxyarp entry in /etc/sysctl.conf, for example if your LAN interface is ethX, use

net.ipv4.conf.ethX.proxy_arp=1

Common error messages

ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: Message too long [errno 90, origin ICMP type 3 code 4 (not authenticated)]

These errors are often intermittent, it depends on your application data that is getting encrypted. Your NAT'ed IPsec tunnel is using ESPinUDP, and the additional UDP header caused some of your packets to be too big. See the previous answer and try lowing your mtu. Use an insanely small mtu like 1300 or 1200 for confirmation. Then try to bring it up higher to what seems to work reliably for you.

ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: No route to host [errno 113, origin ICMP type 3 code 1(not authenticated)]

These errors often happen 15 minutes after the tunnel successfully established. It's most likely that the tunnel was idle and the NAT router removed the nat mapping. Or the NAT router rebooted and lost state. It no longer knows which client to send the packet to. Ensure your connection uses nat-keepalive=yes. Possibly decrease the global keep-alive= value to send more frequent keep-alive packets. Alternatively, enable DPD on the connection to cause some regular traffic on idle tunnels.

ERROR: asynchronous network error report on eth0 (sport=500) for message to xx.xx.xxx.xxx port 500, complainant yy.yy.yyy.yyy: Connection refused [errno 111, origin ICMP type 3 code 3 (not authenticated)]

This error means the other end is not (or no longer) running an IKE daemon. Ensure the IKE daemon is running on the remote system. If you see this error during a negotiation, it could be that the remote IKE daemon crashed or stopped listening. On Mac OSX if the IKE daemon is not allowed to read the proper X.509 certificate, it will only realize this partially into the IKE negotiation and terminate, resulting in this error. It is also possible that the remote IP is actually a NAT device with the IPsec device behind it. In that case, using rekey=no and letting the other end initiate might make this error go away.

error: ignoring informational payload, type NO_PROPOSAL_CHOSEN msgid=00000000

This error means exactly what i says. The IKE proposal(s) sent to the server were rejected. This means there is a configuration mismatch between libreswan and the remote IPsec server. Usually this is a configuration mismatch in the ike= or esp= (phase2alg=) setting. But other options could also be wrong, such as authby= or pfs= or aggrmode=

ssh gives error: Corrupted MAC on input. Disconnecting: Packet corrupt

This usually indicates MTU issues. You can try lowering the mtu using the mtu= option or by changing the actual mtu on the proper interface on the libreswan server. This error is known to happen on Amazon EC2 AMI types that use PV (xen) instances. Switching to Amazon HVM instances seems to resolve the problem on AWS.

Using aes_gcm or aes_ctr results in ERROR: netlink response for Add SA esp.XXXXXXXX@IPADDRESS included errno 22: Invalid argument

This usually indicates that the ESP algorithm selected using the phasealg= (esp=) line is not available in the kernel. These usually indicate kernel bugs.

Linux kernels up to 3.2.x have a bug in the aesni-intel driver on x86_64. See rhbz#1176211 The AESNI hardware acceleration kernel module does not properly support 256 or 192 bit keys for AES_GCM. You can either switch to 128 bit keys or blacklist or unload the aesni-intel kernel module. Another alternative is to switch from phase2alg=aes_gcm to phase2alg=aes, although that will cut the performance in half.

Linux kernels to date seem to have a bug in the aes_ctr code on the POWER8BE VM - use phase2alg=aes there as well to use AES_CBC,

Can't find the private key from the NSS CERT (err -8177)

The old libreswan-3.8 /etc/ipsec.d/nsspassword requires just the password to be entered. In later libreswan's, you must add the NSS prefix to it. So to specify the password "secret", use:

NSS Certificate DB:secret

"IPsec encryption transform did not specify required KEY_LENGTH"

This happens when trying to interoperate with old openswan versions that mistakenly do not send the KEY_LENGTH attribute for AES. The work around the problem, on those old implementations, specify "aes128" or "aes256" instead of "aes". For example:

phase2alg=aes256-sha1;modp1536
esp=aes256-sha1;modp1536
ike=aes256-sha1;modp1536


No PARENT proposal selected

This error can happen when there is a mismatch of IKE proposals between the server and client. In libreswan-3.14, the modp1024 (group 2) was removed from the default proposal set because of its weakness, but apparently Windows 7 requires it per default.

Using VTI causes "Keys are not allowed with ipip and sit tunnels"

You need to upgrade the iproute package. For RHEL7, see RHBA-2015-2117

Old problems fixed in newer releases

invalid last pad octet:

There is a bug in racoon (also called ipsec-tools) that sends improper oversized padding. Libreswan version 3.14 became more struct and rejected these packets. Libreswan 3.16 allows the bad padding again. Note that racoon is used in various products including older versions of OSX and iOS (up to iOS 7.x)

Module unloading error on shutdown or restart: Module esp4 is in use

ERROR: Module xfrm4_mode_tunnel is in use
ERROR: Module esp4 is in use
FAILURE to unload NETKEY esp4/esp6 module

This has been fixed in libreswan-3.9. Please upgrade

IPv6 tunnel works manually but fails on freshly booted machine

When one machine reboots and loses state, the other machine still has an encryption policy for the rebooted machine and will insist on receiving only encrypted packets. Obviously, after a reboot the host cannot send encrypted packets. For that reason, an "IKE hole" is present in the host's kernel. This means that any UDP 500 and UDP 4500 packets for IKE are allowed in plaintext even if we have an encryption policy active for that host. On at least the Linux kernel that hole does not include ipv6-icmp Neighbour Discovery packets, which is a unicast reply from the host that did not reboot to the just rebooted host. You can see this in "ipsec status" as:

000 Shunt list:
000 000 2620:52:0:ab0:42f2:e9ff:fe09:a16c/128:136 -58-> 2620:52:0:ab0:ca1f:66ff:fef1:c74c/128:0 => %hold 0 %acquire-netlink

Note protocol 58 (ipv6-icmp)

A workaround is to add the following connection:

conn v6neighbor-hole
        left=::1
        leftsubnet=::0/0
        leftprotoport=58/0
        rightprotoport=58/34816
        rightsubnet=::0/0
        right=::0
        connaddrfamily=ipv6
        authby=never
        type=passthrough
        auto=route
        priority=1

If you wonder where the number "34816" comes from please see the leftprotoport= entry of the ipsec.conf man page.

libreswan-3.13 adds this connection to /etc/ipsec.d/ as a workaround.

Using IPsec/L2TP with xl2tpd, the pppd ip-down script does not seem to run

Old pppd < 2.4.5 could cause xl2tpd to hang on a hanging pppd, so xl2tpd killed pppd itself to avoid this. But that meant pppd did not get to execute its ip-down script. This behaviour can be tweaked using the define TRUST_PPPD_TO_DIE in the xl2tpd Makefile. Fedora and EPEL packages enable this as of April 2015.

Interop issue with racoon: invalid padding-length octet: 0x23

Racoon has a broken implementation of IKE padding. Libreswan version 3.12 to 3.14 had strict padding checks that caused these packets to be rejected. These restrictions have been loosened to accomadate the broken racoon in libreswan 3.15 and higher

on xen pluto crashes with: Illegal instruction when using ike=aes_gcm

This is due to the interaction of NSS and Xen (which is possibly lying about the real AES hardware capability of the system. A workaround for this is to disable AES_GCM encryption in NSS using:

export NSS_DISABLE_HW_GCM=1

This should probably be placed somewhere more global than just libreswan, as it will affect everything that is using the nss libraries.

IPv6/KLIPS: ipsec_set_dst can't determine the correct routing device on a host connection

This is a kernel bug, see lsw#237 Confirmed affected are kernel 4.1.6 and 3.14.51 but possible all 3.x and 4.[12].x kernels to date (Sep 28, 2015)