Revision as of 05:10, 26 September 2018

General Questions

( we will sort this in categories once we have more )

Which RFC's or other standards does libreswan support?

See Implemented Standards

Which ciphers / algorithms does libreswan support?

IKEv1

IKE: AES_CBC, 3DES, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, SHA1, MD5 with all regular MODP and ECP groups

ESP on Linux: AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5

AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5

IKEv2

IKE: CHACHA20-POLY1305, AES_GCM, AES_CTR, AES_CBC, 3DES, CAMELLIA_CBC, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5 with all regular MODP groups, NIST ECP groups and Curve25519, and ESN

ESP on Linux: CHACHA20-POLY1305, AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5

AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5

Notes

Serpent and Twofish are compile time options and do not come from the NSS library. They use the well-known private allocation numbers from FreeS/WAN.
Some algorithms are disabled when running in FIPS mode.

Which IKEv1 and IKEv2 Exchange Modes does libreswan support?

The IANA Registry lists all official Exchange Modes. There are a few IKEv1 Modes that are very common despite never gotten past the draft stage.

Supported:

Not supported

IKE_SESSION_RESUME

IKEv1 Revised Mode
IKEv1 Hybrid Mode (aka "Mutual Group Authentication") although there is some unmaintained contributed code

Does libreswan interoperate with Microsoft Windows?

In general, yes it does. For specific features, see Microsoft Product Behaviour

Note IKEv2 Fragmentation is only supported as of Windows 10 April 2018 build. If you see issues when using LTE/4g/5g, try updating to the latest win10.

Another known issue is reconnecting not working, see this techinline blog

Should I use the XFRM/NETKEY or KLIPS IPsec stack with libreswan?

At this point we recommend using the NETKEY/XFRM stack for most deployments. If you are using an embedded platform with a cryptographic hardware offload device, it might be better to use KLIPS.

The NETKEY IPsec stack requires no kernel recompiles on most Linux distributions, so it is the easiest stack to use in most standard deployments. It offers a larger selection of cryptographic algorithm support, including the IPsec Suite B algorithms AES CTR, AES GCM and SHA2. It does cause a little additional delay with on-demand IPsec tunnels because it does not implement first+last packet caching. NETKEY supports OCF only using the cryptosoft driver, and is lacking native driver support for most cryptographic hardware cards. NETKEY also does not distribute the load of a single IPsec SA over different CPU's. NETKEY has support for Linux VTI for IPsec SA reference tracking.

The KLIPS IPsec stack offers easier debugging with tcpdump and easier iptables firewall rules due to its use of separate ipsecX interfaces. It also plays a little nicer with on-demand tunneling as it will hold on the first+last packet sent while the tunnel is being setup, and will release those packets once the IPsec tunnel is established. KLIPS distributes the load of a single IPsec SA over multiple CPU's. It supports all OCF hardware devices when compiled with OCF support. The MAST variant of KLIPS use IPsec SAref for IPsec SA reference tracking and is also used for L2TP/IPsec deployments requiring SAref tracking. Although it is recommended to use VPN_server_for_remote_clients_using_IKEv1_XAUTH instead of L2TP/IPsec.

Can I have an ipsec0 interface with XFFRM/NETKEY?

Yes, this is supported as of libreswan-3.18. See Route-based VPN using VTI

Does libreswan work with OpenVZ virtualization?

Yes it can work. You must run kernel 042stab084.8 or later. You must load the proper kernel modules on the host before booting the container. The easiest way to do this is to install libreswan on the host and then run "ipsec _stackmanager start". You also need to give the container the "net_admin" capability.

How can I debug the kernel?

For KLIPS, you can run ipsec klipsdebug --all | --none For NETKEY/XFRM, see /proc/net/xfrm_stat and its documentations at xfrm_proc.txt

Well known vulnerabilities

Libreswan is not vulnerable to the OpenSSL "Heartbleed" exploit

See Libreswan and Heartbleed

Libreswan is not vulnerable to bash CVE-2014-6271 or CVE-2014-7169

Libreswan sanitizers strings that may come from the network, such as XAUTH username, domain and DNS servers by passing it through filter functions remove_metachar() and cisco_stringify() before assigning it to environment variables that are passed to the updown scripts that invoke bash. These filters remove dangerous characters including the ' character needed for these bash exploits.

Libreswan is vulnerable to NSS CVE-2014-1568 RSA Signature Forgery

Please upgrade NSS to one of 3.17.1, 3.16.1 or 3.16.5.

This only affects libreswan when using X.509 certificates. Raw RSA keys using leftrsasigkey/rightrsasigkey are not affected. Connections using auth=secret (PSK) are also not affected.

See Mozilla Foundation Security Advisory 2014-73

Libreswan is not vulnerable to LogJam / weakdh.org CVE-2015-4000

The IKE protocol never allowed any DH group smaller than MODP768. Libreswan has never supported anything smaller than MODP1024

Libreswan as a client to a weak server will allow MODP1024 in IKEv1 as the least secure option, and MODP1536 in IKEv2 as the least secure option. However, the default is MODP2048.

Libreswan supports MODP group upto MODP8192, the ECP groups and Curve25519.

Libreswan also supports the alternative primes for MODP1024 and MODP2048 specified in RFC-5114. None of these will be placed in the default proposal group due to the lack of transparency of where these alternatives came from and why these were needed.

For more details, see "The weak DH and LogJam attack impact on IKE / IPsec (and the *swans)

Libreswan is not vulnerable to the TLS/IKE SLOTH / TRANSCRIPT attacks CVE-2015-7575

The IKE protocol is not affected, see "The SLOTH attack and IKE/IPsec"

Libreswan is not vulnerable to CVE-2018-15836 (a Bleichenbacher-style signature forgery which involves an RSA padding attack)

libreswan never contained the old RSA code from openswan. It uses NSS for RSA operations, which is not vulnerable. Note all openswan versions in RHEL < 6.8 are also not vulnerable because it uses the NSS based RSA code. RHEL as of 6.8 ships with libreswan instead of openswan.

Libreswan is not vulnerable to CVE-2018-5389 ("Practical Attacks on IPsec IKE")

This CVE is issued along with the paper The Dangers of Key Reuse: Practical Attacks on IPsec IKE. The paper lists two attacks.

The first attack requires the use of two uncommon IKEv1 Authentication Methods called "Encryption with RSA" (value 5) and "Revised encryption with RSA" (value 6). These two modes are not implement by libreswan, which only implements "RSA signatures" (value 3) for IKEv1. The extension of this attack in the paper against IKEv2 assumes RSA key reuse with these unsupported IKEv1 authentication methods, so libreswan is not vulnerable to this attack.

The second attack requires IKEv1 or IKEv2 with weak PreSharedKeys (PSKs). This is nothing new. Basically, you MITM the client (Alice) and so perform a Diffie-Hellman key exchange. Alice will then send the IKE_AUTH exchange packet containing their AUTH payload. Alice's AUTH payload is constructed (as per RFC 7296):

InitiatorSignedOctets = RealMessage1 | NonceRData | MACedIDForI
   GenIKEHDR = [ four octets 0 if using port 4500 ] | RealIKEHDR
   RealIKEHDR =  SPIi | SPIr |  . . . | Length
   RealMessage1 = RealIKEHDR | RestOfMessage1
   NonceRPayload = PayloadHeader | NonceRData
   InitiatorIDPayload = PayloadHeader | RestOfInitIDPayload
   RestOfInitIDPayload = IDType | RESERVED | InitIDData
   MACedIDForI = prf(SK_pi, RestOfInitIDPayload)

The attacker now has all values except the PSK, so it can go offline and try out all the PSK's to see if it can recreate the received AUTH payload, eg:

for every PSK in dictionary
  if (calculate prf(prf(Shared Secret, "Key Pad for IKEv2"), <InitiatorSignedOctets>) == AUTH_of_Alice)
      print (Alice used PSK:%s", PSK)

The IKE RFC's list clearly that PSK's should never be based on short or guessable passwords. Libreswan logs a warning about weak PSK's and refuses to use such weak PSKs in FIPS mode. The IKEv2 RFC clearly states this in three different places:

   Note that it is a common but typically insecure practice to have a
   shared key derived solely from a user-chosen password without
   incorporating another source of randomness.  This is typically
   insecure because user-chosen passwords are unlikely to have
   sufficient unpredictability to resist dictionary attacks and these
   attacks are not prevented in this authentication method.
   (Applications using password-based authentication for bootstrapping
   and IKE SA should use the authentication method in Section 2.16,
   which is designed to prevent off-line dictionary attacks.)  The
   pre-shared key needs to contain as much unpredictability as the
   strongest key being negotiated.

   When using pre-shared keys, a critical consideration is how to assure
   the randomness of these secrets.  The strongest practice is to ensure
   that any pre-shared key contain as much randomness as the strongest
   key being negotiated.  Deriving a shared secret from a password,
   name, or other low-entropy source is not secure.  These sources are
   subject to dictionary and social-engineering attacks, among others.

   As noted above, deriving the shared secret from a password is not secure.
   This construction is used because it is anticipated that people will do it anyway.

We strongly recommend people to use X.509 or raw public keys instead of PSKs. IKEv2 also supports RSA-PSS when using authby=rsa-sha2 so RSA v1.5 and its Bleichenbacher oracles can be avoided altogether.

For those deployments insisting on needing passwords, but without using X.509 and/or EAP authentication modes, there is RFC 6467 Secure Password Framework for IKEv2

   The IPsecME working group was chartered to provide for IKEv2
   a symmetric secure password authentication protocol that
   supports the use of low-entropy shared secrets, and to protect
   against off-line dictionary attacks without requiring the use of
   certificates or the Extensible Authentication Protocol (EAP).

Google Cloud VPN issue

Google Cloud VPN does not support NAT. The libreswan endpoint has to have a real public IP that is not NAT'ed

Configuration Matters

Using SHA2_256 for ESP connection establishes but no traffic passes (especially Android 6.0)

It seems that android 6.0 now defaults to ESP with SHA2, but it uses a bad implementation of SHA2. You can work around that using sha2-truncbug=yes but that would break all non-android clients that use the proper RFC SHA2 implementation. It might be possible to avoid SHA2 completely and use esp=aes_gcm-null instead (which is also faster)

See the sha2-truncbug man page entry of ipsec.conf for more information. There is also an android bug 194269 about this issue.

Note Linux kernels before 2.6.33 all used the broken truncation, so to interop with those old kernels, the sha2-truncbug=yes option would need to be set.

libreswan-3.18 and higher prefers sha2_512 over sha2_256 to avoid this issue. A note has also been added to RFC7321bis.

Microsoft Windows connection attempts fail with NO_POROPOSAL_CHOSEN

Windows uses only insecure defaults for IKEv2. To interop with libreswan, you need to either specify a modp1024 based proposal (eg ike=aes-sha2;modp1024) or change the registry and add a DWORD

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Rasman\Parameters\NegotiateDH2048_AES256

Note however, that this option seems to only fix the initial IKEv2 Exchange. When Windows starts a rekey via the CREATE_CHILD_SA Exchange, it will again fall back to using modp1024 despite the registry entry. We have reported this to Microsoft under the existing Case Number 35732 - IKEv2 - Diffie-Hellman to MODP-1024. A workaround was added to libreswan 3.24 with the option ms-dh-fallback=yes that allows such a downgrade to happen, provided it is specified as one of the valid proposals in the configuration.

An example ike= and esp= entry that will work securely for modern clients (iphone, OSX, Linux) and also support weaker algorithms as least prefered for Windows is:

ike=aes_gcm256-sha2,aes_gcm128-sha2,aes256-sha2,aes128-sha2,aes256-sha1,aes128-sha1,aes256-sha2;modp1024
esp=aes_gcm256-null,aes_gcm128-null,aes256-sha2_512,aes128-sha2_512,aes256-sha1,aes128-sha1,aes_gcm256-null;modp1024

How do I specify AEAD ciphers like GCM for IKE and IPsec

For IKE, a PRF must be configured. For IPsec, a PRF must not be configured.

# the IKE SA is configured for AES_GCM using a PRF of SHA2
ike=aes_gcm-sha2_256
# the -null can be left out when using version 3.23 or higher
# phase2alg= and esp= are aliases for the same configuration option.
phase2alg=aes_sha256-null

The format of the ike= and phase2alg= (esp=) lines are: encr_algo-integ_algo. So for a classic non-AEAD AES CBC with SHA2_256 algorithm set, this would be: ike=aes-sha2_256 and phase2alg=aes-sha2_256. When using an AEAD algorithm such as AES GCM, there is no separate encryption and integrity algorithm. The combined algorithm however is negotiated and specified as if it is an encryption algorithm and with no (separate) integrity algorithm. However, IKE re-uses the integrity algorithm as the PRF to generate key material for the encryption/integrity functions of both IKE encryption and IPsec encryption. This PRF is negotiated along with the encryption and integrity algorithms. Since from a security standpoint, it makes no sense to trust an algorithm for integrity but not trust it for PRF, libreswan re-uses the integrity keyword to negotiate the PRF. It does not allow negotiating a different algorithm for integrity and PRF. When using an AEAD such as aes_gcm, that means we now need to specify a PRF, since the AEAD cannot be used as the PRF. So now the IKE configuration line becomes ike=aes_gcm-sha2_256 where the latter argument denotes the PRF and not the integrity algorithm. Since the IKE PRF also generates the key material for the IPsec SA, when also using an AEAD for the IPsec encryption/integrity, the phase2alg= (esp=) line does not need to specify an integrity algorithm nor a PRF algorithm. Up to libreswan 3.23, the parser would require to specify encryption-integrity, so the way to configure the AEAD was by adding a null for integrity. So that would be phase2alg=aes_gcm-null. As of libreswan 3.23, the trailing -null can be left out, so it can be specified as phae2alg=aes_gcm. Note that the esp= keyword is an alias for the phase2alg= keyword.

My ssh sessions hang or connectivity is very slow

This could be an MTU issue. The overhead of IPsec encryption (and possibly ESPinUDP encapsulation) yields a slightly smaller packet size. This can cause problems. A good way to confirm MTU problems is if you can login remotely over the IPsec tunnel using ssh, but issuing "ls -l /usr" causes the session to hang. Try adjusting the MTU with:

iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS  --clamp-mss-to-pmtu

If that does not help, try hardcoding it yourself:

iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1380

If these settings don't help, adding mtu=1420 to the connection might work, although it will affect all traffic that the connection covers.

As a last case alternative, you can try lowering the MTU on the internal interface of your IPsec server so that the PMTU discovery locally already goes back to 1440, eg using ip link set dev eth1 mtu 1440. This will not only affect packets for the VPN tunnel, but all packets received and sent on that inerface. Only use this as a last resort.

using auto=route slows down TCP establishments when using XFRM

(also known as rhbz#1010347 )

This should be fixed on recent kernels (3.x) and backported to some older kernels (notably rhel 6.6)

The issue: The ESP packets are arriving sometimes very late or they do not arrive at all. The issues are most noticeable after restarting the IPSec daemon.

The problem as explained by Herbert Xu:

Your first TCP SYN packet triggers the IPsec lookup, however, the packet itself is dropped. TCP then retransmits but it only gets through after the IPsec SAs are fully instated, resulting in the delay.

What happens in some kernels is that the IPsec trigger occurs in a sleepable context, which means that the sending process will wait for the IPsec SAs to be installed before sending the first SYN. However, this was never meant to be a complete solution to supporting auto=route as it relies on the fact that there must be some sleepable context prior to the SYN packet being sent.

Evidently this is no longer the case for some kernels. Going forward I suggest two courses of action:

1) Doo not rely on auto=route. Instead use auto=start and ensure that you synchronously wait for the SAs to complete. For example, ipsec auto --up foo will bring foo up synchronously, while ipsec auto --asynchronous --up foo will not wait and thus may fail.

2) I will take this issue to the IPsec maintainer and the network maintainer to see if we can make adjustments to allow at least the TCP connection case to work with auto=route. However, there is no guarantee that this will be done as we may not be able to insert the requisite sleepable context into the general network stack just so that IPsec auto=route can work.

Longer term for auto=route to be properly supported someone needs to implement packet queueing on larval SAs.

Possible work around:

        echo 0 > /proc/sys/net/core/xfrm_larval_drop
        echo 3 > /proc/sys/net/ipv4/tcp_syn_retries

This means that the first retransmit of the SYN packet (+1s) should make it through, rather than the current behaviour where only the fourth retransmit (+15s) makes it through.

Note that this workaround causes a regression on the connect() call to immediately return on a non-blocking socket with an appropriate POSIX compliant errno, which is why the workaround also sets the TCP SYN retry count to 3.

PSK doesn't work against cisco ASA 55xx

While libreswan has very little restrictions to Pre-shared secret Cisco has additional restriction, you can't have question mark '?' in psk. Cisco handles that as help request.

When using hundreds of tunnels on a xen based cloud system like AWS, a fraction of tunnels fail regularly

This is a known issue that could be a problem of the aesni kernel module in combination with the xen hypervisor. Try unloading the aesni.ko kernel module on the xen server. If you can confirm this fixes your issue (we cannot change the AWS servers), please email the swan-dev list with a confirmation.

My XAUTH authentication via PAM always claims the password is incorrect on centos6

This is an odd bug (feature?) that shows up when you have disabled selinux in /etc/sysconfig/selinux. Running selinux in permissive (or enforcing) mode seems to resolve this.

Why is it recommended to disable send_redirects in /proc/sys/net ?

Let's say you have a VPN server in a cloud that you use with your phone. Your phone will setup an IPsec VPN and all its traffic is encrypted and send to the cloud instance, which decrypts it and sends it on the internet, using SNAT. Replies it receives are encrypted and send to your phone.

Your phone will send the VPN server an encrypted packet. The server receives it on eth0 (its only interface!) and decrypts it. The decrypted packet is then ready to get routed. The server looks which interface it should send the packet to. It is destined to go out eth0. Since the packet came in via eth0 and would go out via eth0, the server concludes there clearly must be a better path not involving itself, since it is going out the same interface. It has no idea the packet arrived encrypted and got decrypted.

This is why we recommend disabling "send_redirects" in /etc/sysctl.conf using

net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0

Why is it recommended to disable rp_filter in /proc/sys/net ?

The kernel has a notion of which interface a packet came from and where it will go to and it determines if the path through the machine makes sense based on the IP address it sees. If 10.0.2.0/24 lives on eth0 and 1.2.3.4 has eth1 with the default route, then rp_filter will automatically block a 10.0.2.1 packet coming in on eth1. The rp_filter code is an implementation of RFC-3704 https://tools.ietf.org/html/rfc3704. Of course, you should created had firewall rules on the machine that would block these packets too. AND firewall rules on the router in front of the machine.

The problem with IPsec appears when you hand out a 10.0.2.13 address, like via XAUTH/IPsec. A packet with IP a.b.c.d comes in on eth1 for 1.2.3.4, which passes rp_filter, then gets decrypted to 10.0.2.13. Now the packet is still seen as coming from eth1, so rp_filter will drop the packet as 10.0.2.0/24 packets are only expected to originate from eth0.

This is why we recommend disabling "rp_filter" in /etc/sysctl.conf using

net.ipv4.conf.default.rp_filter = 0

A network restart or reboot might be neccessary for this entry to be picked up. As a one shot disabling for all interfaces, you can use:

for i in /proc/sys/net/ipv4/conf/*; do echo 0 > $i/rp_filter; done

NAT + IPsec is not working

When using NAT on the same linux machine as IPsec, care must be taken that packets meant for an IPsec remote address is not NATed. The NATed packet would no longer match the IPsec tunnel source and destination IP address ranges.

If you have the following common catch-all NAT rule:

-A POSTROUTING -o eth0 -j MASQUERADE

or

-A POSTROUTING -o eth0 -j SNAT --to-source 1.2.3.4

then either change these rules to only apply with a non-ipsec policy:

-A POSTROUTING -o eth0 -m policy --dir out --pol none -j MASQUERADE

or insert a ipsec skip rule before these:

-A POSTROUTING -o eth0 -m policy --dir out --pol ipsec -j RETURN
-A POSTROUTING -o eth0 -j MASQUERADE

Can I hand out LAN IP addresses in the addresspool?

Yes, but you will need to enable proxyarp on the IPsec server. You can do this globally using the proxyarp entry in /etc/sysctl.conf, for example if your LAN interface is ethX, use

net.ipv4.conf.ethX.proxy_arp=1

Common error messages

030 ignoring message from whack with bad magic

This means that the ipsec whack command that is used to talk to the pluto daemon are different versions. The most common cause is that the system has two installs of libreswan. One system install that appears in /usr/libexec/ipsec and one local install in /usr/local/libexec/ipsec. The system started the non-local version, but the user running the ipsec command prefers /usr/local/sbin/ipsec over /usr/sbin/ipsec and thus uses the whack from /usr/local/. You should remove one of the two installs.

Another reason this can happen is that a package upgrade did not properly restart the daemon and the old daemon is running but the user only has the new whack command. Manually killing the pluto daemon and restarting the ipsec service will resolve that. Note that the "shutdown" command is not version-specific, so regular package upgrades (eg via rpm) can install the new whack, then call shutdown with the new whack to the old pluto, without this error appearing.

compile error: ‘SEC_OID_CURVE25519’ undeclared here

The NSS library is too old and does not support CURVE25519. Either upgrade the NSS library or compile with USE_DH31=false to disable CURVE25519 at build time. This currently happens with Debian9 and Debian8

ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: Message too long [errno 90, origin ICMP type 3 code 4 (not authenticated)]

These errors are often intermittent, it depends on your application data that is getting encrypted. Your NAT'ed IPsec tunnel is using ESPinUDP, and the additional UDP header caused some of your packets to be too big. See the previous answer and try lowing your mtu. Use an insanely small mtu like 1300 or 1200 for confirmation. Then try to bring it up higher to what seems to work reliably for you.

ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: No route to host [errno 113, origin ICMP type 3 code 1(not authenticated)]

These errors often happen 15 minutes after the tunnel successfully established. It's most likely that the tunnel was idle and the NAT router removed the nat mapping. Or the NAT router rebooted and lost state. It no longer knows which client to send the packet to. Ensure your connection uses nat-keepalive=yes. Possibly decrease the global keep-alive= value to send more frequent keep-alive packets. Alternatively, enable DPD on the connection to cause some regular traffic on idle tunnels.

ERROR: asynchronous network error report on eth0 (sport=500) for message to xx.xx.xxx.xxx port 500, complainant yy.yy.yyy.yyy: Connection refused [errno 111, origin ICMP type 3 code 3 (not authenticated)]

This error means the other end is not (or no longer) running an IKE daemon. Ensure the IKE daemon is running on the remote system. If you see this error during a negotiation, it could be that the remote IKE daemon crashed or stopped listening. On Mac OSX if the IKE daemon is not allowed to read the proper X.509 certificate, it will only realize this partially into the IKE negotiation and terminate, resulting in this error. It is also possible that the remote IP is actually a NAT device with the IPsec device behind it. In that case, using rekey=no and letting the other end initiate might make this error go away.

error: ignoring informational payload, type NO_PROPOSAL_CHOSEN msgid=00000000

This error means exactly what i says. The IKE proposal(s) sent to the server were rejected. This means there is a configuration mismatch between libreswan and the remote IPsec server. Usually this is a configuration mismatch in the ike= or esp= (phase2alg=) setting. But other options could also be wrong, such as authby= or pfs= or aggrmode=

Microsoft Windows fails to connect, log shows: retransmit response for message ID: 1 exchange ISAKMP_v2_AUTH

You are on a network that is dropping UDP fragments, and your client has no support for IKEv2 fragmentation. This is common on LTE networks when using Windows clients that are not up to date. Microsoft added IKEv2 Fragmentation support in Windows 10 April 2018 Update (v1803) so updating Windows might resolve this issue.

Microsoft Windows Error 13806: IKE failed to find valid machine certificate

You are using a certificate that is missing some required ExtendedKeyUsage ("EKU") attributes. See Interoperability#Windows_Certificate_requirements

ssh gives error: Corrupted MAC on input. Disconnecting: Packet corrupt

This usually indicates MTU issues. You can try lowering the mtu using the mtu= option or by changing the actual mtu on the proper interface on the libreswan server. This error is known to happen on Amazon EC2 AMI types that use PV (xen) instances. Switching to Amazon HVM instances seems to resolve the problem on AWS.

Using aes_gcm or aes_ctr results in ERROR: netlink response for Add SA esp.XXXXXXXX@IPADDRESS included errno 22: Invalid argument

This usually indicates that the ESP algorithm selected using the phasealg= (esp=) line is not available in the kernel. These usually indicate kernel bugs.

Linux kernels up to 3.2.x have a bug in the aesni-intel driver on x86_64. See rhbz#1176211 The AESNI hardware acceleration kernel module does not properly support 256 or 192 bit keys for AES_GCM. You can either switch to 128 bit keys or blacklist or unload the aesni-intel kernel module. Another alternative is to switch from phase2alg=aes_gcm to phase2alg=aes, although that will cut the performance in half.

Linux kernels to date seem to have a bug in the aes_ctr code on the POWER8BE VM - use phase2alg=aes there as well to use AES_CBC,

Can't find the private key from the NSS CERT (err -8177)

The old libreswan-3.8 /etc/ipsec.d/nsspassword requires just the password to be entered. In later libreswan's, you must add the NSS prefix to it. So to specify the password "secret", use:

NSS Certificate DB:secret

ESP DH algorithm MODP3072 is invalid as PFS policy is disabled

libreswan before version 3.25 allowed invalid configurations with pfs=no while specifying a PFS group for esp (eg esp="aes-sha2;modp3072") and it would ignore the PFS group. This is no longer allowed. Either use pfs=yes (the recommended and default) or remove the modp item from any ah= / esp= / phaesalg= option.

"IPsec encryption transform did not specify required KEY_LENGTH"

This happens when trying to interoperate with old openswan versions that mistakenly do not send the KEY_LENGTH attribute for AES. The work around the problem, on those old implementations, specify "aes128" or "aes256" instead of "aes". For example:

phase2alg=aes256-sha1;modp1536
esp=aes256-sha1;modp1536
ike=aes256-sha1;modp1536

No PARENT proposal selected

This error can happen when there is a mismatch of IKE proposals between the server and client. In libreswan-3.14, the modp1024 (group 2) was removed from the default proposal set because of its weakness, but apparently Windows 7 requires it per default.

Using VTI causes "Keys are not allowed with ipip and sit tunnels"

You need to upgrade the iproute package. For RHEL7, see RHBA-2015-2117

Old problems fixed in newer releases

invalid last pad octet:

There is a bug in racoon (also called ipsec-tools) that sends improper oversized padding. Libreswan version 3.14 became more struct and rejected these packets. Libreswan 3.16 allows the bad padding again. Note that racoon is used in various products including older versions of OSX and iOS (up to iOS 7.x)

Module unloading error on shutdown or restart: Module esp4 is in use

ERROR: Module xfrm4_mode_tunnel is in use
ERROR: Module esp4 is in use
FAILURE to unload NETKEY esp4/esp6 module

This has been fixed in libreswan-3.9. Please upgrade

IPv6 tunnel works manually but fails on freshly booted machine

When one machine reboots and loses state, the other machine still has an encryption policy for the rebooted machine and will insist on receiving only encrypted packets. Obviously, after a reboot the host cannot send encrypted packets. For that reason, an "IKE hole" is present in the host's kernel. This means that any UDP 500 and UDP 4500 packets for IKE are allowed in plaintext even if we have an encryption policy active for that host. On at least the Linux kernel that hole does not include ipv6-icmp Neighbour Discovery packets, which is a unicast reply from the host that did not reboot to the just rebooted host. You can see this in "ipsec status" as:

000 Shunt list:
000 000 2620:52:0:ab0:42f2:e9ff:fe09:a16c/128:136 -58-> 2620:52:0:ab0:ca1f:66ff:fef1:c74c/128:0 => %hold 0 %acquire-netlink

Note protocol 58 (ipv6-icmp)

A workaround is to add the following connection:

conn v6neighbor-hole
        left=::1
        leftsubnet=::0/0
        leftprotoport=58/0
        rightprotoport=58/34816
        rightsubnet=::0/0
        right=::0
        connaddrfamily=ipv6
        authby=never
        type=passthrough
        auto=route
        priority=1

If you wonder where the number "34816" comes from please see the leftprotoport= entry of the ipsec.conf man page.

libreswan-3.13 to 3.22 installs this connection per default in /etc/ipsec.d/ libreswan as of 3.23 loads this as a buildin connection automatically.

Using IPsec/L2TP with xl2tpd, the pppd ip-down script does not seem to run

Old pppd < 2.4.5 could cause xl2tpd to hang on a hanging pppd, so xl2tpd killed pppd itself to avoid this. But that meant pppd did not get to execute its ip-down script. This behaviour can be tweaked using the define TRUST_PPPD_TO_DIE in the xl2tpd Makefile. Fedora and EPEL packages enable this as of April 2015.

Interop issue with racoon: invalid padding-length octet: 0x23

Racoon has a broken implementation of IKE padding. Libreswan version 3.12 to 3.14 had strict padding checks that caused these packets to be rejected. These restrictions have been loosened to accomadate the broken racoon in libreswan 3.15 and higher

on xen pluto crashes with: Illegal instruction when using ike=aes_gcm

This is due to the interaction of NSS and Xen (which is possibly lying about the real AES hardware capability of the system. A workaround for this is to disable AES_GCM encryption in NSS using:

export NSS_DISABLE_HW_GCM=1

This should probably be placed somewhere more global than just libreswan, as it will affect everything that is using the nss libraries.

IPv6/KLIPS: ipsec_set_dst can't determine the correct routing device on a host connection

This is a kernel bug, see lsw#237 Confirmed affected are kernel 4.1.6 and 3.14.51 but possible all 3.x and 4.[12].x kernels to date (Sep 28, 2015)

L2TP / Transport Mode connections fail after system update

There is a kernel bug in Linux kernel 4.14 with the XFRM/NETKEY code. Downgrade to 4.13 or upgrade to 4.15rc1 or later. Be aware that some kernels could contain backports of the faulty 4.14 kernel.

High Speed IPsec performance issues

Kernels used a really small crypto queue by default (100), which was also hard coded. Recent kernels (4.x and RHEL7 kernels) can now be configured to increase this queue length:

echo 'options cryptd cryptd_max_cpu_qlen=1000' > /etc/modprobe.d/cryptd.conf
reboot

To check your current queue length:

# modprobe cryptd
# dmesg | grep cryptd
[ 4865.043558] cryptd: max_cpu_qlen set to 1000

@@ Line 20: / Line 20: @@
 === IKEv2 ===
-* IKE: AES_GCM, AES_CTR, AES_CBC, 3DES, CAMELLIA_CBC, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5 with all regular MODP groups,  NIST ECP groups and Curve25519, and ESN
+* IKE: CHACHA20-POLY1305, AES_GCM, AES_CTR, AES_CBC, 3DES, CAMELLIA_CBC, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5 with all regular MODP groups,  NIST ECP groups and Curve25519, and ESN
-* ESP on Linux: AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
+* ESP on Linux: CHACHA20-POLY1305, AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
 * AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
@@ Line 375: / Line 375: @@
 Another reason this can happen is that a package upgrade did not properly restart the daemon and the old daemon is running but the user only has the new whack command. Manually killing the pluto daemon and restarting the ipsec service will resolve that. Note that the "shutdown" command is not version-specific, so regular package upgrades (eg via rpm) can install the new whack, then call shutdown with the new whack to the old pluto, without this error appearing.
+== compile error: ‘SEC_OID_CURVE25519’ undeclared here ==
+The NSS library is too old and does not support CURVE25519. Either upgrade the NSS library or compile with USE_DH31=false to disable CURVE25519 at build time. This currently happens with Debian9 and Debian8
 == ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: Message too long [errno 90, origin ICMP type 3 code 4 (not authenticated)] ==

FAQ: Difference between revisions