FAQ: Difference between revisions
Paul Wouters (talk | contribs) No edit summary |
Paul Wouters (talk | contribs) No edit summary |
||
Line 59: | Line 59: | ||
Another known issue is reconnecting not working, see this [https://blog.techinline.com/2018/06/01/vpn-stuck-on-connecting-windows-10/ techinline blog] | Another known issue is reconnecting not working, see this [https://blog.techinline.com/2018/06/01/vpn-stuck-on-connecting-windows-10/ techinline blog] | ||
== Microsoft and L2TP (xl2tpd) == | |||
It seems newer xl2tpd versions only interop with Microsoft when using the l2tp_ppp kernel module loaded. Some distributions blacklist the l2tp_netlink and/or l2tp_ppp module from auto-loading. Check the blacklisting of modules for your distribution. To see if your modules can properly load, use: | |||
<pre> | |||
modprobe l2tp_netlink | |||
modprobe l2tp_ppp | |||
lsmod | grep l2tp | |||
</pre> | |||
You should see the l2tp modules in the output of the last command. | |||
== Should I use the XFRM/NETKEY or KLIPS IPsec stack with libreswan? == | == Should I use the XFRM/NETKEY or KLIPS IPsec stack with libreswan? == |
Revision as of 10:26, 3 February 2020
General Questions
( we will sort this in categories once we have more )
Which RFC's or other standards does libreswan support?
Which ciphers / algorithms does libreswan support?
IKEv1
- IKE: AES_CBC, 3DES, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, SHA1, MD5 with all regular MODP and ECP groups
- ESP on Linux: AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
- AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
IKEv2
- IKE: CHACHA20-POLY1305, AES_GCM, AES_CTR, AES_CBC, 3DES, CAMELLIA_CBC, SERPENT, TWOFISH and SHA2_256, SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5 with all regular MODP groups, NIST ECP groups and Curve25519, and ESN
- ESP on Linux: CHACHA20-POLY1305, AES_GCM, AES_CCM, AES_CTR, AES_CBC, CAMELLIA, 3DES, SERPENT, TWOFISH, CAST5, NULL and SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
- AH on Linux: SHA2_256, SHA2_256_96(truncbug) SHA2_384, SHA2_512, AES-XCBC-MAC, SHA1, MD5
Notes
- Serpent and Twofish are compile time options and do not come from the NSS library. They use the well-known private allocation numbers from FreeS/WAN.
- Some algorithms are disabled when running in FIPS mode.
Which IKEv1 and IKEv2 Exchange Modes does libreswan support?
The IANA Registry lists all official Exchange Modes. There are a few IKEv1 Modes that are very common despite never gotten past the draft stage.
Supported:
- IKEv1 Main Mode (PSK, raw RSA, X509)
- IKEv1 Aggressive Mode (PSK, raw RSA, X509)
- IKEv1 XAUTH/RSA and XAUTH/PSK with ModeConfig (aka "Cisco IPsec mode")
Not supported
- IKEv1 Revised Mode
- IKEv1 Hybrid Mode (aka "Mutual Group Authentication") although there is some unmaintained contributed code
Does libreswan interoperate with Microsoft Windows?
In general, yes it does. For specific features, see Microsoft Product Behaviour
Note IKEv2 Fragmentation is only supported as of Windows 10 April 2018 build. If you see issues when using LTE/4g/5g, try updating to the latest win10.
Another known issue is reconnecting not working, see this techinline blog
Microsoft and L2TP (xl2tpd)
It seems newer xl2tpd versions only interop with Microsoft when using the l2tp_ppp kernel module loaded. Some distributions blacklist the l2tp_netlink and/or l2tp_ppp module from auto-loading. Check the blacklisting of modules for your distribution. To see if your modules can properly load, use:
modprobe l2tp_netlink modprobe l2tp_ppp lsmod | grep l2tp
You should see the l2tp modules in the output of the last command.
Should I use the XFRM/NETKEY or KLIPS IPsec stack with libreswan?
At this point we recommend using the NETKEY/XFRM stack for most deployments. If you are using an embedded platform with a cryptographic hardware offload device, it might be better to use KLIPS.
The NETKEY IPsec stack requires no kernel recompiles on most Linux distributions, so it is the easiest stack to use in most standard deployments. It offers a larger selection of cryptographic algorithm support, including the IPsec Suite B algorithms AES CTR, AES GCM and SHA2. It does cause a little additional delay with on-demand IPsec tunnels because it does not implement first+last packet caching. NETKEY supports OCF only using the cryptosoft driver, and is lacking native driver support for most cryptographic hardware cards. NETKEY also does not distribute the load of a single IPsec SA over different CPU's. NETKEY has support for Linux VTI for IPsec SA reference tracking.
The KLIPS IPsec stack offers easier debugging with tcpdump and easier iptables firewall rules due to its use of separate ipsecX interfaces. It also plays a little nicer with on-demand tunneling as it will hold on the first+last packet sent while the tunnel is being setup, and will release those packets once the IPsec tunnel is established. KLIPS distributes the load of a single IPsec SA over multiple CPU's. It supports all OCF hardware devices when compiled with OCF support. The MAST variant of KLIPS use IPsec SAref for IPsec SA reference tracking and is also used for L2TP/IPsec deployments requiring SAref tracking. Although it is recommended to use VPN_server_for_remote_clients_using_IKEv1_XAUTH instead of L2TP/IPsec.
Can I have an ipsec0 interface with XFFRM/NETKEY?
Yes, this is supported as of libreswan-3.18. See Route-based VPN using VTI
Does libreswan work with OpenVZ virtualization?
Yes it can work. You must run kernel 042stab084.8 or later. You must load the proper kernel modules on the host before booting the container. The easiest way to do this is to install libreswan on the host and then run "ipsec _stackmanager start". You also need to give the container the "net_admin" capability.
How can I debug the kernel?
For KLIPS, you can run ipsec klipsdebug --all | --none For NETKEY/XFRM, see /proc/net/xfrm_stat and its documentations at xfrm_proc.txt
Well known vulnerabilities
Libreswan is not vulnerable to the OpenSSL "Heartbleed" exploit
Libreswan is not vulnerable to bash CVE-2014-6271 or CVE-2014-7169
Libreswan sanitizers strings that may come from the network, such as XAUTH username, domain and DNS servers by passing it through filter functions remove_metachar() and cisco_stringify() before assigning it to environment variables that are passed to the updown scripts that invoke bash. These filters remove dangerous characters including the ' character needed for these bash exploits.
Libreswan is vulnerable to NSS CVE-2014-1568 RSA Signature Forgery
Please upgrade NSS to one of 3.17.1, 3.16.1 or 3.16.5.
This only affects libreswan when using X.509 certificates. Raw RSA keys using leftrsasigkey/rightrsasigkey are not affected. Connections using auth=secret (PSK) are also not affected.
See Mozilla Foundation Security Advisory 2014-73
Libreswan is not vulnerable to LogJam / weakdh.org CVE-2015-4000
The IKE protocol never allowed any DH group smaller than MODP768. Libreswan has never supported anything smaller than MODP1024
Libreswan as a client to a weak server will allow MODP1024 in IKEv1 as the least secure option, and MODP1536 in IKEv2 as the least secure option. However, the default is MODP2048.
Libreswan supports MODP group upto MODP8192, the ECP groups and Curve25519.
Libreswan also supports the alternative primes for MODP1024 and MODP2048 specified in RFC-5114. None of these will be placed in the default proposal group due to the lack of transparency of where these alternatives came from and why these were needed.
For more details, see "The weak DH and LogJam attack impact on IKE / IPsec (and the *swans)
Libreswan is not vulnerable to the TLS/IKE SLOTH / TRANSCRIPT attacks CVE-2015-7575
The IKE protocol is not affected, see "The SLOTH attack and IKE/IPsec"
Libreswan is not vulnerable to CVE-2016-5361 (IKEv1 protocol is vulnerable to DoS amplification attack)
This attack basically spoofs IKEv1 packets from different IPs. Since the IKEv1 protocol has the responder also retransmitting packets, one spoofed packet can generate a response packet that is retransmitted a number of times. This flaw is inherent to the IKEv1 protocol and was addressed in IKEv2.
Nevertheless, libreswan has changed its implementation to not retransmit as responder in these specific cases of receiving a "first packet". Since in IKEv1 the initiator is also responsible for retransmission, this should not break any real IKEv1 clients.
Libreswan is not vulnerable to CVE-2018-15836 (a Bleichenbacher-style signature forgery which involves an RSA padding attack)
libreswan never contained the old RSA code from openswan. It uses NSS for RSA operations, which is not vulnerable. Note all openswan versions in RHEL < 6.8 are also not vulnerable because it uses the NSS based RSA code. RHEL as of 6.8 ships with libreswan instead of openswan.
Libreswan is not vulnerable to CVE-2018-5389 ("Practical Attacks on IPsec IKE")
This CVE is issued along with the paper The Dangers of Key Reuse: Practical Attacks on IPsec IKE. The paper lists two attacks.
The first attack requires the use of two uncommon IKEv1 Authentication Methods called "Encryption with RSA" (value 5) and "Revised encryption with RSA" (value 6). These two modes are not implement by libreswan, which only implements "RSA signatures" (value 3) for IKEv1. The extension of this attack in the paper against IKEv2 assumes RSA key reuse with these unsupported IKEv1 authentication methods, so libreswan is not vulnerable to this attack.
The second attack requires IKEv1 or IKEv2 with weak PreSharedKeys (PSKs). This is nothing new. Basically, you MITM the client (Alice) and so perform a Diffie-Hellman key exchange. Alice will then send the IKE_AUTH exchange packet containing their AUTH payload. Alice's AUTH payload is constructed (as per RFC 7296):
InitiatorSignedOctets = RealMessage1 | NonceRData | MACedIDForI GenIKEHDR = [ four octets 0 if using port 4500 ] | RealIKEHDR RealIKEHDR = SPIi | SPIr | . . . | Length RealMessage1 = RealIKEHDR | RestOfMessage1 NonceRPayload = PayloadHeader | NonceRData InitiatorIDPayload = PayloadHeader | RestOfInitIDPayload RestOfInitIDPayload = IDType | RESERVED | InitIDData MACedIDForI = prf(SK_pi, RestOfInitIDPayload)
The attacker now has all values except the PSK, so it can go offline and try out all the PSK's to see if it can recreate the received AUTH payload, eg:
for every PSK in dictionary if (calculate prf(prf(Shared Secret, "Key Pad for IKEv2"), <InitiatorSignedOctets>) == AUTH_of_Alice) print (Alice used PSK:%s", PSK)
The IKE RFC's list clearly that PSK's should never be based on short or guessable passwords. Libreswan logs a warning about weak PSK's and refuses to use such weak PSKs in FIPS mode. The IKEv2 RFC clearly states this in three different places:
Note that it is a common but typically insecure practice to have a shared key derived solely from a user-chosen password without incorporating another source of randomness. This is typically insecure because user-chosen passwords are unlikely to have sufficient unpredictability to resist dictionary attacks and these attacks are not prevented in this authentication method. (Applications using password-based authentication for bootstrapping and IKE SA should use the authentication method in Section 2.16, which is designed to prevent off-line dictionary attacks.) The pre-shared key needs to contain as much unpredictability as the strongest key being negotiated.
When using pre-shared keys, a critical consideration is how to assure the randomness of these secrets. The strongest practice is to ensure that any pre-shared key contain as much randomness as the strongest key being negotiated. Deriving a shared secret from a password, name, or other low-entropy source is not secure. These sources are subject to dictionary and social-engineering attacks, among others.
As noted above, deriving the shared secret from a password is not secure. This construction is used because it is anticipated that people will do it anyway.
We strongly recommend people to use X.509 or raw public keys instead of PSKs. IKEv2 also supports RSA-PSS when using authby=rsa-sha2 so RSA v1.5 and its Bleichenbacher oracles can be avoided altogether.
For those deployments insisting on needing passwords, but without using X.509 and/or EAP authentication modes, there is RFC 6467 Secure Password Framework for IKEv2
The IPsecME working group was chartered to provide for IKEv2 a symmetric secure password authentication protocol that supports the use of low-entropy shared secrets, and to protect against off-line dictionary attacks without requiring the use of certificates or the Extensible Authentication Protocol (EAP).
Libreswan is not vulnerable to "The Deviation Attack" presented at TrustCom 2019
The paper is available at https://hal.inria.fr/hal-01980276/document
It is also known as "A Novel Denial-of-Service Attack Against IKEv2"
The attack described is a theoretical and extremely unpractical attack that simply does not work against any IKE implementation. Various people tried to convince the authors of this before final publication of this paper at the IETF IPsec Working Group. See: https://mailarchive.ietf.org/arch/msg/ipsec/-xT8RclsMtdmFNzCAAR2PSGAPN0
Libreswan is not vulnerable to CVE-2019-14899 "Inferring and hijacking VPN-tunneled TCP connections"
Vulnerability disclosure: https://seclists.org/oss-sec/2019/q4/122
The Linux IPsec implementation (XFRM) is a "policy based VPN" and does not accept unencrypted packets for IP ranges for which it has an IPsec encryption policy, irrespective of the rp_filter setting. When using VTI or XFRMi to create a "routing based VPN", AND disabling rp_filter protection for spoofed traffic, libreswan is still not vulnerable as it places the obtained VPN client IP address on the loopback device with a non-global scope of 50, resulting in the unencrypted packet still being dropped.
An additional defense can still be deployed in libreswan using the tfc=1000 (or tfc=1500) option which causes all outgoing ESP traffic to be padded to 1000 bytes (or the path MTU when specifying more than what would otherwise fit) ensuring that nothing can be learned from the size of the encrypted ESP packet.
Google Cloud VPN issue
Google Cloud VPN does not support NAT. The libreswan endpoint has to have a real public IP that is not NAT'ed
Build issues
Missing devel packages for CentOS8
Some devel packages have moved into the PowerTools repository. To enable this, run:
dnf config-manager --set-enabled PowerTools
Configuration Matters
Using SHA2_256 for ESP connection establishes but no traffic passes (especially Android 6.0)
It seems that android 6.0 now defaults to ESP with SHA2, but it uses a bad implementation of SHA2. You can work around that using sha2-truncbug=yes but that would break all non-android clients that use the proper RFC SHA2 implementation. It might be possible to avoid SHA2 completely and use esp=aes_gcm-null instead (which is also faster)
See the sha2-truncbug man page entry of ipsec.conf for more information. There is also an android bug 194269 about this issue.
Note Linux kernels before 2.6.33 all used the broken truncation, so to interop with those old kernels, the sha2-truncbug=yes option would need to be set.
libreswan-3.18 and higher prefers sha2_512 over sha2_256 to avoid this issue. A note has also been added to RFC7321bis.
Microsoft Windows connection attempts fail with NO_POROPOSAL_CHOSEN
Windows uses only insecure defaults for IKEv2. To interop with libreswan, you need to either specify a modp1024 based proposal (eg ike=aes-sha2;modp1024) or change the registry and add a DWORD
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Rasman\Parameters\NegotiateDH2048_AES256
Note however, that this option seems to only fix the initial IKEv2 Exchange. When Windows starts a rekey via the CREATE_CHILD_SA Exchange, it will again fall back to using modp1024 despite the registry entry. We have reported this to Microsoft under the existing Case Number 35732 - IKEv2 - Diffie-Hellman to MODP-1024. A workaround was added to libreswan 3.24 with the option ms-dh-fallback=yes that allows such a downgrade to happen, provided it is specified as one of the valid proposals in the configuration.
An example ike= and esp= entry that will work securely for modern clients (iphone, OSX, Linux) and also support weaker algorithms as least prefered for Windows is:
ike=aes_gcm256-sha2,aes_gcm128-sha2,aes256-sha2,aes128-sha2,aes256-sha1,aes128-sha1,aes256-sha2;modp1024 esp=aes_gcm256-null,aes_gcm128-null,aes256-sha2_512,aes128-sha2_512,aes256-sha1,aes128-sha1,aes_gcm256-null;modp1024
See also How to configure DH for IKEv2 in Windows
iOS (Apple) phone devices trying and failing to rekey in 8 minutes
This is a bug in Apple devices. These devices do not use pfs=yes when configured via the phone manually (as opposed to via a .mobileconfig provisioning profile). Either use pfs=no or enable ms-dh-fallback=yes option .
How do I specify AEAD ciphers like GCM for IKE and IPsec
For IKE, a PRF must be configured. For IPsec, a PRF must not be configured.
# the IKE SA is configured for AES_GCM using a PRF of SHA2 ike=aes_gcm-sha2_256 # the -null can be left out when using version 3.23 or higher # phase2alg= and esp= are aliases for the same configuration option. phase2alg=aes_sha256-null
The format of the ike= and phase2alg= (esp=) lines are: encr_algo-integ_algo. So for a classic non-AEAD AES CBC with SHA2_256 algorithm set, this would be: ike=aes-sha2_256 and phase2alg=aes-sha2_256. When using an AEAD algorithm such as AES GCM, there is no separate encryption and integrity algorithm. The combined algorithm however is negotiated and specified as if it is an encryption algorithm and with no (separate) integrity algorithm. However, IKE re-uses the integrity algorithm as the PRF to generate key material for the encryption/integrity functions of both IKE encryption and IPsec encryption. This PRF is negotiated along with the encryption and integrity algorithms. Since from a security standpoint, it makes no sense to trust an algorithm for integrity but not trust it for PRF, libreswan re-uses the integrity keyword to negotiate the PRF. It does not allow negotiating a different algorithm for integrity and PRF. When using an AEAD such as aes_gcm, that means we now need to specify a PRF, since the AEAD cannot be used as the PRF. So now the IKE configuration line becomes ike=aes_gcm-sha2_256 where the latter argument denotes the PRF and not the integrity algorithm. Since the IKE PRF also generates the key material for the IPsec SA, when also using an AEAD for the IPsec encryption/integrity, the phase2alg= (esp=) line does not need to specify an integrity algorithm nor a PRF algorithm. Up to libreswan 3.23, the parser would require to specify encryption-integrity, so the way to configure the AEAD was by adding a null for integrity. So that would be phase2alg=aes_gcm-null. As of libreswan 3.23, the trailing -null can be left out, so it can be specified as phae2alg=aes_gcm. Note that the esp= keyword is an alias for the phase2alg= keyword.
My ssh sessions hang or connectivity is very slow
This could be an MTU issue. The overhead of IPsec encryption (and possibly ESPinUDP encapsulation) yields a slightly smaller packet size. This can cause problems. A good way to confirm MTU problems is if you can login remotely over the IPsec tunnel using ssh, but issuing "ls -l /usr" causes the session to hang. Try adjusting the MTU with:
iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
If that does not help, try hardcoding it yourself:
iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1380
If these settings don't help, adding mtu=1420 to the connection might work, although it will affect all traffic that the connection covers.
As a last case alternative, you can try lowering the MTU on the internal interface of your IPsec server so that the PMTU discovery locally already goes back to 1440, eg using ip link set dev eth1 mtu 1440. This will not only affect packets for the VPN tunnel, but all packets received and sent on that inerface. Only use this as a last resort.
using auto=ondemand slows down TCP establishments when using XFRM
(also known as rhbz#1010347 )
This should be fixed on recent kernels (3.x) and backported to some older kernels (notably rhel 6.6)
The issue: The ESP packets are arriving sometimes very late or they do not arrive at all. The issues are most noticeable after restarting the IPsec daemon.
The problem as explained by Herbert Xu:
Your first TCP SYN packet triggers the IPsec lookup, however, the packet itself is dropped. TCP then retransmits but it only gets through after the IPsec SAs are fully instated, resulting in the delay.
What happens in some kernels is that the IPsec trigger occurs in a sleepable context, which means that the sending process will wait for the IPsec SAs to be installed before sending the first SYN. However, this was never meant to be a complete solution to supporting auto=route as it relies on the fact that there must be some sleepable context prior to the SYN packet being sent.
Evidently this is no longer the case for some kernels. Going forward I suggest two courses of action:
1) Doo not rely on auto=route. Instead use auto=start and ensure that you synchronously wait for the SAs to complete. For example, ipsec auto --up foo will bring foo up synchronously, while ipsec auto --asynchronous --up foo will not wait and thus may fail.
2) I will take this issue to the IPsec maintainer and the network maintainer to see if we can make adjustments to allow at least the TCP connection case to work with auto=route. However, there is no guarantee that this will be done as we may not be able to insert the requisite sleepable context into the general network stack just so that IPsec auto=route can work.
Longer term for auto=route to be properly supported someone needs to implement packet queueing on larval SAs.
Possible work around:
echo 0 > /proc/sys/net/core/xfrm_larval_drop echo 3 > /proc/sys/net/ipv4/tcp_syn_retries
This means that the first retransmit of the SYN packet (+1s) should make it through, rather than the current behaviour where only the fourth retransmit (+15s) makes it through.
Note that this workaround causes a regression on the connect() call to immediately return on a non-blocking socket with an appropriate POSIX compliant errno, which is why the workaround also sets the TCP SYN retry count to 3.
PSK doesn't work against cisco ASA 55xx
While libreswan has very little restrictions to Pre-shared secret Cisco has additional restriction, you can't have question mark '?' in psk. Cisco handles that as help request.
When using hundreds of tunnels on a xen based cloud system like AWS, a fraction of tunnels fail regularly
This is a known issue that could be a problem of the aesni kernel module in combination with the xen hypervisor. Try unloading the aesni.ko kernel module on the xen server. If you can confirm this fixes your issue (we cannot change the AWS servers), please email the swan-dev list with a confirmation.
My XAUTH authentication via PAM always claims the password is incorrect on centos6
This is an odd bug (feature?) that shows up when you have disabled selinux in /etc/sysconfig/selinux. Running selinux in permissive (or enforcing) mode seems to resolve this.
Why is it recommended to disable send_redirects in /proc/sys/net ?
Let's say you have a VPN server in a cloud that you use with your phone. Your phone will setup an IPsec VPN and all its traffic is encrypted and send to the cloud instance, which decrypts it and sends it on the internet, using SNAT. Replies it receives are encrypted and send to your phone.
Your phone will send the VPN server an encrypted packet. The server receives it on eth0 (its only interface!) and decrypts it. The decrypted packet is then ready to get routed. The server looks which interface it should send the packet to. It is destined to go out eth0. Since the packet came in via eth0 and would go out via eth0, the server concludes there clearly must be a better path not involving itself, since it is going out the same interface. It has no idea the packet arrived encrypted and got decrypted.
This is why we recommend disabling "send_redirects" in /etc/sysctl.conf using
net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0
Why is it recommended to disable rp_filter in /proc/sys/net ?
The kernel has a notion of which interface a packet came from and where it will go to and it determines if the path through the machine makes sense based on the IP address it sees. If 10.0.2.0/24 lives on eth0 and 1.2.3.4 has eth1 with the default route, then rp_filter will automatically block a 10.0.2.1 packet coming in on eth1. The rp_filter code is an implementation of RFC-3704 https://tools.ietf.org/html/rfc3704. Of course, you should created had firewall rules on the machine that would block these packets too. AND firewall rules on the router in front of the machine.
The problem with IPsec appears when you hand out a 10.0.2.13 address, like via XAUTH/IPsec. A packet with IP a.b.c.d comes in on eth1 for 1.2.3.4, which passes rp_filter, then gets decrypted to 10.0.2.13. Now the packet is still seen as coming from eth1, so rp_filter will drop the packet as 10.0.2.0/24 packets are only expected to originate from eth0.
This is why we recommend disabling "rp_filter" in /etc/sysctl.conf using
net.ipv4.conf.default.rp_filter = 0
A network restart or reboot might be neccessary for this entry to be picked up. As a one shot disabling for all interfaces, you can use:
for i in /proc/sys/net/ipv4/conf/*; do echo 0 > $i/rp_filter; done
NAT + IPsec is not working
When using NAT on the same linux machine as IPsec, care must be taken that packets meant for an IPsec remote address is not NATed. The NATed packet would no longer match the IPsec tunnel source and destination IP address ranges.
If you have the following common catch-all NAT rule:
-A POSTROUTING -o eth0 -j MASQUERADE
or
-A POSTROUTING -o eth0 -j SNAT --to-source 1.2.3.4
then either change these rules to only apply with a non-ipsec policy:
-A POSTROUTING -o eth0 -m policy --dir out --pol none -j MASQUERADE
or insert a ipsec skip rule before these:
-A POSTROUTING -o eth0 -m policy --dir out --pol ipsec -j RETURN -A POSTROUTING -o eth0 -j MASQUERADE
Can I hand out LAN IP addresses in the addresspool?
Yes, but you will need to enable proxyarp on the IPsec server. You can do this globally using the proxyarp entry in /etc/sysctl.conf, for example if your LAN interface is ethX, use
net.ipv4.conf.ethX.proxy_arp=1
No acceptable ECDSA/RSA-PSS ASN.1 signature
This is an interop issue between libreswan and strongswan. When using RFC -7427 style autentication, libreswan only allows RSA-PSS and not RSA-v1.5 based signatures. As per RFC 8247, it is expected that any implementation doing RFC-7427 MUST support RSA-PSS and MAY support RSA-v1.5. Strongswan unfortunately defaults to using RSA-v1.5 when configured with authby=rsasig, even if it received a RSA-PSS signature. To work around this problem on strongswan, the ipsec.conf should be changed to contain:
conn example authby=rsasig rightauth=ike:rsa/pss-sha512-sha384-sha256 leftauth=ike:rsa/pss-sha512-sha384-sha256 [...]
Common error messages
030 ignoring message from whack with bad magic
This means that the ipsec whack command that is used to talk to the pluto daemon are different versions. The most common cause is that the system has two installs of libreswan. One system install that appears in /usr/libexec/ipsec and one local install in /usr/local/libexec/ipsec. The system started the non-local version, but the user running the ipsec command prefers /usr/local/sbin/ipsec over /usr/sbin/ipsec and thus uses the whack from /usr/local/. You should remove one of the two installs.
Another reason this can happen is that a package upgrade did not properly restart the daemon and the old daemon is running but the user only has the new whack command. Manually killing the pluto daemon and restarting the ipsec service will resolve that. Note that the "shutdown" command is not version-specific, so regular package upgrades (eg via rpm) can install the new whack, then call shutdown with the new whack to the old pluto, without this error appearing.
compile error: ‘SEC_OID_CURVE25519’ undeclared here
The NSS library is too old and does not support CURVE25519. Either upgrade the NSS library or compile with USE_DH31=false to disable CURVE25519 at build time. This currently happens with Debian9 and Debian8
ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: Message too long [errno 90, origin ICMP type 3 code 4 (not authenticated)]
These errors are often intermittent, it depends on your application data that is getting encrypted. Your NAT'ed IPsec tunnel is using ESPinUDP, and the additional UDP header caused some of your packets to be too big. See the previous answer and try lowing your mtu. Use an insanely small mtu like 1300 or 1200 for confirmation. Then try to bring it up higher to what seems to work reliably for you.
ERROR: asynchronous network error report on eth0 (sport=4500) for message to xx.xx.xxx.xxx port 4500, complainant yy.yy.yyy.yyy: No route to host [errno 113, origin ICMP type 3 code 1(not authenticated)]
These errors often happen 15 minutes after the tunnel successfully established. It's most likely that the tunnel was idle and the NAT router removed the nat mapping. Or the NAT router rebooted and lost state. It no longer knows which client to send the packet to. Ensure your connection uses nat-keepalive=yes. Possibly decrease the global keep-alive= value to send more frequent keep-alive packets. Alternatively, enable DPD on the connection to cause some regular traffic on idle tunnels.
ERROR: asynchronous network error report on eth0 (sport=500) for message to xx.xx.xxx.xxx port 500, complainant yy.yy.yyy.yyy: Connection refused [errno 111, origin ICMP type 3 code 3 (not authenticated)]
This error means the other end is not (or no longer) running an IKE daemon. Ensure the IKE daemon is running on the remote system. If you see this error during a negotiation, it could be that the remote IKE daemon crashed or stopped listening. On Mac OSX if the IKE daemon is not allowed to read the proper X.509 certificate, it will only realize this partially into the IKE negotiation and terminate, resulting in this error. It is also possible that the remote IP is actually a NAT device with the IPsec device behind it. In that case, using rekey=no and letting the other end initiate might make this error go away.
error: ignoring informational payload, type NO_PROPOSAL_CHOSEN msgid=00000000
This error means exactly what i says. The IKE proposal(s) sent to the server were rejected. This means there is a configuration mismatch between libreswan and the remote IPsec server. Usually this is a configuration mismatch in the ike= or esp= (phase2alg=) setting. But other options could also be wrong, such as authby= or pfs= or aggrmode=
Microsoft Windows fails to connect, log shows: retransmit response for message ID: 1 exchange ISAKMP_v2_AUTH
You are on a network that is dropping UDP fragments, and your client has no support for IKEv2 fragmentation. This is common on LTE networks when using Windows clients that are not up to date. Microsoft added IKEv2 Fragmentation support in Windows 10 April 2018 Update (v1803) so updating Windows might resolve this issue.
Microsoft Windows Error 13806: IKE failed to find valid machine certificate
You are using a certificate that is missing some required ExtendedKeyUsage ("EKU") attributes. See Interoperability#Windows_Certificate_requirements
ssh gives error: Corrupted MAC on input. Disconnecting: Packet corrupt
This usually indicates MTU issues. You can try lowering the mtu using the mtu= option or by changing the actual mtu on the proper interface on the libreswan server. This error is known to happen on Amazon EC2 AMI types that use PV (xen) instances. Switching to Amazon HVM instances seems to resolve the problem on AWS.
Using aes_gcm or aes_ctr results in ERROR: netlink response for Add SA esp.XXXXXXXX@IPADDRESS included errno 22: Invalid argument
This usually indicates that the ESP algorithm selected using the phasealg= (esp=) line is not available in the kernel. These usually indicate kernel bugs.
Linux kernels up to 3.2.x have a bug in the aesni-intel driver on x86_64. See rhbz#1176211 The AESNI hardware acceleration kernel module does not properly support 256 or 192 bit keys for AES_GCM. You can either switch to 128 bit keys or blacklist or unload the aesni-intel kernel module. Another alternative is to switch from phase2alg=aes_gcm to phase2alg=aes, although that will cut the performance in half.
Linux kernels to date seem to have a bug in the aes_ctr code on the POWER8BE VM - use phase2alg=aes there as well to use AES_CBC,
Can't find the private key from the NSS CERT (err -8177)
The old libreswan-3.8 /etc/ipsec.d/nsspassword requires just the password to be entered. In later libreswan's, you must add the NSS prefix to it. So to specify the password "secret", use:
NSS Certificate DB:secret
ESP DH algorithm MODP3072 is invalid as PFS policy is disabled
libreswan before version 3.25 allowed invalid configurations with pfs=no while specifying a PFS group for esp (eg esp="aes-sha2;modp3072") and it would ignore the PFS group. This is no longer allowed. Either use pfs=yes (the recommended and default) or remove the modp item from any ah= / esp= / phaesalg= option.
"IPsec encryption transform did not specify required KEY_LENGTH"
This happens when trying to interoperate with old openswan versions that mistakenly do not send the KEY_LENGTH attribute for AES. The work around the problem, on those old implementations, specify "aes128" or "aes256" instead of "aes". For example:
phase2alg=aes256-sha1;modp1536 esp=aes256-sha1;modp1536 ike=aes256-sha1;modp1536
No PARENT proposal selected
This error can happen when there is a mismatch of IKE proposals between the server and client. In libreswan-3.14, the modp1024 (group 2) was removed from the default proposal set because of its weakness, but apparently Windows 7 requires it per default.
Using VTI causes "Keys are not allowed with ipip and sit tunnels"
You need to upgrade the iproute package. For RHEL7, see RHBA-2015-2117
Old problems fixed in newer releases
invalid last pad octet:
There is a bug in racoon (also called ipsec-tools) that sends improper oversized padding. Libreswan version 3.14 became more struct and rejected these packets. Libreswan 3.16 allows the bad padding again. Note that racoon is used in various products including older versions of OSX and iOS (up to iOS 7.x)
Module unloading error on shutdown or restart: Module esp4 is in use
ERROR: Module xfrm4_mode_tunnel is in use ERROR: Module esp4 is in use FAILURE to unload NETKEY esp4/esp6 module
This has been fixed in libreswan-3.9. Please upgrade
IPv6 tunnel works manually but fails on freshly booted machine
When one machine reboots and loses state, the other machine still has an encryption policy for the rebooted machine and will insist on receiving only encrypted packets. Obviously, after a reboot the host cannot send encrypted packets. For that reason, an "IKE hole" is present in the host's kernel. This means that any UDP 500 and UDP 4500 packets for IKE are allowed in plaintext even if we have an encryption policy active for that host. On at least the Linux kernel that hole does not include ipv6-icmp Neighbour Discovery packets, which is a unicast reply from the host that did not reboot to the just rebooted host. You can see this in "ipsec status" as:
000 Shunt list: 000 000 2620:52:0:ab0:42f2:e9ff:fe09:a16c/128:136 -58-> 2620:52:0:ab0:ca1f:66ff:fef1:c74c/128:0 => %hold 0 %acquire-netlink
Note protocol 58 (ipv6-icmp)
A workaround is to add the following connection:
conn v6neighbor-hole left=::1 leftsubnet=::0/0 leftprotoport=58/0 rightprotoport=58/34816 rightsubnet=::0/0 right=::0 connaddrfamily=ipv6 authby=never type=passthrough auto=route priority=1
If you wonder where the number "34816" comes from please see the leftprotoport= entry of the ipsec.conf man page.
libreswan-3.13 to 3.22 installs this connection per default in /etc/ipsec.d/ libreswan as of 3.23 loads this as a buildin connection automatically.
Using IPsec/L2TP with xl2tpd, the pppd ip-down script does not seem to run
Old pppd < 2.4.5 could cause xl2tpd to hang on a hanging pppd, so xl2tpd killed pppd itself to avoid this. But that meant pppd did not get to execute its ip-down script. This behaviour can be tweaked using the define TRUST_PPPD_TO_DIE in the xl2tpd Makefile. Fedora and EPEL packages enable this as of April 2015.
Interop issue with racoon: invalid padding-length octet: 0x23
Racoon has a broken implementation of IKE padding. Libreswan version 3.12 to 3.14 had strict padding checks that caused these packets to be rejected. These restrictions have been loosened to accomadate the broken racoon in libreswan 3.15 and higher
on xen pluto crashes with: Illegal instruction when using ike=aes_gcm
This is due to the interaction of NSS and Xen (which is possibly lying about the real AES hardware capability of the system. A workaround for this is to disable AES_GCM encryption in NSS using:
export NSS_DISABLE_HW_GCM=1
This should probably be placed somewhere more global than just libreswan, as it will affect everything that is using the nss libraries.
IPv6/KLIPS: ipsec_set_dst can't determine the correct routing device on a host connection
This is a kernel bug, see lsw#237 Confirmed affected are kernel 4.1.6 and 3.14.51 but possible all 3.x and 4.[12].x kernels to date (Sep 28, 2015)
L2TP / Transport Mode connections fail after system update
There is a kernel bug in Linux kernel 4.14 with the XFRM/NETKEY code. Downgrade to 4.13 or upgrade to 4.15rc1 or later. Be aware that some kernels could contain backports of the faulty 4.14 kernel.
High Speed IPsec performance issues
Kernels used a really small crypto queue by default (100), which was also hard coded. Recent kernels (4.x and RHEL7 kernels) can now be configured to increase this queue length:
echo 'options cryptd cryptd_max_cpu_qlen=1000' > /etc/modprobe.d/cryptd.conf reboot
To check your current queue length:
# modprobe cryptd # dmesg | grep cryptd [ 4865.043558] cryptd: max_cpu_qlen set to 1000