XFRM pCPU: Difference between revisions

From Libreswan
Jump to navigation Jump to search
No edit summary
No edit summary
Line 53: Line 53:
* test interop with unsupported version. ideally we should figure it out and not install clones. It could be that we will install clones and the last one would be used.
* test interop with unsupported version. ideally we should figure it out and not install clones. It could be that we will install clones and the last one would be used.


== how add pCPU support only on OUT direction ==
== nCPU < nSAs ==
Lets say there are 4 cpus and number of clone configured is 8. The head SA's list only has 4 places for sub SAs. The other 4 used only as receive SAs. As I understand from the IETF, Tero, the the initiator could do this. The initiator is committing to receive.  The IKE daemon will install 8 Receive SAs
 
You need extra flags to XFRM_MSG_GETSA  and XFRM_MSG_UPDSA, XFRM_MSG_GETSA when dealing with out going s
You need extra flags to XFRM_MSG_GETSA  and XFRM_MSG_UPDSA, XFRM_MSG_GETSA when dealing with out going s
=== XFRM_MSG_GETSA | XFRM_MSG_UPDSA  ===  
=== XFRM_MSG_GETSA | XFRM_MSG_UPDSA  ===  
Line 62: Line 64:
* sub SA set  XFRMA_SA_EXTRA_FLAGS to XFRM_SA_PCPU_SUB AND XFRMA_SA_PCPU to <sub-sa-id>.
* sub SA set  XFRMA_SA_EXTRA_FLAGS to XFRM_SA_PCPU_SUB AND XFRMA_SA_PCPU to <sub-sa-id>.
* also set XFRMA_SRCADDR to src addr
* also set XFRMA_SRCADDR to src addr


==  Receiver side RSS support ==
==  Receiver side RSS support ==
Once we had running code soon we realized the Receive Side Scaling [https://www.kernel.org/doc/Documentation/networking/scaling.txt RSS] is necessary. The receiver NIC should be able steer different flows into separate Qs otherwise receiver seems to getting overwhelmed. Some cards initially tested did not support RSS for ESP flows, instead only TCP and UDP. While figuring out RSS tried ESP in UDP encapsulation, along with ESP in UDP GRO patches we could see the flows getting distributed on the receiver.
Once we had running code soon we realized the Receive Side Scaling [https://www.kernel.org/doc/Documentation/networking/scaling.txt RSS] is necessary. The receiver NIC should be able steer different flows into separate Qs otherwise receiver seems to getting overwhelmed. Some cards initially tested did not support RSS for ESP flows, instead only TCP and UDP. While figuring out RSS tried ESP in UDP encapsulation, along with ESP in UDP GRO patches we could see the flows getting distributed on the receiver.
ideally you should be able to run the following,
<pre> ethtool -N <nic> rx-flow-hash esp4 </pre>
Most NICs we seems only support tcp and udp and not esp4 or esp6
Another argument is if the NIC agnostic the 16 bits of SPI, of ESP packet, is aligned with UDP port number and should provide enough entropy.
<pre> ethtool -N eno2 rx-flow-hash udp4 sdfn </pre>

Revision as of 11:30, 16 November 2019

Goal: scalable IPsec throughput with CPU encryption (no HW offload)

The idea called per CPU sa for the out going direction was discussed at Linux IPsec workshop 2019 in Prague. During the following days a small group of people worked on a prototype of userspace, Libreswan, and kernel, xfrm. The libreswan used the terminology "clones". Kernel so far calls pCPU. These names may change.

How to test

Libreswan source with pCPU support branch #clones-3

git clone --single-branch --branch clones-3 https://github.com/antonyantony/libreswan

Sample config | ipsec.conf

conn westnet-eastnet
	rightid=@east
        leftid=@west
        left=192.1.2.45
        right=192.1.2.23
	rightsubnet=192.0.2.0/24
	leftsubnet=192.0.1.0/24
	authby=secret
        clones=2
        auto=add

ipsec auto --up westnet-eastnet
taskset 0x1 ping -n -c 2 -I 192.0.1.254 192.0.2.254
taskset 0x2 ping -n -c 2 -I 192.0.1.254 192.0.2.254

ipsec trafficstatus

ipsec whack --trafficstatus
006 #2: "westnet-eastnet-0", type=ESP, add_time=1234567890, inBytes=0, outBytes=0, id='@east'
006 #4: "westnet-eastnet-1", type=ESP, add_time=1234567890, inBytes=168, outBytes=168, id='@east'
006 #3: "westnet-eastnet-2", type=ESP, add_time=1234567890, inBytes=168, outBytes=168, id='@east'

NOTE both SA #3 and #4 has outgoing traffic on it.

Kernel source pcpu-2

git clone -b pcpu-2 https://github.com/antonyantony/linux

Kernel / xfrm plans

  • Release private branch on Steffen's repository to get wider testing.
  • Kernel support for rekey. Possibly with reference counting the linked list on the Head SA. One could rekey in any order - either head SA or sub SA.
  • Ben would like to add feature bind a sub sa to a head SA,


Libreswan Plans

  • Currently support clones=n. Both sides should have same number.
  • support for asymmetric configuration, one side 8(initiator) and responder (4).
  • add rekey support
  • fix bugs down and delete.
  • don't allow clone instance on its own to be add|delete|down on the unaliased name.
  • test interop with unsupported version. ideally we should figure it out and not install clones. It could be that we will install clones and the last one would be used.

nCPU < nSAs

Lets say there are 4 cpus and number of clone configured is 8. The head SA's list only has 4 places for sub SAs. The other 4 used only as receive SAs. As I understand from the IETF, Tero, the the initiator could do this. The initiator is committing to receive. The IKE daemon will install 8 Receive SAs

You need extra flags to XFRM_MSG_GETSA and XFRM_MSG_UPDSA, XFRM_MSG_GETSA when dealing with out going s

XFRM_MSG_GETSA | XFRM_MSG_UPDSA

both head SA and sub SA need extra attributes.

  • head SA set XFRMA_SA_EXTRA_FLAGS to XFRM_SA_PCPU_HEAD*
  • sub sa set XFRMA_SA_EXTRA_FLAGS to XFRM_SA_PCPU_SUB AND XFRMA_SA_PCPU to <sub-sa-id>. Sub SA ID start from 0-u32

XFRM_MSG_GETSA call only change for sub sda

  • sub SA set XFRMA_SA_EXTRA_FLAGS to XFRM_SA_PCPU_SUB AND XFRMA_SA_PCPU to <sub-sa-id>.
  • also set XFRMA_SRCADDR to src addr


Receiver side RSS support

Once we had running code soon we realized the Receive Side Scaling RSS is necessary. The receiver NIC should be able steer different flows into separate Qs otherwise receiver seems to getting overwhelmed. Some cards initially tested did not support RSS for ESP flows, instead only TCP and UDP. While figuring out RSS tried ESP in UDP encapsulation, along with ESP in UDP GRO patches we could see the flows getting distributed on the receiver.

ideally you should be able to run the following,

 ethtool -N <nic> rx-flow-hash esp4 

Most NICs we seems only support tcp and udp and not esp4 or esp6

Another argument is if the NIC agnostic the 16 bits of SPI, of ESP packet, is aligned with UDP port number and should provide enough entropy.

 ethtool -N eno2 rx-flow-hash udp4 sdfn