MAST
(note that the mastX interface is no longer being developed. Instead, NETKEY/XFRM with VTI will be worked on)
The MAST design
ipsecX interfaces are augmented with a single mastX interface.
Of the three mastX modes that that were discussed years ago, only one has been implemented. That is the mode where if the nfmark is set to 0x80000000 | (SAref&0x7fff << 16) The SAref is then extracted used to lookup an SA in the SA table that RGB wrote years ago.
SAref extension
The SAref is now a proper extension so that our SA create message was identical to stock PFKEYv2. This also makes it easier to #include the KAME/NETKEY linux/pfkeyv2.h. It turns out that NETKEY has skb->sp, which is:
struct sec_path { atomic_t refcnt; int len; struct sec_decap_state x[XFRM_MAX_DEPTH]; };
This has been extended with:
typedef unsigned int xfrm_sec_unique_t; xfrm_sec_unique_t ref; /*reference to high-level policy*/
After checking the nfmark, we then check the skb->sp->ref value, and if non-null use that instead for the SA lookup.
The other two modes for mastX are (for the record): b) punt all traffic into a well-defined SA# ("PPP" mode). This would be useful for building a virtual-leased line, and would eliminate the GRE layer that is currently required to make BGP-over-IPsec work.
c) a mode where it has a link-layer value, and for each tunnel that is up, it populates the neighbour cache with something is meaningful to it.
sec_path addition
In ipsec_rcv(), skb->sp is setup to contain a sec_path that has a ref set up correctly.
UDP sockets
for UDP sockets, we set up a new IP-level option that permits the skb->sp to be mapped to a new "ancilliary" data message. This message is currently pretty primitive, consisting of two "ref"s.
These are called generally "ref" (or refme) and "refhim". The first is refme, and is the SA on which the packet arrived. The second is refhim, and is the ref of an SA on which a reply can be sent.
TCP sockets
For TCP sockets, the TCP layer will be taking care of this stuff. (for SCTP, it will have to be a mix!)
xltpd support
In the case of xl2tpd, the "refhim" is used as an additional index in to the call/tunnel list, letting us distinguish two hosts that appear to have the same outer IP. Actually, one could probably use *ONLY* refhim.
The "ref" is not used, although it is recorded the first time, since it currently changes during rekeys. (actually, it might already be kept during rekeys)
The problem is that we have to keep the other (older) SAs around when
we are doing rekeys, since there may still be packets in flight. In
fact, the client could even, say, load balance among the currently
valid SAs... might occur for some HA system...
IPsec SAs are all properly reference counted.
pluto handling
KLIPS can assign SArefs if the passed SAref is IPSEC_SAREF_NULL, and this is fine for new SAs. But, we need to know the outgoing SA when we create the incoming SA, so that it can reference it.
Normally, pluto creates the incoming during QUICK_R1 (when it receives I1, and sends R1). It then creates the outgoing SA when it gets I2.
Now, if pluto doesn't know the outgoing SA's "refhim", it creates the outgoing SA immediately. (It doesn't "eroute" it yet)
It then creates the incoming SA.
Then at I2, if it hasn't created the outgoing SA, it creates it. Why might that happen? well, at a rekey, we already know the refhim that we want to use, so we can just it normally.
For transport-mode with the MAST kernel driver, we actually don't eroute ANYTHING.
We can find out the previous refhim by looking at the state referenced by: st->st_connection->newest_ipsec_sa
If the gateway is already EXPIRED the state, then we have a problem. I don't have a solution for this problem yet. I think that we will have to have some kind of mapping: {public-key, SA-tuple} => refhim
iptables
for tunnel mode SAs, we can now use iptables. the updown.mast script does:
iptables -I PREROUTING 1 -j IPSEC -t mangle iptables -I IPSEC 1 -s 192.0.1.0/24 -d 192.0.2.0/24 -j MARK --set-mark 0x80120000
Except that it turns out iptabels doesn't grok 0x, and thinks of set-mark value as being signed... so _updown.mast uses /bin/printf to do the right thing
We then do:
ip rule add from all fwmark 0x80000000/0x80000000 lookup 50 ip route add 0.0.0.0/0 dev $PLUTO_INTERFACE table 50
updown wrapper
The updown script calls _updown.mast when the mast stack is used.
transport mode UDP
for transport mode UDP sends, when the IPSEC_REFINFO option is attached using sendmsg(), then basically we fill in the outgoing flow information. We figure out which mastX device to force things to by looking up the ref value given using ipsec_sa_getbyid(), and finding the attached mastX device.
(Oh, yeah, each SA can now indicate which device it should go in or out of, but only mast0 is supported by pluto)