22 November 2021
16:00 (UTC + 1)
ALEXANDER AZIMOV: Welcome back, now we are starting our last Plenary session for today. I will be chairing, with Wolfgang Tremmel, and there will be three talks, one lightning talk.
Now, a few words about our first talk. We are all familiar with Internet exchange points and the first talk will be about exchange, though not Internet Exchange, but IP exchange, and it's not also about chasing or leasing address space. Please, Andra Lutu from Telefonica will tell you about IXP operations.
ANDRA LUTU: Hi everyone. Can you hear me and see me? Can everyone see the slides? Excellent. Thank you so much.
Thank you so much for the opportunity. It's great to be here, thanks for joining us today everyone to hear about some insights that we wanted to share from operating an IP exchange provider.
I am part of the Telefonica research and the work I'm talking about today is joint work with my colleagues from Telefonica Research, Marcella from Madrid and Fabian from North Western University.
In this talk, you will not hear a lot about IXPs, but you'll hear a lot about mobile roaming and how mobile operators essentially enable this function through the deployment of the so‑called IP exchange model. So, roaming, as you all I'm sure very well know, is one of the basic function of cellular service allowing a user to connect to any other network in the world with the same device in the same mobile description from their home. This function it actually highly important to enable this continuous service, both nationally and internationally, for users travelling abroad.
With roaming, people on older devices can actually remain connected anywhere they go. And roaming has been receiving some attention in the past five years, you know, with people travelling much more globally, at least prior to the pandemic, and regulations actually trying to lower the cost of roaming, so, for example, we saw in the past year, as you know, European roam, like, at home regulation that made it possible for people to access cellular service at the same rate in any country in the European Union as in their home country. As we actually learned that not only people and their devices roam, it's also the Internet of Things that actually draws much benefit from exploiting roaming and the infrastructure that is mostly meant for people. So, in IoT verticals actually require cellular connectivity and they leverage roaming to ease their operations essentially.
So what we see is they actually purchase their connectivity services from one cellular connectivity provider and deploy their devices basically anywhere in the world through the use of mobile roaming. This means that essentially a smart vehicle in Spain might actually be connected via a SIM card provided by a cellular operator from, let's say, Germany. This is true for not only things that need to move but also things that don't really need to move, like, for example, smart meters, but they still require cellular connection in order to connect their application server.
Cellularity is now the fastest growing mobile device category, even more than smartphones so there are going to be more things connected than people connected. And lately we have heard a lot of talk of the so‑called global SIM, which enables essentially IoT devices to connect anywhere in the world, all while using one single provider that knows actually how to leverage the roaming function of the cellular ecosystem.
So, this all brings me to say that, despite all this growing importance of the roaming function, actually little effort has gone, or at least from, according to our knowledge and what we have been aware, into research around this ecosystem. So this brings me to what we will actually discuss today: What makes roaming possible. The goal today is to shed some light on to this IP exchange model and onto the ecosystem that different stakeholders build around it.
We take it from one particular IP ‑‑ IPX provider, that operates a worldwide roaming platform and we will look into actually signalling traffic patterns as well as performance and ‑‑ the performance of the platform that actually enables the tunnelling of the data across the world. And finally, we will dissect a small sample of data roaming traffic to give insights into the service performance of this platform.
So the IP exchange is essentially a private world, it's a walled garden, it's a private platform that local carriers, which will we also call today, IPX providers, operate in order to allow their customers such as mobile network operators or content providers to operate worldwide. Currently, we are aware of about 30 different IPX providers overall. We know that they all peer together in a full mesh to build the so‑called IPX network and this network of this approximately 30 entities enables any of their customers to gain global coverage for their end users. This is essentially they are connecting, you know, more than 800 mobile network operators worldwide.
It's worth mentioning that this IPX network, it by‑passes the public Internet, it's hard to measure and observe with end‑to‑end measurements so. Here, what we are trying to do is present a first detailed analysis of operations or ‑‑ of how operations look like in one such IPX provider.
So to understand more about what this means and what services an IPX provider actually offers and what is their performance to actually take a look at the specific IPX provider. And of course the insights I will bring today come from the point of view of a global telco that actually operates a tier 1 IP network with a global fingerprint. National network, loads of coverage, submarine capacity, submarine fibre optic cables and this network actually supports what we were saying before, this IP exchange platform and all the services that it offers. The IP exchange platform, this particular carrier has strategy deployments mostly in two major regions, namely Europe and the Americas. And I'm just going to show you today, here, a high level feature of how two roaming partners interconnect to allow their devices to connect. So the services we analyse that the IPX provider actually gives, are the signalling which allows 4G devices to connect anywhere in the world. ACCPC language essentially supports 2G and 3G devices, data roaming which is the act of actually connecting while roaming and the machine to machine platform which supports essentially IoT devices.
So I show here this very simplified scheme of the two interconnected parties, the visited network, and the home network, together interconnecting via the IPX provider. And all these functions that the IPX provider basically enables for the connectivity of the end user devices. So mobile devices such as smartphones or IoT devices, which can vary from connected cars to smart meters and so on, connects radio networks of different visited networks across the world and in order to enable this, the IPX provide supports the SS TCP signalling and the damn another signalling functions. So this allows essentially different core network functions without ‑‑ within the 2G and 3G core respectively the 4G core of the two roaming partners to exchange messages, to talk to each other, such that they are allowed to authenticate the roaming user and enable data roaming in other services in the visited network. So the signalling service is basically one of the fundamental services that the IPX provider offers, essentially if we're feeling this function of allowing an end device to connect into a visited network.
Now, once the user is authenticated and can connect to the local visited network, the IPX provider actually enables the data roaming function. So this function allows the end user device to connect to a data ‑‑ to actually establish a data connection usually using the GPRS protocol, GDP, which is the ‑‑ which the IPX provides by creating actually this tunnel for the data communications.
Now the data roaming service is actually active both for people that their smartphone devices but as well, as I was saying before, for IoT devices. Because we have this mixture within the same ecosystem, that's why a separate service was born which is the so‑called end‑to‑end platform which essentially means a dedicated bunch of resources that the IPX provider basically has for the machine‑to‑machine communications for devices.
So in order to give our insights, what we do is basically sample the ‑‑ all these interfaces of the IPX provider and we build a vast, massive dataset with all these four types of services that I just mentioned before.
So, now that we have our dataset that captures these four major services, our goal is to further analyse it and understand some of the traffic patterns as well as performance of the whole platform. So, first, we start by looking at the SCCP in diameter signalling traffic patterns.
So the data I am showing here, what we captured initially was data in December 2019, so in both of the plots I am showing here we basically captured the time streams over two weeks, both for diameter and SCCP dialogues, and for the SCCP dialogues we look at the mobile application part and in the upper right‑hand side plot, we see that map signalling accounts for a much higher volume of traffic than diameter signalling in terms of just total number of messages that this devices trigger within our sample. And again, this is expected, because actually what we know is that the number of devices that actually are in roaming worldwide, and depend on 2G and 3G technology, is actually much higher than the ones that depend on 4G and other technology.
On average, we see per device and we also observe a larger volume of map messages than diameter, so this is noticeable by visually comparing, in the lower right‑hand side plot, the red shaded area corresponding to map messages, to the green shaded area, which corresponds to the diameter messages. So, we're representing, again, average number of dialogue messages per device per hour over the same two periods that I previously mentioned. So we guessed that this difference can be due either to the application that these mobile devices actually serves, so, for example, think about comparing a connected car traversing a low coverage area with a smart meter, so ‑‑ or it can be due to the data implementation of the protocols which can be slightly different between diameter and map.
Now, this analysis ‑‑ wait, just let me go back. The analysis of this signalling dataset also allows us to observe where these different devices actually operate in terms of geolocation. So this enables us to build international mobility matrixes showing the movement of devices from home countries to different visited countries. So, we show here two such mobility matrixes, one corresponding to December 2019 and another one corresponding to July 2020, so, in both matrixes, in the X axis we have the home country and the Y axis is is the visiting country and the will value of the columns add up to 100%. We observed here there are two major visited international hubs, the lines that are highlighted, one in the UK and the other one in the US, and this only makes sense because, again, this platform operates on top of an international transit network that the telco carrier actually operates, and essentially this network is deployed around these two geographies, Europe and the Americas.
Now, this type of mobility matrix not only allows us to capture these operational artefacts but also the impact of social economic factors, the international mobility of people, so specifically we can capture the impact of the Venezuela‑Colombia migration crisis, which is the point we are showing on the vertical.
So now we can continue the analysis with the traffic patterns and actually look at the GDP control plane. So, for the data roaming activity analysis, again we capture here only IoT devices, specifically we look now at the GPRS tunnel protocol that is used by mobile devices to request data communications from the IPXP, we look at the control plane. That means that we don't look at the data flowing through the tunnels, we look at the requests that the end user devices make in order to establish communication. So, specifically we focus on one IoT customer, which uses SIM cards from one mobile operator in Spain. We only capture a sample of about 2 million devices operating worldwide in July 2020. So, in the upper part matrix, in the line in red, we can see the breakdown of devices triggering data communication per visited country. So, we know that the majority of the devices operate in this case in the UK and Mexico, Peru, Germany or the US, again confirming what we saw before confirming the operational fingerprint of this particular IXP which depends on the underlying network that the carrier operates, essentially in Europe and the Americas.
We now look at the hourly time series of the number of active devices per visited country. So we see how this evolves in time, we have time on the X axis and on the Y axis we have the number of devices. So we note a clear daily pattern emerging but also weekend, weekday pattern, so the greyed‑out area actually corresponds to the weekend. Also an interesting note is that, for example, devices in Germany, which the green line on the bottom part, are slightly irregular, which is due to the fact that these are devices that have actually higher mobility than the devices in the rest of the other countries.
And finally, what we look at are some IPXP performance matrix from analysing the actual data that is flowing through the GDP tunnels which actually create ‑‑ are created in order to enable the data roaming service we mentioned.
So, specifically we look here at the GDP user plane traffic which allows us to measure the round‑trip time both in the uplink and the downlink direction. So the uplink direction allows us to capture basically the impact of the packet gateway in the home network, and also the latency already Internet passed towards the application server depending on the application of the device.
And the downlink direction allows us to capture the impact of the serving gateway, so basically in the visited network, and also the impact of the radio access network. So for both metrics what we actually do is we breakdown the devices per visited country, and we show the time series of the average RTT millisecond per device per hour over a two‑week period in July 2020. So, again all devices here are IoT devices, which we essentially use as sensors in this case to measure, capture the performance of the IXP platform. We note again that daily patterns emerge as well as weekend/weekday for a specific visited locations in the US in this case, so this is the red line here on both of these graphs.
And I would like to highlight in particular here the devices in the UK, which present a very interesting synchronised behaviour which we actually looked at in detail, and this is essentially our smart meters with a pre‑programmed behaviour. And we actually conjectured that this pre‑programmed behaviour, the synchronised behaviour explains the periodic peaks that we observe both in RTT and in uplink and downlink. So again, UK here is the blue line and the peaks that you observe here.
So, with this, I would like to actually conclude our short dive into this hidden world of the IPX ecosystem, and actually for more results and, you know, analysis on this dataset, I do encourage you to take a look at some of the work we published.
So, before I let go, I actually wanted to explain why we wanted to bring this content here and we tried the open questions that were basically looking to answer. We'd love to hear from the audience here, and from the community. So we believe our work does open the door, or does open the door to our community at least, for an entire area of research around this IPX ecosystem and its operations.
So, specifically one challenge we think needs addressing or discussion is how we could actually learn from the peering fabric of the Internet and the success that this peering fabric has had, how could we learn and how could we transpose all these lessons in the cellular ecosystem? And moreover, as we are now deploying 5G networks worldwide, we are still in the phase of defining how roaming would look like in 5G and in next‑generation networks. And we we believe that is essential that we answer this question with a very clear vision that actually improves upon the prior approaches that we have seen in LTE or in 3G, and as more devices leverage roaming and we have seen this is the case for IoT, we believe that is of critical importance to build an ecosystem that ensures security and privacy, and again, this is an opaque network that does come with this promise. However, there are many talks out there and there are many points to show that, you know, there are still many weaknesses that attackers can exploit within this ecosystem.
So we believe that, you know, a normal detection and network intelligent approaches can actually help better operate this ecosystem and help take this ecosystem in the direction of security and privacy.
So, with this, I am very keen to open the conversation with everybody here. I thank you again for your attention and, especially, I thank you for allowing me to take up so much of your time. I hope that we have been able to stir your interest, and I hope that you will, you know, join us in working towards answering some of these challenges that I highlight here.
So thank you.
BENEDIKT STOCKEBRAND: Thank you, Andra. You can leave your video on for questions, but at the moment I don't see any ‑‑ oh, there is a question. The question is from Daniel Karrenberg from the RIPE NCC:
"Note the synchronised behaviour of clients. Was this synchronised to wall clock, UTC and in which way exactly? For example, second zero of the top of an hour."
ANDRA LUTU: Thank you so much for the question. So they were actually synchronised to request a data connection around midnight every day. So that is basically the peak that we observed there.
BENEDIKT STOCKEBRAND: Okay. Next question from Eliot Lear from Cisco Systems. The question is:
"Smart meters and roaming is interesting. Is this an aggregation provider with some sort of agreed APN with multiple providers?"
ANDRA LUTU: Yeah, that is correct. So we do see that this type of global provider essentially aggregates, you know, services across the world. They purchase this ‑‑ or they offer this global thing which essentially depends on the roaming function, and they use the IPXP in order to leverage all the agreements. So, like that, they don't need to just go in each country and, you know, negotiate an agreement; they could just piggyback on the IPXPs or the existing agreements and essentially deploy their services worldwide. So, yeah. Does that answer the question?
BENEDIKT STOCKEBRAND: Any more questions? Remind you, you can also use your microphone or camera button and ask the question over your microphone. Okay. I do not see any more questions. So, thank you again, Andra, for your presentation, and see you around, see you in SpatialChat later on.
ANDRA LUTU: Thank you so much and I'll be around, I am keen to discuss more with everybody here. So again, thank you so much for your time and for allowing me to tell you about our experience. Bye!
BENEDIKT STOCKEBRAND: Okay. The next speaker is Max Franke. He is from the Technical University of Berlin and he is talking about any eyeballs and using happy eyeballs for load balancing. Max, the floor is yours.
MAX FRANKE: Thank you. I am going to talk about any eyeballs, which is a new way to use existing happy eyeballs implementations as a way to achieve load balancing. So just to give a quick overview.
We are going to use the already existing code that is present on many clients nowadays, to give more ‑‑ to establish a way to give fine‑grained load balancing control to server operators and we especially want to enable Anycast‑based load balancing to give basically a second layer of load balancing beyond the usual catchment‑based load balancing for them. So just to quick recap what is happy eyeballs:
It is a way to quickly fall back from IPv6 to IPv4 addresses so you have a client trying to resolve the host name and the host name resolves the two IP addresses, one IPv6, one IPv4. Now, you would first try the IPv6 address and, if it takes too long or you notice that it fails, you would just immediately fall back to the IPv4 address and this will basically allow you to encourage IPv6 adoption, but the client would not get any visible delay.
It's, nowadays, present in a lot of operating systems and most of the major browsers, Chrome, Safari, Firefox, etc., it's basically everywhere nowadays and we are going to use that system to give it a different purpose other than IPv6 to IPv4 fallback and adoption.
So again, back to a quick overview. The current focus, or the focus so far with happy eyeballs was just what is the client doing? The client is the one in control, the client is the one doing all the work and now we want to look at from a server side. If I am a server operator and I know that a lot of clients on the Internet nowadays have happy eyeballs, is there maybe something I can gain from that knowledge and is there something ‑‑ some way I can utilise all of that existing code?
So, the goal is to give the power of two choices to the servers to basically selectively reject individual requests that come to a server, because I, as a server, know the client has happy eyeballs, so if I reject the request, there is a different server that got result from the same host name that will then handle the request.
And by doing so, I get implicit load balancing control. On the server, obviously I have to know that there is a different server available that will be able to handle that request, which brings us to our requirements. For this to work, I have to have a global overview of my network and its current state, so I need to know all the node information on my network, all the servers, I need to know their current state. I need to also obviously have to have more than one node, and otherwise you wouldn't be able to load‑balance anything. And I need to have at least two pairs of IP addresses, so one pair IPv4 and IPv6 and the second IPv4 and IPv6. The more address pairs you have, the better the load balancing gets because you have more choices.
I guess, for the minimum, you need two pairs.
So, let's quickly look at the principle. So what you can see here is we have two nodes, the most basic case in our network, one is located in Los Angeles, it has an IPv4 address and an IPv6 address. One is located in London, it also has an IPv4 and IPv6 address, and then we also have our load balance manager, which is located somewhere, which is, as I mentioned before, the component has the global overview of the network. Now we get a lot of traffic from Europe because maybe it's the morning there, and all of our traffic would flow to the IPv6 interface of the London node because we assume that RTT to London is shorter than LA. All of our clients are using happy eyeballs so we give a preference to IPv6 addresses so all the traffic would end up in the IPv6 in London and LA wouldn't get any traffic. Obviously it's maybe not exactly desired all the time because if you want to balance load across nodes, we want to also utilise the LA node and not have it idle all the time.
What happens now is that the ‑‑ both nodes send their status reports to the load balance manager, which will then make a decision on what to do about this situation.
So, it will now instruct the London node to turn off its IPv6 interface, for example, which will lead to the happy eyeballs implementations on all the clients noticing that basically the connection to the IPv6 address on the land node is not possible any more and they will just fallback to alternatives. Some of them might go to the IPv6 address in LA because even though the RTT is higher than the one to the IPv4 ‑‑ to the London node, happy eyeballs gives a preference to IPv6, so they would move that traffic there. Where the others would go to the IPv4 interface on the London node as the RTT for them to LA would just be way too long.
And doing so, we get the some form of load balancing established without having to do anything on the clients.
So, just for a bit more architecture, we have the different nodes, servers with different IPs. We have one load balancing manager. Each node sends a status report to the load balancing manager periodically, it says one second but obviously that is something that could be fine‑tuned. You can see the status report packet at the bottom there, and again it's not that big, it wouldn't create a lot of overhead on the network to send it. The LBM then will decide what to do about the current state of the network and which nodes to shut down or which ones to turn on. The algorithm, this is done by obviously the ‑‑ it's very implementation‑specific and what you actually want to achieve, this is more about showing that how the core principle works, the fine‑grained load balancing isn't that relevant here. It has to guarantee at all times that at least one node is able to handle traffic. Obviously, where all nodes are at capacity, it can't do anything. As in every other case, the LBM has to ensure connection is possible for the clients. So there will not be a client that can't establish a connection because, even for a short period of time, all nodes are rejecting all the traffic.
Some of the advantages, as I said before there is no implementation required on the client side. We just use the existing code that is present everywhere now, which makes implementation and especially deployment quite easy.
There is very low overhead especially on the nodes, on the servers, because they have to send small status reports once in a while. The load balance manager might have a bit more aware, but it's not that important.
And as I said at the beginning, it's especially useful for Anycast because you can get a more fine‑grained load balancing control than the usual catchment‑based one where you basically have to rely on BGP choosing the node, which the traffic goes to, and also, you get the choice between both servers and protocols. So if you want to say for some reason you don't want to get any IPv6 traffic at the moment, you can just turn ‑‑ just do that because you can just turn off your IPv6 interface and happy eyeballs will then just use IPv4, and the other way around as well.
Some of the drawback. There is obviously more drawbacks, and this is one of the reasons why I'm presenting this today, to get feedback for that from this, this is out of my master's thesis from a bigger audience.
Happy eyeballs has to be present on all the clients, which is, as I said in the beginning, is not that far off from reality, but obviously there is always going to be some cases where there is no happy eyeballs so you would have to think about a fallback mechanism for those clients and it can increase the latency slightly for the clients, as we saw with the London and LA nodes, some of the clients that are now going to LA have a bigger RTT than necessarily would have to have, but that's probably not going to be too noticeable for them.
So, we also implemented out of this, we implemented it in Rust, it's research code, it's not optimised or anything like that. There was quite a few abstractions made for simulating the load and stuff like that. But, it works. We proved it works, we deployed it on AWS, six nodes around the world, six different locations, and then we just associated the 12 IP addresses and host name in DNS and when you went to that website, you would get, from different places on the world, you would go to different servers.
Also, we want deployed on Anycast to actually prove that it works in catchments as well, so real live Anycast infrastructure, but it's not that easy to get in prefix, or any infrastructure like that, so if you have any ideas about where we could deploy it, I would be thankful if you can give me a shout. So ‑‑ and here is one of the results. As I said, we did a bunch of times. For this one we just set the threshold to 60%, so once a node reaches 60%, it will ‑‑ it will instruct that node to no longer accept traffic. On the left we see the load in percent for different nodes and at the bottom it's time in seconds. You see in this case we just tested it from a single, so we only had one client, and that was me located in Berlin. As you can see, first it's traffic to the Frankfurt node and then London, as soon as it reached 60%, it instructed to shut off and then received the other node. As you see, two of the nodes aren't getting any traffic at all because the RTT is it too long and there is never enough traffic generated for them to even become necessary.
All right. Thanks for your attention. If you have any questions or comments, please feel free to ask.
ALEXANDER AZIMOV: Thank you. I see at least one person in the queue.
SPEAKER: Thanks for the presentation. I would like to comment on the assumption that happy eyeballs is everywhere. It's not true. It's not true for all browsers, even. So, I would suggest not to go this path if possible, because it assumes that happy eyeballs will just take over and process the rejects, as you did mention. So please be very careful about this idea. Research the audience you target. If it's not just web browsers, if it's, for example, any client with wget, there is no happy eyeballs. If there is an API client, there is no happy eyeballs usually on API. Like, the programming languages, please be very careful about that. That's, like, a comment.
And then a suggestion: Happy eyeballs was created as part of the ‑‑ as the market was kind of afraid that IPv6 is not reliable enough. That's why happy eyeballs was invented ten years ago, and eventually, happy eyeballs is effectively hiding problems with IPv4 and even IPv6 networks. I think we, as the community, should be more targeting the happy eyeballs removal eventually rather than keeping it alive and using it for client side decision‑making. So that's from me. Thank you very much.
MAX FRANKE: Thanks.
ALEXANDER AZIMOV: There is another question. It comes from Peter van Dijk from PowerDNS:
"You mentioned that many clients are happy with these browsers IOSs and others. Have you also found clients that are not happy?"
MAX FRANKE: I guess that goes back to the previous comment as well. So, before I did this, I did a big background study on the current state of happy eyeballs. Actually, surprisingly, there is still a lot of happy eyeballs implementations getting added, for example, to like Cora Labris (?) from ‑‑ okay, but, to the point. Wget has it and a lot of programming languages are still adding it, and you can see, obviously, the quality of the implementation differs. For example, the Apple ecosystem has the most closest to the spec, to the RFC implementation of the happy eyeballs. For example, the one in Chrome is somewhat bare‑bones, but it still works even with the more bare‑bones fallback mechanisms from IPv6 to IPv4, even if they aren't exactly implemented as the spec wants them to be. So, I didn't find ‑‑ we tried with a bunch of different browsers ‑ Chrome, Safari, Firefox, all the big ones ‑ and it worked for all of them, on mobile and on desktop operating systems. So, if there is any browsers ‑‑ of course there is probably some, as I said, corner cases and stuff like that where it's not supported, but, from what I can tell, it works pretty well with all the browsers.
ALEXANDER AZIMOV: Okay. And I don't see any more questions in the line and no people in the line. So, I'd like to thank you very much for the report, and feel free to contact people that are maybe discussing it in the general. Thank you for your wrap ‑up.
And now we are moving to the last lightning talk for the day. So, it's good that we are living in a world where RPKI is getting momentum, but is there a router regional validation adoption level... and we are finding operational issues that are not noticed previously. Please welcome Randy Bush from IIJ Research and Arrcus with his findings.
RANDY BUSH: You are offering me the slide deck that I need?
ALEXANDER AZIMOV: You can't see the slides?
RANDY BUSH: I am getting the wrong set. Got it.
And it's a cast of 1,000, and, also, I should credit Saku for evidently discovering this. I hate to tell you, but if you are running BGP validation and dropping invalids, newer neighbours are going to see this presentation and find out they really don't like you.
This is an oversimplified BGP to explain this. We have ‑‑ by the way, the peers on the left and the right are the same peers, of course. The peers sending BGP information, you ideally keep it in an adjacency RIB in and then you process policy, you decide what to send out to which neighbours and, of course, you push the best paths down to the forwarding plane.
This is resource‑intensive, to say the least. So, in the late eighties, RAM and CPU were constrained. Cisco 004... developed data structures and algorithms to get rid off Adj‑RIB‑in, which was a big hunk of data and a complex retrieve diversal problem. So what he did was he merged the Adj‑RIB‑in and kept the interesting part of the data as part of the mechanism of the BGP algorithms themselves. They may not have the best path policy are all pretty much the same. But you'll notice the blank space over on the left of the screen here, he saved space. So, this was pretty radical, has been copied by many vendors, and has interesting consequences, we're going to concentrate on one, which is how it affects route origin validation.
So if I had not kept the Adj‑RIB‑in and policy changes, I need to reevaluate all the paths, but all the paths aren't in the Adj‑RIB‑in because we got rid of it. So, [Enke] came up with route refresh, it's negotiated at the beginning of the BGP session in open, and it means that, oops, policy is changed, I send a route refresh message to my peers, and they all send all the paths to me again, and I can run full policy. Because policy changes so rarely, right, it operates, reconfiguration policy on the router. This kind of is okay, it doesn't scale too unreasonable.
But RPKI data are new and different policy, okay. This is ‑‑ you now have a second data source, not just BGP, but ROAs and possibly ASPA, or BGPSEC, or whatever, but RPKI‑based policy.
RPKI data changes very fairly, in the order of an hour or minutes.
RPKI operator policy and RPKI policy are done before best path is calculated, right, so that what happens is that they come in and they hit the BGP mechanism that doesn't have the Adj‑RIB‑in, and guess what? Route refresh has to be issued.
So if there were invalids, in other words if I previously had stayed where paths were dropped because ROAs made them invalid, they are not in the Adj‑RIB‑in to compete in the next round if the RPKI data changes. So the router issues a route refresh to all its peers. And this is not infrequent, to be polite about it. The refresh is horrible, okay. If you had had Adj‑RIB‑in, a new ROA would only affect the sub‑tree, a very scaled operation. Route refresh gets you a full table, and I don't know which peers I need it from, so guess what? I get the whole tree, and while I am processing that head of line blocking stops incremental updates, so if you have a slow convergence router, and let's remember that I probably chose to eliminate the Adj‑RIB‑in because I have small resources, and that's going to be a small weak router, then you have really killed convergence.
And as John says, a change in import policy, an operator changes an import policy, that usually only affects a single peer and therefore it will only require a route refresh towards that one peer. A ROA change, I don't know who it affects if I don't have Adj‑RIB‑in. So I'm going to run it against all my peers. Great, I have just successfully DDoSSed myself.
The result is not theory. SEACOM, Mark Tinka's company, sends the cable all the wait up through Africa, through Europe and AMS‑IX, so on and so forth, here is a D peering notice from AT&T. Okay. This is a real whack. Think about an IXP with hundreds of members doing route origin validation and issuing route refresh to the route servers every few minutes. Talk about DDoS. But what's more fun is reverse it and think about the route servers sending route refresh to hundreds of peers every time they get new RPKI data. Now, it turns out you probably don't have to worry about this because the route servers turn out not to really implement ROV, but that's a side subject, we're finding out what people's implementations are, and they are fine.
So what's the solution? The first is keep a full Adj‑RIB‑in. Okay. Juniper does this by default. Cisco, you have to turn it off. Inbound soft, okay. But that can be resourceive intensive on old hardware and, you know, you do have to advertise this stuff. But on anything that has real RAM and real processors, there is no excuse not to keep the full Adj‑RIB‑in. So if you have something that has the resources, if it's not on by default, then turn it on. This is the real solution.
But the Internet draft I am describing is a hack, if you did not keep the Adj‑RIB‑in. If a path was marked invalid by route origin validation, don't throw it away, keep it in your back pocket and mark it as dead, and by "dead" I mean do not let it compete for best path, okay. But keep it in your back pocket because then you can revive it if it needs to compete. You don't have to send the route refresh. This is a lot smaller set than full Adj‑RIB‑in, unless one of your neighbours does a 7,007 or dumps a full table on you in some way and then route origin validation is going to say all those are invalid and all of a sudden your back pocket has real problem holding it. So probably you want to do a pre‑policy Max prefix for sure, so that they can't dump the full routing table on you, redigest it in faults. Okay.
The third solution is: Don't run RPKI policy if you can't do one of the first solutions. Yeah, please turn off dropping invalids. No, this is very ‑‑ makes us sad.
How do you test? Well, different vendors have different ways to count how often you have frequently and what the counters are for route refresh. There is no MIB. There are CLI and YANG queries on most devices.
We didn't know this was happening for years, that's because we don't measure our networks very well. Shame on us!
And that's it. Questions?
WOLFGANG TREMMEL: Thank you, Randy. There was one question from the chat from Job Snijders:
"Do you know off the top of your head which OSS CO2S BGP implementations do not have an Adj‑RIB in enabled by default?"
RANDY BUSH: Two things: One is, since I do not know all implementations, I cannot correctly answer the question. The second thing is, one of the occasionally popular vendors, I think they are in California, they begin with a C, has Adj‑RIB‑in by default.
WOLFGANG TREMMEL: Okay.
RANDY BUSH: But if you want to see something really novel, take a look at Bird.
WOLFGANG TREMMEL: Okay. Any more questions? If not, there is a question from me. Well, I do BGP a bit, and the question is: Why do you think ‑‑ well, memory has become so cheap, and why do you think router vendors simply do not use kind of SSD just to cache that information and use that instead of requesting the neighbour to resend everything?
RANDY BUSH: Well, let me pretend I work on Tasman Drive, and I have been shipping, in these images, this version of software, but let's be specific. Let's not be silly. The IOS VR, for a decade or more, I can't all of a sudden change the default, because customers, millions of customers who did not have it hard‑configured in their configuration will upgrade their router software and blame, everything changes. So, I can't ship a fix to turn on Adj‑RIB‑in.
WOLFGANG TREMMEL: Okay. Are there any more questions? Okay. It looks like not. Thank you ‑‑ oh, there is one more question from Carlos Martinez: Does presence/absence of the adjacency Adj‑RIB‑in have anything to do with what the vendors call soft reconfiguration?
RANDY BUSH: Yes. Cisco, by default, ships no Adj‑RIB‑in, okay. So, if you want to turn it on, it's called soft input on the input. It's not, watch out there is soft configuration output also. That's different.
WOLFGANG TREMMEL: Okay.
RANDY BUSH: But yes, Carlos, indeed, you are correct.
WOLFGANG TREMMEL: All right. So, I see no more questions. Thank you, Randy, for this presentation. And also thanks to all presenters of this session and, with that, I think the Plenary sessions for today are over. I would like to thank all presenters. I would like to thank the stenographers. Please rate the talks, it helps us in the Programme Committee to make our decisions for the next RIPE meeting, and also, if you are interested in working in the Programme Committee, please contact us, there is a PC election going on, just put your name into the hat.
And with that, I say goodbye, and we will see each other I guess in the social in half an hour, and thank you and see you tomorrow for Address Policy which starts at 10:30, and the next Plenary will be on Friday.
LIVE CAPTIONING BY
MARY McKEON, RMR, CRR, CBC