RIPE 83
.
Plenary session
.
22 November 2021
.
11:00 a.m. (UTC + 1)
.
.
FRANZISKA LICHTBLAU: Good morning, everyone, welcome you to our fourth virtual RIPE meeting. My name is Franziska, I am the Chair of the RIPE PC, and the PC will be responsible for the upcoming Plenary sessions these days. If you have any questions, don't hesitate to ask us, ping us on SpatialChat, you can nominate yourself to work with us, and I hope you all have a very enjoyable RIPE meeting.
So, let's get started. Our first speaker is Giovane. He works as a data scientist at SIDN Labs and is also a researcher at TU Delft. He is one the folks who tries to get two communities together, namely the operations and the research community. So welcome, Giovane. He will talk about responsible disclosure.
.
GIOVANE MOURA: Thanks, Franziska. Can you hear me well? All right. Perfect.
So, let's get started.
So good morning, everybody, or evening or afternoon, depending on where you are. Today, I am going to be talking about responsible disclosure here. My name is Giovane and I am at the SIDN Labs and recently ‑‑ well, last year, we started, we came across this DNS vulnerability TsuNAME and we carried out responsible disclosure on that so I want to share a little bit of our experience in this case.
.
There is actually also made, we wrote this paper on that, which appears in the last ACM IMC conference, here is the link. I am also including here a link to the video related to this paper because today I'm not going to be talking about TsuNAME per se but only, like, the responsible disclosure part referring in the details to the vulnerability. I would recommend you to the paper and the video here that we presented previously, and the idea today is just to share experience that probably maybe can be useful to other people.
So let's say, you know, you are a researcher and an operator or developer or, you know, enthusiast or whatever, and then you came across a vulnerability in a product, maybe protocol software, hardware, and, at least for me, I have been working in research for a little while and I never actually came across vulnerability before. So it was actually a rare event for all of us in this paper. But the first thing that comes to mind when you are into that, it's, what you do in this case? Everybody talks about responsible disclosure, but does it really work in practice? For this particular vulnerability, we carried out responsible disclosure, but let's see how actually that panned out.
So, today the goal here is to share experiences and may help some people in the future that came across that, and also to share our mistakes. We're not perfect. We never had done that before, none of us, and we can also show where we made mistakes, but also what went well.
And a disclaimer here is that our sample size is only one. Again, we never came across this before. So, let's see how it goes here.
Now, disclosing vulnerability. If you come across vulnerability, there are four things you can pretty much do, besides not telling anybody. One thing you can do is what we call private disclosure. You notify a vendor about it and say hey, there is a vulnerability in your product, please fix it. I am going to cover them later. But, like, that, you just put the vendor in the hot spot and you allow them to get to know about that.
.
The other thing that you can do is public disclosure, that's pretty much what OpenBSD recommends. You tell everybody and the vendor at the same time that this ‑‑ about this vulnerability, and there is another one later which was created by researchers and the tech industry which was called responsible disclosure, it's the way that you combine both. You first notify the vendor and then you give them some heads‑up and extra time so they can fix the product and then say, I am going to disclose this publicly at a certain date.
The fourth thing you can do is go rogue. Then you can actually try to sell your vulnerability and there is a lot of secured agencies that are going to be interested in buying that to exploit others. We don't recommend that. It's not ethical and we don't cover that in the paper or in the talk.
When we talk about private disclosure, you just tell the vendor and the vendor has the power to decide what they are going to do about it, and I think this is pretty much defunct. Well, not really, if you think about it, because many people they come across, many people in this particular case of this, I am going to talk about them later, they came across this before, but they notified the vendors, didn't cause much impact in practice because the issues continued. And in the past, vendors could simply ‑‑ do our researchers in that so ‑‑ but if you are interested, here is a link on this. He talks about the issues and the problems related to private disclosure, that you only leave the power in the hands of the vendor.
The alternative to that is that like the public disclosure. Tell everyone at the same time. It's a damn good idea, according to him. It brings public scrutiny to this vulnerabilities, it's pretty much like open source code. Everyone can look into it. That's the only reason, according to Shania, that the vendor patch their system because they get a better reputation in that. The issue is if you directly doing public disclosure of a vulnerability is that patches may not be available at the time of the disclosure, some people can exploit it.
.
I mean, last month, there was the National Cyber Security Centre in the Netherlands, they decided to public disclose one RPKI vulnerability. I don't have much information on that, but but here is the link the discussion of the vendors if you are interested. Just to see a real case of public disclosure, they kind of stepped back and they contacted the vendors first but later they very quickly released any information publicly.
I think the public disclosure is great. It's the way to tell the people what's going on.
Later, researchers came with this idea of combining both. You first notify a vendor and then they can patch their ISPs and then public disclosure. And according to Shania, he says these things only exist nowadays and it's defunct only because public disclosure came first to lay the foundations for that.
In a nutshell, what happens with TsuNAME is you have DNS resolvers, which are DNS servers trying to resolve a domain name and they need to fetch the information from servers which are called authoritative servers and there is a configuration in a bug that, when combined, causes resolvers and clients to overwhelm the targets which are like authoritative servers which can potentially bring them down. And the issue that there is started with symmetry in this vulnerability, it's the bugs on resolvers which are run by, you know, your Google or whatever, public DNS, or Cisco, but it's the authoritative servers that pay the price, this is run ‑‑ I don't know, there are different companies that run different top levels domains, we run .nl, VeriSign runs .com. It's a different sort of institutions, or companies. And after we publicly disclosed that, one the European city of these shared their graph with us that showed that they had a ten times traffic of doing TsuNAME vulnerability. Two domains of them were RIS configured and you see in this graph that each line represents a different server a different colour, you see how the traffic increased and only around 11 a.m. when the traffic sharply dropped is when they fix that had issue.
So that's the timeline we used for disclosure. First, we chat to Google, it took them a little while, then you say hey, folks, we can privately disclose this first. We already told that you it's private but we're going to carry out responsible disclosure. First, we tell you, then we are going to tell you a selected Working Group. They are very helpful with this that, because this is established DNS community of trusted parties. You can share private information in there, and, before we release this information, the day before, Google fixed the entire system, which was great. Google was one the most affected systems here and it took them less than three months to fix that and later Cisco DNS fixed that in 40 days after our disclosure to DNS work and we also discussed that in other venues as well. Later, we carried out the public disclosure of the vulnerability on the public website.
So, moving on. What did you learn in this process? I think that was the key lessons here for, that we would like to share with you.
It worked. Responsible disclosure really worked in this case. Google and Cisco, which are main DNS providers, fixed their DNS software and it was enough time for them to react and to fix that and we also obtained self‑reports from the other vendors, so it really worked. But again, we cannot generalise our own sample size one experience to all the vulnerabilities but in our case we can say it worked.
One of the mistakes we had is that we did not, when we notified Google in a private disclosure, we did not set the date for the public disclosure and that was a mistake because I think people work to deadlines, we all work with deadlines, if we don't have a deadline, you will have have other things to do and put stuff on the back burner. My suggestion is, if you find a vulnerability, please set from the very first mail you sent to them a deadline for that and it depends obviously on the severity of the vulnerability, and you have to weigh out those risks. 990 days, if you read the recommendations for responsible disclosure it is more than enough, and that was the case in our case. So set above a disclosed date from the start.
And when in doubt we also recommended to disclose it. I mean, when we came across that, we had no really evidence of any large DDoS being carried out based on the TsuNAME and we believed this vulnerability existed for years and years and years, and we asked: Is this really worth disclosing? Because people should have already known about that. And the answer is: Yes. You don't have, actually, when you are a researcher seeking ‑‑ you don't have a complete view of the Internet and know what the other people are going through and it's also important to let others take responsibility for their own products and figuring stuff.
And not disclosing any of this would be simply security by obscurity, which is the thing you want to avoid if you do Internet security. And it's also better to be safe than sorry if you disclose, you know, people can fix it, then if you don't disclose attacks can occur and you can to do that yourself.
And one the advantages of the disclosure in this particular case with the statement we had a tool to find a sort of bugs in DNS files and that allowed ‑‑ when we disclosed the DNS‑QUARC, a lot of, for example, came together and helped to make the tool and we are thankful to all of them. They improved the tool, made it faster, better, precise, you can leverage the power of the community. And that was really an effort that came, it would start with us like a researcher finding this out but it became a community effort, and we were really thankful to all the people that helped us in that.
.
One thing that I didn't know beforehand: It takes a lot of time and energy and patience to do a disclosure. In our case, it affected not only the resolver operators, but also folks that run authoritative number resources, so we had other two different types of crowds and developers operators on each of them, so we rely upon different venues that we could trust as parties to disclose that. We did disclosures in four languages and different venues here and it takes time and effort. You have to be prepared for that.
.
And another thing that we learned is that trust is essential. What we did once we contacted people by e‑mail, ,we asked for the BGP keys and we were very transparent and open from the beginning. But I recommend you to check with your legal folks in your company. I did that to cover yourself. I mean, just in case, it doesn't hurt, I mean I am not a lawyer, I don't know what's the legal framework, so I think it's better to check with them too.
Another thing you will learn, here in this slide you have a cat which is not very happy, is that you cannot make everybody happy. So when you kind of disclose that, they don't know how people would react, and, after processing all the things back, I think there were three types of reactions to that:
.
One were people were very positive about it. These were the vendors and operators who had already suffered TsuNAME attacks themselves. We are not the first one seeing that. So they came forward, they told us thanks so much for doing this. I mean, we have seen this before. We private‑disclosed that to the vendors, they didn't fix it, so we're thankful for you coming forward with that.
.
We also had negative feedback, people were accusing us of fear‑mongering and saying there are other ways of doing DDoS, nobody is going to do that.
We also had people who are indifferent, saying "I don't care". You have got to be prepared for that. I think that's okay. It's okay because the goal here is not to make everybody happy. You are not trying to, you know, be the smartest guy in the room, but what we are trying to do is to manage, you know, folks to fix their software and mitigate this vulnerability and Google Cisco will fix that that protects all their users, I think that's the main goal here. So be prepared to not make everybody happy.
And, another thing you can try to do in your presentation is to make the most of constructive feedback. After the public disclosure, I tweeted this and I found this to be most far my most popular Twitter ‑‑ tweet. I am not really an avid Twitter user, but, right after that, Randy Bush responded that it's a shame that cycle prevention was not in the early DNS RFCs. Oh, wait, it was. I have a deep respect for Randy. That's how the feedback, was it really in the standards, and it turned out he was partially right. We had missed four RFCs, and if you miss related work and you are a researcher, it's one the worst mistakes you can do, but it turned out that none of these RFCs fully addressed this issues. The feedback that Randy gave us allowed us and motivated us to write the new Internet draft and we submitted this Internet draft to the DNS Operations Working Group within the IETF, and we got some feedback there in the list. We are actually working on a new version right now. It's not clear what's going to happy with the draft because it's in the very early stages, but at least we're getting some momentum in the discussions in here. So, make most of the constructive feedback that you get.
I think one the last things I probably can talk about this, it's about taxes. This is an extra thing that it was not even in the goal here. But, in the process of disclosing this, Google awarded us a bug bounty, they give you some cash, and the funny thing is, they said, hey, here is the link, if you want to get the money, that's what you got to do, you got to fill in an eight‑page‑long form with 30 sections and you can download here in this PDF. This is not Google. This is the Internet internal review service, the US tax authority and you have to pay 30% tax, you get to talk to your bank to get it here in Europe and where I live and in it's euros and your bank won't accept that, we didn't want the money actually, we were happy with the bounty but we just wanted to donate that because it's not our money, we would like to give it back to the community somehow. I wrote a mail to Google. Can you just donate this to one of the charity of our choices and we just did that and they said there is an app for that, click here, you know, I'll forward that to you and then we could donate with one click and it went there.
That was a funny thing. If there are people that make a living out of bug bounties and they have to go through that, but that was not our goal, so there was an app for that and we could simply donate that to the charity; we choose Wikipedia in this case.
So, I think, as a summary, we can say that responsible disclosure worked. It took more effort and time that than we anticipated. It was unchartered terrain here for us. But I think, overall, we had positive responses and my recommendation for researchers, you should try that first if you have a chance before trying simply private disclosure, that's something I don't recommend. In this particular case of some vendors came forward and said we reached for the vendors before, they knew about it but we didn't manage to get them to fix it. I'm not going to say which vendors were they, but like, they sat in a public disclosure, take the vendors of the peer pressure and the public pressure of actually fixing that, so try responsible disclosure and ‑‑ because it worked here and Google fixed actually, and others fixed their stuff. It was a positive outcome.
We have an IETF draft now under review and I can say we have a slightly safer DNS because of that, because of the community engagement, because of the process, because of the help of everybody else in the process.
And I think that's what I had to say. If there are any questions, I'd be happy to take them.
JAN ZORZ: Okay, thank you for this very nice presentation. And I think we have two questions in the Q&A section.
So, Brett Carr is saying: "A comment not a question: We are interested in ensuring that vendors also do responsible disclosure to customers of fixes in a staged manner so that operators of critical Internet services get to know in advance of the public exposure. This allows critical operators to patch their systems before a critical vulnerability is well known. Some vendors do this already, but many do not."
.
Any comments, or should I go to Erik Bais that has the next question?
GIOVANE MOURA: I don't think I have specific comments. I agree with him.
JAN ZORZ: Okay. So Erik Bais is saying: "On the tax issue, most of the software companies have local offices like a Dutch Google entity or more similar for your location, that might fix the US tax issue."
GIOVANE MOURA: Oh, I didn't know that. I mean, I actually I was just like thinking we do this and I got scared of those forms and I was like can you donate this? And he said yes. So it's like please, let's ‑‑ so I didn't even try to get further than that, but... yeah... so, thanks. I just didn't look much into that. It was the first step that I did and ‑‑ sorry, I was not willing to fill such a long form. We wanted to donate anyway.
JAN ZORZ: So we would prefer if people asked the question themselves so we don't have to read it and ask for the microphone, and... oh, Peter wrote a novel. Peter, would you like to ask yourself or you want to read? Let me read it:
.
Peter Hessler from OpenBSD Project: "I would like to make some clarifications about OpenBSD stance on disclosure. First, for security issues that we ourselves discover in OpenBSD. We developed the fix, prepare the patches and only then we do the announce the fixes to the public. When outside groups contact OpenBSD developers about the security issue, normally our developers will agree to reasonable conditions so we can fix our own software and assist others to fix their software. OpenBSD is not a legal entity and cannot sign any NDA or contracts only the regional developers can."
GIOVANE MOURA: Thanks for sharing it. I mean, in our particular case, we didn't impose any legal NDA or anything with the people we contacted. We just trust them. I think trust is key. If you go for legal, I think it makes things much more difficult and, I mean, maybe in our case I knew people already in the company, it makes things easier, on top of that when we talked to DNS‑QUARC, I requested them ‑‑ follow some legal respect of DNS work and confidentiality is one of those, so we already had in place a trusted place to disclose that. But even people that we contacted by mail, we never imposed an NDA or anything. I think trust was a choice. Maybe it was a naive choice but that's what we did.
JAN ZORZ: Okay. Are there any other questions?
FRANZISKA LICHTBLAU: I don't see any.
FRANZISKA LICHTBLAU: Daniel Karrenberg from the RIPE NCC: He poses a question to the audience, listen up: "Related to Brett's suggestion" ‑‑ so the first one that Jan read out: "Is there a mechanism to reach all operators of critical infrastructure worldwide who determines who is such an operator?" So I see there is some basis for discussion, maybe we can have that later in SpatialChat or something, or you can, like, come back to Daniel in the chat.
Oh, somebody is asking to send audio? Oh, no it looks like a mistake.
Okay. So ‑‑
JAN ZORZ: No more questions?
GIOVANE MOURA: Maybe I can ask something, like, to the crowd here, the RIPE crowd here. Does anybody here have previous experience of public disclosure? Because in my discussions with the operators, when we private disclosed that the DNS‑QUARC, I mean I can name four other people who knew about that and they had notified vendors and I think one really, I mean I think that really puts people ‑‑ hold people back when doing responsible disclosure is the effort. You have to do a lot of work for that. And they only try and notify the vendor because they are operators, that's how the day job, we are getting this problem here, can you fix your software. But that's where they stopped and they didn't really have much incentives, in my understanding, I cannot speak for them, to go further, and because I am a researcher, I have more, you know, ground to do that, and it took a lot of time, but I think one the issues of responsible disclosure is that you have to deal with yourself a lot of work, and a lot of people, operators, they don't have the time and energy for that and my question to the crowd is: Have you folks ever did private disclosure? And if you did, be aware that the vendors might not fix it and that might hammer other people in the future, but has anybody else any experience on that because I'd like ‑‑ or public disclosure or private or responsible? I'd like to hear what people have to say about it.
FRANZISKA LICHTBLAU: We have five minutes left, so if anybody wants to tell their story, request audio and we're happy to talk to you.
JAN ZORZ: Somebody asked to send the audio, but then ‑‑ again ‑‑ hi, Michael.
MICHAEL RICHARDSON: I am a receiver of responsible disclosure for TCP dump, and let me tell you from the other side of things. We get dozens per month of fly‑by reports from people running fuzzers, they would like us to find them a CVE number as soon as possible so they can collect their Google bounty. They almost never respond for clarifications. They don't usually have test data that demonstrates the bug. Sometimes it occurs only when you use the Clang 32‑bit compiler on a 64‑bit platform and test it in ‑‑ on this other machine, and it will take us somewhere between two and 20 hours to validate each report as volunteers on an open source, probably. So I can't imagine what it's like for other vendors, okay. And so if you are not getting that response from vendors, it's part of this problem, and I am sorry to say that the bug bounties are the cause of this. We now have an ecosystem of people running fuzzers, in come cases using electricity, not bitcoin level, but wait a bit, who are basically depending on these bug bounties to pay for their electricity and pay for their stuff. And so it's really difficult. And then okay, so then as an Open Source project, I now have to maintain a separate GIT tree that's private in order to not publicly disclose the fix. It's really hard to collaborate with people. We set up a Toras for simply to be able to do that, and, as somebody said in the chat, preferentially disclose to, we'd love to preferentially disclose to Red Hat and Boo Too and these people so that the packages are ready. The best, funniest, Apple for instance, the best funniest part is we disclosed to Apple and they took six to ten months to do this. Are we supposed to wait to get an acknowledgment from them before we publicly disclose or not? We have had people ask us could you please, you know, agree to the 90‑day public disclosure? And our comment is: No. You might as well not even report it to us if you are going to make us take 90 days. Just go public, because we are not going to fix it in 90 days based upon the number of things in the queue. And about half the time it's fixed upstream. It's fixed in our master anyway. They are reporting about some version that is not the latest or whatever. So, I don't have an answer for this, except that it would be nice to work together. It would be nice if someone would set up a secret decoder ring so that I could recognise clueful people from unclueful people. If you could create that, I'll join it. Until such time I don't know what we're going to do. It's just ‑‑ it's too much work. We need, like, literally secretaries to sort it all out. And I don't ‑‑ you know, if operators don't have the resources, I am going to say that our side don't have the resources. I don't think Apple or Google have the resources, or they are not spending it anyway, they had rather pay money on bug bounties. Anyway, there you go.
GIOVANE MOURA: Thanks a lot for this feedback, I had no idea that the bug bounty provided such sort of a negative incentives to people just to try to find any ‑‑ try to sort of a minor issue to make a living out of that. Thanks for sharing that. I haven't heard that. Maybe because people that I disclosed to you were, like, solid companies that had been there forever, these are big softwares and they probably had the resources to fix that.
MICHAEL RICHARDSON: I would also add that there is some other new entity that says they will give 20% of the bug bounty to the project. And I don't have ‑‑ I don't know if it's real. I have responded with scepticism to this person, and great idea, I don't know if 20% is the right one, I would say it depends on whether the report has a patch with it. If someone wants to send us a patch and a demonstration, here is how to break it. That's always great. That's always the hardest part that like 90% of the problem is to producing the bug, and if you reproduce it then you can fix it usually in five minutes. It's usually all it ‑‑ right. And then you go through the process of, how far do I release this and which branches do I release? So my opinion is, whoever writes the patch should get 80%. And if all you did is report it, I am like great, you may have been one of ten people who reported it, and who cares?
GIOVANE MOURA: Yeah, in your specific case, it's that your code is Open Source, so someone can actually write a patch. It was not the case in our case. And the thing that I said, one the things I said it takes a lot of energy and effort to do public disclosures because what I implied it was not the kind of lazy approach, the people that reported it, I am talking about trying to reproduce every single scenario, trying to get to like the bomb line of that and that took a lot of effort, to reproduce the experiments. In all the papers but it took a lot of energy, too. If you do not do the homework, I am sorry for that, and that's not the goal because you want to improve the software and protect people and not make a living out of that, and giving trouble to developers. I think that's the wrong way to go about it. But thanks for sharing that.
JAN ZORZ: Thank you. We need to go on because it's 11:32. Thank you for your presentation and thanks everybody for contributing to the discussion.
Now, Franziska, what do we have? One more speaker. So, let me see here, I have a short introduction. Isabelle Hamchaoui and Alexander Ferrieux, they provide the quality of service and traffic management at Orange and in order to stay connected to the reality in the field, they are used to troubleshooting Orange affiliates for the real problems and, in this context, they designed new methods and tools for troubleshooting in the harshest conditions, in particular for encrypted protocols such as QUIC. Let's see how Orange is troubleshooting with QUIC.
ALEXANDER FERRIEUX: Okay. So, maybe in this audience everybody already knows about QUIC. In that case, please speak upright now, or otherwise maybe we need to do some short introduction.
With my colleague, Isabelle, as we have just introduced us, we are really concerned about making the networks work as designed, so, it's a question of debugging and we feel that debugging is a bit of a forgotten constraint of the design today, especially with new protocols.
So what is QUIC? QUIC is something that has appeared in the aftermath of the Snowden scandal in '13 or '14, and everybody started being a bit touchy about privacy, of course, and also some ‑ of privacy, like link ability and so on. So, in this context, some players started being aware that something was needed to fight those breaches, and also to improve on existing protocols.
So, they designed something based on a ramification of TCP, so something which is basically advanced TCP but with deep encryption. So, QUIC was born. And it has steadily grown in importance, so now it is nearly 30% of the traffic that we are seeing at Orange.
After a very smooth start in Google, it has gained some momentum and now it has reached standard status as it has become an RFC published this year in the IETF, which is RFC 9000, and it is now a standard transport layer.
What is it exactly? A transport layer as Layer4 in the model, it's something that does end‑to‑end connectivity. So end‑to‑end means that nobody in between should intervene and it is also supposed to give some key assets to the end points. The key assets are a short delivery and a robust control of flow.
So these two items which are depicted here, error control, which means a short delivery, and flow control, which means you cannot send faster than the receiver accepts, a very important components of TCP, which is the mainstream Layer4 on the Internet, but it seems to be something really easy to achieve, but it's not.
When you do that, when you do some error and flow control, you need to expose some elements of information to do that. It allows some intervening elements like middle boxes to do some nasty stuff on it. So middle boxes, I will legitimately interfere with the transport layer at some point. For example, for acceleration or simply because of lack of resources, so firewalls starting to be very touchy about some new options like TCP Fast Open, which never saw the light because of the intervention of the nasty firewalls.
So, globally, this is known as ossification. Ossification is some break against progress that was completely accidental, but is, somehow, a consequence of exposing the important keys at the transport layer.
So, the answer which was designed to find this was QUIC. So basically it started with what you see here, which is the full stack of applications of TCP. The current stack means it works with a TLS encryption layer which is above the TCP and the application above it. In this scheme, the critical keys which are flow and collection control expose the layer at the TCP layer below the encryption. That was the problem. So, what QUIC did was to slightly move those functions up one layer, which is move flow control and congestion control up in the QUIC layer, which is the layer above UDP and is itself completely encrypted.
Thanks to this, the key information elements are now hidden. So that's good. No more ossification.
So, yes, it was the initial mindset that led to the construction of QUIC at Google and then at the IETF. But does it really work that way for everybody?
.
Well, not quite. In the presence of TCP as an operator, we have been ‑‑ we have grown some methods to debug the network and look at the issues and look at loss, for example, or RTT buildup. We take advantages of the exposed information elements to look at this, for example a sequence number that you can see on the left, which allows us to quickly determine some loss appears upstream or downstream from the measurement points. So we have meet points, which is, for example, somewhere in the middle of your network. You critically need to know whether you need to pursue the investigation on the right or on the left. So, to do that, you need some signal that tells us okay, loss is upstream, loss is downstream, so this is no longer possible in QUIC.
As we have just said, none of these information elements are exposed. So, what shall we do?
.
Of course, there are a few methods that could be envisioned which work without looking at the information elements but is all backdrop counters or two‑point measurements, where you put two separate meet points to absorb packets and correct. Or, for example, active measurements where you force some packets from your own flow with the special tags. Or, for example, key disclosure, where you ask the client to expose its cryptography keys just for him. None of these methods scales up, and we thought that we needed to do something at the protocol level to re‑involve the methods that we have shown worked very well in TCP, so, to do that, we advanced a met which we called the loss bits to detect and look at the segments without any packet numbers, without anything else.
So to do that, we drew some reference patterns in the packet and we need only 2 bits in the QUIC headers to do that.
The first bit that we are exposing, which is called the upstream loss, is just 1 square signal of end packets. End packets of 1 and 0 and so on, very simple, right. With this, we can just count the ‑‑ at the meet point, count the numbers of successive ones at 0 and compare with N, assuming N is none in advance. This gives you clearly the amount of upstream loss.
The second bit that we are proposing is the end‑to‑end loss. Here, we need the contribution of the work protocol stack itself to the centre stack on the right. We need up‑server loss, must be aware of it of course because it needs to do a short transportation so it will retransmit it. So it knows it has lost some packets. We ask the stack to just mark one bit, one packet with this bit. So, just by counting them, simple unary counting, we can also assess the amount of end‑to‑end loss.
Having both of them, by difference we can know also the downstream loss. With this, we can revolve the one that we have mentioned before.
So that seems pretty efficient and fairly simple, right. So what we did is that we presented it to the IETF in 2019 and we came not empty‑handed. We came with a large‑scale experiment with a partner, and we did an experiment on four countries, so it was large scale. The large scale was made possible because, as you can see, these bits are only to involved on the sender's side, so it was easy not to change a bit in the devices, and deploy it at large scale, just by...
.
So we exposed this to the IETF. We showed the scale of the experiment and we received a very lukewarm reception. So it's not enough to do a simple mechanism. It's not enough to be very simple. It's not enough to deploy widely to be heard at IETF.
So, as a summary, I'd say that we have seen some very, very serious threats against the basic day‑to‑day action of debugging the network. We are very wary that QUIC will completely blind us in many instances. We brought it to the IETF, and got very, varied mixed responses. Some support from Akamai and other supports. Very few operators would like us to join the support.
Somehow, meet a lukewarm reception from Google and Microsoft and fierce opposition from strong players like Facebook and Mozilla who, by the way, call the shots regarding QUIC at the IETF.
So that's where we are now.
Now, one side question would be: Is loss really critical? Why are you doing all that work against all intentions? Why worry about that today, in the Internet today? Of course, you may know about BBR, which is a new congestion control for TCP and also installed in QUIC which makes loss a bit less important. So my loss doesn't affect BBR TCP or BBR QUIC like it did for earlier congestion controls. So should we throw away all the methods? No, because there are also stronger losses and stronger losses which happen fairly frequently. Where we still need to locate the fault and locating is the name of the game here. We can, of course, measure with visitors methods, but locating is very hard.
So we think that our mechanism, which is very light, it's only 2 bits on the centre side, should be pursued and pushed with other methods. We don't know exactly how, and maybe, in this audience, we could get some hints.
Some references to trace the IETF. And that's it for now.
FRANZISKA LICHTBLAU: Thank you. So, do we have any questions? Any people who want to speak up and ask us all questions about QUIC? I see Daniel ‑‑
BENEDIKT STOCKEBRAND: What are the reasons given by Facebook and Mozilla to oppose your idea?
ALEXANDER FERRIEUX: Okay, there are basically two parts. One part is just privacy version, so clearly they feel that with some very ‑‑ with the scenario that somebody could imagine that one bit could be used for link ability to slightly increase link ability. We think that it doesn't hold. And the other reason, which was I think very bad faith, is that there were ‑‑ there was other fish to fry. When you are writing a big protocol with such ambitions as QUIC, you need to solve, admittedly, thousands of more important issues than that. So differences arise and told to come back with V2.
BENEDIKT STOCKEBRAND: Okay. But... not very nice to hear, but if you want to do a proper job, you have to basically get everything right, not just the big stuff and leave all the other stuff. Okay. Thank you.
JAN ZORZ: Okay. Any other questions? I see Sascha commenting that Franziska is very quiet. Nothing in Q&A?
FRANZISKA LICHTBLAU: We don't have any questions. Come on, people. Now we have time...
JAN ZORZ: We still have 12 minutes.
FRANZISKA LICHTBLAU: Ah, there is somebody.
JAN ZORZ: Somebody was asking to send audio, please click that button again so we can catch you and allow you. It's interesting, you have to be on participants' list if you want to see the ‑‑ who is asking for the video. If you are on any other panel, you don't see that.
FRANZISKA LICHTBLAU: We cannot see that. Okay. So...
JAN ZORZ: Okay, Brian Trammell.
BRIAN TRAMMELL: Hi. How is it going? Thanks a lot for the talk, it's really good to see this here at RIPE. I wouldn't have asked this but people are like "please ask questions". I was wondering ‑‑ so, in the ‑‑ in the trial that you had with Akamai, so there is a little bit of measurability in the protocol, in the spin bit, so, just for the latency thing. Did you look at the correlation between the latency and the loss in the links that you were looking at? To answer the question, like, so that your last question there is, why are we looking at loss? That's a question that I also posed when we are doing the spin bit thing earlier and I just realised that sort of like the experiment that you are doing here, you are looking at can you actually see the loss when it's on the network? I believe you did have the spin bit. Was the spin bit enabled on these links as well?
ALEXANDER FERRIEUX: Not at that time, because as you know very well ‑‑ the client says no.
BRIAN TRAMMELL: Right. Okay. Okay, then I say my question is, I guess then, a comment. It would be really interesting to look at, for future work here, looking at the correlation between the loss and the latency signals, because that would give a good answer to the 'is loss useful' question. I think you'd see that loss is probably is in the cases you care about.
ALEXANDER FERRIEUX: There are some questions that we know they are completely ‑‑ if you think about some bottleneck with the shaper with some RTT build up and there is a telltale sign about future loss which is RTT. But that is a tiny corner case and the dominant case is not that at all. The dominant case is completely independent cross traffic or transmission loss or stuff like that of faulty equipments and so on, and these do not build up RTT at all.
BRIAN TRAMMELL: I am saying it would be good to see that like actually sort of with these two things on the same inband signal. Right. We have information about that with active measurement. It would be interesting to see that sort of with this passive measurement as well. Probably, we should take this offline. I just realised this when I was looking at the talk again, I haven't seen this stuff for a while but it would be an interesting thing to run.
ALEXANDER FERRIEUX: Okay.
BRIAN TRAMMELL: Thanks.
JAN ZORZ: Okay. Any other questions? Apparently not. We are nine minutes earlier. So ‑‑
FRANZISKA LICHTBLAU: There is things in the chat panel. Usually we don't read those, but somebody is asking: "Thank you for your presentation. Is your proposal somewhere to review?"
ALEXANDER FERRIEUX: Okay. Just as I had written ‑‑ the IETF draft, which is in the references. Okay. In that slide there. So, the first one here gives you a detailed explanation, and the presentation about the experiment is the second link.
FRANZISKA LICHTBLAU: Okay. Thank you.
JAN ZORZ: Apparently, we'll have eight more minutes for coffee. Please, everybody do rate the presentations. This is the only way that we, as a Programme Committee, can have some feedback how the presenters did it, and we have the elections to the Programme Committee this time also coming up, and please vote for your candidates, and please put forward some nominations.
FRANZISKA LICHTBLAU: You can send your nomination to pc [at] ripe [dot] net and we would love to get many of them so we can have a diverse and nice committee again.
JAN ZORZ: Yes, so when is the next session? At 1:00, right? I believe we should close here.
FRANZISKA LICHTBLAU: Let's have some coffee.
JAN ZORZ: Unless anybody else has anything else to say and come back at 1:00.
FRANZISKA LICHTBLAU: Okay. So see you in a couple of minutes.
(Coffee break)