Right here’s a keynote I gave at RubyConf Mini final 12 months: Studying DNS in 10 years.
It’s about methods I exploit to be taught laborious issues. I simply seen that they’d
launched the video the opposite day, so I’m simply posting it now although I gave
the discuss 6 months in the past.
Right here’s the video, in addition to the slides and a transcript of (roughly) what I
mentioned within the discuss.
the video
the transcript
However this discuss is
probably not about DNS. I imply, it is a Ruby convention, proper? So this discuss is
actually about studying laborious issues, and DNS is an instance of one thing that was
laborious for me to be taught.
It took me possibly 16 years from the primary time
that like I purchased a site title and arrange my DNS data to once I actually
felt like I understood how the system labored.
And one factor I need to say at the start of this discuss, is that I believe that
taking like 16 years to be taught one thing like DNS is form of regular. The concept
that “I ought to perceive this already” is a bit foolish. For me, I used to be doing
different stuff for many of the 16 years! There was different stuff I needed to be taught.
And so, this discuss is just not about how you need to find out about any explicit
factor. I do not care in case you find out how DNS works! It is actually about
strategy studying one thing laborious that is a precedence so that you can be taught.
So, we will discuss studying by means of
a collection of tiny deep dives. My favourite approach of studying issues is to do
nothing, more often than not.
That is why it takes 10 years.
So for six months I will do nothing after which like I will furiously be taught one thing
for possibly half-hour or three hours or a day. After which I will declare
success and return to doing nothing for months. I discover this works rather well
for me.
Listed here are among the methods we will discuss for doing these tiny deep dives
First, we will begin briefly by speaking about what DNS is.
Subsequent, we will discuss spying on DNS.
Then we’re gonna discuss being confused, which is my foremost mode. (I am at all times confused about one thing!)
Then we’ll discuss studying the specification, we’ll going to
do some experiments, and we will implement our personal horrible model
of DNS.
And so what’s DNS actually briefly? DNS stands for the Area Title System. And
each time you go to an internet site like www.instance.com
, your browser
must search for that web site’s IP deal with. So DNS interprets
domains into IP addresses. It seems to be up different details about area
names too, however we’re principally simply going to speak about IP addresses immediately.
I need to briefly promote why I believe DNS is cool, as a result of we will be
speaking about it so much.
One cool factor about DNS is that it is this invisible system that controls the
total web.
For instance, you are in your cellphone, you are utilizing Google Maps, it must know,
the place is maps.google.com, proper? Or in your
pc, the place’s reddit.com? What is the IP deal with? And if we
did not have DNS, the complete web
would collapse.
I believe it is enjoyable to find out how this behind the scenes stuff works.
The opposite factor about DNS I discover attention-grabbing is that it is actually previous. There’s
this doc (RFC
1035) which defines how DNS works, that was written in 1987. And if
you’re taking that doc and also you write a program that works
the best way that paperwork says to work, your program will work. And I believe
that is form of wild, proper?
The fundamentals have not modified since earlier than I used to be born. So in case you’re slightly gradual
about studying about it, that is okay: it is not going to vary out from beneath
you.
Subsequent I need to discuss spying on DNS, which is one in all my favourite methods to
find out about issues.
I will discuss two spy instruments for DNS: dig and wireshark.
dig is a software for making DNS queries. We talked about you recognize, how your
browser must search for the IP deal with for maps.google.com
. We
can do this in dig!
After we run dig maps.google.com
, it prints out 5 fields. Let’s
discuss what these 5 fields are.
I’ve used instance.com as an alternative of maps.google.com on this slide, however the fields
are the identical. Let’s discuss 4 of them:
We’ve got the area title, no large deal
The Time To Stay, which is how lengthy to cache that report for therefore it is a at some point
You have got the report kind, A stands for deal with as a result of that is an IP deal with
And you’ve got the content material, which is the IP deal with
However I believe that the funniest subject in a DNS report
is that this subject within the center, IN, which stands for INternet. I suppose in 1987, they thought that we may be on loads of
completely different networks. So that they made an possibility for it. In actuality, we’re all on the
web. And each DNS question has class set to “web”. There are a few
others question lessons (CHAOS and HESIOD), which really virtually no person makes use of.
We will additionally form of poke round on the
web with Dig. We have talked about A data to search for IP addresses.
However there are
other forms of data like TXT data. So we will take a look at a TXT report
actually rapidly simply because I believe that is very enjoyable. We will take a look at twitter.com’s TXT data.
So TXT data are one thing that folks use for area verification, for
instance to show to Google that you simply personal twitter.com.
So what you are able to do is you possibly can set this DNS
report google-site-verification
. Google will let you know what to set
it to, you will set it, after which Google will consider you.
I believe it is form of enjoyable you can
like form of poke round with DNS and see that Twitter is utilizing
Miro or Canva or Mixpanel, that is all public. It is like slightly peek into what
individuals are doing inside their firms
Oh, the opposite factor about dig is that by default, dig’s output seems to be like
this, which could be very ugly and unreadable. There’s loads of nonsense right here.
So dig has a configuration file, the place you possibly can put +noall +reply
and
then your dig responses look a lot nicer (like they did within the screenshots
above) as an alternative of getting loads of nonsense in them. Every time attainable, I attempt to
make my instruments behave in a extra human approach.
The opposite factor I need to discuss is Wireshark, which
is my favourite pc networking software within the universe for spying on
all issues pc networks. On this case, DNS queries. So let’s go take a look at
Wireshark.
After we make a DNS question like this and search for instance.com, Wireshark can seize it.
Once you
begin wanting within the guts of issues, I believe it may be a bit scary at first. Like
what do all these numbers? It form of appears
like so much. So once I’m one thing new, I attempt to begin by stuff
that I perceive.
For instance, I do know that instance.com is a site title,
proper? So we must always ready to make use of Wireshark to go discover that area title within the
DNS question. If we click on into the “question” a part of the DNS packet, we will see 3
fields that we acknowledge. First, the area title.
We will additionally see the kind (“A”)
And the third one is the category which
is INternet, which is at all times the identical. What I discover comforting right here is that in
the question, there are actually solely 2 vital fields: a DNS question is simply saying “I would like
the IP deal with for instance.com”. There’s simply two fields. And that that at all times
makes me really feel slightly bit higher about understanding one thing.
A fast caveat: your browser may be utilizing encrypted DNS and spying in your
DNS queries with Wireshark won’t work in case your DNS is encrypted. However there’s
a number of non-encrypted DNS to spy on.
The second factor I need to discuss for studying new issues is to
discover whenever you’re confused about one thing.
I need to let you know a narrative, “the case of the mysterious caching”, of one thing
that occurred to me with DNS that basically confused me.
First, I need to discuss to you slightly
bit about how DNS works slightly bit extra. So on the left right here, you’ve gotten your
browser. And when your browser makes a DNS question, it asks a server referred to as a
resolver. And all you’ll want to know in regards to the resolver is that it is cache, which
as we all know is just like the worst factor in pc science. So the resolver is a cache,
and it will get its data from the supply of reality, which has the true solutions.
So your browser talks to a resolver, which is a cache.
On the time of this story, I had this psychological mannequin for like how I thought of
DNS, which is that if I set a TTL (the cache time) of 5 minutes when configuring my DNS data,
then I’d by no means have to attend greater than 5 minutes. One thing you’ll want to
learn about me is that I am a really impatient individual. And I hate ready. So this
mannequin was principally working for me on the time, although there are just a few different very
vital caveats that we’re not going to get into.
However at some point I used to be organising a brand new subdomain for some new venture. As an instance it
was new.jvns.ca. So I set it up. I made its DNS data, and I refreshed the
web page. And it wasn’t working. So I figured, that is advantageous, my mannequin says, I solely
have to attend 5 minutes, proper? As a result of that is what I used to be used to. However I
waited 5 minutes and nonetheless did not work.
And I used to be like, oh, no. My psychological mannequin was damaged! I didn’t really feel good.
And infrequently when this occurs to me, and I believe for many of us, if one thing
bizarre occurs with a pc, you let it go, proper? You may resolve okay, I
do not have time to enter a deep investigation right here. I will simply wait longer.
However generally I
have loads of vitality, and possibly I am feeling mad, like “the pc
cannot beat me immediately”! As a result of there is a cause that that is occurring, proper? And I
need to discover out what it’s. So today for some cause. I had so much
of vitality.
So I began Googling furiously. And I discovered a helpful touch upon Stack
Overflow.
The Stack Overflow remark talked about one thing referred to as damaging caching.
What’s that?
And so here is what it mentioned may be happening. The primary time I opened the
web site (earlier than the DNS data had been arrange), the DNS servers returned a
damaging reply, saying hey,this area does not exist but. The code for that’s
NXDOMAIN, which is sort of a 404 for DNS.
And the resolver cached that damaging NXDOMAIN response. So the truth that it
did not exist was cached.
So my subsequent query was: how lengthy do I’ve to attend for the cache to run out?
This brings us to a one other studying method.
I believe like possibly the
most upsetting studying method to me is to learn a really boring
technical doc. I am like very impatient. I form of hate
studying boring issues. And so once I learn one thing very boring, I wish to
carry a particular query. So on this case, I had a particular query, which is
how lengthy do I’ve to attend for the cache to run out?
In networking, every part has a specification. The boring technical paperwork
are referred to as RFC is for request for feedback. I discover this title a bit humorous,
as a result of for DNS, among the foremost RFCs are RFC 1034 and 1035. These had been written in 1987,
and the remark interval resulted in 1987. You’ll be able to positively now not make
feedback. However anyway, that is what they’re referred to as.
I personally form of love
RFCs as a result of they’re like the last word reply to many questions. There is a
nice collection of HTTP RFCs, 9110 to 9114. DNS really has one million
completely different RFCs, it’s totally upsetting, however the solutions are sometimes there. So I went
wanting. And I believe I went wanting as a result of once I learn feedback on
StackOverflow, I do not at all times belief them. How do I do know in the event that they’re correct? So
I needed to go to an authoritative supply.
So I discovered this doc referred to as RFC 2308. In part 3, it has this very boring
sentence, the TTL of this report is about to the minimal of the minimal subject of the
SOA report and the TTL of the SOA itself. It signifies how lengthy a resolver could
cache the damaging reply.
So, um, okay, cool. What does that imply, proper? Fortunately, we solely have one
query: I needn’t learn the complete boring doc. I simply want to love
analyze this one sentence and determine it out.
So it is saying that the cache time is dependent upon two fields. I need to present you
the precise information it is speaking about, the SOA report.
Let us take a look at what occurs once we run dig +all asdfasdfasdfasdfasdf.jvns.ca
It says that the area does not exist, NXDOMAIN. But it surely additionally returns this
report referred to as the SOA report, which has some area metadata. And there are two
fields right here which can be related.
Right here. I put this on a slide to attempt to make it slightly bit clearer. This slide
is a bit tousled, however there’s this subject on the finish that is referred to as the MINIMUM
subject, and there is the TTL, time to reside of the report, that I’ve tried to
circle.
And what it is saying is that if a report does not exist, the period of time the
resolver ought to cache “it does not exist” for is the minimal of these two numbers.
On this case, each of these numbers are 10800. In order that’s how lengthy need to
wait. We’ve got to attend 10,800 seconds. That is 3 hours.
And so I waited three hours after which every part labored. And I discovered this
form of enjoyable to know as a result of typically like in case you search for DNS recommendation it can
say one thing like, if one thing has gone flawed, you’ll want to wait 48 hours. And I
don’t need to wait 48 hours! I hate ready. So I find it irresistible once I
can like use my mind to determine that I can look ahead to much less time.
Typically once I discover my psychological mannequin is damaged, it seems like I do not know
something
However on this case, and I believe in loads of instances, there’s typically only a few
issues I am lacking? Like this damaging caching factor is like form of bizarre, however
it actually was the one factor I used to be lacking. There are just a few extra vital information about how
DNS caching works that I have never talked about, however I have never run into extra
issues I did not perceive since then. Although I am positive there’s one thing I
do not know.
So generally studying one small factor actually can remedy all of your issues.
I need to say briefly that there is a resolution to this damaging caching downside.
We talked about how like in case you go to a site that is nonexistent, it will get
cached. The answer is that if you have not arrange your area’s DNS, do not go to
the area! Solely go to it after you set it up. So I’ve realized to do this and
now I virtually by no means have this downside anymore. It is nice.
The subsequent factor I need to discuss is doing experiments.
So as an example we need to do some experiments with caching.
I believe most individuals do not need to make experimental modifications to their area
names, as a result of they’re frightened about breaking one thing. Which I believe could be very comprehensible.
As a result of I used to be actually into DNS, I needed to experiment with DNS. And I additionally
needed different folks to experiment with DNS with out having to fret about
breaking one thing. So I made this little web site with my pal, Marie, referred to as
Mess with DNS
The concept is, in case you do not need to do this DNS experiments in your area, you
can do them on my area. And in case you mess one thing up, it is my downside, it is
not your downside. And there have been no issues, in order that’s
advantageous.
So let’s use Mess With DNS to perform a little DNS experimentation
The way in which this works is you get slightly subdomain. This one is
chair131.messwithdns.com. After which you can also make DNS data on it and take a look at
issues out. Right here we’re making a report for take a look at.char131.messwithdns.internet, with
kind A, the IP 7.7.7.7, and TTL 3000 seconds.
What we’d anticipate to see is that if we make a question to the resolver, then it
asks form of just like the supply of reality, which we management. And we must always anticipate
the resolver to make just one question, as a result of it is cached. So I need to do an
experiment and see if it is true that we get just one question.
So I will make just a few queries for it, with dig @1.1.1.1 take a look at.chair131.messwithdns.com
.
I’ve queried it a bunch of occasions, possibly 10 or 20.
Oh, cool. This is not what I anticipated to see. That is enjoyable, although, that is nice.
We made about 20 queries for that DNS report. The server logs all queries it
receives, so we will depend them.
Our server obtained 1, 2, 3, 4, 5, 6, 7, 8 queries. That is form of enjoyable. 8 is lower than 20.
One cause I love to do demos reside on stage is that generally what I what
occurs is not precisely what I believe will occur. After I do that actual experiment
at dwelling, I simply get 1 question to the resolver.
So we solely noticed like eight queries right here. And I assume that that is
as a result of the resolver, 1.1.1.1, we’re speaking to has multiple
unbiased cache, I suppose there are 8 caches. This is sensible to me as a result of
Cloudflare’s community is distributed — the precise machines I am speaking to right here
in Windfall usually are not the identical as those in Montreal.
That is attention-grabbing as a result of it complicates your thought about how caching works a
little bit, proper? Like possibly a given DNS resolver really has like eight
caches and which one you get is random, and you are not at all times speaking
to the identical one. I believe that is what is going on on right here.
We will additionally do the identical experiment, however ask Google’s resolver, 8.8.8.8, as an alternative
of Cloudflare’s resolver.
And we’re seeing an analogous factor right here to what we noticed with Cloudflare, there are
possibly 4 unbiased caches.
We might additionally do an experiment with damaging caching, however no, I am not going to
do that demo. Sorry. I might simply see it going downhill. The issue is that
there’s too many alternative caches, and I really need there to be one cache, however
there’s like seven. That is advantageous, let’s transfer on.
Now I will discuss
about my favourite technique for studying about stuff, which is to
write my very own very dangerous model of the factor. And I need to say that writing my
very dangerous implementation provides me a extremely unreasonable quantity of confidence.
So that you may suppose that writing DNS software program is difficult, proper? But it surely’s
simpler than you may suppose, so long as you retain your expectations low.
To make the DNS queries, the very first thing we have to do is we have to
make a community connection. Let’s do this.
These 4 strains of Ruby join to eight.8.8.8, the Google DNS resolver, on UDP
port 53. Now we’re like midway there. So after we have made a connection,
we have to ship Google a DNS question. You may be considering, Julia, I
do not know write a DNS question.
However there is no downside. We will copy one from one thing else that is aware of what a
DNS question seems to be like. AKA Wireshark.
So if I proper click on on this DNS question, it’s totally small, however I am clicking on
“copy”, after which “copy as hex stream”.
You won’t know what this implies but, however it is a DNS question. And
you may suppose that like, Hey, you possibly can’t simply copy and paste one thing and
then ship the very same factor and it will reply, however you possibly can. And it really works.
Here is what the code seems to be wish to ship this hex string we copied and pasted to eight.8.8.8.
So we take this like hex string that we copy and pasted, and paste it into our
tiny Ruby program, and use `.pack` to transform right into a string of bytes and ship it.
Now we run the Ruby program.
Let’s go to Wireshark and search for the packet we simply despatched. And we will see it there! There’s another noise in between, so I will cease the seize.
We will see that it is the similar packet as a result of the question ID matches, B962.
So we despatched a question to Google the reply server and we obtained a response proper? It
was like that is completely reputable. There isn’t any downside. It does not know that we copied and pasted it and that we do not know what it means!
However we do need to know what this implies, proper? And so we’ll take this hex string and break up it into 2 components.
The primary half is the header. And the second half is the
query, which accommodates the precise area title we’re wanting up.
We will see assemble these in Ruby, however first
I need to discuss what a byte is for
one second. So this (b9) is the hexadecimal illustration of a byte. The way in which
I like to take a look at determine what meaning is simply kind it into IRB, if
you kind in 0xB9 it’s going to print out, that is the quantity 184.
So the query is 12 bytes
These 12 bytes correspond
six numbers, that are two bytes every. So the primary quantity is the factor
b962
which is the question ID. The subsequent quantity is the flags, which
principally on this case, means like it is a question like howdy, I’ve a
query. After which there’s 4 extra sections, the variety of questions after which
the variety of solutions. We shouldn’t have any solutions. We solely have a query. So
we’re saying, howdy, I’ve one query. That is what the header means.
And the best way that we will do that in Ruby, is we will make slightly array that has the
question ID, after which these numbers which
correspond to the opposite the opposite header fields, the flags after which 1 for 1
query, after which three zeroes for every of the three sections of solutions.
After which we have to inform Ruby take these like six numbers and
then signify them as bytes. So n right here means every
of those is meant to signify it as two bytes, and it additionally means to make use of large endian byte order.
Now let’s discuss in regards to the query.
I broke up the query part right here. There are two components
you may acknowledge from instance.com
: there’s instance, and com.
The way in which it really works is that first you’ve gotten a quantity (like 7), after which a
7-character string, like “instance”. The quantity tells you what number of characters to
anticipate in every a part of the area title. So it is 7, instance, 3, com, 0.
After which on the finish, you
have two extra fields for the kind and the category. Class 1 is code for
“web”. And sort 1 is code for “IP deal with”, as a result of we need to search for the
IP deal with. is
So we will write slightly little bit of code to do that. If we need to translate
instance.com into seven instance three column zero, can like break up the area on
a dot after which like get its size and concatenate that collectively and put a 0 on
the tip. It is just a bit little bit of Ruby. encode a site title.
After which we will wrap all this up
collectively the place we make a random question ID. And then you definately make
the header, encode the area title, after which we add the kind
and the category, 1 and 1, after which we will simply
concatenate every part collectively and that is our question.
There’s positively extra work to do right here to print out the response, however I wrote
a 120-line Ruby script that parses the response too, and I need to present you a fast demo of it working.
What area ought to we glance up>. rubyconfmini.com. All proper, let’s do it. Hey, it really works!
We’re on the finish! Let’s do a recap.
Okay. Let’s go over the methods we have talked about studying issues!
First, spy on it. I discover that once I take a look at issues like
to see like actually what’s occurring beneath the hood, and once I take a look at like,
what’s within the bytes, you recognize what is going on on? It is typically like not as
difficult as I believe. Like, oh, there’s simply the area title and the
kind. It actually makes me really feel much more assured that I perceive that factor.
I attempt to discover once I’m confused, and I need to say once more, that
noticing whenever you’re confused is one thing that like we do not
at all times have time for proper? It is one thing to do when you’ve gotten the vitality. For
instance there’s this bizarre DNS question I noticed in one of many demos immediately that I
do not perceive, however I ignored it as a result of, nicely, I am giving a chat. However possibly at some point I will really feel like it.
We talked about studying the specification, which, there are few occasions I really feel
like extra highly effective than once I’m in like a dialogue with somebody, and I KNOW that I’ve the appropriate reply as a result of, nicely, I learn the specification!
It is a very nice solution to really feel sure.
I like to do experiments to test that my understanding of stuff is true. And
typically I be taught that my understanding of one thing is flawed! I had an instance in
this discuss that I used to be going to incorporate and I did an experiment to test that
that instance was true, and it wasn’t! And now I do know that. I like that
experiments on computer systems are very quick and low-cost and often don’t have any
penalties.
After which the very last thing we talked about and really my favourite, however essentially the most
work is like implementing your individual horrible model. For me,
the arrogance I get from writing like a horrible DNS implementation that works
on 11 completely different domains is unmatched. If my factor works in any respect, I really feel like,
wow, you possibly can’t inform me that I do not know the way DNS works! I carried out it! And
it does not matter if my implementation is “dangerous” as a result of I do know that it really works!
I’ve examined it. I’ve seen it with my very own eyes. And I believe that simply feels
wonderful. And there are additionally no penalties since you’re by no means going to run
it in manufacturing. So it does not matter if it is horrible. It simply exists to offer
you big quantities of confidence in your self. And I believe that is very nice.
That is all for me. Thanks for listening.
due to the organizers!
Due to the RubyConf Mini organizers for doing such an excellent job with the
convention – it was the primary convention I’d been to since 2019, and I had a
nice time.
a fast plug for “How DNS Works”
When you favored this discuss and need to to spend much less than 10 years studying about
how DNS works, I spent 6 months condensing every part I learn about DNS into 28
pages. It’s right here and you will get it for $12: How DNS Works.