BOLD 2003: Development and the Internet
PART 2 - Follow the Header (or "Around the World in 900 Milliseconds")
To help you keep everything straight in this section, we suggest that you refer to this handy network diagram that depicts the transactions discussed below. You might want to print the diagram, or, if you're reading online, open a new browser window with it.
Andrew is in Ulaanbaatar, Mongolia, relaxing with a cup of delicious airag (fermented mare's milk) and checking email at the Chingis Khan cybercafe. He's carrying his laptop, which he's attached to the Ethernet hub in the cafe.
The first thing Andrew's laptop has to do is to get an IP address from the local network.
As soon as he plugs it into the Ethernet hub, Andrew's machine automatically requests an IP address from the Windows NT "gateway" server at the cafe, using a protocol called DHCP. [Note 4] The gateway machine is connected to a local Internet service provider (ISP), Magicnet, via a 36kbit per second dial-up modem. The gateway machine has a unique IP assigned by Magicnet's DHCP server: 184.108.40.206. The gateway is using Network Address Translation (NAT) to assign a temporary, non-public IP address to each machine at the cafe, including Andrew's machine. In other words, because the gateway uses NAT, Andrew's laptop thinks it has the IP address 192.168.0.5; the rest of the world thinks that Andrew's machine has the IP address of the gateway machine, 220.127.116.11. The gateway does the translation from public IP address to non-public IP address: that is, the gateway machine receives all the Internet traffic for all users at the cybercafe using the single public IP address 18.104.22.168, and then distributes the appropriate packets to the various machines at the cafe that requested them, according to their non-public IP addresses assigned by the gateway via DHCP.
Next, Andrew's laptop has to convert his email into packets and send them over the Internet to his mailserver. At that point, Andrew's mailserver will attempt to deliver the packets to their destination.
On his laptop, Andrew is running Microsoft Outlook, a standard email program that supports three email-related protocols: SMTP, POP3 and IMAP. SMTP -- the Simple Mail Transport Protocol -- is a protocol used for sending mail from one machine to another. When Andrew types a message to Ethan's email address, Outlook sends a series of SMTP commands to his mailserver, a machine at Harvard Law School called cyber-mail.law.harvard.edu. When Andrew hits send, his laptop must first break the email message down into a set of packets and then use SMTP to send them to cyber-mail with instructions about where they should ultimately be delivered. While Outlook is smart enough to format this message into valid SMTP commands, it leans on part of the Windows operating system -- the TCP/IP stack -- to translate the SMTP messages into valid IP packets.
Andrew's packets go to the gateway machine through the Ethernet connection, to the Magicnet server via a modem, and then through a gateway machine at Magicnet. In the next seven tenths of a second, they take an epic journey through 23 machines in Mongolia, China, Hong Kong, San Jose, New York, Washington DC and Boston.
Here's the itinerary for Andrew's intrepid packets -- the path from his laptop to his Harvard mailserver:
1 cobalt03.mn (Datacomm Mongolia) (22.214.171.124)
These twenty-three computers are routers. A router's job is extremely simple - it moves packets to the next machine in the right direction as quickly as possible. Because all they do is move packets, routers are able to process millions of packets a minute. Each router has a "routing table", which is a set of rules that determine which next machine to forward packets to, based on the final destination of a packet. The actual construction of routing tables is a fascinating subject, far beyond the scope of this discussion - an excellent introduction from networking engineers at Agilent is available here (this link will open a new window).
Let's unpack this itinerary.
Router #1, Cobalt3.mn, is one of the computers Magicnet uses to route traffic out of Mongolia. Cobalt3 is attached to a high-capacity phone line that connects Ulaanbaatar and China Satnet's NOC (Network Operations Center) in Beijing. China Satnet is a Network Service Provider, a company that sells internet capacity to Internet Service Providers, like Magicnet. Network Service Providers, in turn, buy connectivity from global backbone providers, the companies that operate the huge fiberoptic cables that link together continents. China Satnet routes packets from router #2, which handles traffic to and from Mongolia, through routers #3 and #4 to router #5, all of which are on Satnet's network. Router #5 routes traffic from Satnet to and from Opentransit, the backbone arm of France Telecom. Opentransit sees that the packets need to get to the US, specifically to a network served by Qwest, and calculates a sensible route for the packets. On Opentransit's network, the packets head through Hong Kong (#6, #7), across the Pacific to San Jose (#8, #9), across the continent to New York (#10, #11) and then to computer #12 in Ashburn, Virginia.
Routers #12 and #13 are worth special note. They dwell in a building owned by Equinix, a company that specializes in network-neutral Internet peering exchanges. Network service and backbone providers need to transfer data from one network to another. Historically, in the United States, this happened at places called Metropolitan Area Exchanges (MAEs), where dozens of networks terminated their lines and transferred packets. As the Net grew, the MAEs grew unwieldy - the amounts of data that needed to be exchanged overwhelmed the capacity of the MAE switches and led to very slow data transfer. More importantly, large network providers quickly learned that MAEs put them at an economic disadvantage. Imagine that a tiny Internet provider - Joe's Internet Backbone (JIB) - wants to connect to MCI WorldCom at a MAE. There are a few hundred computers attached to Joe's backbone; there are several million attached to the MCI backbone. It's significantly more likely that a user of Joe's network will want to reach a site on the MCI network than vice versa. As a result, if MCI connects to JIB, it will end up carrying most of Joe's traffic, and absorbing (without compensation) the costs of getting Joe's traffic to its destination.
To avoid the congestion at the MAEs and to escape the MCI/JIB situation, network providers started moving to a model called "private peering". In private peering, two networks agree to interconnect -- meaning that they agree to establish a direct link between their two networks. They agree on a place to put their routers, they each buy machines and connect them via fiber optic cable or gigabit Ethernet. And they usually strike a financial deal that compensates the larger network for its larger costs of carriage. If networks have a similar number of users, they might decide to interconnect without exchanging money; if one network is substantially smaller, it might have to pay several millions of dollars for the privilege of interconnecting with the larger. Network providers work extremely hard to keep the financial details of their peering arrangements secret, so it is often very difficult to know who's paying whom and how much.
Both routers #12 and #13 are located at an Equinix facility in Ashburn, Virginia. Router #12 is owned by Opentransit; Router #13 is owned by Qwest. So as of router #13, we're now on Qwest's network. The packets fly through a set of Qwest machines in the Washington DC area (#13, #14, #15), near JFK airport in New York City (#16, #17), and then Boston (#18, #19). These machines have the word "core" in their names, implying they are core nodes in Qwest's network -- tremendously fast computers attached to enormous communications lines. Router #19 is an "edge" machine - our packets are now ready to get off the Qwest backbone and to be routed to a connected network in the Boston area. Router #20 is owned by Harvard University and interconnects the Harvard and Qwest networks. In a very real sense, Harvard and Qwest are interconnected at this point much the way Qwest and Opentransit were connected in Virginia. However, Harvard isn't a peer to Qwest - it doesn't run its own backbone extending anywhere beyond Cambridge, Mass. -- and hence Harvard acts like a customer and bears all the costs associated with the connectivity and with the "peering" point. Harvard does have an awfully big network, though, and distinguishes between its edge machines (#21) and core machines (#22, #23).
Let's check the stopwatch -- those last four paragraphs? Seven tenths of a second. The duration of a single human heartbeat.
Machine #24 in this chain is also an edge machine, cyber-mail.law.harvard.edu. Unlike the core routers on the network, this machine has numerous jobs beyond the forwarding of packets. One is to run a mailserver, a piece of software that receives and distributes incoming email to users, and routes outgoing email to other mailservers. When Andrew's packets are received by the mailserver, it determines that Andrew's email is addressed to Ethan and needs to be sent to the mailserver named geekcorps.org. Accordingly, it starts sending IP packets to geekcorps.org. Think of it as the two mailservers (cyber-mail.law.harvard.edu and geekcorps.org) striking up a conversation:
geekcorps.org: 220 SMTP Service Ready
Downright mannerly, isn't it? Keep in mind that each of those messages is contained within an IP packet. And each IP packet has to wend its complex way from cyber-mail.law.harvard.edu to geekcorps.org. This connection spans 18 computers, three networks and takes 12 hundredths of a second.
1 core-nw-gw-vl216.fas.harvard.edu (126.96.36.199)
The three asterisks from router #18 signify a machine that failed to identify itself through traceroute, in this case, a Verio router in Sterling, Virginia, where the Verio data center is located. The Geekcorps machine is actually a small part of a large server owned by Verio; that single machine provides web, ftp and mail services for several dozen separate domain names. The mailserver on the Verio machine receives the email from Andrew and appends it to a "spool file", a text file containing all the uncollected email for a particular user. The mailserver now considers its job done - it couldn't care less whether Ethan retrieves the mail, so long as it's been correctly spooled.
So much for Andrew's email -- in its packetized IP form, it traversed something like 45 different machines in at least four countries before reaching Ethan's mailserver.
Having finished a tasty meal of banku and okra stew, Ethan is ready to check his email at the Geekcorps office in Accra, the capital of Ghana. In contrast with the fairly straightforward path of Andrew's packets (laptop to Harvard mailserver to Geekcorps mailserver), Ethan's takes a somewhat convoluted journey. Instead of pointing a mail client like Outlook at his mail account on Geekcorps, he points a web browser at Hotmail. His web browser speaks a protocol called HTTP, hypertext transfer protocol, which is used most commonly to transfer web pages and web data from server to client. When Ethan types www.hotmail.com into his browser, his computer is actually sends a message that looks like this to the Hotmail server:
The server responds with something like this:
HTTP/1.1 200 OK
Once again, these polite little exchanges are taking place via IP packets routed around the world. A reasonable guess for the routing of these packets suggests approximately 18 hops in eight tenths of a second. [Note 7] Because a webpage is built of several files - the HTML file and the associated image files - and because many of these files are too big to fit in just one packet, there are dozens of transactions involved with assembling a webpage, meaning it can take several seconds to load. Following is the likely routing of packets from the Hotmail webserver (located on the Microsoft Network) to Ethan's office in Accra, Ghana:
1. vlan701.law13-msfc-b.us.msn.net (188.8.131.52)
Ethan navigates through Hotmail and sends a request (using a new command: POST) to the Hotmail server to check his email at Geekcorps. A small program running on the Hotmail server reads the parameters that Ethan has set earlier (using other HTTP POST commands) and composes a message using the POP3 protocol to the mailserver running on geekcorps.org. They, too, have a polite little exchange:
Geekcorps: +OK POP3 server ready < geekcorps.org >
And yes, again, all these polite conversations are carried out through IP packets transmitted through thirteen machines in under a hundredth of a second. Following is an educated guess at the path the packets might take to get from geekcorps.org to hotmail.com:
1. www.geekcorps.org (184.108.40.206)
Not by a long shot. For the sake of simplicity, we've been ignoring an important part of the process -- the translation of hostnames to IP addresses. When Andrew sends an email to firstname.lastname@example.org, his computer needs to know the IP address for the destination machine, namely, the mailserver at geekcorps.org. His computer doesn't automatically know that geekcorps.org is located at 220.127.116.11. It needs to send a query to the Domain Name System (DNS) to determine what IP address is currently associated with geekcorps.org.
Deployed in the late 1980s, the DNS is a highly distributed Internet directory service that allows the use of easy-to-remember domain names instead of the numerical IP addresses. Domain names are used to identify connected computers ("hosts") and services on the Internet. For the Internet's routing systems, domain names are completely useless -- it's the IP address that tells a router the destination of a given packet. But for human Internet users, it's important to enable identifiers that they can readily remember. The DNS naming hierarchy starts with the top-level domains, such as .com, .net., .gov, .info, .jp, .ca, .cn, .mn, .gh, etc. There are approximately 258 top-level domains, 15 of which are generic three-or-more-letter strings, and 243 of which are country- or territory-specific two-letter strings. Nearly all of the top-level domains are operated independently by organizations called "registries," located all over the world.
Domain names are written as strings of characters (or "labels") separated by dots: for example, cyber.law.harvard.edu is the domain name for the Berkman Center. As we've said, the DNS is organized hierarchically, meaning that the holder of a domain name has the ability to create as many sub-domains as she/he chooses; likewise, the holders of those sub-domains can create whatever sub-domains they choose, and so forth. Thus, each domain name label potentially represents a different holder of authority and a different registrant. In other words, using the example of the Berkman Center's domain name, .edu is the top-level domain, representing the registry for educational institutions; .harvard.edu is the second-level domain, registered by Harvard University; .law.harvard.edu is a third-level domain, delegated by Harvard University to Harvard Law School; cyber.law.harvard.edu is a fourth-level domain, delegated by Harvard Law School to the Berkman Center. In DNS parlance, each of these labels or levels is called a "zone", defined to include all of the sub-domains beneath it -- thus, you can speak of the .edu zone (which includes all second-level domains under .edu, and their sub-domains), the harvard.edu zone (which includes all third-level domains under harvard.edu, and their sub-domains), the law.harvard.edu zone, the cyber.law.harvard.edu zone, and so forth.
One of the really great things about the DNS is its distributed nature -- naming authority is spread far and wide throughout the system, and up and down the hierarchy -- which means that each domain name registrant has the flexibility to change the IP addresses associated with its domain name any time she/he wants. This means that Google can change the IP addresses associated with google.com whenever it wants or needs to, without waiting for anyone's permission, and without Internet users noticing the difference.
To resolve a domain name into an IP address, your computer needs to find the nameserver that is authoritative for that domain name. (A nameserver is an Internet computer that has both the software and the data -- one or more zone files -- to be able to translate domain names to IP addresses). To find the relevant authoritative nameserver, though, your computer must first work its way down the DNS hierarchy. At the very top of the DNS is a list of the 258 top-level domains, known as the "root zone file." The root zone file is published to the Internet continuously through a system of 13 DNS root nameservers located around the world -- two are in Europe, one in Japan, and the rest in the United States. They are labeled with letters, A to M. One of the amazing things about the DNS is the fact that the DNS root nameservers are all still run by different volunteer organizations -- universities, government agencies, non-profit consortia, and huge networking corporations. Without them, the Internet, as we know it, would come to a screeching halt. Fortunately, the DNS root nameserver system is designed with lots and lots of redundancy, so that very bad things could happen to most of the DNS root nameservers without any noticeable effect on the operation of the Internet.
The root nameservers maintain identical, synchronized copies of the root zone file, which tells your computer where to find the authoritative nameservers for the 258 top-level domains. In turn, each top-level domain registry maintains a top-level domain zone file, which is a list of all the second-level domains and the names and IP addresses of their authoritative nameservers. For example, the .com registry is a huge online database of domain names and the names and IP addresses of the nameservers that are authoritative for them. Once you have found the authoritative nameserver for the domain name you want to resolve, you query that nameserver for the IP address associated with that domain name. The nameserver will give you the answer, and your computer will proceed to use the IP address as the destination for your communication.
Back to Andrew. When Andrew's laptop obtained an IP address from the cybercafe gateway via DHCP, it was also assigned a pair of local DNS nameservers. In order to translate geekcorp.org into an IP address, his laptop first sends a DNS lookup request to one of these assigned nameservers (there are two for redundancy, in case one server is unavailable). If the DNS nameserver had previously looked up the IP address for geekcorps.org, it will be "cached", stored in a local table for quick lookup.
If the address isn't cached locally, Andrew's DNS nameserver needs to venture forth into the DNS to find it. To simply just a bit, Andrew's DNS nameserver first sends a query to the closest DNS root nameserver (M, in Tokyo), asking it for the authoritative nameservers for the .org top-level domain. M responds with a list of the nameservers for .org. Andrew's DNS nameserver then sends a query to one of the .org nameservers (a7.nstld.com, located at 18.104.22.168), asking it for the authoritative nameservers for geekcorps.org. That nameserver responds with a list of two nameservers for geekcorps.org, including ns11a.verio-web.com, located at 22.214.171.124. He now has the location of the nameserver that is authoritative for geekcorps.org!
Andrew's DNS nameserver now queries the Verio nameserver, which reports that geekcorps.org -- at the moment -- is associated with 126.96.36.199. Andrew's DNS nameserver caches the IP associated with geekcorps.org so that it doesn't have to perform another series of lookups immediately afterwards. The cache expires fairly quickly, though, to make it possible for the registrant of the geekcorps.org domain to change its IP address on short order.
And yes, as you've guessed, all these DNS queries are polite, well-mannered exchanges carried out through IP packets, all of which need to be routed across the Internet. Count all the machines involved with these DNS lookups and Ethan and Andrew's simple exchange of email may well involve over 100 computers in 6 countries.