The Internet—with the upper case "I"—works in mysterious ways. By definition, the Internet refers to the global network of physically or remotely connected computers. It is basically one big graph of connected computers, where each node represents a single computer and each edge represents the (physical or wireless) connection between two computer nodes.
NOTE: This is not to be confused with the World Wide Web. The Web refers to the interactive platform that the Internet enables. The Internet simply facilitates connections between computers all around the world. If a certain computer were to join the "interconnected network" of computers—hence the term "Internet"—only then it would be able to access the Web, the interactive platform on which websites, services, data, and information are virtually located. In other words, the Internet is essentially the highway to the World Wide Web.
The connection between computers is what makes the Internet such a powerful technology. Before the Internet, computers were standalone devices that did not care about anything but its own hardware.
With the advent of the Internet, computers suddenly had the ability to communicate with each other. They could finally send files and emails without the use of some external storage hardware such as a floppy disk.
Although computers had to be physically connected via a mess of cumbersome wires, it was nonetheless a revolutionary development in computer technology because it paved the way to the notion of computers working together towards a common goal.
Nowadays, the Internet is more ubiquitous than ever, perhaps even pervasive in some aspects—but that is a discussion for another day. Thanks to the invention of wireless connections through Wi-Fi, the Internet has become more accessible than ever. Its ease of use has cemented itself into daily life to the point that it has become a crucial necessity for the daily functioning of society itself.
Having such a significant role in daily life, it is important to understand the basics of how the Internet works. That is the objective of this article and the succeeding installments of this series.
As beginners of the web development landscape, it is often the case that the fundamentals are ignored in favor of the "trendy" and "flashy" technologies such as React, Node.js, ES6 Promises, and Flask. To fill in the gaps in knowledge, this series will serve as a basic, high-level overview of fundamental concepts in web development across its many "sub-disciplines", ranging from (but not limited to) front-end development, back-end development, networking, security, and workflows. Of course, the pedantic jargon and nitty-gritty details will have to be omitted in order to stay true to the heart of the topic. Analogies may also be employed, albeit rather roughly.
With that said, the first article in this series, How the Internet Works, will discuss the two core stakeholders of an Internet connection to the World Wide Web: the server and the client.
A common misconception about servers is that they are mostly large-scale mainframes stored in large rooms that cost a lot to maintain. Although this is true to some extent, servers are not always large supercomputers, as often portrayed in Hollywood films and popular media. In fact, a server can be any computer as long as it can accept and establish connections with other computers. A server can even be as simple as a smartphone—albeit rather ineffecient but nonetheless feasible.
NOTE: The word "computer" is loosely used here because a "computer" nowadays can arguably refer to a plethora of devices: PCs, laptops, tablets, Raspberry Pis, smartphones, and smart-appliances ("Internet of Things" devices) to name a few.
At its very core, the Internet connects two computers together. One of the two computers "owns" or "hosts" some data while the other "receives" that data (upon "requesting" it). The server literally serves the client. Indeed, that is the basic relationship of the server and the client. The server owns some data, resource, or service while the client requests to retrieve it.
To put it more concretely, the client is typically the user of a website, whereas the server serves the files of a website. The two computers essentially communicate with each other. It takes two to tango. 🕺💃
Now it is time to discuss what exactly happens when two computers in the Internet seek to communicate with each other. For example, let's say that a person wishes to access the DEV website using their favorite Internet browser. That person is now considered to be the client.1 Naturally, the client wishes to request the home page of DEV from one of the servers that host the entire DEV website.
NOTE: A web page and a website are two different ideas. A web page refers to the document being visually displayed to the user through an Internet browser. On the other hand, a website can refer to a bunch of related web pages, services, and platforms.
When the user tells the browser to visit the DEV website, the browser generates a small text file that contains the instructions for retrieving the home page. Over the Internet, the browser sends this text file to a DEV server. After the server processes the instructions, it sends the home page back to the browser over the Internet.
In a nutshell, that is the high-level overview of how the server serves the client the home page. But... how does the browser even know where to send the text file in the first place? The Internet is such a huge network. How can one possibly know where to correctly send data?
In the Internet, all computers in the network—regardless of whether it is a client or a server—possess a presumably unique Internet Protocol (IP) address.2 The IP address works in a similar fashion to real home addresses, where the unit numbers, street names, cities, ZIP codes, and countries indicate the true physical location of a building. Staying true to this analogy, IP addresses indicate the true virtual location of a computer in the Internet.
IP addresses come in many forms, but the most common one follows the IPv4 format. You have probably seen one before. IPv4-compliant addresses usually come in the form of four 8-bit integers delimited by a period (
.). For example, a valid IP address may look like
z are placeholders for actual 8-bit integers.
NOTE: For the sake of brevity, I shall save the discussion of the newer IPv6 format for a future article. For now, let's assume that everyone in the world uses IPv4.
However, despite the existence of an IP address in all computers, it is still impossible for the data packet to reach its intended destination. Why? Well, an IP address indeed indicates the virtual location of a computer in the Internet, but that doesn't mean all computers keep an updated list of all registered IP addresses in the world.
If all computers were to somehow locally store a "phonebook" of IP addresses in storage, it would undoubtedly be an expensive and impractical endeavor. One cannot possibly keep an updated copy of the list of all Internet devices in the whole world.
Besides, such a "phonebook" would definitely occupy a large amount of storage. Needless to say, this would be extremely problematic for lower-end smartphones with limited storage.
This is exactly why the world moved on from phonebooks in the first place. Phonebooks had to be constantly updated. Over the years, each new edition made the phonebook thicker and heavier until it was a hassle to own one.
Thankfully, the Domain Name System (DNS) had been designed to specifically tackle the issue of directing data packets to the correct IP addresses.
The Domain Name System (DNS) is a network of intermediary computer servers that transform the domain name of a website into a valid IP address. A DNS server is basically the Internet's phonebook. Clients rely on a DNS server to transform the domain name
dev.to into a valid IP address.
NOTE: Internet Service Providers (ISP) typically provide the IP address of a reasonable DNS server by default so that you don't have to. Of course, you are always free to explicitly define a DNS server of your choice. Popular options include Cloudflare's DNS and Google's Public DNS.
dev.to may resolve to
www.google.com may resolve to
22.214.171.124. Since IP addresses may change every now and then, DNS servers allow us to deterministically refer to servers by domain name instead of having to memorize and keep track of their specific IP addresses ourselves.
DISCLAIMER: As mentioned earlier, since IP addresses may change every now and then, it is not recommended to directly access the IP addresses listed above. We can never be certain whether an IP address does indeed point to the correct server. For all we know, a random IP address can lead to a malicious website. This is why the DNS can also be considered a "security feature" of the Internet.
Regardless of domain name, the job of a DNS server is to quickly look up the correct IP address associated with a certain domain name. Given the size of the Internet's "phonebook", it is indeed a marvelous feat of engineering to accomplish this "quickly" and "frequently".
In the case of our example, the user wishes to navigate to
dev.to. The user's Internet browser then sends a request to the operating system's default DNS server to transform the domain name
dev.to into a valid IP address it can use. After some fancy registry lookups and domain resolution, the DNS server sends back to the client the requested IP address.
Once the client receives the IP address from the DNS server, the client can finally send the text file (containing the instructions for the home page) to the true virutal location of the
When sent over the Internet, the text file is broken up into small units of data called data packets for the sake of network efficiency. The specific size of these packets depends per software implementation, but it usually occupies a few kilobytes or so.
Through the Internet highway, each packet is delivered one by one—but not necessarily in sequential order—to the IP address of the
During this time, the client waits for the packets to arrive and for the server to respond accordingly. If the server is located close to the client, this shouldn't take too much time. However, if the server is located halfway around the world, then that is where latency, data corruption, and lost packets slow things down, in which case the packets cannot arrive at the server in a timely manner—if at all.
Since the client has to ensure that all packets successfully arrive at the server, the server has to acknowledge receipt of all the packets. If the server fails to do so, then the client sends the packets again until a certain period of time called the timeout expires. When the timeout is reached, the client assumes that the server is not available. It then terminates any further communication.
In the optimistic case when the packets do arrive at the server, the server accordingly acknowledges receipt and rearranges the broken data in order to reconstruct the original text file. By then, it can finally read the instructions and send the home page to the client.
In a (rather large) nutshell, that is how a client communicates with the server over the Internet. From a high-level, this is what happens under our noses when we browse the Web. More impressive than that is the fact that all of this happens as quickly as a few milliseconds. In worse cases, perhaps it takes a few seconds, but it is nonetheless impressive.
The entire world depends on this global virtual infrastructure we call the "Internet". Most of us don't even stop and think about the revolutionary engineering that makes the modern lifestyle possible. In just a matter of a few decades, we—the human species—have figured out how to reliably transmit signals and messages over unfathomable distances in an impeccably timely manner.
That is the story of the Internet: bringing people together, one computer at a time.
Pedantically speaking, the browser is the client because it is the one truly performing the network requests. The user is merely using the browser's interface to perform the network request. The real heavy-lifting is done by the browser, therefore the browser is the real client. ↩
In reality, all computers connected to the same network (usually via the same router) share the same public IP address. It is the router who truly owns the public IP address. For the data to reach the client, the router simply forwards the data to the correct computer in the local network. This will be the discussion for a future installment in the series. ↩