Car, television, stove, and refrigerator are the machines we use daily that are fairly simple to use. These are essential tools in our lives yet most of us do not fully comprehend how they work but still, it is not the end of the world. We don't have to understand their mechanisms to be able to use them. Computers and mobile devices are also other machines that most people know how to use, but they are pretty complicated machines that are capable of doing many things.
Using these devices, we connect to the World Wide Web (www) commonly known as the Web where we can access various resources such as documents, audios, pictures, and videos. The web became very simple to use despite its complexity over time, and even a 7-year-old Larry can open his mom's iPad to watch his favorite cartoon on YouTube.
Have you ever wondered what's happening behind the scenes for someone like young Larry to go on YouTube to play the video, or what the moving pieces are for you to get to a website? You came to the right place. Let us go over the key players of the web to do its work.
The World Wide Web, also known as the web was invented by a British computer scientist named Sir Tim Berners-Lee back in 1989. His parents were computer scientists, but Tim was more interested in trains growing up. He got into electronics as he had to build electronic gadgets to control the trains. Eventually, he got more interested in electronics than trains which is how he started working on computers and software.
Tim became a software engineer at a physics laboratory in Switzerland named CERN after graduating from Oxford University. Back then, things were very manual and offline which you had to log on to different computers to get the information stored. Some computers used different programs, so you had two choices: learn a different program on each computer to access the information, or go have a coffee chat with your co-worker to ask how they work.
With millions of computers being connected together through the internet, Tim envisioned a technology to connect the world with information and proposed his idea to his supervisor at CERN. Although it was never an official project, his supervisor gave him time to work on the big task.
Tim has developed the three fundamental technologies by 1990 that are the foundation of the web today: HTML, URI, and HTTP. Over time, the web has grown and Tim realized that the web's true potential would only prevail if it was accessible by anyone, anywhere without any cost or permission. CERN agreed to make the web free forever and announced the decision in April 1993. The web has grown ever since with the number of websites totaling 1.8 Billion as of April 2021.
“Had the technology been proprietary, and in my total control, it would probably not have taken off. You can’t propose that something be a universal space and at the same time keep control of it.” - Sir Tim Berners-Lee
Enough of the history lessons on the web, it's time to talk about how the offspring of Tim's noble task look like today. To summarize the flow of the web, a browser sends an HTTP request to a server to access specific content, and the server returns an HTTP response of that requested data back to the browser. We will get into more detail but it looks something like this:
Let's go through each of the key players of the web.
I spend a lot of free time watching YouTube. It is probably the website I spent the most time in my life. In order for me to get to the website to watch a video, I go through these steps:
- Open my computer
- Open Google Chrome
- In the browser, I type www.youtube.com
- I get to the website
- Watch a video
In this example, I am the client who requests access to a video that is in YouTube's database. YouTube is serving me by providing the web service to access the video, which makes YouTube the server.
A client is internet-connected computer hardware that uses client software like a web browser. Your computer and mobile device are clients that use browsers like Chrome, Firefox, or Safari. We often refer to the device, browser, and user using the device as clients. Clients can request access to the content that servers store.
On the other hand, a server is computer software and its hardware that serves clients by receiving their requests and returning responses accordingly. Servers can show web pages, send/receive emails, store files and share them, or identify and authorize user accounts.
I like to compare a client and a server relationship as a patron and a librarian at a public library. The patron can ask the librarian a book he is looking for, and the librarian will respond with the location of the book if they have a copy of it. Just like that, a client can send a request to a server to view a web document.
When humans speak to each other, we use a shared language and follow its grammar structure to deliver our messages across. Clients and servers do the same by using Hypertext Transfer Protocol (HTTP) which is a request-response protocol they expect from each other when exchanging data.
A client communicates with a server by sending an HTTP request containing information on what the client is looking for, and the server responds to the client by returning an HTTP response as a result of the request. HTTP requests and responses both have HTTP Header, which allows clients and servers to understand each other better. HTTP Headers contain information like the client's setup (browser, operating system), browser cookie, and domain name the client wants to reach.
It is possible for hackers to intercept the data in the middle and see the data being exchanged though. This could result in horrible outcomes logging into a bank account, email, or health insurance. That is why Hypertext Transfer Protocol Secure (HTTPS) was introduced to encrypt the data. With HTTPS, hackers will see encrypted meaningless characters even if they were to intercept the data. The data can be decrypted by using the shared secret key between the client and the server.
For clients and servers to be able to communicate, they connect to the global system of computer networks called the Internet. We pay a monthly fee to the internet service providers (ISP) to be able to connect to the network. The internet uses the internet protocol suite (TCP/IP) to exchange data packets between computers.
These packets are fragments of data that allow data to be transferred reliably and efficiently. Transferring a large file instead of packets would be inefficient as the speed of the data transfer varies based on how you are sending them (optical cable, copper wire, or satellite). It can result in an unexpected loss of data or a change in the order of the packets. this is where the internet protocol suite comes into play.
The internet protocol suite is a communication protocol that ensures the successful exchange of data to an intended destination. It consists of two protocols: TCP and IP.
TCP stands for Transmission Control Protocol which defines the model of the data and assigns numbers to each data packet being transferred. With the numbers assigned to each packet, it can detect loss of data during the transfer to fix them and reassemble them in the right order as one file again. Due to its complexity, it makes TCP very reliable.
In order for data to get to the right place between computers, it requires the addresses of each computer. IP stands for Internet Protocol which routes data to the right location. IPs are numbers of unique computer addresses with a mix of digits and periods like
192.158. 1.38 (IPv4). With the web growing its size every day, a new version of IP, IPv6 was deployed to fulfill the need for more internet addresses. Compared to its previous version with a 32-bit binary IP address, IPv6 uses a 128-bit binary IP address which allows 340 undecillion unique address space! Here's an example of an IPv6 address:
These IP addresses are not that human-readable though, and we'd need address books to keep all the websites' IP addresses. And it'd be very inconvenient if we had to look up Google's IP address and type
http://220.127.116.11/ in the browser to get there every time. To solve this problem, the Domain Name System (DNS) was introduced. The DNS is like the address book of the internet. We purchase domains from DNS providers, website addresses that are more human-readable like google.com, youtube.com, or facebook.com. With the domains purchased for the websites, the DNS provider is responsible for exchanging domain URLs to IP addresses to the clients.
With that being said, let's take a look at an example of a user accessing YouTube's home page and break down what's happening behind the scenes:
- User opens his laptop (client) that is connected to the internet and opens Google Chrome (browser)
- User types in the web address www.youtube.com to the browser address bar
- The browser goes to the DNS server and exchange the web address into an IP address
- The browser uses the IP address to make an HTTP request to YouTube's server to access the website page
- YouTube server looks at the HTTP request, prepares the data into packets and TCP numbers each packet (many companies have their data stored in services like Oracle Cloud or AWS)
- YouTube server responds with an HTTP response with a "200 OK" status code (means the request was processed successfully) to the user's browser
Imagine the world without the web. Imagine doing your school projects without any access to Google, just like back a couple of decades ago when things were simple without any power of the internet. You would have to access offline documents like books, newspapers, or magazines. Researchers had to fly across the country to interview the right personnel to collect data.
The web has made many things possible by connecting humanity all over the world to exchange information. You can now watch Netflix anywhere with the internet instead of going to a Blockbuster store to rent a DVD, have video chats with friends and family across the globe, or simply Google any information you are looking for. Especially the COVID-19 pandemic really showed the power of the web by connecting the world regardless of the location. It minimized the damage to our society by allowing remote work, food delivery service, and quick/easy access to COVID-19 guidelines for anyone.
Although you do not have to fully understand how the web works to be able to use them, I hoped to provide a little bit of history and knowledge on how the web operates. Feel free to comment below with additional information! Thank you so much for taking time to read this blog post.
Follow my blog account or let's connect on LinkedIn to keep up with more tech content!