Introduction
Have you ever wondered what exactly happens when you type www.google.com in your web browser? Of course, depending on your internet connection, you'd receive the google homepage homepage in a matter of micro-seconds.
Did you know that there are lots of communications, channels and protocols which are ticked before the webpage is displayed on your screen?
In this article, we would demystify the basics of everything which happens behind that scenes when you type any website into your web browser. We would go over some of the architecture, protocols and services responsible for serving us the pages we request for.
Here is a simplified diagram showing most of the components we would talk about in this article.
DNS Resolution 🔍
When you enter the Uniform Resource Locator (URL), www.google.com, DNS resolution is the first step in loading the webpage.
DNS is like the phonebook of the internet. Instead of remembering IP addresses like 192.168.1.1, you can use domain names like google.com to navigate the web. DNS translates human-readable domain names into IP addresses that computers use to identify each other on a network.
How DNS works
Local DNS Cache
Your operating system typically has local DNS cache that stores recently accessed domain names and their corresponding IP address. If you've visited Google recently, your computer might already have this information cached, speeding up the processDNS Query
If the domain information is not cached locally, your computer sends a DNS query to a DNS resolver. The resolver is usually provided by your Internet Service Provider (ISP) or a third party service like Google DNS.Root Servers
The resolver starts by querying the root servers to find the authoritative name server for the .com top-level domainTop-Level Domain (TLD) Servers
Once the resolver gets the information about the .com TLD Server, it queries that server to find out the authoritative name server for google.comAuthoritative Name Server
Finally the resolver queries the authoritative name server for google.com which provides the IP address associated with google.com and then returns it to your computer, allowing it to access the site
TCP/IP 🌐
Once the IP address for google.com is found, TCP/IP takes over to establish connections between your computer and the google webserver.
Transmission Control Protocol/ Internet Protocol (TCP/IP) is a set of rules for internet communication, ensuring that data get's to where it needs to go.
How TCP/IP Works
- Three Way Handshake Your computer and the server exchange synchronization (SYN) and acknowledgement (ACK) packets.
- Data Transfer Data is then sent in small packets over the established connection.
- Connection Termination After the transfer of data the connection is closed
Firewall 🛡️
A firewall is a security system that monitors and controls incoming and outgoing network traffic based on a set of predetermined security rules.
How a firewall works
- A firewall inspects incoming and outgoing data packets to determine if they meet the security criteria set by the rules
- Based on those predefined rules, a firewall either allows or blocks the data packets
Firewalls essentially acts as a protective barrier between your computer and the internet, ensuring that only safe and authorized traffic can pass through.
HTTPS/SSL 🔐
Once the firewall allows the connection, that connection has got to be secured using Https/SSL.
HTTPS (Hyper Text Transfer Protocol) is the secured version of HTTP and SSL (Secured Socket Layer) is the technology that encrypts the data transmitted between your browser and the webserver.
Learn about SSL
Learn about HTTPS
How HTTPS/SSL operates
- Encryption SSL encrypts the data transferred between your browser and the webserver, making it unreadable for anyone who intercepts the network.
- Authentication SSL certificates verify the identity of a website ensuring that you are connecting to the legitimate server and not a malicious one
- Data Integrity SSL provides data integrity checks to ensure that data hasn't been altered during transmission.
Load Balancer ⚖️
Google has various webservers which process different requests from client computers like yours. As such a load balancer is used to distribute the network traffic across those servers.
A load balancer is a device or software that distributes incoming network traffic across multiple servers to ensure efficient use of resources, maximize throughput, and minimize response time.
Read about load balancers
How a load balancer works
- A load balancer receives incoming requests and distributes them across multiple servers based on predefined algorithms such as round Robbin.
- Load balancers also monitor the health of servers by sending regular requests to check their status. If a server becomes unavailable or unhealthy, the load balancer redirects the traffic to another server.
- Load balancers can drop and add servers depending on the demand allowing for easy scalability.
Webserver 🖥️
After being directed by the load balancer, the request reaches the web server which hosts the website files.
A webserver is a software that serves web pages to clients over the internet using HTTP Protocol
Read about Web server
How the Web server works
- When a request arrives from the client, the web server processes it, retrieves the requested files, and generates a response.
- The web server delivers the requested web pages, images or other resources to the client's web browser.
- In addition to serving static files, web servers can also execute scripts or interact with databases to generate dynamic content.
Examples of the popular webservers include Nginx, Apache and Microsoft IIS
Application Server 📡
The application server basically handles all dynamic content and business logic that the request contains.
Application Server is software that provides the runtime environment for web applications, allowing them to run and execute code written in programming languages.
Read about Application Server
How the Application Server works
- The application server executes the scripts to generate dynamic content based on user input, database queries or other factors.
- It often integrates with databases or other backend systems to fetch data to be returned to the user.
NOTE: Web servers is more fitted for static content, while the application server handles dynamic content.
Read more about the differences HERE
Database 🗃️
After the application server handles dynamic content, it may interact with the database to retrieve or store information required for the web application.
A database is a structured collection of data organized for efficient retrieval, storage, and management. It stores information such as user profiles, product details, or any other data needed by the web application.
Read about databases
How the database works
- The database stores data in tables, with each table containing rows and columns representing individual records and fields, respectively.
- The application server sends queries to the database to retrieves or manipulate data. These queries are written using a query language like SQL.
- The database processes queries, retrieves the requested data and performs the necessary operations like filtering, sorting or aggregating.
Conclusion
In addition to the major routes and services discussed in this article, there are other microservices like CDNs (Content Delivery Networks) that also play a crucial role in delivering web content. While we've covered the essential components involved in the process of accessing a website like google.com, it is important to recognize the complexity of the underlying infrastructure that ensures the seamless delivery of content to your web browser.
I invite you to share your thoughts and feedback in the comments section. Did this article meet your expectations?
Your input is valuable in shaping future discussions and ensuring that we continue to provide insightful and informative content.
Follow me on Github, let's get interactive on Twitter and form great connections on LinkedIn 😊
Happy Learning 🚀
Top comments (2)
Nice write-up
Thank you Ron, hope you learnt something