The Anatomy of Web Searches: A Beginner’s Guide

Have you ever wondered what happens behind the scenes when you search for a website on the internet? This blog will guide you through the journey of a web request, from the moment you hit Enter after typing a URL, to the instant a webpage loads on your screen. Whether you're a budding developer or simply curious about the technology you use every day, this exploration is for you. For the purpose of this blog, I will be using https://www.google.com as the example.

Before we proceed, it is essential to grasp two fundamental concepts that form the cornerstone of our discussion: TCP/IP (address) protocol and the role of a client.

What is a client?
A client is a host or device capable of interacting with a service. A client is the device that submits a request and waits for a response from the server. The term "client" can refer to both your laptop and your browser. Your laptop acts as a client device, and the browser is the client application. The browser enables your laptop to send requests and receive information from the internet, thus acting as a client in the process.

What is TCP/IP?
TCP/IP stands for Transmission Control Protocol/Internet Protocol. TCP protocol is a set of rules for ensuring reliable delivery of data packets from one host to another while IP protocol is a set of rules that govern the routing and addressing of data packets ensuring they can travel across networks and reach the correct host. IP address is a unique identifier given to any device connected to a network to allow communication between each other.

Upon pressing Enter after typing https://www.google.com into your browser, the client first searches the local DNS cache to see if it has a record of the domain name's IP address (the domain name is google.com) if the IP is not found, the following operations are a simplified overview of DNS resolution:

DNS Query: Also referred to as DNS request, this sends a query to a recursive DNS server (usually provided by your ISP, Internet Service Provider) asking for the IP address associated with the domain name requested.
DNS Recursor: After receiving the requests, this server acts as a middleman between the client and the DNS nameservers, querying DNS nameservers like root, TLD, Authoritative nameserver respectively. after receiving the requested IP address from the Authoritative nameserver, it sends the response (requested IP address) back to the client. During this process, the recursor will cache information received for future use.

Now you may be wondering why the client (your browser) cannot use domain names? This is because domain names are only human-readable, meant for human use. The internet is based on IP addresses. Imagine trying to go to google.com but to get there you have to memorize 8.8.8.8, okay google has a short IP. But what about other websites that could have IPs like 176.89.229.128; what if you need to use 5 different websites? trying to memorize IPs would be a hassle!

This is where DNS nameservers become handy. They act as the internet's phonebook, receiving queries and responding with the domain name's IP addresses. Now that you understand the basics of what a DNS request and response is, I will talk about firewalls.

What is Firewall?
A firewall is a network security that monitors all incoming and outgoing traffic (request and response) and selectively filters them based on a defined set of security policies. while firewall helps to prevent unauthorized access, secure communication is also needed which is the work of HTTPS.

What is HTTPS?
HTTPS stands for Hypertext Transfer Protocol Secure. HTTPS is the results of layering the HTTP protocol on top of the SSL/TLS protocol. The SSL/TLS (Secure Sockets Layer/Transport Layer Security) packs, encrypts data being communicated and authenticates the site being accessed is the original one ensuring integrity. This helps prevent attacks such as phishing, where a malicious site pretends to be a legitimate one. HTTPS works in such a way that if your data is intercepted, they won't be able to decrypt it therefore making the interception useless. Taking a look at our example; https://www.google.com you will see the 's' in front of http, which stands for 'secure' indicating the connection is encrypted through SSL/TLS. SSL/TSL is typically identified by a padlock next to the URL in your browser. Beware of unsecure sites!

We've explored the way a client accesses websites and delved into security layers such as firewalls and HTTPS.

Now, let's take a look at how the client actually receives responses. We'll begin with a simplified overview of load balancers, which ensures efficient traffic management and high availability.

What is a load balancer?
A load balancer is a system that distributes incoming traffic across multiple servers, ensuring high availability, efficient utilization of servers, and high performance. The primary goal of a load balancer is to minimize server crashes due to overload of users and so on, also ensures low latency and high performance. This is possible through the distribution of load using difference algorithms like round robin or least connection e.t.c.

Facts: Google's cloud load balancer (GCLB) has an availability SLA OF 99.99% per month even though it has over 4 billion users! Amazing.

Service Level Agreement (SLA) is simply expressed as a percentage that indicates the expected uptime for a service within a given period.

Now that we understand how web servers stay available even if they have billions of users, lets talk about servers themselves.

What is a server?
A server is a hardware device or software that handles requests sent over a network and sends out responses. For the scope of this blog, our focus will be on software servers. Among the different server types such - as FTP, proxy, and game servers - we will delve into three essential varieties for a 3-tier application model:

Web server: A web server is a software application or hardware device that stores content (HTML, CSS and scripts e.t.c), processes, and serves web content to users over the internet. It operates on the HTTP or HTTPS protocols which are the foundation for communication on the world wide web. "So how do you actually see a web page? he asked.". Well a process called DOM rendering with CSSOM and others are utilized by your web browser. Simply put, your web browser downloads web files from the web server, creates the Render Tree, paints and displays them.
Application server: This sits between the web server and database server. It is essential for providing the business logic behind any application. It is a type of server designed to install, operate and host applications and associated services for clients. It facilitates the hosting and delivery of high-end applications. It interacts with the database to fetch and store data.
Database Server: This is a computer system that manages and provides access to databases. It runs a Database Management System (DBMS) and is responsible for storing, retrieving, and managing data in a database allowing multiple clients i.e users or applications, to interact with the database concurrently.

Below is a schema illustrating the flow of the request created when you type https://www.google.com in your browser and press Enter.

In conclusion your web browser is performing modern day magic, weaving web pages into existence in seconds, right before your eyes.

That's it for this beginner guide, Now you should have a fair understanding of what happens when you type https://www.google.com in your browser and press Enter.

DEV Community

The Anatomy of Web Searches: A Beginner’s Guide

Top comments (0)

Read next

SVG hacks for you

Install Splunk on Windows in Minutes: Complete Guide for Beginners

Embracing Disappearing Frameworks: A Future of Efficient Web Development

Integrate Daytonia in a Url Shortener App