DEV Community

Rajat Kumar Nayak
Rajat Kumar Nayak

Posted on

WebRTC Explained: Behind the Scenes of Seamless Video Calls

Introduction

Hello Everyone!! ๐Ÿ‘‹๐Ÿ‘‹ We all must have come across video calling applications like Google Meet, Zoom, or Microsoft Teams app, and the list goes on. Whether you are a student or a working professional we all must have used this application for educational purposes and to conduct official meetings or get connected with distant family members.

In order to implement a video conferencing platform we need to understand the implementation of WebRTC. So let us have a deep understanding of how WebRTC works.

What is WebRTC?

  • WebRTC stands for Web Real-Time Communication, and it allows for instant, low-latency communication over the internet.

  • One most fascinating features of WebRTC is that gets easily integrated into web browsers such as Chrome, Firefox, and Safari. making it readily accessible without the need for additional plugins or installations.

How it works

Let's first understand the fundamental terms we will be using in the next steps before diving into the technical details.

  • NAT (Network Address Translation): The fundamental use case of NAT is to map a private IP address used in a local network into a public IP address.

You might be thinking why we are considering this?

When two hosts attempt to connect to each other, NAT devices such as routers and firewalls either block or interfere with the connection. To avoid this issue, WebRTC implementation frequently uses different strategies. In the meanwhile, we shall investigate those strategies.

So what the heck is NAT Traversal?

NAT traversal is the process of finding a way for two computers or devices to connect with each other while being separated by various NAT (Network Address Translation) routers.

So in order to facilitate NAT Traversal we make use of STUN and TURN Servers and exchange of ICE Candidates.

Okay, there's too much jargon in this line ๐Ÿ˜ฎโ€๐Ÿ’จ๐Ÿ˜ฎโ€๐Ÿ’จ. Let's break it down.

What are ICE Candidates?

WebRTC generates ICE candidates, which are IP addresses and port combinations that could be used to make connections. These possibilities come from a variety of sources, including local network interfaces and STUN servers. These candidates assist peers in negotiating the optimum approach to establish a connection while taking NAT setups into account.

STUN:

  • It is a protocol that allows a device to discover its public IP address and port when it is behind a NAT device. STUN servers are typically located on the public internet.

TURN:

  • If STUN is unable to make a direct connection, usually due to severe NAT configurations or firewall constraints, ICE can use a TURN server.

  • The TURN server serves as a relay, accepting data from one device and relaying it to another. When direct peer-to-peer connections are not possible, this relayed communication ensures that data can flow between the devices.

  • There is an important concept we need to know to have a clear understanding of how things work and that is Signaling.

  • Signaling is responsible for coordinating communication across WebRTC peers. It entails exchanging session control messages such as SDP (Session Description Protocol) offers, responses, and ICE candidates.

  • Signaling can be accomplished through the use of numerous protocols, such as WebSocket, HTTP, or bespoke protocols. The application developer selects the signaling protocol and server.

  • When a WebRTC peer sends an ICE candidate that includes STUN-derived information to the remote peer through signaling, it helps the remote peer understand how to reach it through NAT devices.

So until now, we have a brief understanding with the help of ICE Candidates two peers will be able to connect with each other.

Now taking into consideration video-calling applications we need to transmit applications from one peer to another peer.

So how this is possible? ๐Ÿค”๐Ÿค”

The answer is it is possible SDP.

Now what is SDP?

SDP (Session Description Protocol) is a protocol that is used to describe the media capabilities and preferences of a device.

  • After getting access to the device's camera and microphone, the information is transferred to the other peer through SDP and is referred to as an SDP Offer.

  • The SDP offer provides information about its media capabilities, network settings, and audio and video preferences.

  • The receiving peer responds with an SDP response, indicating its own media capabilities, preferences, and any necessary alterations to match the offer.

  • The SDP offer/answer exchange ensures that the media settings, codecs, and other parameters for the call are agreed upon by both peers.

  • It supports dynamic configuration and modifications during the call if necessary.

So upon successful SDP exchange and the peer connection is established, the browsers start streaming audio and video directly to each other. This data is encrypted using Secure Real-time Transport Protocol (SRTP) to ensure privacy and security.

Now in all these steps, you have noticed the major steps. So let us summarize the above contents into a brief.

  • The peers that need to get connected first undergo NAT Traversal.

  • It is implemented with the help of ICE Candidates and STUN/TURN Servers.

  • After ICE Candidates, the peers get connected and now the turn comes to exchange media information.

  • WebRTC asks for permission for media such as audio and video and an SDP offer is created which is then sent to the remote peer and in response it receives the SDP answer from the remote peer.

  • The media is transmitted in an encrypted format in order to prevent malicious activities.

In this way after an exchange of media information, the browsers start streaming audio and video directly to each other.

So I hope you enjoyed reading the article and that I was able to help you understand what goes on behind the scenes of WebRTC to some extent.

Top comments (0)