Calvin Nguyen

Posted on Jul 20, 2020 • Edited on Sep 8, 2020

WebRTC - The technology that powers Google Meet/Hangout, Facebook Messenger and Discord

#programming #webdev #javascript #security

Here is what happens during a P2P connection, and all you need to know about Web Real-Time Communication

Everything you need to know about Web RTC in 9 minutes

History of Real-time Communication

Around the 2010s, real-time communication was only available using additional software, plugins, or Adobe Flash.
In 2013, the first cross-browser video call between Chrome & Firefox was introduced.
In 2014, the first cross-browser data transfer occurred, opening to a new-emerging trend in real-time communication via the client-side.

Today, it is known as the Web RTC that we use every day in Chrome, Mozilla Firefox, Opera, Safari, Edge, iOS, and Android.

Overview

WebRTC stands for Web Real-Time Communication, which is a networking technology introduced in 2011 by Google to enable real-time audio, video, and data transmission across the web and native browsers.

"Its mission is to enable rich, high-quality RTC applications to be developed for the browser, mobile platforms, and IoT devices, and allow them all to communicate via a common set of protocols."

WebRTC allows web apps to create Peer-To-Peer communication. WebRTC is a vast topic, so in this post, we'll focus on the following issues of WebRTC:

Why do developers & companies love Web RTC?
What happens during the P2P connection
- Signaling
- NATs & ICE
- STUN & TURN Server
- VP9 Video Codec
WebRTC APIs
Security

Why do developers & companies love Web RTC?

Free open source
- It provides browsers with end-to-end direct communication and allows developers to facilitate this connection easily.
Speed Enhancement
- No longer needs to be routed through a server; it reduces latency and bandwidth consumption.
- Direct communication improves the speed of data transfer & file sharing.
No third party app required
- Requiring no additional software, plugins, or continuous server involvement (Well, it does, but only in the beginning, you will know why later)
- Easily be embedded in any websites and connect peers across the internet.
Easy to implement
- Less time and effort to facilitate peer-to-peer (P2P) connection.
- All the functionality can be done through the client-side. Developers just need to download a WebRTC compatible browser and use
Compatible
- Supported by most popular browsers: Microsoft Edge, Google Chrome, Mozilla Firefox, Safari, Safari, Opera, Vivaldi.
- Supported by Android, Chrome OS, Firefox OS, BlackBerry 10, iOS, Tizen.
Provide a safe connection across many browsers
- Encryption is mandatory for all of the WebRTC components.
- Since it is not a plugin, it runs inside the browser's sandbox without creating a new process so that no malware can get into the user's system.
- No need to keep track of the updates. With the automatic updates of the browser's version, the user gets the patch as soon as it is available.

What happens during the P2P connection

Image by PubNubTo connect two browsers, Web RTC is required to perform five steps to set up a P2P connection.

Signal processing to remove ambient noise from the audio or video.
Codec handling to compress and decompress the audio or video.
Routing from one peer to another through firewalls, (NATs), and relays to create an Interactive Connectivity Establishment (ICE)
User data is encrypted before transmitting across connections.
Managing bandwidth to the user what each peer has to give

Signaling

P2P connections in the browser are established by a server to ensure all peers agree to the session.
Information like session keys, error messages, media metadata, codecs, bandwidth, and public IP address and ports are shared between peers to create the connection.
The server signals to both peers to determine what media format to use and what each peer wants to send to the other.

Network Address Translations (NATs) and ICE

NATs translate a private IP address found on devices like a home router to a public IP address. Firewalls and NATs slow the process by blocking specific protocols or ports. The solution WebRTC uses is a framework called ICE.
ICE establishes a P2P connection over the internet by trying all connections in parallel and selecting the most efficient path. There are two types of connections: STUN & TURN

STUN Servers

It first works connecting through a STUN (Session Traversal Utilities for NAT) server to get a direct link through the network address.

A STUN server provides the requestor a public IP address to communicate with others. Its purpose is to help a requestor answer the question, "What's my IP Address?"

How STUN servers work

To set up a connection with other peers, an endpoint is required to know its public IP to share with others.

When an endpoint (Calvin) is behind a NAT/Firewall, it can only identify its local IP address, and the other (Elana) cannot connect to the local IP because of the firewall security.
This endpoint will ask for help from the STUN server to provide its public IP address and a type of NAT.
The other endpoint (Elana) can attempt the connection between the two using the given public IP address from the STUN server.
If successful, media will flow directly to each endpoint without a 3rd party or another server.
For security, all STUN servers will be dropped and wait for the next query.

Limitations - Symmetric NAT

However, the situation above may sometimes fail, and the PORT and IP number will be changed.

This situation is called "symmetric NAT" as the public IP address of the STUN server does not have enough capability to establish connectivity here, as the port would also need a translation.

Some routers use Symmetric NAT, which is made to add another security layer to the endpoint or avoid many strangers to connect to your device. A Symmetric NAT not only translates the IP address from private to public but also translates ports.

In other words, the router will only accept connections from known peers that the user previously connected to. Hence, another solution is made to ensure that the connection between two peers is successful is through the TURN server.

Why STUN servers are useful

As a protocol, STUN is super fast, lightweight, and straightforward. It allows the media to travel directly to each other in a short time. STUN is beneficial to speed up the connection and get the result faster in real-time.

The scenario is similar when the user is using LAN to download the data, which is faster than downloading from the Wi-fi. Most importantly, it allows the media to travel directly between both endpoints. STUN can be used publicly and free.

TURN Servers

TURN (Traversal Using Relays around NAT) server acts as relay servers incase the peer-to-peer connection dies. While STUN servers are used to establish the connection, TURN servers remain active throughout the association.

A TURN server keeps relaying the media between the WebRTC peers. That is why the term "relay" is used to define TURN.

How TURN servers work

This relay server is used to relay traffic if the STUN server fails, and it also has the STUN's functions.

The TURN server is a STUN server with transmitting capability built-in. More specifically, TURN is used to relay audio/video/data streaming between peers, not signaling data.

Follow the steps for STUN Servers
If STUN fails, an end-user will create a connection with a TURN server, inform all peers to send data to the server, which is in charge of transferring data to the first end-user.

One main reason why a STUN server is always used first is that the TURN server is too expensive, and uses massive bandwidth if HD Video is streamed online.

VP9 Video Codec

One of the main features, why many people start to use WebRTC, is for video streaming. As live video becomes more mainstream and starts getting higher quality, it requires data transfer to be faster or the packet size to be smaller to be easily transferred.

That is when VP9 Video Codec takes place to compress and decompress the audio or video. It helps stream video quicker and more apparent. By supporting VP8, Safari 12.1 can exchange live video with other peers.

VP9, which is an improvement from VP8, is a video compression format owned by Google and created by On2 Technologies.

The main feature is to conceal packet loss and clean up noisy images, as well as capture and playback capabilities across multiple platforms.

With VP9, users can use WebRTC to stream a 720p video without packet loss or delay. It can also support a 1080p video call at the same bandwidth and helps reduce poor connections and data usage to avoid expensive costs for users.

JavaScript APIs

There are three main Javascript APIs that handle audio capturing, video conferencing, and data transmission:

MediaStream

Utilizes a user's camera and microphone to capture and stream audio and video. Using this API allows you to get access to input devices such as the microphone and the web camera.
When a developer integrates WebRTC into their website, they can create constraints on how they want the audio and video streamed. Limitations like the frame rate, size of the video frame, resolutions, and much more.
This API was provided as part of HTML 5, whereas the other two APIs are explicitly offered for WebRTC.

RTCPeerConnection

Send the captured stream of audio and video in real-time across the internet to another WebRTC endpoint. Using these APIs enables users to transmit audio and video captured by getUserMedia to other peers.
Has features to connect to a remote peer, maintain and monitor the connection, and close the connection once after it's done.

RTCDataChannel

Transmit arbitrary data. Each data channel is associated with an RTCPeerConnection.
Built-in security (DTLS) and congestion control.

Security

One of the security risks in any real-time communication application may raise during the transmission of data. Eventually, encryption is a mandatory feature of WebRTC and is enforced on all components.

WebRTC uses two standardized encrypting protocols:

Datagram Transport Layer Security (DTLS)

A standardized protocol that is built in a browser. It's used to encrypt data streams. It is based on the Transport Layer Protocol (TLP).
Preserves the semantics of the transport because DTLS uses User Data Protocol (UDP).
It is an extension of Secure Sockets Layer (SSL); any SSL protocol could be used to secure WebRTC data allowing end-to-end encryption.

Secure Real-time Transport Protocol (SRTP)

Used to encrypt media streams.
It is an extension to Real-Time Transport Protocol (RTP), which does not have any built-in security mechanisms. Adds protection, integrity, and message authentication to the RTP.
Downside: While it provides encryption for the RTP packets, it does not encrypt the header.

Steps to secure a link between 2 peers

Initiates the signaling process exchanges the metadata between two peers.
ICE check is performed, and ICE establishes a channel between parties.
DTLS handshake is performed. If there are media transported, the SRTP uses the keys that were exported at the DTLS handshake step.
All peers have secure channels with keys that are not known publicly.
Exchange Keys between the peers.

Applications that use WebRTC

Google Meet/ Google Hangout
Facebook Messenger
Discord
Amazon Chime
....

For more, you can check out this link for a list of apps that use WebRTC
http://www.webrtcworld.com/webrtc-list.aspx

Follow me here to get the latest blog posts

Top comments (10)

Tilak Madichetti • Jul 20 '20 • Edited

Dude I don't have a pro account on medium - I can read only 3 art. / month. If you really want to help the community here, post stuff that we can at least read. Don't use this site to drive traffic to your posts on a paid platform . I am not saying you should completely avoid it but at least your post here should contain something - Not just a link . C'mon man - its not cool

manish srivastava • Jul 21 '20

copy link and paste in incognito window....may help

Calvin Nguyen • Jul 20 '20 • Edited

Hi, Sure. Thanks for giving me some insight My apologies for forgetting about that. I'm new here, so just trying to give some useful links :)

yellow1912 • Sep 13 '20

Yup I really hate medium now. People who really want to share for free (yup I understand some people want to make money and that is great too) should not post there anymore.

David Dal Busco • Jul 21 '20

Nice post 👍

We use WebRTC for the communication between the remote control and the presentations created with our editor or developer kit. I'm always surprised how responsive it is.

If someone want to give it a try: