Unsurprisingly, the Covid-19 pandemic revealed exactly how useful audio-video conferencing software could be. Having to shelter in place and work remotely, people worldwide had to manage everything from doctor’s appointments, fitness classes, dating to Friday night drinks, and watching live concerts on video apps.
Naturally, video conferencing software has seen massive growth in usage during the pandemic. There was a 627% increase in downloads of video chat and online conference apps in North America. A 121% increase in daily active users was also observed. The explosion of Zoom and Google Meet is common knowledge by now. Even Skype for Business, GoToMeeting, and JoinMe apps saw downloads increase by 66%, 85%, and 43% in March 2020.
App Categories with Significant Growth
As it stands, users expect certain apps to have in-built audio-video communicative capabilities. The numbers attest to this. Video conferencing improves communication for 99% of people. Video meetings boost productivity by 50%. By every metric, video conferencing capabilities improve user experience.
However, embedding audio-video communication into an app from scratch requires time, effort, and investment. Instead, it's much easier to implement audio-video using an SDK designed for that purpose.
Since there are multiple products and SDKs aimed at helping devs build comprehensive video conferencing into their software, choosing the right platform is more complicated than it seems.
To help devs and product managers make a more informed decision, this article will break down the main features and limitations of five tools that provide the infrastructure for embedding in-app audio-video functionality: Agora, Twilio, Jitsi, Zoom, and 100ms.
Note: This piece will compare tools on six parameters: ease of integration, error handling, scalability, cost of support, plugins for easy feature development, and pricing.
Agora is an API-first SaaS company that started by providing real-time audio and video broadcast APIs. Now, it has expanded to a platform that allows customers to create rich, in-app audio-video features, such as real-time recording and messaging, embedded video, and video chat as well as interactive live video streaming.
- Cross-platform SDKs that are highly customizable.
- Using low code UIKit libraries, users can embed a real-time video UI with a few lines of code.
- Agora uses its own Software Defined Real-time Network (SD-RTN™), a real-time transmission network. Unlike a traditional carrier network, the SD-RTN™ is not limited by device type, phone numbers, or a network provider’s coverage radius.
- Multiple extensions (interactive whiteboard, cloud recording, Agora analytics) enable the easy addition of new and useful features to an app.
- Offers official SDKs for React Native, Electron, Unity, Cocos, and Flutter.
1. Ease of integration: Agora offers both pre-built and custom SDKs. The pre-built version can be installed and activated in a few lines of code but is not customizable and extensible. It comes with two pre-defined permissions (roles) for peers within a call: host and participant.
Developers have to handle low-level publish-subscribe abstractions. This adds overhead in handling network exceptions, bandwidth management, and writing role-based infrastructure. Developers will also have to manually configure the permissions for different roles in the call (teacher vs. student, instructor vs. learner, etc.)
2. Error Handling: Agora does not support in-built disconnection handling and edge cases on devices like app background handling, switching of microphones, etc. Devs must write extra code to set it up.
3. Scalability: Within a single call, Agora allows a maximum of 17 hosts and a total of 10,000 participants, including the hosts.
4. Cost of Support: Agora’s free Starter plan offers Tickets/Email support, Online Documentation, and KB Access. Services like code review, guaranteed response times, live developer consultation, and training must be paid for (the lowest price being $1200/month and the highest being $4900/month).
5. Plugins for easy feature development: Agora offers multiple plugins for the easier development of feature-rich apps. Users can access Agora’s plugin marketplace for a large number of integrations with various functions. However, adding all plugins requires extra coding effort.
6. Pricing: Starts at $4, but Agora’s pricing policy is quite layered and will require close examination before purchase.
- Pricing for Agora goes up exponentially as the number of video participants increases. Anytime the download quality on a call exceeds 720p (which can happen by screen-sharing between two peers), charges switch to HD prices. Agora’s pricing plans can also be complicated for users.
- You cannot have more than 17 hosts in a meeting.
- Integrating the SDK is quite complicated and time-consuming.
Twilio originally offered an API for automating traditional phone calls and SMS text messages. Today, it provides programmable APIs to help developers build business communication (in-app and otherwise) across the customer journey. They allow devs to integrate audio and video interactions into multiple platforms.
Once the Twilio API has been integrated, interactions can take the form of SMS, WhatsApp, Voice, Video, email, and even IoT.
- Communications APIs to implement messaging, voice chats, and video conversations within the software or outside its UI (SMS, WhatsApp messages, etc.)
- Programmable connectivity features (Chat, Voice API, Video) for generating virtual phone numbers, initiating SIP trunking, and messaging.
- Use case-based APIs that allow the abstraction required for tasks related to authentication, message control, and call routing.
1. Ease of integration: Twilio provides web, iOS, and Android SDKs. The SDK does not support other frameworks like Flutter and React Native. When using multiple audio and video inputs, devs must manually configure them, for which they have to write extra code.
Much like Agora, devs have to handle low-level publish-subscribe abstractions to set up Twilio integrations. They must also manually define permissions for all actors within a call.
2. Error Handling: Manual configuration is required to build bandwidth management. Twilio offers extensive call insights to track and analyze errors.
3. Scalability: Twilio supports a maximum of 50 hosts within a call and a maximum of 50 participants, including hosts. You can switch to Twilio Live for HLS streaming and accommodate unlimited participants in a call. But, separate SDK integration is required to stream video via HLS.
4. Cost of Support: Twilio’s free support plan offers API status notifications and email support during business hours. Users have to pay for more services like 24/7 live chat support, support escalation line, quarterly status review, and guaranteed response times. The price depends on the support plan, usually a percentage of the monthly plan or a certain minimum amount (lowest being $250/month and highest being $5000/month).
5. Plugins for easy feature development: No plugins are available.
6. Pricing: Starts at $4, and has a relatively simple pricing policy.
SDKs are available only for web, iOS, and Android.
It is not possible to stream video via RTMP.
Devs must manually compose all recordings.
Number of participants is limited to 50.
Jitsi is a collection of open-source projects designed to help users build and implement secure video conferencing options. Among its offerings, Jitsi Meet is best known for providing video conferencing services. Jitsi also comes with meet.jit.si which hosts a free-for-use Jitsi Meet instance and the Jitsi Videobridge which powers and sustains Jitsi’s multi-peer video features.
- Jitsi is an open-source solution with impressive community support.
- Setup is relatively easy and comes with one-click installation.
- The process to set up video/audio calls and multi-user meeting rooms is quite user-friendly.
- Jitsi uses industry-standard physical, administrative, and technical shields to safeguard the confidentiality of users' personal data.
- Jitsi users have a decent choice of service providers across multiple geographies to choose from, in order to host the application locally.
- Jitsi supports all available clients (Windows, Linux, Mac, iOS, Android).
1. Ease of integration: Offers both pre-built and custom SDKs. Like Twilio, there are no predefined roles. Devs have to manually configure permissions for peers within a call.
2. Error Handling: Manual configuration is required to build bandwidth and connection management into Jitsi's low-level API. Jitsi also has SDKs for less customizable versions with some in-built connection management.
3. Scalability: With the open-source SDK*,* Jitsi supports a maximum of 100 hosts in a call, and a maximum of 100 participants, including the hosts. The paid Jitsi SDK from 8x8 supports a maximum of 500 participants (including hosts) in a call.
4. Cost of Support: Users can ask questions to the Jitsi community. For paid support, they will have to approach 8x8, the company that acquired Jitsi in 2018.
5. Plugins for easy feature development: Since Jitsi is fully DIY, plugins may not be available for most features. However, there are some open-source plugins devs can use.
6. Pricing: Largely free of charge for the open-source SDK. However, some costs will be involved for the deployment infrastructure. Users can expect that to be 40-50% of the cost of other paid providers.
For the paid SDK from 8x8, prices are based on the number of active monthly active users. For example, JaaS (Jitsi as a Service) Dev supporting 25 active monthly users is free, JaaS Basics supporting 300 users is $99/month, JaaS Standard supporting 1500 users is $499/month, and so on.
- Given its open-source nature, video apps with Jitsi have to be built from scratch. Therefore, it takes a significant amount of time to build a stable product.
Zoom is a cloud-based video conferencing app that offers SDKs for devs to implement audio-video communications and relevant capabilities within new or existing apps. Zoom SDKs are split into the Meeting SDK and the Video SDK.
Zoom SDKs allow for the setup and integration of numerous features such as live chat, webinars, screen sharing, changing background, and multiple collaborative functions. Zoom Meeting SDKs are a solid option for anyone looking to include a slew of video communication features into their software ecosystem.
- Zoom users can simply import libraries and packages for quick implementation of the Zoom meeting platform into applications.
- Zoom supports seven major languages and provides open translation extensibility which opens any app to international growth and improved user experience.
- The Zoom Video SDK comes with fully customizable UI features that developers can expand and modify according to the requirements of their app.
1. Ease of integration: Zoom offers two predefined roles: host and participants. Permissions for these roles cannot be modified.
2. Error Handling: Zoom SDKs come with in-built error handling and bandwidth management, as built into their consumer offering. Developers have to handle only minimal reconnection/bandwidth management.
3. Scalability: Zoom supports a maximum of 300 hosts in a call and a maximum of 1000 participants including the hosts.
4. Cost of Support: Zoom offers three customer support plans: Access, Premier, and Premier+. Pricing for these plans has to be obtained by contacting Zoom.
5. Plugins for easy feature development: Currently, Zoom does not have a plugin marketplace for devs to use.
6. Pricing: Base pricing starts at $3.99. Zoom has a fairly straightforward, uncomplicated pricing policy.
- Zoom is best suited for basic use cases, as it only allows the use of predetermined roles: host and participant. For use cases that require modified permissions for peers, Zoom may pose difficulties.
- The SDK’s footprint size is inordinately large.
100ms provides live audio-video SDKs that enable devs to add powerful, extensible, scalable, and resilient audio-video features into their apps with half a dozen lines of code. It abstracts the business logic of the conference room in Templates and Roles. The client-side SDKs include all edge cases within the SDK rather than leaving it to the application side.
The solution has been built by the team that powered live video infrastructure for some of the world’s largest live events & running billions of minutes a day at Facebook and Disney+Hotstar.
- The 100ms SDK provides pre-built templates to build virtual events, audio rooms, classrooms, and a wide range of use-cases with a few lines of code.
- The SDK is comprehensive, meaning that any piece of code which multiple application developers have to write repeatedly exists in the infrastructure layer.
- By virtue of room templates, the business logic remains on the server-side rather than being burned into client apps.
- The SDK is fully customizable and designed to be modified as required by any app’s UI.
1. Ease of integration: 100ms SDKs are fully customizable. The publish-subscribe logic has been abstracted in the concept of roles. Pre-defined roles are available, and users can customize roles as required with zero coding - right on the 100ms dashboard.
Devs can use roles to build complex video applications with minimal coding. All the code resides on the SDK side. Write a few lines of code on the application side and go live.
2. Error Handling: Pre-built disconnection handling is available in 100ms. Devs also benefit from edge case handling like granular device capture errors, in-built network degradation handling, automatic lowest latency server choice, and more.
3. Scalability: 100ms supports up to 10,000 participants and up to 100 hosts on regular video calls. It provides a single switch between WebRTC and HLS, which means that video calls can be streamed to millions via HLS with one click.
4. Cost of Support: No extra cost is charged for support. Customers can access support via private Slack channels as well as Discord.
5. Plugins for easy feature development: With the 100ms SDK, the most relevant features like whiteboard, hand-raising, media player, chat, and screen share are provided out of the box. Customers also get RTMP Streaming, HLS, and recording capabilities out of the box. Therefore, too many plugins may not be necessary.
6. Pricing: Base pricing starts at $4. Simple, easy-to-understand pricing policy.
- No simulcast available
- No hosting servers in ANZ and South America.