DEV Community

Cover image for Teach you to implement web/app voice group chat function from 0 to 1
ZegoTech
ZegoTech

Posted on

Teach you to implement web/app voice group chat function from 0 to 1

Since the second half of 2019, a number of APPs have been noticed due to their excellent performances, like Yinyu on music social media, Zhiya on voice social media and Spot, a combination of music and map functions, showcasing the huge potential of “voice & social networking”. In the spring of 2019, the voice social networking segment evolved to a new stage, witnessing many innovative products.

In supporting multiple customers in live broadcast and social networking, ZEGO has deepened integration of live broadcast technologies and pan-entertainment industry. With deep understanding of customer needs, the ZEGO ChatRoom Solution offers an array of voice chat formats for social networking and pan-entertainment customers based on the subtly different operations of voice social networking via optimized functions and detailed configuration at the low level.

ZEGO Voice ChatRoom Solution
Taking all functions required by voice social networking products, the ZEGO Voice ChatRoom Solution comprises 3 modules:
Mic-to-Mic: group voice chat
Playing music: playing background music or ambient sound effects
Backend optimization: supporting the APP to keep running in the background and in-game voice chatting after switching.

1.Mic-to-Mic: group voice chat
Group voice chat generally depends on the multi-mic to mic technology. It would be difficult for the platform to perform R&D all by itself:
a. need to independently deploy servers and deal with massive concurrency;
b. need to optimize the encoder and decoder to address echo and noise problems;
c. need mature technology plans to reduce latency and improve sound quality;
d. need to guarantee user experience by being compatible with all network environments.
Based on audio and video processing engine developed independently, the ZEGO Voice Chatroom Solution is capable of tackling all difficulties in group chatting, and allows customers to use or integration in a fast and easy manner by building in related APIs in the SDK.
Cooperation with leading network operators, rich node resources and unlimited capacity expansion
Multiple mature pre-processing algorithms, no echo and strong noise reduction
Less-than-100ms latency on a global scale, support 1080P and adaptive to multiple resolutions
Adaptive to multiple complex networks and highly compatible with over 5000 models of Android mobile phones
The ZEGO video calling SDK enables voice and video functions with outstanding capacity for function extension: after activating the video chatroom, customers could enable mic-to-mic video streaming according to their needs. For platforms with interactive live streaming functions, new application formats of the voice chatroom could be developed.

2.Playing music: playing background music and ambient sound effects
In many scenarios, playing background music in the voice chatroom could enhance user experience. ZEGO voice chatrooms support MP3 and MP4 music files, either stored locally, on the internet or in the iOS Media Library. It supports playing one playlist in three modes: in sequence, randomly and loop, and provides well-designed control interfaces.
ZEGO’s music player mixes audios being played into the stream publishing so that all users in the voice chatroom can hear the background music. Inspired by strong beat tracks, slay in the game together with your friends.
In entertainment scenarios, ambient sound effects are necessary, like applauds, whistles and laughter. ZEGO voice chatrooms support playing ambient sounds, which would not interfere with background music, to make the chatrooms more active.

3.Backend optimization: supporting the APP to keep running in the background and in-game voice chatting after switching
The ZEGO Voice ChatRoom Solution supports detailed configuration according to different scenarios, optimizing for the business model in an all-round way. Take playing games for example, ZEGO enables low latency and good sound quality, and reduces its CPU usage through special encoding and decoding schemes to ensure a smooth gaming experience, happy without delay.

Five advantages of ZEGO Voice ChatRoom Solution
The ZEGO Voice ChatRoom Solution comprises multiple functional modules needed by a voice chatroom with a flexible and expandable architecture, facilitating users to develop new uses. In addition, based on its self-developed audio and video processing engines, the solution has following advantages:
1.Business-oriented API design supporting fast connection
The ZEGO Voice ChatRoom Solution adopts business-oriented API, simple and direct. The developer needs a minimum of 4 lines of code to realize voice chat function in the APP. The embedded APIs, simple but functional, and module scenarios allow customers to integrate in an easy and efficient way.
Illustration of the invocation procedures:
2.Scenario-based detailed configuration
Different scenarios, like entertainment live broadcast, games, social networking and education, pose subtly different requirements on sound configuration in terms of bitrate, sampling rate and number of audio channels. The ZEGO Voice ChatRoom Solution provides 4 sets of recommended configurations of the live broadcast for different uses. Therefore, the developer can choose a suitable set as per his requirements without leaning about enormous parameters and their meanings and the debugging process.

3.Mic-to-mic voice chat managment
The ZEGO Voice ChatRoom Solution proposes the conception of microphone position. A user of the chatroom could engage with the live broadcast by taking the microphone position (mic connection) and stop doing so by leaving the position (mic disconnection). Other operation choices include mic shift, connecting others to mic, mic ban and mic closing permanently.
The solution discloses only the concept of microphone position and related operations to outside users, but APIs are designed to the actual business needs of developers. The internal intelligent management controls stream publishing, stream playing and retrying, allowing developers to focus on project design without the need to worry about complex underlying technologies.
4.Decrease in bandwidth and traffic
The ZEGO Voice ChatRoom Solution adopts the mic-to-mic technology that is based on the advanced DTX (discontinuous transmission) and VAD (voice activity detection).
DTX is a means by which signals are not transmitted through the internet when there is no voice input in the mic-to-mic communication in order to reduce traffic and battery usage of users’ mobile phones.
VAD is a means to detect whether there is human speech in the communication. When non-speech is detected, it will avoid unnecessary processing, and when speech is detected, it will compress encoding and transmission, which uses as less bandwidth as possible and reduces traffic costs.
5.Open APIs for multiple functions
The ZEGO Voice ChatRoom Solution supports voice changing, stereo (3D surround) and reverberation. The open APIs allow the platform to customize many sound effects and provide more innovative use methods, enabling users to have more fun in voice-based social networking.
With the ZEGO Voice ChatRoom Solution, the customer is able to develop a voice chat product in a simple and efficient manner. By creating more possibilities for voice social networking, it helps to reduce the dominance by a handful of social giants and diversify the social APP ecology, offering more choices to users.

Top comments (0)