DEV Community

Ticorrian Heard for Zoom

Posted on • Updated on

Migrating from the Agora Video SDK to the Zoom Video SDK

Transition Guide Agora Web Video SDK to Zoom Web Video SDK

Quick Snapshot of steps:

  1. Install the Zoom Web Video SDK

  2. Import Zoom Video SDK

  3. Replace Agora function join logic with Zoom function join logic

    • Implement backend logic for handling token generation the using both SDKs REST APIs
  4. Implement camera and audio setup using Zoom SDK methods. Also implement methods for controlling the media peripherals (mute and unmute)

  5. Implement callbacks for joining, leaving, and video state changes SDK listeners

  6. Implement logic for controlling remote user video for each sdk listeners

  7. Implement meeting leave logic

Features of this app:

  • Dynamic selection between sdks in the same app
  • Join session unique to each service
  • Muted/Unmuted media peripherals
  • Leave meeting
  • Check for meeting limit exceeded using REST API

Installing the SDK

Replace Agoras RtcEngine with Zooms SDK by either inputting the CDN link into the index.html file via tag or installing and importing the SDK via command line with command, npm i zoom-videosdk, to power live voice, video, and streaming integrations.

Access Zoom Video SDK in the Zoom Marketplace

  1. Go to marketplace.zoom.us to login to your Zoom Video SDK account
  2. Click Build Video SDK on the right corner and you are navigated to the credentials page of your app where you can see your Video SDK credentials and API credentials
  3. Store these credentials in your .env folder for local development
ZOOM_SDK_KEY=
ZOOM_SDK_SECRET=
ZOOM_API_KEY=
ZOOM_API_SECRET=
AGORA_APP_ID=
AGORA_APP_CERTIFICATE=
AGORA_CUST_ID=
AGORA_CUST_SECRET=
Enter fullscreen mode Exit fullscreen mode

Import Zoom SDK

Replace Agora’s RtcEngine with Zooms SDK by either inputting the CDN link into the index.html file via tag or installing and importing the SDK via command line with command, npm i zoom-videosdk, to power live voice, video, and streaming integrations:

$ npm install @zoom/videosdk
Enter fullscreen mode Exit fullscreen mode

In zoom.service.ts, import the Zoom Video SDK via this statement:

import ZoomVideo from '@zoom/videosdk'
Enter fullscreen mode Exit fullscreen mode

Frontend changes
Zoom SDK uses canvas to paint and render video data to. Replace the div element with a canvas element with the desire styling and #videoCanvas reference as shown (in this case we add, the canvas element as an option to be selected based off ngIf):

From:

<div class="video-canvas">    
    <div *ngIf="mode === 'agora'" class="center-video-div" #videoCanvas></div>
</div>
Enter fullscreen mode Exit fullscreen mode

To:

<div class="video-canvas">
    <canvas *ngIf="mode === 'zoom'" width="1920" class="center-video-canvas" height="1080" #videoCanvas></canvas>    
    <div *ngIf="mode === 'agora'" class="center-video-div" #videoCanvas></div>
</div>
Enter fullscreen mode Exit fullscreen mode

Backend

We use a local NodeJs server for handling the token generation, credential fetching, and secure REST api calls for both services. Credentials are stored in a .env file not tracked with git and imported into the server using the dotenv library. We are running the express framework and a fetch library for parity in fetching with our frontend.

Our endpoints are:
/zoom-session-count - uses the Zoom REST api to fetch the amount of users in the session. We then return this number to the frontend. Utilizes the utility function generateJWTToken. This uses the /videosdk/sessions/ Zoom REST API Endpoint

/zoomtoken - generates a Zoom Video SDK jwt token to use when joining a session. The utility function generateJWTToken is called here. This uses the /videosdk/sessions/ Zoom REST API Endpoint

/agora-token - generates an Agora Video SDK jwt token to use when joining a session.

/agora-appid - returns the Agora appID stored with the .env file

/agora-channel-count - uses the Agora REST api to fetch the amount of users in the channel. We then return this number to the frontend. This uses the /channel/user/ Agora REST API Endpoint

Joining the meeting

We use the same information given in the form to join the Agora session. In ui.service.ts, the mode is set to Agora so the logic for joining an Agora session will be executed:

ui.service.ts

async joinSession(name: string, sessionId: string, password: string): Promise<boolean> {   
    let exceeded: boolean = false;
    switch (this.toggle.mode) {
      case "zoom":
        console.log("joining zoom");
        await this.zoomService.joinSession(name, sessionId, password).then( (e) => { exceeded = e; } );
        break;
      case "agora":
        console.log("joining agora");
        await this.agoraService.joinSession(name, sessionId, password).then( (e) => { exceeded = e; } );  
        break;
    }
    return exceeded;
   }
Enter fullscreen mode Exit fullscreen mode

First, we check if the limit of 4 users a session is exceeded via the Agora REST API:

agora.service.ts

let count!: number;
    let url: string = "http://localhost:3001/agora-channel-count/?channelname=" + sessionId;
    await fetch(url).then( async res => {
      await res.text().then(data => {
            count = parseInt(data);
            console.log(count, data)
        });
      });
    if (count >= 4) return true;
Enter fullscreen mode Exit fullscreen mode

If the limit is exceeded, the exceeded flag is set to true which is seen by the frontend and the session is not joined. An alert is thrown to notify the user of the limit being reached. If not, the Agora appID an and Agora token are retrieved from our backend server and the session is joined using the Agora SDK join function:

agora.service.ts

this.sessionId = sessionId;
    this.agoraEngine = AgoraRTC.createClient({ mode: "rtc", codec: "vp8" });
    url = "http://localhost:3001/agora-appID";
    let settings: Headers = {
        mode: "cors",
        method: 'GET',
        headers: {
          "Content-Type": "text/plain"
        }
      };
    await fetch(url, <Object>settings).then( async res => {
        await res.text().then(data => {
            this.appID = data.toString().trim();
        });
    });
    url = "http://localhost:3001/agora-token?name=" + name + "&topic=" + sessionId + "&password=" + password;

    await fetch(url, <Object>settings).then( async res => {
        await res.text().then(data => {
            this.token = data.toString().trim();
        });
    });

    await this.agoraEngine!.join(this.appID, this.sessionId, this.token, this.uid);

    return false;
  }
Enter fullscreen mode Exit fullscreen mode

Remote users in Agora will have the userPublished event triggered for all new users that publish their media data.

The Zoom SDK join logic operates the same way. First, given the mode is set to Zoom, the logic for joining a Zoom session is executed in ui.service.ts.

First, we check if the limit of 4 users a session is exceeded via the Agora REST API:

zoom.service.ts

let count!: number;
    let check: string = "http://localhost:3001/zoom-session-count/?sessionname=" + sessionId;
    await fetch(check).then( async res => {
      await res.text().then(data => {
            count = parseInt(data);
            console.log(count, data)
        });
      });
    if (count >= 4) return true;
Enter fullscreen mode Exit fullscreen mode

If the limit is exceeded, the exceeded flag is set to true which is seen by the frontend and the session is not joined. An alert is thrown to notify the user of the limit being reached. If not, the Zoom Video SDK JWT token is retrieved from our backend server and the session is joined using the Zoom SDK join function:

zoom.service.ts

let url: string = "http://localhost:3001/zoomtoken?name=" + name + "&topic=" + sessionId + "&password=" + password;
    let settings: Headers = {
        mode: "cors",
        method: 'POST',
        headers: {
          "Content-Type": "text/plain"
        }
      };

    await fetch(url, <Object>settings).then( async res => {
        await res.text().then(data => {
            token = data.toString().trim();
        });
    });

    await this.client.join(sessionId, token, name, password).then(() => {
      this.stream = this.client.getMediaStream();
      this.populateParticipantList();
    }).catch((error: any) => {
      console.log(error);
    });

    return false;
Enter fullscreen mode Exit fullscreen mode

For remote users, the userAdded event is triggered once a user joins and that user is added to the participantList. Their video and/or audio can be consumed once they setup their peripherals later in the process:

private userAdded = ()=>{
    let participantList: any = this.client.getAllUser();
    participantList.forEach((participant: any, i: number) => {
        this.participants[i].userId = participant.userId;
    });
    console.log(this.participants);
  };
Enter fullscreen mode Exit fullscreen mode

Setting up Media

Media setup in Agora uses a pub/sub model where users publish their video and audio to the channel and other user subscribe to receive that video and audio. In this app, we use a local user object and a remote user array objects, each containing identical objects with different property names, to keep track of the user and their peripherals

private localUser: AgoraParticipant = {
    userId: '',
    PlayerContainer: null,
    AudioTrack: null,
    VideoTrack: null,

  };
  private remoteParticipantGrid: AgoraRemoteParticipant[] = [
    {
      userId: '',
      PlayerContainer: null,
      AudioTrack: null,
      VideoTrack: null
    },
    {
      userId: '',
      PlayerContainer: null,
      AudioTrack: null,
      VideoTrack: null
    },
    {
      userId: '',
      PlayerContainer: null,
      AudioTrack: null,
      VideoTrack: null
    }
  ]
Enter fullscreen mode Exit fullscreen mode

After joining the session, we have to setup the local user to publish their video to the channel:

async setupLocalUserView(): Promise<void> {
    console.log(this.videoCanvas, this.appID, this.token, this.uid, this.sessionId);

    this.localUser.PlayerContainer = this.videoCanvas;
    this.localUser.PlayerContainer!.id = this.uid.toString();

    this.createGrid();

    await AgoraRTC.createMicrophoneAndCameraTracks().then( async (audioVideoTracks) => {
      this.localUser.AudioTrack = audioVideoTracks[0];
      this.localUser.VideoTrack = audioVideoTracks[1];
    });

    await this.agoraEngine!.publish([<ILocalTrack>this.localUser.AudioTrack, <ILocalTrack>this.localUser.VideoTrack]);
    this.localUser.VideoTrack!.play(<HTMLElement>this.localUser.PlayerContainer);
    await this.localUser.AudioTrack!.setMuted(true);
    console.log("publish success!");
  }
Enter fullscreen mode Exit fullscreen mode

The local user media is then played (this is an injected video element so the .play method is used to play the video)

We then use the Agora SDK listeners with callback function, userPublished to publish this media for other users to see and to subscribe to each of the existing participants media:

private userPublished = async (user: IAgoraRTCRemoteUser, mediaType: any) => {
    await this.agoraEngine!.subscribe(user, mediaType);

    let userFound: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === user.uid.toString()) });
    let newUser: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === '') });

    let userIndex = (userFound != -1) ? userFound : newUser;

    if (mediaType == "video") {
        this.remoteParticipantGrid[userIndex].VideoTrack = <IRemoteVideoTrack>user.videoTrack;
        this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
        this.remoteParticipantGrid[userIndex].userId = user.uid.toString();
        this.remoteParticipantGrid[userIndex].PlayerContainer!.id = user.uid.toString();

        this.remoteParticipantGrid[userIndex].VideoTrack!.play(<HTMLElement>this.remoteParticipantGrid[userIndex].PlayerContainer);
    }
    if (mediaType == "audio") {
        this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
        this.remoteParticipantGrid[userIndex].AudioTrack!.play();
      }

      console.log("SUBSCRIBE SUCCESS", this.remoteParticipantGrid);
    }
Enter fullscreen mode Exit fullscreen mode

Each remote user is added to the remoteUser array so we can quickly access their data as well as control their video element in the DOM

Zoom handles this in an easier way where we can use only a single canvas element to paint the user video by giving xy-coordinates to the Zoom Video SDK render method. After joining the session, a mediastream is retrieved by the SDK to control participant video and audio data given to the client by the SDK:

await this.client.join(sessionId, token, name, password).then(() => {
      this.stream = this.client.getMediaStream();
      this.populateParticipantList();
    }).catch((error: any) => {
      console.log(error);
    });
Enter fullscreen mode Exit fullscreen mode

We use a single participants array to keep track of all users and their data within the session:

private participants: Participant[] = [ //placeholder values
    {userId : '', X : 0, Y : 540},
    {userId : '', X : 960, Y : 540},
    {userId : '', X : 0, Y : 0},
    {userId : '', X : 960, Y : 0}
  ];
Enter fullscreen mode Exit fullscreen mode

In ui.service.ts, we call the setupCamera and setupMicrophone methods. The SDK finds our video and audio peripherals automatically and starts them upon calling .startVideo and .startAudio, respectively:

async setupCamera(): Promise<void> { 
    await this.stream.startVideo().then( () => {
     this.stream.renderVideo(this.videoCanvas, this.participants[0].userId, 960, 540, this.participants[0].X, this.participants[0].Y,3);
    });
    console.log("camera setup");
  }

  async setupMicrophone(): Promise<void> {
    await this.stream.startAudio().then( () => {

      setTimeout(() => { //I get error "no audio joined" if I dont have settimeout. not sure why this is a race condition even though then is being used. id rather not use this
        this.stream.muteAudio();
        console.log("audio muted");
      }, 150);

    });
    console.log("microphone setup");
  }
Enter fullscreen mode Exit fullscreen mode

We also start our SDK event listeners to listener for user-added, user-removed, and peer-video-state-changed events during the session. Finally we render all participant videos of all users within the session on the canvas:

async renderParticipantsVideo(): Promise<void> {
    console.log("rendering participant videos");
    this.participants.forEach( async (participant) => {
      if (participant.userId !== '') {
        console.log("rendering participant:", participant.userId, participant.X, participant.Y);
        await this.stream.stopRenderVideo(this.videoCanvas, participant.userId);
        await this.stream.renderVideo(this.videoCanvas, participant.userId, 960, 540, participant.X, participant.Y,3);
      }
    });
  }
Enter fullscreen mode Exit fullscreen mode

This method offers minimum access to the DOM and lets the SDK handle painting on the backend. You only have to give the location of the video to the SDK.

Controlling Audio and Video

Both Agora and Zoom offer easy ways to control the audio and video by using their mute and unmute functions. There are also methods for checking the state of the audio and video which we use to check if either peripheral is muted or not. Here we can see the similarities in the functions for Zoom and Agora:

agora.service.ts

async toggleAudio(): Promise<void> {
    if (this.localUser.AudioTrack!.muted) {
      await this.localUser.AudioTrack!.setMuted(false);
    } else {
      await this.localUser.AudioTrack!.setMuted(true);
    }
  }

  async toggleVideo(): Promise<void> {
    if (this.localUser.VideoTrack!.muted) {
      await this.localUser.VideoTrack!.setMuted(false);
    } else {
      await this.localUser.VideoTrack!.setMuted(true);
    }
  }

  isMutedAudio() {
    return this.localUser.AudioTrack!.muted;
  }

  isMutedVideo() {
    return this.localUser.VideoTrack!.muted;
  }
Enter fullscreen mode Exit fullscreen mode

zoom.service.ts

async toggleVideo(): Promise<void> {
    if (!this.client.getCurrentUserInfo().bVideoOn) {
      await this.stream.startVideo().then( () => {
        this.stream.renderVideo(this.videoCanvas, this.participants[0].userId, 960, 540, this.participants[0].X, this.participants[0].Y,3);
      });
    } else {
       await this.stream.stopVideo();
    }
  }

  async toggleAudio(): Promise<void> {
    if (this.client.getCurrentUserInfo().muted) {
     await this.stream.unmuteAudio();
    } else {
     await this.stream.muteAudio();
    }
  }

  isMutedAudio(): boolean {
    return this.client.getCurrentUserInfo().muted;
  }

  isMutedVideo(): boolean {
    return this.client.getCurrentUserInfo().bVideoOn;
  }
Enter fullscreen mode Exit fullscreen mode

The major difference between the SDKs is how listeners function. For Agora, the user video and/or audio is unpublished if the video and/or audio is muted. This event is picked up by the userUnpublished listeners and since it is unpublished, subscribed users do not see and/or hear the affected media:

private userPublished = async (user: IAgoraRTCRemoteUser, mediaType: any) => {
    await this.agoraEngine!.subscribe(user, mediaType);

    let userFound: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === user.uid.toString()) });
    let newUser: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === '') });

    let userIndex = (userFound != -1) ? userFound : newUser;

    if (mediaType == "video") {
        this.remoteParticipantGrid[userIndex].VideoTrack = <IRemoteVideoTrack>user.videoTrack;
        this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
        this.remoteParticipantGrid[userIndex].userId = user.uid.toString();
        this.remoteParticipantGrid[userIndex].PlayerContainer!.id = user.uid.toString();

        this.remoteParticipantGrid[userIndex].VideoTrack!.play(<HTMLElement>this.remoteParticipantGrid[userIndex].PlayerContainer);
    }
    if (mediaType == "audio") {
        this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
        this.remoteParticipantGrid[userIndex].AudioTrack!.play();
      }

      console.log("SUBSCRIBE SUCCESS", this.remoteParticipantGrid);
    }
Enter fullscreen mode Exit fullscreen mode

For Zoom, user media is tracked using the stream object retrieved from the SDK. Media state changes are seen by the userVideoStateChange listener where it determines whether or not to render a user's video based on the event state:

private userVideoStateChange =  (payload: any) => {
    let userIndex = this.participants.findIndex( (participant) => { return (participant.userId === payload.userId) });
    console.log("video state change:", payload.userId);
    if (payload.action === 'Start') {
      this.stream.renderVideo(this.videoCanvas, payload.userId, 960, 540, this.participants[userIndex].X, this.participants[userIndex].Y,3);
    } else if (payload.action === 'Stop') {
     this.stream.stopRenderVideo(this.videoCanvas, payload.userId);
    }
  };
Enter fullscreen mode Exit fullscreen mode

Leaving the session

Leaving the session and channel are similar for both SDKs. The media peripherals have to be stopped first and then the session can be left which triggers the respective leaving meeting listener (userRemoved for Zoom, userLeft for Agora) The sequence is shown for each SDK below:

Agora

async leaveSession(): Promise<void> {
    this.localUser.AudioTrack!.close();
    this.localUser.VideoTrack!.close();
    await this.agoraEngine!.leave();
    console.log("You left the channel");
  }
Enter fullscreen mode Exit fullscreen mode

userLeft function:

private userLeft = async (user: IAgoraRTCRemoteUser, reason: string) => {
    let userIndex: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === user.uid.toString()) });
    let parent: HTMLElement|null = document.querySelector("body > app-root > div > app-meeting > div > app-video-client > app-video-canvas > div");
    this.renderer.removeChild(parent, this.remoteParticipantGrid[userIndex].PlayerContainer);
    this.remoteParticipantGrid.splice(userIndex, 1);

    let PlayerContainer: HTMLElement = this.renderer.createElement('div');
    this.renderer.setStyle(PlayerContainer, "width", "550px");
    this.renderer.setStyle(PlayerContainer, "height", "380px");
    this.renderer.setStyle(PlayerContainer, "display", "inline-block");
    this.renderer.appendChild(parent, PlayerContainer);

    let newSlot: AgoraRemoteParticipant = {
      userId: '',
      PlayerContainer: PlayerContainer,
      AudioTrack: null,
      VideoTrack: null
    }

    this.remoteParticipantGrid.push(newSlot);

    console.log("new grid", this.remoteParticipantGrid);
  };
Enter fullscreen mode Exit fullscreen mode

Zoom

async leaveSession(): Promise<void> {

    await this.stream.stopVideo();
    await this.stream.stopAudio();
    console.log("audio and video stopped");

    if (this.client.isHost()){
      console.log("ending session");
      await this.client.leave(true);
    } else {
      await this.client.leave();
    }
  }
Enter fullscreen mode Exit fullscreen mode

userRemoved function:

private userRemoved = async (payload: any)=>{
    console.log("user left,", payload);

    this.participants = [ //reset array
      {userId : '', X : 0, Y : 540},
      {userId : '', X : 960, Y : 540},
      {userId : '', X : 0, Y : 0},
      {userId : '', X : 960, Y : 0} ];

    let participantList: any = this.client.getAllUser();

    console.log("user removed", participantList);

    participantList.forEach((participant: any, i: number) => {
        this.participants[i].userId = participant.userId;
    });
    console.log(this.participants);

    await this.renderParticipantsVideo();
  };
Enter fullscreen mode Exit fullscreen mode

Now we have a fully functional video conferencing app powered by the Zoom Video SDK!
Image description

Top comments (0)