Build in app calling: Use a voice & video calling API & highly abstracted SDKs to build a real-time communication web app (Next.js)

Subha Chanda profile pic
Subha Chanda
  • Tutorial Type: Basics
  • Reading Time: 20 min
  • Building Time: 45 min
Chat SDK v4 2x

Swift, Kotlin, and TypeScript SDKs

Build in-app chat, calls, and live streaming

Building calls with a voice & video calling API

In a post-pandemic world, app-based communication has seen accelerated adoption for use cases from marketing and operations to telemedicine and fintech. Voice and video calls, in particular, are more important than ever for business and personal use cases.

The WebRTC protocol underlies the implementation of these calls. What is WebRTC? Web Real-time Communication, or WebRTC, is an open-source communications protocol that allows real-time video, voice, or data communication between browsers, applications, or devices. If you’re building a web app that needs peer to peer communication, it’s important to familiarize yourself with how WebRTC works and why WebRTC is important.

It’s a good idea to use an abstraction layer on top of the WebRTC protocol to build and deploy your real-time applications. Let's look at how (and why) to use Sendbird’s video calling API and highly abstracted SDKs to handle the operational complexity of real-time communication requirements for in-app calls and live streaming so you can focus on building your core application.

In this article, you’ll learn how to build a voice and video call application in React using the Sendbird JavaScript SDK.

But first, let’s familiarize ourselves with the WebRTC protocol.

How does the WebRTC protocol work?

The WebRTC protocol is the foundation of Sendbird’s live streaming and Calls protocols. Let’s take a look under the hood of the WebRTC protocol so you understand what it can do and why Sendbird works well with it.

WebRTC provides protocols (like STUN, ICE, and SDP) and APIs that let you request real-time information from browsers or devices. This enabled users to share voice, videos, files, chat, and screens in real time. According to the Mozilla Developer Network documentation, “The set of standards that comprise WebRTC makes it possible to share data and perform teleconferencing peer-to-peer, without requiring that the user install plugins or any other third-party software.”

Because WebRTC connections are peer to peer, they in theory operate independently of servers, thereby reducing latency and bandwidth consumption for a faster user experience. A signaling server is needed to create the peer to peer connection, allowing peers to swap metadata like codec info and network data, but that doesn’t directly affect the voice or video call. If you’d like to understand the basics of peer to peer communication, here’s a great introduction to peer to peer calls.

WebRTC flow with signaling server, STUN server, and TURN server
WebRTC flow with signaling server, STUN server, and TURN server. Adapted from medialooks

After successfully implementing a signaling server, you can create the peer connection with the RTCPeerConnection API. This process might ask you to pass an ICE configuration object, containing the STUN or TURN server details, as a parameter to the API. Both peers must be configured to listen for ICE candidates produced by their RTCPeerConnection. The signaling server transmits these candidates to the other peer.

Now you need a peer to create and exchange offers with other peers. Following the creation of the offer via the createOffer API, a local description is set using the setLocalDescription API, and the offer is subsequently sent to the other peer. The other peer generates an answer using the createAnswer API and sets a remote description with the setRemoteDescription API.

In this exchange, the peer who creates the offer is referred to as the caller, while the peer who receives the offer is the callee. The answer generated is then returned to the caller, and a connection is established.

Now, both peers can add their media streams to RTCPeerConnection. You can display remote media streams using various methods. Here’s where Sendbird comes in.

Getting started with Sendbird: Using voice & video calling APIs and abstracted SDKs built on WebRTC

Sendbird provides SDKs for iOS, Android, React Native, and JavaScript to integrate voice and video call features into your application. With the Sendbird SDKs, you can quickly add a real-time call feature without having to set up your signaling server or listen to the ICE candidates.

The Sendbird Calls SDKs give you an abstraction layer above the native WebRTC APIs to simplify working with them. Don’t worry about manually creating any handshake methods for communicating with your peers—Sendbird handles it under the hood.

Prerequisites for building a web application: Groundwork to build with a voice and video calling API

To build a Next.js and Tailwind CSS web application for voice and video calling with Sendbird’s Calls SDK, it’s important that you are familiar with the following:

Foundations: Building in-app calls with Sendbird’s voice and video calling API & abstracted SDKs

Before you build the necessary components and tie them together, you’ll need to obtain the necessary keys from Sendbird and create a Next.js project.

Obtaining keys from Sendbird

First, create a free Sendbird account. Fill out the registration page, and you’ll be directed to your dashboard.

Sendbird dashboard

To obtain the necessary keys, you’ll have to create a new Sendbird application in the dashboard.

Click Select Application from your dashboard’s sidebar, then click Create Application. Fill out the pop-up modal with an application name and your preferred region. Select either All Products if you plan to use more than just the Calls feature in Sendbird, or simply Calls. Click Confirm.

'Create application' dialog box

You should now be able to see the necessary details required for connecting Sendbird to your application, including the application ID. Copy this ID and save it somewhere—you’ll need it later.

Create a Next.js project and install necessary packages

To integrate the call functionality, you need the Sendbird Calls npm package. It provides methods like connectWebSocket, dial, and onEstablished to connect, dial, and manage sessions.

Remember that Sendbird only works on browsers that support the WebRTC APIs. Browsers like IE or Opera Mini don’t support WebRTC.

WebRTC web browsers chart

Now, let’s create a Next.js project.

Open a folder from your code editor where you want to create the project and run the following command in the terminal:

Name your application and click Enter. A new Next.js project is scaffolded for you.

Now, cd into the project directory and install the Tailwind packages.

Once the Tailwind setup is complete, run the following command to install the sendbird-calls package:

Create a new file called .env.local inside the root directory of your application, and define a new variable called NEXT_PUBLIC_APP_ID to it. The value of the variable is the application ID you copied from the Sendbird dashboard earlier.

The content of the .env.local file should look like this:

Implementing in-app voice and video calls with a video calling API

With the package installed and the environment variable defined, let’s discuss the application flow before we move on.

A user will authenticate in the application by providing the user ID, which needs to be unique. An authenticated user (in this case, the caller) can input the user ID of another user (the callee) and choose to voice or video call them. Since we’re building for the web with a JavaScript SDK, the callee will receive an event of an incoming call. An event is an occurrence or situation, recognized by software, that can be triggered by an action performed by a person or a condition/state that becomes true. The callee receives a notification of incoming calls and can choose to accept or reject the call. If they accept the call, a connection is established, and both users can see and hear each other.

Let’s build a few components first. This tutorial uses Tailwind CSS to design the UI, but you’re not bound to use a specific UI library. Sendbird lets you use whatever you’re comfortable with.

Login component

Create a new folder called components in your project’s root directory and create a file called Login.js inside it. Paste the following code to the file:

The form takes a prop called setUserId to set the user ID using the connect function. The form has an onSubmit method called connect, which sets the value of the user ID input field via the setUserId state setter.

Dialer component

Let’s create a dialer for calling other users. Create a new file called Dialer.js inside the components directory. The content for this file should look like this:

This dialer is a form similar to the login form. It has an input field for the callee’s user ID, and below that are two buttons, one for initiating a voice call and the other for a video call.

This form takes a prop called makeCall, used for calling another user. There are also two React states: one for displaying any input user ID error, and another state called userId and setUserId for setting and passing the user ID to the makeCall function.

The initiateCall function takes a single parameter to verify if the call is a video call. If the user ID for the input field is empty, it sets an error via the setUserIdError method and displays it in the frontend. The makeCall function takes the userId and a Boolean value of isVideoCall and instantiates the call. The dialer is ready!

With the Login and Dialer components ready, let’s implement the business logic using Sendbird.

Call component and business logic

The complete application logic will be inside a Call component, so create a new component called Call.js inside your components folder.

You need to import the Sendbird Calls SDK and the other components you built earlier in this article. Start the Call.js component by pasting the following:

This component requires a few states. Create a React functional component called Call and paste the following states into it:

Let’s briefly discuss how each state is used:

  • userId and setUserId: This state holds the userId of the logged-in user.

  • setUserId: A setter method to set the state.

  • call and setCall: This state holds information relevant to the call, like details for the callee and caller, application context, and more. This state holds necessary details only when a call is made using the Sendbird SDK’s dial method.

  • authenticated and setAuthenticated: This state checks if the user is authenticated for using the application. Necessary details, like accepting or rejecting a call or changing audio settings, will only be visible to an authenticated user.

  • ringing and setRinging: The ringing state checks if the user is receiving any calls. The buttons to accept or reject the call are shown to the user only when the ringing state is true.

  • connected and setConnected: This state is set to true when a call is connected between the peers. This state shows or hides the buttons for ending the call or changing the voice and video settings along with the authenticated state.

  • connecting and setConnecting: Along with the authenticated state, this state also shows the accept or reject buttons to the user when true.

  • loading and setLoading: This state checks if the application is in loading state.

  • muted and setMuted: This state checks if the user is muted.

  • startVideo() and stopVideo(): These check whether the user's video is on or off.

  • defaultCallParams and setDefaultCallParams: This state sets the default call parameters for calling a user.

Now, let’s create a function to authenticate the user. Define a function called authenticate and paste the following code:

This function uses the authenticate method from the Sendbird SDK; it passes the user ID and a callback function. If there is an error, the promise is rejected; otherwise, it is resolved. Later in this tutorial, you’ll see this function used to authenticate the user. The Sendbird Calls SDK will be initiated depending on whether the user is authenticated.

Now, create a new function called addEventHandler for handling Sendbird listener events. Paste the following code below the authenticate function:

The addListener method registers device-specific event handlers. This listener listens for events such as onRinging, onAudioInputDeviceChanged, and so on.

For the sake of simplicity, this tutorial only defines the onRinging listener. This listener takes the call as an argument. The call argument sets the default call handler parameters. The setter method sets the ringing and call states, which display the notification to the user when they receive a call.

Your next step is to create a function for initiating Sendbird Calls using the init method. Paste the following code below your addEventHandler function:

The SendBirdCall.init method takes the application ID as a parameter, and the authenticate function is called inside a try-catch block. After authentication, the Sendbird SDK’s connectWebSocket method establishes a connection to the WebSocket server.

For the default call handler options, set the values using the setDefaultCallHandlers setter:

This code sets different default parameters for different events: or the call.onEnded event, the ringing is set to false so that the notification doesn’t show; the connected state is set to false so that the video elements get hidden; the connecting state is set to false; and the call state is set to null.

You’ll also need to define a useEffect hook. Paste the following code below your existing code:

The useEffect hook checks if the userId exists, calls the initiateSendbirdCalls function, and then sets the default call parameters. Inside this object, define another object called callOption. The localMediaView key sets the caller’s video output. The remoteMediaView sets the callee’s video output. The audio and video inputs are set to true.

Finally, let’s create the makeCall function for calling another user:

The makeCall function takes an object containing userId, and isVideoCall values. The dialParams object holds the defaultCallParams and the userId and isVideoCall values.

Finally, the call is initiated using the dial method available from the Sendbird Calls SDK. This method takes the dialParams object as an input and a callback function that takes the call and error as an argument. If there’s an error, the call state is set to null. Otherwise, the connecting state is set to true, and the call state is also set to the call value.

Paste the following code below your makeCall function:

This part of the code is straightforward, but let’s quickly take a look at it.

The JSX shows the different components depending on the state of the application. If the user receives a call, they get a Video Call is Ringing or Voice call is Ringing message depending on the type of the call. Two buttons are displayed for accepting or rejecting the call. The method call.end() rejects a call and the method call.accept() accepts a call.

If you wanted to implement the same functionality using WebRTC, it would have taken a lot more code than simply checking the call object.

When users are connected, they can see three buttons: one for toggling the audio, one for toggling the video, and one for terminating the call. Methods like muteMicrophone, unmuteMicrophone, stopVideo, and startVideo are available on the call object to achieve these features. The methods are bound to their corresponding buttons.

The user receives a connecting message when the call is in a connecting state. If the application is not in a connecting, connected, or ringing state, the Dialer component is shown to the user. The makeCall function is passed to it for dialing. If the user is not authenticated, then the Login component is displayed. The component takes the userId and the setUserId prop.

The video toggle button is only shown if the call is a video call. In that case, two video components are displayed. Take a look at the ids of the components—they’re referenced in the setDefaultCallParams setter in the useEffect function above. The local_video_element_id references the local video of the caller, and the remote_video_element_id references the callee’s video.

When no userId is available, the Login component is displayed.

The application is now ready!

As your final step, you need to import the Call function inside the index.js file. Replace the content of the index.js file with the following code:

You can start your application by running npm run dev; it should start a server locally on your machine. The application should work as shown in the following GIF:

GIF of application

You can find the complete source code for this application on GitHub.

Simplifying communication with Sendbird’s voice and video calling APIs and abstracted SDKs

We talked earlier about the basic workflow of building a communication app using WebRTC—it requires a signaling server to create a connection, which can be a complicated task. If you’re just starting out with WebRTC, it could take a long time to implement perfectly. And if you want to extend your app’s functionality with both voice and video calls, WebRTC takes even more work.

With reliable voice and video calling APIs, as well as highly abstracted voice and video SDKs, it becomes much easier to use underlying technology and communication protocols like WebRTC. As you’ve seen in this tutorial, methods like connectWebSocket, authenticate, and listener events simplify operational tasks. Depending on whether your app enables video, you can easily create a voice call or video call app without having to understand the nitty-gritty of WebRTC. Keep that attention on your core application!

Extend your app’s functionality with a live streaming SDK

If you want to livestream your video event, the Sendbird Live SDKs can help you extend your app’s communication functionality to give your users a customized experience without relying on a third-party streaming service. Sendbird Live can handle up to 100,000 participants per event, with a sub-second WebRTC latency of 100ms. You can also add useful (and fun!) features such as screen sharing, camera filters, cloud video recording, and more. Sendbird Live also offers RTMP ingest plus RTMP to WebRTC transcoding for global content delivery to web and mobile devices. This enables greater device compatibility, higher video quality, and a smoother viewing experience for live online events. Check out the docs to see how to start your live stream.

A top-quality app built with Sendbird’s voice and video calling API and JavaScript SDK

This tutorial used Next.js, Tailwind CSS, and the Sendbird JavaScript SDK to implement in-app voice and video calls. Using the Sendbird JavaScript SDK, you now know how to build a voice and video call app in React without managing or building a Websocket server or media server. You also saw how to customize the user experience with Sendbird’s flexible voice and video API and SDKs.

Make your first call today and get in-depth guidance in the docs. If you need help, please start a conversation in the Sendbird Community or contact us to get personalized help. Our experts are always there for you and happy to answer any questions.

Happy calls building!📱