Back to Blog

Build a WebAR Live Video Streaming Web App

Hey everyone, today I want to walk through how to build a live Streaming web app that leverages WebXR to display a live video stream as part of a 3D model that exists in both WebAR and WebVR environments.

In this project we’ll build a simple web app that supports two user roles. The WebVR scene allows users to broadcast themselves into a 3D model that they can rotate and move around a virtual scene. The WebAR scene allows users to view the virtual scene in augmented reality where the 3D model and video stream are controlled by the broadcast user.

Pre-Requisites

Architecture and Structure Design

In “How To: Build a Live Broadcasting Web App,” the build conformed to the broadcasting scenario. This build will follow a similar structure with two user roles: broadcaster and audience. Just as in the traditional broadcasting scenario, broadcasters will share their camera stream into the channel, while the audience will be able to watch the broadcaster(s), but with a WebXR twist.

In this project, the broadcaster will be represented by a 3D model, and we will play the camera stream as a texture as the face of the model.

Broadcaster by digitallysavvy on Sketchfab

For this build I’m planning to use 3 frameworks: AFrame because its virtual DOM makes it very clear and easy to understand the structure of the scene; AR.js for its cross-platform WebAR; and Agora for the live streaming, because it it makes integrating real-time video simple.

Given the fragmentation that is pervasive throughout the JS landscape, I wanted to write this tutorial using the most basic versions of HTML, CSS, and JS. As we go along, you’ll notice that I’ve mixed Vanilla JS and jQuery but that was done intentionally because Vanilla JS can be a bit verbose for certain DOM tasks, so I used jQuery, to simplify a few things. Within the WebXR specific code, I stuck to Vanilla JS to keep it very clear how things are connected. We are going to cut a few more corners and use Bootstrap so we don’t have to worry about writing too much custom CSS.

Core Structure (HTML)

In our live broadcast web app, we will have two clients (broadcaster/audience), each with their own UI. The broadcaster UI will use AFrame to create the WebVR part experience, while the audience will use AR.js with AFrame to create the WebAR portion of the experience.

Broadcaster UI

Let’s start with the basic html structure of the Broadcaster UI. There are a few UI elements we must have. First, we need a toolbar that will contain buttons for toggling audio/video streams, and a way to leave the chat. We’ll also need an area to act as our 3D canvas for our AFrame scene.

Before we start, it’s worth noting that AFrame implements an entity component system that virtualizes the DOM. We’ll use AFrame’s custom DOM elements to build our virtual scene, starting with the <a-scene> component that defines our scene, and then its components:

  • <a-assets>to properly load our broadcaster model as part of the scene.
  • <a-sky>to set the scene’s background color.
  • 2 <a-entity>’s, one for the camera and the other as the camera’s target, this is for the aframe-orbit-controls-component.

We’ll also include Agora’s Video and RTM Web SDKs. We’ll implement them within the Broadcaster Client to display our video stream and allow Broadcaster&; to move the 3D model within the virtual scene.

<html lang="en">
<head>
<title>Agora.io AFrame [HOST] - Live Stream WebVR</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- FontAwesome and Bootstrap CSS -->
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.7.0/css/all.css" integrity="sha384-lZN37f5QGtY3VHgisS14W3ExzMWZxybE1SJSEsQp9S+oqd12jhcu+A56Ebc1zFSJ" crossorigin="anonymous">
<link href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css" rel="stylesheet">
<link href="https://cdnjs.cloudflare.com/ajax/libs/animate.css/3.7.0/animate.min.css" rel="stylesheet">
<!-- jQuery and Bootstrap JS -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/js/bootstrap.min.js"></script>
<!-- AFrame -->
<script src="https://aframe.io/releases/1.0.4/aframe.min.js"></script>
<script src="https://rawgit.com/donmccurdy/aframe-extras/master/dist/aframe-extras.loaders.min.js"></script>
<script src="https://cdn.rawgit.com/tizzle/aframe-orbit-controls-component/master/dist/aframe-orbit-controls-component.min.js"></script>
<!-- Agora.io -->
<script src="js/AgoraRTCSDK-3.0.2.js" type="text/javascript"></script>
<script src="js/agora-rtm-sdk-1.2.2.js" type="text/javascript"></script>
<link rel="stylesheet" type="text/css" href="style.css"/>
</head>
<body>
<div class="container-fluid p-0">
<div id="main-container">
<div id="buttons-container" class="row justify-content-center mt-3">
<div id="audio-controls" class="col-md-2 text-center btn-group">
<button id="mic-btn" type="button" class="btn btn-block btn-dark btn-lg">
<i id="mic-icon" class="fas fa-microphone"></i>
</button>
<button id="mic-dropdown" type="button" class="btn btn-lg btn-dark dropdown-toggle dropdown-toggle-split" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
<span class="sr-only">Toggle Dropdown</span>
</button>
<div id="mic-list" class="dropdown-menu dropdown-menu-right">
</div>
</div>
<div id="video-controls" class="col-md-2 text-center btn-group">
<button id="video-btn" type="button" class="btn btn-block btn-dark btn-lg">
<i id="video-icon" class="fas fa-video"></i>
</button>
<button id="cam-dropdown" type="button" class="btn btn-lg btn-dark dropdown-toggle dropdown-toggle-split" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
<span class="sr-only">Toggle Dropdown</span>
</button>
<div id="camera-list" class="dropdown-menu dropdown-menu-right">
</div>
</div>
<div class="col-md-2 text-center">
<button id="exit-btn" type="button" class="btn btn-block btn-danger btn-lg">
<i id="exit-icon" class="fas fa-phone-slash"></i>
</button>
</div>
</div>
<div id="full-screen-video">
<a-scene>
<a-assets>
<a-asset-item id="broadcaster" src="/assets/broadcaster.glb"></a-asset-item>
</a-assets>
<a-sky color="#ECECEC"></a-sky>
<a-entity
id="camera"
camera
position="0 1 2.5"
orbit-controls="
autoRotate: false;
target: #target;
enableDamping: true;
dampingFactor: 0.25;
rotateSpeed:0.14;
minDistance:2;
maxDistance:15;"
mouse-cursor="">
</a-entity>
<a-entity id="target"></a-entity>
</a-scene>
</div>
</div>
</div>
</div>
</body>
<script>
$("#mic-btn").prop("disabled", true);
$("#video-btn").prop("disabled", true);
$("#exit-btn").prop("disabled", true);
</script>
<script src="js/agora-broadcaster.js" type="text/javascript"></script>
</html>

At the bottom of the page, we’ll include a code block to disable the UI controls and a link to our Broadcaster Client. When we build the Broadcaster Client, we’ll enable the UI controls when the user joins the channel.

Audience UI

Now that we have the Broadcaster UI ready, we can build the Audience UI. It will be much simpler than the Broadcaster UI, as we only need to support displaying 3D models with AR.js/AFrame and Agora’s Video and RTM Web SDKs.

Within the scene, we’ll use the <a-assets> element to preload the broadcaster 3D model, but this time the <a-entity> camera is empty, because AR.js will control the camera within the scene (just like all other AR frameworks).

You’ll notice that we have a new element <a-marker>. This represents the marker image that AR.js is tracking for the AR experience. The type='barcode' and value=6 dictate the marker type and the “fingerprint” that AR.js will “track” (more about AR.js Marker Tracking). I chose to use the Barcode type for simplicity.

<!doctype HTML>
<html>
<head>
<title>Agora AR.js - Live Streamed WebAR</title>
<meta name="viewport" content="width=device-width, user-scalable=no, minimum-scale=1.0, maximum-scale=1.0">
</head>
<script src="https://aframe.io/releases/1.0.4/aframe.min.js"></script>
<script src="https://rawgit.com/jeromeetienne/AR.js/master/aframe/build/aframe-ar.min.js"></script>
<script src="js/AgoraRTCSDK-3.0.2.js" type="text/javascript"></script>
<script src="js/agora-rtm-sdk-1.2.2.js" type="text/javascript"></script>
<script src="js/agora-audience.js" type="text/javascript"></script>
<script src="https://rawgit.com/donmccurdy/aframe-extras/master/dist/aframe-extras.loaders.min.js"></script>
<body style='margin : 0px; overflow: hidden;'>
<a-scene embedded arjs='sourceType: webcam; debugUIEnabled: false; detectionMode: mono_and_matrix; matrixCodeType: 3x3;'>
<a-assets>
<a-asset-item id="broadcaster" src="/assets/broadcaster.glb"></a-asset-item>
</a-assets>
<a-marker type='barcode' value='6'>
</a-marker>
<a-entity camera></a-entity>
</a-scene>
</body>
</html>
view raw audience.html hosted with ❤ by GitHub

Note: We are using the legacy marker tracking and not the new AR.js NFT Image Tracking because there is an issue with dynamically loaded models and NFT markers. Once this issue is resolved, I plan to update the project to use NFT Image Tracking.

CSS & Styling

Within each of the UI’s above I’ve included some Bootstrap classes. Bootstrap is really nice, but we also need to include a few custom CSS blocks to adjust a few elements that we won’t get perfect out of the box. We’ll use the Font Awesome CSS framework because we are going to need to incorporate icons for the various buttons and FA makes it really simple.

As I mentioned, Bootstrap is great, but sometimes you still need a little bit of custom CSS. Here are the styling blocks for the above referenced style.css. The CSS is pretty straight-forward without anything really worth noting.

body {
margin: 0;
padding: 0;
}
body .btn:focus{
outline: none !important;
box-shadow:none !important;
}
#buttons-container {
position: absolute;
z-index: 2;
width: 100vw;
}
#buttons-container div {
max-width: 250px;
min-width: 150px;
margin-bottom: 10px;
}
.btn-group button i {
padding-left: 25px;
}
#full-screen-video {
position: absolute;
width: 100vw;
height: 100vh;
}
#mute-overlay {
position: absolute;
z-index: 2;
bottom: 0;
left: 0;
color: #d9d9d9;
font-size: 2em;
padding: 0 0 3px 3px;
display: none;
}
.mute-overlay {
position: absolute;
z-index: 2;
top: 2px;
color: #d9d9d9;
font-size: 1.5em;
padding: 2px 0 0 2px;
display: none;
}
.btn-xlg {
padding: 20px 35px;
font-size: 30px;
line-height: normal;
-webkit-border-radius: 8px;
-moz-border-radius: 8px;
border-radius: 8px;
}
.drop-mini {
width: inherit;
display: inline-block;
}
.close .fa-xs {
font-size: .65em;
}
/* Respomnsive design */
@media only screen and (max-height: 350px) {
.btn-xlg {
font-size: 1rem;
}
}
@media only screen and (max-height: 400px){
.btn-xlg {
font-size: 1.25rem;
}
}
@media only screen and (max-width: 400px) {
.btn-xlg {
padding: 10px 17px;
}
}
view raw style.css hosted with ❤ by GitHub

Building the Broadcaster Client

Now we are ready to build our WebVR Broadcaster Client. To get us started, below is a barebones implementation of the Agora RTC and RTM Web SDKs that will serve as our Broadcaster Client.

I’ve included the required callbacks and implemented the key functions but left the function block empty for us to walk through. We’ll have a few functions to fill in, but the two I want to highlight for this section are:

  • The createBroadcaster function which will create the <a-entity> for the 3D model with a video texture.
  • The connectStreamToVideo function that will connect the Agora Video stream as the source for the video texture.

Note: You will need to add your Agora AppID at the top of the script. You can find this in the Projects section of your Agora Console

// Agora settings
const agoraAppId = ''; // insert Agora AppID here
const channelName = 'WebAR';
var streamCount = 0;
// video profile settings
var cameraVideoProfile = '720p_6'; // 960 × 720 @ 30fps & 750kbs
// set log level:
// -- .DEBUG for dev
// -- .NONE for prod
AgoraRTC.Logger.setLogLevel(AgoraRTC.Logger.DEBUG);
// keep track of streams
var localStreams = {
uid: '',
camera: {
camId: '',
micId: '',
stream: {}
},
screen: {
id: '',
stream: {}
},
rtmActive: false
};
// keep track of devices
var devices = {
cameras: [],
mics: []
}
// setup the RTM client and channel
const rtmClient = AgoraRTM.createInstance(agoraAppId);
const rtmChannel = rtmClient.createChannel(channelName);
rtmClient.on('ConnectionStateChange', (newState, reason) => {
console.log('on connection state changed to ' + newState + ' reason: ' + reason);
});
// event listener for receiving a channel message
rtmChannel.on('ChannelMessage', ({ text }, senderId) => {
// text: text of the received channel message; senderId: user ID of the sender.
console.log('AgoraRTM msg from user ' + senderId + ' recieved: \n' + text);
// [TODO]: Handle RTM msg
});
// create RTC client
var rtcClient = AgoraRTC.createClient({mode: 'live', codec: 'vp8'}); // vp8 to work across mobile devices
rtcClient.init(agoraAppId, () => {
console.log('AgoraRTC client initialized');
joinChannel(); // join channel upon successfull init
}, function (err) {
console.log('[ERROR] : AgoraRTC client init failed', err);
});
rtcClient.on('stream-published', function (evt) {
console.log('Publish local stream successfully');
});
// connect remote streams
rtcClient.on('stream-added', (evt) => {
const stream = evt.stream;
const streamId = stream.getId();
console.log('New stream added: ' + streamId);
console.log('Subscribing to remote stream:' + streamId);
// Subscribe to the remote stream
rtcClient.subscribe(stream, (err) => {
console.log('[ERROR] : subscribe stream failed', err);
});
streamCount++; // Increase count of Active Stream Count
createBroadcaster(streamId); // Load 3D model with video texture
});
rtcClient.on('stream-removed', (evt) => {
const stream = evt.stream;
stream.stop(); // stop the stream
stream.close(); // clean up and close the camera stream
console.log('Remote stream is removed ' + stream.getId());
});
rtcClient.on('stream-subscribed', (evt) => {
const remoteStream = evt.stream;
const remoteId = remoteStream.getId();
console.log('Successfully subscribed to remote stream: ' + remoteStream.getId());
// get the designated video element and connect it to the remoteStream
var video = document.getElementById('faceVideo-' + remoteId);
connectStreamToVideo(remoteStream, video);
});
// remove the remote-container when a user leaves the channel
rtcClient.on('peer-leave', (evt) => {
console.log('Remote stream has left the channel: ' + evt.uid);
evt.stream.stop(); // stop the stream
const remoteId = evt.stream.getId();
// Remove the 3D and Video elements that were created
document.getElementById(remoteId).remove();
document.getElementById('faceVideo-' + remoteId).remove();
streamCount--; // Decrease count of Active Stream Count
});
// show mute icon whenever a remote has muted their mic
rtcClient.on('mute-audio', (evt) => {
console.log('mute-audio for: ' + evt.uid);
});
rtcClient.on('unmute-audio', (evt) => {
console.log('unmute-audio for: ' + evt.uid);
});
// show user icon whenever a remote has disabled their video
rtcClient.on('mute-video', (evt) => {
console.log('mute-video for: ' + evt.uid);
});
rtcClient.on('unmute-video', (evt) => {
console.log('unmute-video for: ' + evt.uid);
});
// join a channel
function joinChannel() {
const token = generateToken();
// set the role
rtcClient.setClientRole('audience', () => {
console.log('Client role set to audience');
}, (e) => {
console.log('setClientRole failed', e);
});
rtcClient.join(token, channelName, 0, (uid) => {
console.log('User ' + uid + ' join channel successfully');
localStreams.uid = uid
createBroadcaster(uid); // Load 3D model with video texture
createCameraStream(uid); // Create the camera stream
joinRTMChannel(uid); // join the RTM channel
}, (err) => {
console.log('[ERROR] : join channel failed', err);
});
}
function leaveChannel() {
rtcClient.leave(() => {
console.log('client leaves channel');
localStreams.camera.stream.stop() // stop the camera stream playback
localStreams.camera.stream.close(); // clean up and close the camera stream
rtcClient.unpublish(localStreams.camera.stream); // unpublish the camera stream
//disable the UI elements
$('#mic-btn').prop('disabled', true);
$('#video-btn').prop('disabled', true);
$('#exit-btn').prop('disabled', true);
}, (err) => {
console.log('client leave failed ', err); //error handling
});
}
// video streams for channel
function createCameraStream(uid) {
const localStream = AgoraRTC.createStream({
streamID: uid,
audio: true,
video: true,
screen: false
});
localStream.setVideoProfile(cameraVideoProfile);
// The user has granted access to the camera and mic.
localStream.on('accessAllowed', () => {
if(devices.cameras.length === 0 && devices.mics.length === 0) {
console.log('[DEBUG] : checking for cameras & mics');
getCameraDevices();
getMicDevices();
}
console.log('accessAllowed');
});
// The user has denied access to the camera and mic.
localStream.on('accessDenied', () => {
console.log('accessDenied');
});
localStream.init(() => {
console.log('getUserMedia successfully');
// Coonect the local stream video to the video texture
var video = document.getElementById('faceVideo-' + uid);
connectStreamToVideo(localStream, video);
enableUiControls(localStream);
// publish local stream
rtcClient.publish(localStream, (err) => {
console.log('[ERROR] : publish local stream error: ' + err);
});
// keep track of the camera stream for later
localStreams.camera.stream = localStream;
}, (err) => {
console.log('[ERROR] : getUserMedia failed', err);
});
}
function createBroadcaster(streamId) {
// [TODO]: Load 3D model with video texture
// create video element
// add video element to the DOM
// configure the new broadcaster element
// create the broadcaster
// add broadcaster to the scene
// add event listener for model loaded:
// - search the mesh's children for the face-geo
// - create video texture from video element
// - set node's material map to video texture
}
function connectStreamToVideo(agoraStream, video) {
// [TODO]: Coonect the local stream video to the video texture
}
function changeStreamSource (deviceIndex, deviceType) {
console.log('Switching stream sources for: ' + deviceType);
var deviceId;
if (deviceType === 'video') {
deviceId = devices.cameras[deviceIndex].deviceId
} else if(deviceType === 'audio') {
deviceId = devices.mics[deviceIndex].deviceId;
}
localStreams.camera.stream.switchDevice(deviceType, deviceId, () => {
console.log('successfully switched to new device with id: ' + JSON.stringify(deviceId));
// set the active device ids
if(deviceType === 'audio') {
localStreams.camera.micId = deviceId;
} else if (deviceType === 'video') {
localStreams.camera.camId = deviceId;
} else {
console.log('unable to determine deviceType: ' + deviceType);
}
}, () => {
console.log('failed to switch to new device with id: ' + JSON.stringify(deviceId));
});
}
function joinRTMChannel(uid){
console.log('uid:')
console.log(uid)
rtmClient.login({ token: null, uid: String(uid) }).then(() => {
console.log('AgoraRTM client login success');
// join a channel and send a message
rtmChannel.join().then(() => {
// join-channel success
localStreams.rtmActive = true
console.log('RTM Channel join success');
addCameraListener();
}).catch(error => {
// join-channel failure
console.log('failed to join channel for error: ' + error);
});
}).catch(err => {
console.log('AgoraRTM client login failure', err);
});
}
function sendChannelMessage(state, direction){
if (localStreams.rtmActive) {
// use a JSON object to send our instructions in a structured way
const jsonMsg = { };
// build the Agora RTM Message
const msg = {
description: undefined,
messageType: 'TEXT',
rawMessage: undefined,
text: JSON.stringify(jsonMsg)
};
rtmChannel.sendMessage(msg).then(() => {
// channel message-send success
console.log('sent msg success');
}).catch(error => {
// channel message-send failure
console.log('sent msg failure');
});
}
}
// helper methods
function getCameraDevices() {
console.log('Checking for Camera Devices.....')
rtcClient.getCameras ((cameras) => {
devices.cameras = cameras; // store cameras array
cameras.forEach((camera, i) => {
const name = camera.label.split('(')[0];
const optionId = 'camera_' + i;
const deviceId = camera.deviceId;
if(i === 0 && localStreams.camera.camId === ''){
localStreams.camera.camId = deviceId;
}
$('#camera-list').append('<a class=\'dropdown-item\' id= ' + optionId + '>' + name + '</a>');
});
$('#camera-list a').click((event) => {
const index = event.target.id.split('_')[1];
changeStreamSource (index, 'video');
});
});
}
function getMicDevices() {
console.log('Checking for Mic Devices.....')
rtcClient.getRecordingDevices((mics) => {
devices.mics = mics; // store mics array
mics.forEach((mic, i) => {
let name = mic.label.split('(')[0];
const optionId = 'mic_' + i;
const deviceId = mic.deviceId;
if(i === 0 && localStreams.camera.micId === ''){
localStreams.camera.micId = deviceId;
}
if(name.split('Default - ')[1] != undefined) {
name = '[Default Device]' // rename the default mic - only appears on Chrome & Opera
}
$('#mic-list').append('<a class=\'dropdown-item\' id= ' + optionId + '>' + name + '</a>');
});
$('#mic-list a').click((event) => {
const index = event.target.id.split('_')[1];
changeStreamSource (index, 'audio');
});
});
}
// use tokens for added security
function generateToken() {
return null; // TODO: add a token generation
}
function rotateModel(uid, direction, send) {
if (send) {
sendChannelMessage('rotation', direction)
}
var model = document.getElementById(uid)
if (direction === 'counter-clockwise') {
model.object3D.rotation.y += 0.1;
} else if (direction === 'clockwise') {
model.object3D.rotation.y -= 0.1;
}
}
function moveModel(uid, direction, send) {
if (send) {
sendChannelMessage('position', direction)
}
var model = document.getElementById(uid)
switch (direction){
case 'forward':
model.object3D.position.z += 0.1
break;
case 'backward':
model.object3D.position.z -= 0.1
break;
case 'left':
model.object3D.position.x -= 0.1
break;
case 'right':
model.object3D.position.x += 0.1
break;
default:
console.log('Unable to determin direction: ' + direction);
}
}
// UI
function toggleBtn(btn){
btn.toggleClass('btn-dark').toggleClass('btn-danger');
}
function toggleScreenShareBtn() {
$('#screen-share-btn').toggleClass('btn-danger');
$('#screen-share-icon').toggleClass('fas').toggleClass('fab').toggleClass('fa-slideshare').toggleClass('fa-times-circle');
}
function toggleVisibility(elementID, visible) {
if (visible) {
$(elementID).attr('style', 'display:block');
} else {
$(elementID).attr('style', 'display:none');
}
}
function toggleMic() {
toggleBtn($('#mic-btn')); // toggle button colors
toggleBtn($('#mic-dropdown'));
$('#mic-icon').toggleClass('fa-microphone').toggleClass('fa-microphone-slash'); // toggle the mic icon
if ($('#mic-icon').hasClass('fa-microphone')) {
localStreams.camera.stream.unmuteAudio(); // enable the local mic
} else {
localStreams.camera.stream.muteAudio(); // mute the local mic
}
}
function toggleVideo() {
toggleBtn($('#video-btn')); // toggle button colors
toggleBtn($('#cam-dropdown'));
if ($('#video-icon').hasClass('fa-video')) {
localStreams.camera.stream.muteVideo(); // enable the local video
console.log('muteVideo');
} else {
localStreams.camera.stream.unmuteVideo(); // disable the local video
console.log('unMuteVideo');
}
$('#video-icon').toggleClass('fa-video').toggleClass('fa-video-slash'); // toggle the video icon
}
function enableUiControls() {
$('#mic-btn').prop('disabled', false);
$('#video-btn').prop('disabled', false);
$('#exit-btn').prop('disabled', false);
$('#mic-btn').click(() => {
toggleMic();
});
$('#video-btn').click(() => {
toggleVideo();
});
$('#exit-btn').click(() => {
console.log('so sad to see you leave the channel');
leaveChannel();
});
// keyboard listeners
$(document).keypress((e) => {
switch (e.key) {
case 'm':
console.log('squick toggle the mic');
toggleMic();
break;
case 'v':
console.log('quick toggle the video');
toggleVideo();
break;
case 'q':
console.log('so sad to see you quit the channel');
leaveChannel();
break;
case 'r':
rotateModel(localStreams.uid, 'counter-clockwise', true)
break;
case 'e':
rotateModel(localStreams.uid, 'clockwise', true)
break;
case 'd':
// move the model forward
moveModel(localStreams.uid, 'forward', true)
break;
case 'x':
// move the model backward
moveModel(localStreams.uid, 'backward', true)
break;
case 'z':
// move the model left
moveModel(localStreams.uid, 'left', true)
break;
case 'c':
// move the model right
moveModel(localStreams.uid, 'right', true)
break;
default: // do nothing
}
});
}

You’ll notice a few empty functions for sending messages with Agora’s RTM SDK. We’ll first focus on adding the Broadcaster and Audience video clients, then we’ll walk through the RTM integration.

Note: In this implementation, I’m using dynamic user ID’s but you can easily connect this to a user management system and use your own UIDs. In the code above, I use the same user ID for both the RTC and RTM clients.

Adding the 3D Model

Now that we have our bare bones implementation, let’s implement the createBroadcaster function.

First, we want to create a video element that will be used as part of the Video Texture. We need to set the webkit-playsinline and playsinline attributes to ensure the video doesn’t play fullscreen. Once the video element is ready, we can add it to the DOM as a child of <a-assets>. AFrame’s asset loader will handle loading the video in such a way that we can use it in our virtual scene.

Next we need to configure our settings for our 3D model, such as scale, rotation, and position. While the first time we call this function is within the localStream’s init, this function is generic and called whenever a new stream is added to the channel. To allow for multiple hosts, we need to include an offset for the position.

For the gltfModel, we’ll take advantage of AFrame’s DOM virtualization. We’ll pass the id of the 3D model asset that we loaded in the <a-assets> let AFrame’s loader do the rest. AFrame provides a model-loaded event that we can listen for. Once the model is loaded we can traverse its mesh’s to find the mesh named 'face-geo'. This is the name I gave the mesh to make it easier to find.

You’ll notice that we are creating a <a-gltf-model>. This is a component that wraps <a-entity> specifically for loading .glTF and .glb files. In this project we’ll use .glb, because it efficiently packs the model into a single file.

Once we find the 'face-geo', we need to create a Video Texture, set the minFilter, magFilter, and flipY properties to ensure the video is properly sampled and doesn’t appear flipped. Next we’ll update the 'face-geo' node’s material. We’ll start with setting the .map property to the Video Texture. Then we’ll set the .color to an empty color object to ensure the video doesn’t appear with any tint. Lastly we’ll set the .metalness to 0 so the video doesn’t have any reflections.

function createBroadcaster(streamId) {
// create video element
var video = document.createElement('video');
video.id = "faceVideo-" + streamId;
video.setAttribute('webkit-playsinline', 'webkit-playsinline');
video.setAttribute('playsinline', 'playsinline');
video.setAttribute('poster', '/imgs/no-video.jpg');
console.log(video);
// add video object to the DOM
document.querySelector("a-assets").appendChild(video);
// configure the new broadcaster
const gltfModel = "#broadcaster";
const scale = "1 1 1";
const offset = streamCount;
const position = offset + " -1 0";
const rotation = "0 0 0";
// create the broadcaster element using the given settings
const parent = document.querySelector('a-scene');
var newBroadcaster = document.createElement('a-gltf-model');
newBroadcaster.setAttribute('id', streamId);
newBroadcaster.setAttribute('gltf-model', gltfModel);
newBroadcaster.setAttribute('scale', scale);
newBroadcaster.setAttribute('position', position);
newBroadcaster.setAttribute('rotation', rotation);
parent.appendChild(newBroadcaster);
console.log(newBroadcaster);
// add event listener for model loaded:
newBroadcaster.addEventListener('model-loaded', () => {
var mesh = newBroadcaster.getObject3D('mesh');
mesh.traverse((node) => {
// search the mesh's children for the face-geo
if (node.isMesh && node.name == 'face-geo') {
// create video texture from video element
var texture = new THREE.VideoTexture(video);
texture.minFilter = THREE.LinearFilter;
texture.magFilter = THREE.LinearFilter;
texture.flipY = false;
// set node's material map to video texture
node.material.map = texture
node.material.color = new THREE.Color();
node.material.metalness = 0;
}
});
});
}

Implementing the Real Time Video

Now that our 3D model has been added to the scene, it’s relatively simple to connect the video stream as the source of the Video Texture.

function connectStreamToVideo(agoraStream, video) {
video.srcObject = agoraStream.stream;// add video stream to video element as source
video.onloadedmetadata = () => {
// ready to play video
video.play();
}
}

The important thing to remember is to use the 'onloadedmetadata' to wait for the stream to load before trying to play it.

Test the Broadcaster

At this point we are ready to test the Broadcaster Client. Start your local web server, and navigate to:

/broadcaster.html

If you followed the steps above and everything works, you should see yourself in the 3D model.

Note: You’ll have to accept camera access permissions

Building the Audience Client

Now that we have a working WebVR Broadcaster Client, we need to build the WebAR Audience Client. To get us started, below is a barebones implementation of the Agora RTC and RTM Web SDKs that will serve as our Audience Client.

This part is a bit easier because we are still using AFrame, the only difference is we’ve added AR.js, and since this is an audience user they don’t need to publish a stream. I’ve included the Audience Client below, you’ll notice two minor differences when we add the model:

  • The model is added as a child of <a-marker> so that it will display tracking to the marker.
  • The scale is much smaller to make the model fit on the screen better and the reason it’s negative is because I need to flip the UVs because the video was appearing inverted.
  • The model is rotated 180 degrees over the x-axis, this allows the model to appear upright when tracking to the marker.

Note: You will need to add your Agora AppID at the top of the script. You can find this in the Projects section of your Agora Console.

// create client
var client = AgoraRTC.createClient({mode: 'live', codec: 'vp8'}); // vp8 to work across mobile devices
const agoraAppId = ''; // insert Agora AppID here
const channelName = 'WebAR';
var streamCount = 0;
// set log level:
// -- .DEBUG for dev
// -- .NONE for prod
AgoraRTC.Logger.setLogLevel(AgoraRTC.Logger.DEBUG);
client.init(agoraAppId, () => {
console.log('AgoraRTC client initialized');
joinChannel(); // join channel upon successfull init
}, function (err) {
console.log('[ERROR] : AgoraRTC client init failed', err);
});
// connect remote streams
client.on('stream-added', (evt) => {
const stream = evt.stream;
const streamId = stream.getId();
console.log('New stream added: ' + streamId);
console.log('Subscribing to remote stream:' + streamId);
// Subscribe to the stream.
client.subscribe(stream, (err) => {
console.log('[ERROR] : subscribe stream failed', err);
});
streamCount++;
createBroadcaster(streamId); // create 3d broadcaster
});
client.on('stream-removed', (evt) => {
const stream = evt.stream;
stream.stop(); // stop the stream
stream.close(); // clean up and close the camera stream
console.log("Remote stream is removed " + stream.getId());
});
client.on('stream-subscribed', (evt) => {
const remoteStream = evt.stream;
const remoteId = remoteStream.getId();
console.log('Successfully subscribed to remote stream: ' + remoteStream.getId());
// get the designated video element and add the stream as its video source
var video = document.getElementById("faceVideo-" + remoteId);
connectStreamToVideo(remoteStream, video)
});
// remove the remote-container when a user leaves the channel
client.on('peer-leave', (evt) => {
console.log('Remote stream has left the channel: ' + evt.uid);
evt.stream.stop(); // stop the stream
const remoteId = evt.stream.getId();
document.getElementById(remoteId).remove();
document.getElementById("faceVideo-" + remoteId);
streamCount--;
});
// show mute icon whenever a remote has muted their mic
client.on('mute-audio', (evt) => {
console.log("mute-audio for: " + evt.uid);
});
client.on('unmute-audio', (evt) => {
console.log("unmute-audio for: " + evt.uid);
});
// show user icon whenever a remote has disabled their video
client.on('mute-video', (evt) => {
console.log("mute-video for: " + evt.uid);
});
client.on('unmute-video', (evt) => {
console.log("unmute-video for: " + evt.uid);
});
// join a channel
function joinChannel() {
const token = generateToken();
// set the role
client.setClientRole('audience', () => {
console.log('Client role set to audience');
}, (e) => {
console.log('setClientRole failed', e);
});
client.join(token, channelName, 0, (uid) => {
console.log('User ' + uid + ' join channel successfully');
joinRTMChannel(uid);
}, function(err) {
console.log('[ERROR] : join channel failed', err);
});
}
function leaveChannel() {
client.leave(() => {
console.log('client leaves channel');
}, (err) => {
console.log('client leave failed ', err); //error handling
});
}
// Agora RTM
// setup the RTM client and channel
const rtmClient = AgoraRTM.createInstance(agoraAppId);
const rtmChannel = rtmClient.createChannel(channelName);
rtmClient.on('ConnectionStateChange', (newState, reason) => {
console.log('on connection state changed to ' + newState + ' reason: ' + reason);
});
// event listener for receiving a channel message
rtmChannel.on('ChannelMessage', ({ text }, senderId) => {
// text: text of the received channel message; senderId: user ID of the sender.
console.log('AgoraRTM msg from user ' + senderId + ' recieved: \n' + text);
});
function joinRTMChannel(uid){
rtmClient.login({ token: null, uid: String(uid) }).then(() => {
console.log('AgoraRTM client login success');
// join a channel and send a message
rtmChannel.join().then(() => {
// join-channel success
localStreams.rtmActive = true
console.log('RTM Channel join success');
}).catch(error => {
// join-channel failure
console.log('failed to join channel for error: ' + error);
});
}).catch(err => {
console.log('AgoraRTM client login failure', err);
});
}
function rotateModel(uid, direction) {
var model = document.getElementById(uid)
if (direction === 'counter-clockwise') {
model.object3D.rotation.y += 0.1;
} else if (direction === 'clockwise') {
model.object3D.rotation.y -= 0.1;
}
}
function moveModel(uid, direction) {
var model = document.getElementById(uid)
switch (direction){
case 'forward':
model.object3D.position.z += 0.1
break;
case 'backward':
model.object3D.position.z -= 0.1
break;
case 'left':
model.object3D.position.x -= 0.1
break;
case 'right':
model.object3D.position.x += 0.1
break;
default:
console.log('Unable to determin direction: ' + direction);
}
}
// use tokens for added security
function generateToken() {
return null; // TODO: add a token generation
}
function createBroadcaster(streamId) {
// create video element
var video = document.createElement('video');
video.id = "faceVideo-" + streamId;
video.setAttribute('webkit-playsinline', 'webkit-playsinline');
video.setAttribute('playsinline', 'playsinline');
video.setAttribute('poster', '/imgs/no-video.jpg');
console.log(video);
// add video object to the DOM
document.querySelector("a-assets").appendChild(video);
// configure the new broadcaster
const gltfModel = "#broadcaster";
const scale = "-0.55 -0.55 -0.55"; // invert UVs (hack)
const offset = (streamCount-1);
const position = offset + " 0 0";
const rotation = "180 0 0";
// create the broadcaster element using the given settings
const parent = document.querySelector('a-marker');
var newBroadcaster = document.createElement('a-gltf-model');
newBroadcaster.setAttribute('id', streamId);
newBroadcaster.setAttribute('gltf-model', gltfModel);
newBroadcaster.setAttribute('scale', scale);
newBroadcaster.setAttribute('position', position);
newBroadcaster.setAttribute('rotation', rotation);
parent.appendChild(newBroadcaster);
console.log(newBroadcaster);
// add event listener for model loaded:
newBroadcaster.addEventListener('model-loaded', () => {
var mesh = newBroadcaster.getObject3D('mesh');
mesh.traverse((node) => {
// search the mesh's children for the face-geo
if (node.isMesh && node.name == 'face-geo') {
// create video texture from video element
var texture = new THREE.VideoTexture(video);
texture.minFilter = THREE.LinearFilter;
texture.magFilter = THREE.LinearFilter;
texture.flipY = false;
// set node's material map to video texture
node.material.map = texture
node.material.color = new THREE.Color();
node.material.metalness = 0;
}
});
});
}
function connectStreamToVideo(agoraStream, video) {
video.srcObject = agoraStream.stream;// add video stream to video element as source
video.onloadedmetadata = () => {
// ready to play video
video.play();
}
}

Test the Audience

Let’s start our local web server, and navigate to the Broadcaster Client

/broadcaster.html

This is where you’ll need to either deploy this to a web server or use a tunnel out service (like ngrok) to map your local web server to a secure (https) url.

On your mobile device navigate to

/index.html

When you run the demo on your mobile device, you’ll have to accept camera access and sensor permissions for AR.js. Point your phone at the marker image and you should see the 3D broadcaster appear.

Note: On iOS, this only works using Safari

Adding the RTM Messages

Ok, so the video in AR is cool, but what would make this more interesting would be adding the ability to move the 3D model within the scene and synchronize that across all clients. Now that we have our Broadcaster and Audience Clients, all we need to do is add in the ability to send and receive messages with Agora’s Real-Time Messaging SDK.

Send RTM Messages

Within our Broadcaster Client, you’ll notice we have key bindings set to control some of the streaming functionality:

  • m to toggle the mic
  • v to toggle the video and
  • q to leave the channel

You’ll also notice we have a few others:

  • e and r that rotate the model
  • d, x, z, and c to move forward, back, left and right  [respectively]

Tapping these buttons triggers one of two functions

  • rotateModel to handle rotations
  • moveModel to handle (changes in position)

Within each function, you’ll also notice that it calls sendChannelMessage to send an RTM message that lets each of the other clients in the RTM channel know to update the rotation/position.

Let’s fill in the missing pieces so the sendChannelMessage function can pass along our updates to others in the channel. We’ll use a JSON object to structure our message, designating one key for the property we are changing, and the direction.

const jsonMsg = {
  property: property,
  direction: direction
};

In this case direction is mapped to a set of keywords:

  • counter-clockwise: positive rotation around the y-axis
  • clockwise: negative rotation around the y-axis
  • forward: move positively on the z-axis position
  • backward: move negatively on the z-axis position
  • left: move positively on the x-axis position
  • right: move negatively on the x-axis position

Next let’s package our message into an Agora RTM message. The Agora RTM SDK uses a JSON object to structure messages, the most important element being the text property as this holds the message that will get passed to the receiver. Since text expects a string value, we’ll use JSON.stringify() to convert our object to a string.

function sendChannelMessage(property, direction){
if (localStreams.rtmActive) {
// use a JSON object to send our instructions in a structured way
const jsonMsg = {
property: property,
direction: direction
};
// build the Agora RTM Message
const msg = {
description: undefined,
messageType: 'TEXT',
rawMessage: undefined,
text: JSON.stringify(jsonMsg)
};
rtmChannel.sendMessage(msg).then(() => {
// channel message-send success
console.log('sent msg success');
}).catch(error => {
// channel message-send failure
console.log('sent msg failure');
});
}
}

Receive RTM Messages

Now that we are able to send messages into the RTM channel, we also need a way of handling the messages when they are received. Let’s take a look at the .on('ChannelMessage', ... ) callback that exists in both the Broadcaster and Receiver Clients.

Note: We listen for this event on the Broadcaster Client so that we can support multiple Broadcasters in each channel.

When we receive the message, we use JSON.parse() to convert from a string back to a JSON object, which allows us to quickly update the appropriate property.

rtmChannel.on('ChannelMessage', ({ text }, senderId) =&gt; {
  // convert from string to JSON
  const msg = JSON.parse(text);  // Handle RTM msg
  if (msg.property === 'rotation') {
    rotateModel(senderId, msg.direction)
  } else if (msg.property == 'position') {
    moveModel(senderId, msg.direction)
  }
});

Test the RTMTest the Audience

Let’s start our local web server, and navigate to the Broadcaster Client

/broadcaster.html

Again this is where you’ll need to either deploy this to a web server or use a tunnel out service (like ngrok) to map your local web server to a secure (https) url.

On your mobile device navigate to

/index.html

When you run the demo on your mobile device, you’ll have to accept camera access and sensor permissions for AR.js. Point your phone at the marker image and you should see the 3D Broadcaster appear.

Next on the Broadcaster Client, use e, r, d, x, c, and z to move the Broadcaster around in the virtual scene, and watch as it’s updated in real-time for everyone in the channel.

All Done

Wow, that was intense! Thanks for following and coding along with me; below is a link to the completed project. Feel free to fork and make pull requests with any feature enhancements. Now it’s your turn to take this knowledge and go build something cool.

https://github.com/digitallysavvy/AgoraWebXR

Other Resources

For more information about the Agora Video SDK, please take a refer to the Agora Video Web SDK API Reference. For more information about the Agora RTM SDK, please take a refer to the Agora RTM Web SDK API Reference.

I also invite you to join the Agora Developer Slack community.

RTE Telehealth 2023
Join us for RTE Telehealth - a virtual webinar where we’ll explore how AI and AR/VR technologies are shaping the future of healthcare delivery.

Try Agora for Free

Sign up and start building! You don’t pay until you scale.
Try for Free