Back to Blog

Highlighting the Active Speaker using the Agora Flutter SDK

Highlighting the Active Speaker using the Agora Flutter SDK

Satyam Parasa is a web and mobile application developer from India. He loves learning new Technology. The tech enthusiasm led him to make a blog on Flutter named flutterant.com.


For a meeting or video chat to work well, several factors must come into play. The Internet connection must be stable, the speaker’s words must be clearly heard, and everyone should be aware of who is speaking. We often get confused about who is speaking, which may lead to miscommunication. So highlighting a speaker is important for every video calling program.

When you build a video calling app with the Agora SDK, the development becomes easy. In this Nicely organized tutorial, we are going to implement a feature that highlights a speaker in a call with the Agora Flutter SDK.

Prerequisites

  • Basic knowledge of Flutter
  • Any IDE (for example, Android Studio or Visual Studio Code)
  • Agora Flutter SDK
  • Agora Developer Account- Sign up here

Project Setup

1. Create a Flutter project in your favorite IDE, and remove the boilerplate code.
2. Add the following required dependencies in the pubspec.yaml file, and install and install those dependencies by running pub get:

dependencies:
flutter:
sdk: flutter
permission_handler: ^8.1.4+2
agora_rtc_engine: ^4.0.6
view raw pubspec.yaml hosted with ❤ by GitHub

3. Create the file structure in the lib folder as shown below:

Highlighting the Active Speaker using the Agora Flutter SDK 1
Project structure

Implementing the Video Calling Interface

As we know the Flutter application is initialized in the main.dart file and calls the HomePage(),which is defined in the home_page.dart file:

import 'package:agora_flutter_who_is_speaking/pages/home_page.dart';
import 'package:flutter/material.dart';
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
// This widget is the root of your application.
@override
Widget build(BuildContext context) {
return MaterialApp(
debugShowCheckedModeBanner: false,
title: 'Agora flutter Video call ',
theme: ThemeData(
primarySwatch: Colors.blue,
),
home: HomePage(),
);
}
}

main.dart

Building the Home Page

Let’s create a dart file with the name home_page.dart. This page will take the channel name as input from the user.

When the user clicks the join button, the app will ask for the camera and mic permissions from the user. If permissions are granted, then the channel name will be passed to call_page.dart:

import 'dart:async';
import 'package:agora_flutter_who_is_speaking/pages/call_page.dart';
import 'package:flutter/material.dart';
import 'package:permission_handler/permission_handler.dart';
class HomePage extends StatefulWidget {
@override
State<StatefulWidget> createState() => HomePageState();
}
class HomePageState extends State<HomePage> {
/// create a channelController to retrieve text value
final _channelController = TextEditingController();
/// if channel textField is validated to have an error
bool _validateError = false;
@override
void dispose() {
// dispose input controller
_channelController.dispose();
super.dispose();
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(
title: Text('Agora Flutter video Call'),
),
body: Padding(
padding: const EdgeInsets.all(10.0),
child: Column(
mainAxisAlignment: MainAxisAlignment.center,
children: <Widget>[
Row(
children: <Widget>[
Expanded(
child: TextField(
controller: _channelController,
decoration: InputDecoration(
errorText:
_validateError ? 'Channel name is mandatory' : null,
border: UnderlineInputBorder(
borderSide: BorderSide(width: 1),
),
hintText: 'Channel name',
),
))
],
),
SizedBox(
height: 12,
),
Row(
children: <Widget>[
Expanded(
child: ElevatedButton(
onPressed: onJoin,
child: Text('Join'),
),
)
],
)
],
),
),
);
}
}

The onJoin() method is triggered when the user clicks the join button on the home page. It validates the user input and calls the _handleCameraAndMic(..) method to get the camera and mic permissions. After getting the required permissions, the channel name will be passed to call_page.dart:

Here we used the permission_handler plug-in to request the permissions and check their status:

Future<void> _handleCameraAndMic(Permission permission) async {
final status = await permission.request();
print(status);
}

Building the Calling Page

Create a dart file with the name call_page.dart, Our entire functionality will be developed on this page only, This includes showing the surface views in grid view and highlighting the speaker in the call, as well as user interaction buttons such as call end, mic mute/unmute, and camera switching:

import 'package:agora_flutter_who_is_speaking/model/user.dart';
import 'package:agora_flutter_who_is_speaking/utils/settings.dart';
import 'package:agora_rtc_engine/rtc_engine.dart';
import 'package:agora_rtc_engine/rtc_local_view.dart' as RtcLocalView;
import 'package:agora_rtc_engine/rtc_remote_view.dart' as RtcRemoteView;
import 'package:flutter/material.dart';
class CallPage extends StatefulWidget {
/// non-modifiable channel name of the page
final String? channelName;
const CallPage({Key? key, this.channelName}) : super(key: key);
@override
_CallPageState createState() => _CallPageState();
}
class _CallPageState extends State<CallPage> {
late RtcEngine _engine;
Map<int, User> _userMap = new Map<int, User>();
bool _muted = false;
int? _localUid;
@override
void initState() {
super.initState();
// initialize agora SDK
initialize();
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(
title: Text("Group call"),
),
body: Stack(
children: [_buildGridVideoView(), _toolbar()],
),
);
}
}

Let’s have a look at the above code for understanding the variables that are used.

Our CallPage constructor holds the value of the channel name which comes from the home page:

  • _engine is a reference to the RtcEngine class.
  • _userMap maintains the data of the users who joined the channel.
  • _muted is to gives the Boolean status of whether or not the mic is muted.
  • _localUid holds the local user Uid.

As we know, the initState() method is called only once when the stateful widget is added to the widget tree. Here, we call the initialize() method:

Future<void> initialize() async {
if (APP_ID.isEmpty) {
print("'APP_ID missing, please provide your APP_ID in settings.dart");
return;
}
await _initAgoraRtcEngine();
_addAgoraEventHandlers();
await _engine.joinChannel(null, widget.channelName ?? "", null, 0);
}

If we look at the above code snippet, the initialize() method is responsible for calling the following methods:

  • _initAgoraRtcEngine() is responsible for initializing the Rtc Engine.
  • _addAgoraEventHandlers() is responsible for event handling.
  • joinChannel() is responsible for joining into a specific channel.
Future<void> _initAgoraRtcEngine() async {
_engine = await RtcEngine.create(APP_ID);
await _engine.enableVideo();
await _engine.setChannelProfile(ChannelProfile.Communication);
// Enables the audioVolumeIndication
await _engine.enableAudioVolumeIndication(250, 3, true);
}

create(..): The AgoraRtc Engine will be initialized by the create(..) method with the parameter of App ID, Our app will connect to the Agora engine by the App ID that we got from the Agora developer console (this App ID is kept in the settings.dart file).

/// Define App ID
const APP_ID = "Your App ID";

settings.dart

enableVideo(): Enables the video module.You can call this method either before joining a channel or during a call. If you call this method before joining a channel, the service starts in video mode. If you call this method during an audio call, the audio mode switches to the video mode.

setChannelProfile(..): We are using communication as the channel profile which is the default profile in one-on-one calls or group calls.

enableAudioVolumeIndication(..): Enables the RtcEngineEventHandler.audioVolumeIndication callback at a set time interval to report on which users are speaking and the speakers’ volume. This method has three parameters: interval, smooth, and report_vad:

  • int interval: Sets the time interval between two consecutive volume indications. Agora recommends setting an interval that should be greater than or equal to 200 ms. If the interval is ≤ 0, the volume indication is disabled.
  • int smooth: The smoothing factor sets the sensitivity of the audio volume indicator. The value ranges between 0 and 10. The greater the value, the more sensitive the indicator. Agora's recommended value is 3.
  • bool report_vad: If the value is true, the RtcEngineEventHandler.audioVolumeIndication callback reports the voice activity status of the local user, Otherwise, it won’t detect the voice activity status of the local user. The default value of report_vad is false.
void _addAgoraEventHandlers() {
_engine.setEventHandler(
RtcEngineEventHandler(error: (code) {
print("error occurred $code");
}, joinChannelSuccess: (channel, uid, elapsed) {
setState(() {
_localUid = uid;
_userMap.addAll({uid: User(uid, false)});
});
}, leaveChannel: (stats) {
setState(() {
_userMap.clear();
});
}, userJoined: (uid, elapsed) {
setState(() {
_userMap.addAll({uid: User(uid, false)});
});
}, userOffline: (uid, elapsed) {
setState(() {
_userMap.remove(uid);
});
},
/// Detecting active speaker by using audioVolumeIndication callback
audioVolumeIndication: (volumeInfo, v) {
//core logic will be here
}),
);
}

_addAgoraEventHandler(): is responsible for the RtcEngine event callback methods, which set the event handler by calling setEventHandler(). After setting the engine event handler, we can listen to engine events and receive the statistics of the corresponding RtcEngine.

Before going to callback methods, create a dart file with the name user.dart. This is for capturing the speaking status of the user.

class User {
int uid; //reference to user uid
bool isSpeaking; // reference to whether the user is speaking
User(this.uid, this.isSpeaking);
@override
String toString() {
return 'User{uid: $uid, isSpeaking: $isSpeaking}';
}
}

The event handler callback methods include:

  • error() callback is for reporting an error during the SDK runtime.
  • joinChannelSuccess() triggers when the local user joins a specific channel. This method returns the channel name, the uid for the user, and the time elapsed (in ms) for the local user to join the channel:
_localUid = uid;
_userMap.addAll({uid: User(uid, false)});

A local user uid is assigned to _localUid and the user data is added into a userMap with uid as the key and user object as the value, which has uid and false as the status.

  • leaveChannel() initiates whenever the user leaves the channel. It returns the RtcStats, which includes call duration, number of bytes transmitted and received, latency, and so on. Here we clear the _userMap when the user leaves the channel.
  • userJoined() triggers when a remote user joins a specific channel. It returns the uid for the user and the time elapsed (in ms) for the remote user to join a channel:
_userMap.addAll({uid: User(uid, false)});

Remote user data is added into a _userMap with uid as the key and user object as the value, which has uid and false as status.

  • userOffline() occurs when the remote user leaves the channel. It returns the uid of the user and the reason why the user went offline. Here we are removing the user from _userMap with uid
  • audioVolumeIndication() This callback is triggered at the set interval, regardless of whether a user speaks or not. It reports which users are speaking and the speakers’ volume, and whether the local user is speaking. By default, this callback is disabled. We can enable it by calling the RtcEngine.enableAudioVolumeIndication method:

The local user will be detected only when the report_vad value should be set as true in theRtcEngine.enableAudioVolumeIndication method.

If the local user calls the RtcEngine.muteLocalAudioStream method, the SDK stops triggering the local user’s callback. And, 20 seconds after a remote speaker calls the RtcEngine.muteLocalAudioStream method, the remote speakers’ callback does not include information of this remote user. Then 20 seconds after all remote users call the RtcEngine.muteLocalAudioStreammethod, the SDK stops triggering the remote speakers’ callback.

The audioVolumeIndication() method is an effective method for highlighting all active speakers. We can use the activeSpeaker() callback to highlight one of the active speakers. Basically, the callback returns the loudest speaker out of all active speakers.

AudioVolumeCallback includes the two parameters:

  • List<AudioVolumeInfo>speakers:An array containing the user ID and volume information for each speaker. The uid of the local user is 0.
  • int totalVolume: Total volume after audio mixing. The value ranges between 0 (lowest volume) and 255 (highest volume):
audioVolumeIndication: (volumeInfo, v) {
volumeInfo.forEach((speaker) {
//detecting speaking person whose volume more than 5
if (speaker.volume > 5) {
try {
_userMap.forEach((key, value) {
//Highlighting local user
//In this callback, the local user is represented by an uid of 0.
if ((_localUid?.compareTo(key) == 0) && (speaker.uid == 0)) {
setState(() {
_userMap.update(key, (value) => User(key, true));
});
}
//Highlighting remote user
else if (key.compareTo(speaker.uid) == 0) {
setState(() {
_userMap.update(key, (value) => User(key, true));
});
} else {
setState(() {
_userMap.update(key, (value) => User(key, false));
});
}
});
} catch (error) {
print('Error:${error.toString()}');
}
}
});
}

The above code is for updating the _userMap based on the condition validation. If the speaking person’s uid matches any uid in _userMap, that corresponding user object is updated as true and vice versa, for example,

_userMap.update(key, (value) => User(key, true))

In this way, _userMap will be updated every time and maintain the user data with the speaker’s status, whether they are speaking or not in the call. So we can update the UI based on the _userMap data:

GridView _buildGridVideoView() {
return GridView.builder(
shrinkWrap: true,
itemCount: _userMap.length,
gridDelegate: SliverGridDelegateWithFixedCrossAxisCount(
childAspectRatio: MediaQuery.of(context).size.height / 1100,
crossAxisCount: 2),
itemBuilder: (BuildContext context, int index) => Padding(
padding: const EdgeInsets.all(8.0),
child: Container(
child: Container(
color: Colors.white,
child: (_userMap.entries.elementAt(index).key == _localUid)
? RtcLocalView.SurfaceView()
: RtcRemoteView.SurfaceView(
uid: _userMap.entries.elementAt(index).key)),
decoration: BoxDecoration(
border: Border.all(
color: _userMap.entries.elementAt(index).value.isSpeaking
? Colors.blue
: Colors.grey,
width: 6),
borderRadius: BorderRadius.all(
Radius.circular(10.0),
),
),
),
),
);
}

We are using GridView for showing SurfaceViews based on the _userMap data and highlighting the speaking person in the call. If the isSpeaking value true represents that the respected user is saying something in the call, then the container color changes to blue.

Widget _toolbar() is for adding the user interactive buttons at the bottom of the screen for ending the call, mic mute/unmute, and switching the camera:

Widget _toolbar() {
return Container(
alignment: Alignment.bottomCenter,
padding: const EdgeInsets.symmetric(vertical: 48),
child: Row(
mainAxisAlignment: MainAxisAlignment.center,
children: <Widget>[
RawMaterialButton(
onPressed: _onToggleMute,
child: Icon(
_muted ? Icons.mic_off : Icons.mic,
color: _muted ? Colors.white : Colors.blueAccent,
size: 20.0,
),
shape: CircleBorder(),
elevation: 2.0,
fillColor: _muted ? Colors.blueAccent : Colors.white,
padding: const EdgeInsets.all(12.0),
),
RawMaterialButton(
onPressed: () => _onCallEnd(context),
child: Icon(
Icons.call_end,
color: Colors.white,
size: 35.0,
),
shape: CircleBorder(),
elevation: 2.0,
fillColor: Colors.redAccent,
padding: const EdgeInsets.all(15.0),
),
RawMaterialButton(
onPressed: _onSwitchCamera,
child: Icon(
Icons.switch_camera,
color: Colors.blueAccent,
size: 20.0,
),
shape: CircleBorder(),
elevation: 2.0,
fillColor: Colors.white,
padding: const EdgeInsets.all(12.0),
)
],
),
);
}
  • _onToggleMute is used to toggle the local audio stream by calling the _engine.muteLocalAudioStream() method, which takes a Boolean value:
void _onToggleMute() {
setState(() {
_muted = !_muted;
});
_engine.muteLocalAudioStream(_muted);
}
  • _onCallEnd is used to disconnect the call and navigate back to the home page:
void _onCallEnd(BuildContext context) {
Navigator.pop(context);
}
  • _onSwitchCamera() is used to toggle between the front and back cameras by calling the _engine.switchCamera() method:
void _onSwitchCamera() {
_engine.switchCamera();
}

At the end, we call the _engine.destroy() method to destroy the channel. The _engine.leaveChannel() method allows a user to leave the channel. And the dispose() method clears the users:

@override
void dispose() {
//clear users
_userMap.clear();
// destroy sdk
_engine.leaveChannel();
_engine.destroy();
super.dispose();
}

Testing:

Once we are done with coding, we can test the app on the device. For that, we need to run the project in our IDE:

Conclusion:

We have successfully used the Agora Flutter SDK to implement the Flutter video calling app with a feature highlighting the speaking person in a call.

Click this link for the complete source code.

Other Resources

For more about the Agora Flutter SDK, see the developer’s guide here.

For more about the methods discussed in the article, click this link.

You’re welcome to join the Agora Developer Slack community.

RTE Telehealth 2023
Join us for RTE Telehealth - a virtual webinar where we’ll explore how AI and AR/VR technologies are shaping the future of healthcare delivery.

Try Agora for Free

Sign up and start building! You don’t pay until you scale.
Try for Free