Solution Components and Architecture


Introduction

This document explains the technical details of Expertflow Voice Recording Solution. It will cover all terminologies and concepts used behind the solution.

Terminology

Following are the terms which are used in this document

Sip Message

It is the type of message through which CUCM communicates with VRS.

 It can either be a Request or a Response. It can either have content or not.

Call

Call determines a complete object containing all sessions in it. The sessions can be multiple in case of hold/resume or Transfer /conference.

Session

Session determines a single recording containing both parties streams/legs.

Session_Legs

Session legs determine the stream of a single person in any call. A Normal call has minimum of 2 session_legs for both parties stream.

Calling Number

This is the End-user / Party who dialed the call to anyone.

Called Number

This is the End-user / Party who received the incoming call.

Force Termination

It is a flag which represents that a recording for this call is corrupted or incomplete. This recording might be empty or incomplete. 

Zombie Timer

If the call is terminated for equal or more then this interval we mark it Terminated.

Thread Timer

This is the time interval after which our internal process check all the terminated calls and send them to the Mixer.

Call_Timeout

If no RTP packets received from the socket for this amount of time we mark it “Force Terminated”

Topic

Kafka Topic is basically a queue by which we send call information from recorder to mixer

Components


Expertflow Voice Recording Solution consists of 3 components. These components communicate with each other through Apache Kafka.

Recorder

Expertflow Recorder is the major component responsible for handling Handshake and call recording between Cisco CUCM and EF Voice Recording Solution.

The recorder has multiple internal processes developed to record each and every call seamlessly without any interruption or delay.

Correlation

Expertflow Recorder has data structures implemented to store each and every detail of a call in memory which makes it easy to correlate calls on runtime. A proper mapping is implemented which stores call sessions, call legs and a complete call in a data structure so at the end of any call we have a complete correlated call object having all sessions and legs.

The identification of each call is done via Xrefci Id we get from SIP Packets. In-Memory storage makes the whole process fast and seamless.

RTP Storage

Expertflow Recorder stores all voice RTP packets for each call leg in a separate raw file. Recorder decodes the RTP packets from RAW to PCM on the basis of Codecs. So far we are supporting G711 only. 

Tagging

Recorder Tags each call with its completion status, either the call is properly recorded or forcefully terminated. 

All the calls which are tagged “Force Terminated” are those which are not properly recorded due to a network glitch or any other cause. This recorded file may contain the complete recording, Partial or no recording.

Metadata

Expertflow Recorder is connected with Mysql Database in order to store required metadata in the database.

We store information on correlated calls along with required parameters.

Mixer

Expertflow Mixer is responsible for mixing each individual recorded call-leg files into a single session file on the basis of provided correlation information from the Recorder. Mixer after merging relevant files into a single file can convert it into .wav file depends on the configuration.

APIs

APIs provide RESTFul endpoints for any third-party application to fetch a list of recordings and download and play individual recording files.

Front-end

UI to search, play and download recordings. Front-end fetches recordings from the database via APIs component.


 Component level network diagram


Recording Flow

The recording solution supports the Built-in Bridge recording (“BIB recording”) where the recording streams are forked from an agent IP phone to the EF-recorder, The agent voice and the customer's voice are sent separately i.e. stored as separate call legs and then mixed by EF-Recorder.

EF_Recorder will be configured in CUCM as a SIP trunk device in order to receive calls and recording streams.

The Recording is done from CUCM using SIP. The Recording Solution works as a SIP server for CUCM and captures every SIP event generated. Based on those events the recording is done over RTP.

Voice Recording Solution has three main components:

  • Recorder
  • Mixer
  • REST APIs

The recorder has data structures implemented to store each and every detail of a call in memory which makes it easy to correlate calls on runtime. A proper mapping is implemented which stores call sessions, call legs and a complete call in a data structure so at the end of any call we have a complete correlated call object having all sessions and legs.

The identification of each call is done via Xrefci Id we get from SIP Packets. In-Memory storage makes the whole process fast and seamless.

The mixer is responsible for mixing each individual recorded call-leg files into single session file on the basis of provided correlation information from the recorder.

Mixer after merging relevant files into a single file can convert it into .wav file depending on the configuration.

REST APIs are used to fetch and play/download recordings from the database.