Solution Components and Architecture
Introduction
This document explains the technical details of Expertflow Voice Recording Solution. It will cover all terminologies and concepts used behind the solution.
Terminology
Following are the terms which are used in this document
SIP Message | It is the type of message through which CUCM communicates with VRS. It can either be a Request or a Response. It can either have content or an empty message. | |||
Call | A call aggregates all sessions of a call. An actual call may have several sessions due to hold/resume or transfer/conference scenarios. A call object contains all sessions of a call. | |||
Session | A session determines a single recording joining all voice streams of all participants. | |||
Session Leg | A Session Leg is the voice stream of one participant in the session. A session has at least two session_legs. | |||
Calling Number | This is the end-user/party who initiated the SIP call. | |||
Called Number | This is the end-user/party who received the incoming call. | |||
| It is a flag which represents that a recording for this call is corrupted or incomplete. This recording might be empty or incomplete. |
How we record
The recording is done with “Built-in Bridges” “BIB” – an approach that uses the conference bridge available for almost all Cisco IP phone types.
The Recording streams are forked from an agent's IP phone to and open source carrier-grade telephony platform known as FreeSWITCH. The agent and customer's voices are sent and stored as separate call legs. Using different FreeSWITCH scripts, the separate legs are mixed and saved as a single audio file.
The recording is initiated by CUCM using SIP commands. FreeSWITCH is configured on CUCM as a SIP trunk device/ SIP server in order to receive calls and record streams. It captures every SIP event. Based on those events, recording is done using RTP.
Recording Flow
The recording solution supports the Built-in Bridge recording (“BIB recording”) where the recording streams are forked from an agent IP phone to the recorder, The agent voice and the customer's voice are sent separately i.e. stored as separate call legs and then mixed by the Recorder.
The recorder will be configured in CUCM as a SIP trunk device in order to receive calls and recording streams.
The Recording is done from CUCM using SIP. The Recording Solution works as a SIP server for CUCM and captures every SIP event generated. Based on those events the recording is done over RTP.
Voice Recording Solution has three main components:
- Recorder
- Mixer
- REST APIs
The recorder has data structures implemented to store each and every detail of a call in memory which makes it easy to correlate calls on runtime. A proper mapping is implemented which stores call sessions, call legs and a complete call in a data structure so at the end of any call we have a complete correlated call object having all sessions and legs.
The identification of each call is done via Xrefci Id we get from SIP Packets. In-Memory storage makes the whole process fast and seamless.
The mixer is responsible for mixing each individual recorded call-leg files into single session file on the basis of provided correlation information from the recorder.
Mixer after merging relevant files into a single file can convert it into .wav file depending on the configuration.
REST APIs are used to fetch and play/download recordings from the database.