Skip to contentSkip to navigationSkip to topbar
Rate this page:
On this page

Core Concepts: Voice


Flex leverages the Twilio JavaScript Voice SDK to manage calls in the browser via the Flex UI. The Twilio Client requires an installed web browser that supports Web Real-Time Communication (WebRTC) and an internet connection. To understand Twilio's requirements for network connectivity, see Voice Client JS and Mobile SDKs' Network Connectivity Requirement.

When you create a new Flex project in the Twilio Console, Twilio automatically provisions a number that accepts incoming calls and SMS. Flex supports both inbound and outbound voice out of the box.

A Flex call can have one or more call legs, referring to the connection between a device and Twilio. For example, in a bridged call scenario, a call can have two call legs with Twilio: One leg from customer to Twilio and one from Twilio to an agent. The Call Logs(link takes you to an external page) page of your Flex project displays each call leg individually.


Inbound Calls

inbound-calls page anchor

Inbound calls (or incoming calls) are calls made by customers to your contact center. On a successful connection, an inbound call is routed as an incoming call request to an available agent.

To test the default inbound call experience for Flex:

  1. Log in and make yourself available as an agent.(link takes you to an external page)
  2. Place a call to the number that came with your Flex instance.

To customize inbound voice, you can set up a scalable Interactive Voice Response (IVR) system(link takes you to an external page).

Interactive Voice Response (IVR) Apps

interactive-voice-response-ivr-apps page anchor

IVR is an automated telephony system that interacts with customers through voice and touch-tone keypad selections (DTMF tones). It is also commonly known as a phone tree or a phone menu.

You can build IVR systems for your contact center that provide richer call context before routing them to your agents. For example, you can prompt your customers to supply general and account-based information to reduce delays and eliminate the need for call transfers. To get started, see the Build an IVR visually with Studio and Set up an IVR using Twilio Studio tutorials.


Outbound calls (or outgoing calls) refer to calls made by agents to your customers.

With the release of the native Dialpad in Flex UI v1.18.0, agents can initiate outbound calls or transfer an ongoing call to another agent or a supervisor. You can also leverage the StartOutboundCall action (provided by the Actions Framework) to implement use cases like click-to-dial and preview dialers. To start using the native Dialpad, follow the steps in Enabling the Flex Dialpad.


Transfers (Warm and Cold)

transfers-warm-and-cold page anchor

A warm transfer involves consulting with the agent receiving the call transfer before the initiating agent can wrap up the call on their end. The Warm Transfer or Consult button is represented as a Phone icon in the Flex UI. To learn more about a warm call transfer flow, see Call Control Concepts: Warm Transfer.

A cold transfer does not involve communicating with the receiving agent. When an agent initiates a cold transfer (represented as a right arrow icon), the voice task autocompletes for the agent transferring the call. To learn more about a cold call transfer flow, see Call Control Concepts: Cold Transfer.

An agent can either transfer the call to another agent, or to a Task Queue (both for warm and cold transfers).

The native Flex Dialpad supports both warm and cold transfers on outbound calls.


Voice conferencing is where two or more people in different locations use technology like a conference bridge to participate in a voice call. Twilio's Voice Conference lets you manage multi-party calls from 2 to 250 participants. Voice Conferences can be used for standard multi-party audio bridges, inbound contact centers, or for outbound dialers. To learn more about the lifecycle of a conference and how to manage conference participants, see Voice Conference.


In telephony, a call may be placed on hold when an agent needs to review additional details, transfer the call, or consult with a supervisor. When a call is on hold, the connection is not terminated but no verbal communication is possible until the call is removed from hold by the same or another extension on the key phone system. In the Flex UI, the Hold button is represented by a pause icon. To remove a contact from hold status, click Hold again. Under the hood, Flex uses the hold property of the Voice API Participant resource to set the hold status of a call participant.

Twilio Flex allows you to change the hold music or record a message for the caller(link takes you to an external page) while a call is on hold. If you're on a paid Flex plan, you can review your contact center's Average Hold Time and other built-in metrics with Flex Insights.


In the Flex UI, agents can mute or unmute themselves on a call by clicking Mute (or the microphone icon). Muting and unmuting affect the muted property of the Voice API Participant resource.


  • Voice tasks or calls are limited to 100 by default per queue. To change the default value, see Update the Call Queue Limit for Flex .
  • Calls and conferences have a four-hour limit.

Recording voice calls is a must-have feature for many contact centers: either for keeping an audit trail for legal reasons, resolving customer complaints, or for supervisors to coach the agents.

The preferred recording mode is dual-channel recording, which means that each party's audio is recorded onto a separate track (typically one for the customer and one for the agent). Follow Enabling Dual-Channel Recordings to enable dual-channel recordings via Studio or custom code. Currently, this is only available for inbound calls.

The opposite of that is single-channel recording, where the audio from all participants is mixed. If single-channel recording is sufficient for your use case, you can enable it with a single click on the Flex Settings(link takes you to an external page) page in the Console. Notice that some Flex Insights features will not be available on single-channel recordings (cross-talk analysis, graphical display of audio timeline separated per participant, etc).


Voice Insights lets you dive into the data behind your calls. It provides call quality analytics and aggregation tools for drilling into calls made within the last 30 days. To review the aggregate dashboard and call summary for your Flex instance, visit the Voice Insights page(link takes you to an external page) in the Twilio Console. For high precision call metrics, event streams, and programmatic access, you need to enable the Voice Insights Advanced Features(link takes you to an external page). See the documentation for more details on the advanced features.


Agent-Assisted <Pay>

agent-assisted-pay page anchor

Twilio <Pay> enables agents to securely capture caller payment information in a PCI-compliant manner during a voice conversation using the Agent-Assisted <Pay> API. When leveraging the Agent-Assisted <Pay> feature within Flex, agents control the payment flow and guide callers by requesting payment information one at a time (e.g., payment card number, expiration date, security code). Agents can continue to converse with callers but will not hear their DTMF input, ensuring the security of the payment information. Once all the payment information is collected, agents complete the <Pay> session and Twilio securely transmits the collected information to the payment processor via your configured <Pay> connector. To get started, see the <Pay> Connector and the Agent-Assisted <Pay> APIs documentation.


Media Streams gives access to the raw audio stream of your Programmable Voice calls by forking the audio stream in real-time and sending it to a destination of your choosing using WebSockets. This enables use cases such as real-time transcriptions, sentiment analysis, voice authentication, and more. Raw audio can also be streamed into Twilio Voice calls from another application, enabling use cases such as conversational IVR or integrations with a regional provider for custom Text-to-Speech. See the Media Streams Overview page to learn more.


Rate this page: