Voice over BLE

There is no standard way of transmitting voice over BLE so a custom profile must be used. TI offers two BLE mechanisms in transferring voice frames.

  • TI Voice Profile (VoGP) A custom TI’s GATT profile implementation in the BLE-Stack to transmit voice frames.
  • Voice over HID (Human Interface Device) over GATT Profile (VoHoGP) A HID over GATT Profile (HoGP) implementation in the BLE-Stack to transmit voice frame via HID reports.

TI Voice Profile (VoGP)

The TI Voice Profile (audio_profile.[ch]) is found in the BLE-Stack component’s audio_profile folder.

The audio data is transmitted using a proprietary service with UUID F000B000-0451-4000-B000-000000000000. This service is composed of the following 2 characteristics:

Note

The characteristics below use the 128-bit TI base UUID of the format F000XXXX-0451-4000-B000-000000000000 where XXXX is their shortened 16bit UUID. For brevity, this document will refer to the characteristics by their 16-bit short UUID.

Name UUID Description GATT Properties
AUDIOPROFILE_START 0xB001 The start characteristic is used to transmit a start command before the streaming starts and a stop command as the last packet of a stream. GATT_PROP_READ, GATT_PROP_NOTIFY
AUDIOPROFILE_AUDIO 0xB002 AUDIOPROFILE_AUDIO is used as the audio stream characteristic, all audio frames will be transmitted using this characteristic. GATT_PROP_READ, GATT_PROP_NOTIFY

GATT_Notification() was selected as the primary vehicle for transmitting voice data over BLE in the voice profile implementation.

Notifications were selected because they have low packet overhead and are asynchronous in nature. These qualities make notifications ideal for voice streaming applications. Before the voice stream begins, the (receiving) peer device must enable notifications by writing 01:00 to the CCCD of both AUDIOPROFILE_START and AUDIOPROFILE_AUDIO. If notifications are not enabled, the remote will not stream voice data.

The basic flow of a voice transmission is:

  1. CC2640R2F sends a start command (0x04) notification (if enabled) on the AUDIOPROFILE_START characteristic.
  2. CC2640R2F starts streaming voice data. See Sequence diagram for Voice Transmission for more details
  3. CC2640R2F sends a stop command (0x00) notification on the AUDIOPROFILE_START characteristic.

See the figure below for an illustration of voice transmission over BLE. See BLE Voice Frame Data for more information about the contents of the BLE voice frame.

@startuml
Receiver <- Transmitter: Advertisements
Receiver -> Transmitter: Connect Req
Receiver <-> Transmitter: Voice Service Discovery

Receiver -> Transmitter: Enable Notifications on AUDIOPROFILE_START Char
Receiver -> Transmitter: Enable Notifications on AUDIOPROFILE_AUDIO Char

...Wait until transmitter begins streaming...


Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_START - start command (0x04)

group Repeat For Each frame in voice stream


Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_AUDIO -  Metadata + voice data
Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_AUDIO -  voice data
Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_AUDIO -  voice data
Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_AUDIO -  voice data
Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_AUDIO -  voice data

end

Receiver <- Transmitter: GATT_Notification - AUDIOPROFILE_START - stop command (0x00)

@enduml

Figure 96. Sequence diagram for Voice Transmission

Voice over HID over GATT Profile (VoHoGP)

The Voice over HID over GATT Profile HID service implementation (hidservice.[ch]) can be found under the BLE-Stack component of the optional Example Pack for the SimpleLink CC2640R2 SDK.

In contrast to the VoGP, audio data is transmitted using encrypted Consumer Control HID Reports instead of a custom non-encrypted GATT profile. The advantage in using the adopted HID over GATT Profile (HoGP) is that operating systems and Bluetooth Low Energy software stacks generally already support this profile natively; thus eliminating the need for the developer to develop a custom GATT profile.

Tip

The main advantage in transporting data over the HID is that modern Operating Systems typically natively support this profile, simplifying the application development by not having to develop custom GATT profiles.

  • BlueZ users on Linux do not need to recompile the kernel to support a custom profile.
  • Windows 8.1 or later also supports HoGP natively.

Applications using HoGP only need to collect voice frames using HID reports generated by the Operating System. A sample script collecting voice data from HID reports is hosted on TI’s SimpleLink Github Page.

The HID reports used to transport voice frames follow the a similar paradigm as with TI Voice Profile. The HID_RPT_ID_VOICE_START_IN report is used to indicate the start and stop of the voice data stream, whereas the voice data itself is sent via the HID_RPT_ID_VOICE_DATA_IN report.

Name Report ID Description
HID_RPT_ID_VOICE_START_IN 0x0A (10) This HID report is used to transmit a start command before the streaming starts and stop command as the last packet of a stream.
HID_RPT_ID_VOICE_DATA_IN 0x0B (11) This HID report is used to transmit all voice data.

The basic flow of a voice transmission is:

  1. CC2640R2F sends a start command (0x04) on HID_RPT_ID_VOICE_START_IN.
  2. CC2640R2F starts streaming voice data on HID_RPT_ID_VOICE_DATA_IN. See Sequence diagram for Voice Transmission over HID for more details.
  3. CC2640R2F sends a stop command (0x00) on HID_RPT_ID_VOICE_START_IN.

The voice HID reports used to transport voice are described the HID Service’s Report Map in the following manner:

Listing 86. Declaring Voice HID Report in the HID Service’s Report Map
0x05, 0x0C,        // Usage Page (Consumer Devices)
0x09, 0x01,        // Usage (Consumer Control)
0xA1, 0x01,        // Collection (Application)
0x85, 0x0A,        //   Report ID (10)
0x15, 0x00,        //   Logical Minimum (0)
0x26, 0xFF, 0x00,  //   Logical Maximum (255)
0x75, 0x08,        //   Report Size (8)
0x95, 0x05,        //   Report Count (5)
0x09, 0x01,        //   Usage (Consumer Control)
0x81, 0x00,        //   Input (Data,Array,Abs,No Wrap,Linear,Preferred State,No Null Position)
0x85, 0x0B,        //   Report ID (11)
0x15, 0x00,        //   Logical Minimum (0)
0x26, 0xFF, 0x00,  //   Logical Maximum (255)
0x75, 0x08,        //   Report Size (8)
0x95, 0x14,        //   Report Count (20)
0x09, 0x01,        //   Usage (Consumer Control)
0x81, 0x00,        //   Input (Data,Array,Abs,No Wrap,Linear,Preferred State,No Null Position)
0xC0               // End Collection

The voice stream data flow is similar to TI’s Voice Profile, with the exception that now the link between the transmitter and receiver is encrypted. The GATT client, as before, will enable GATT Notifications for all the HID IN reports. See the figure below for an illustration of voice transmission over BLE and see BLE Voice Frame Data for more information about the contents of the BLE voice frame.

@startuml
Receiver <- Transmitter: Advertisements
Receiver -> Transmitter: Connect Req
Receiver <-> Transmitter: (Re-)Establish SMP Pairing and Encryption
Receiver <-> Transmitter: Interrogating HID Service's Report Map

...A HoGP host will enable notifications for all HID IN Report Characteristics...

Receiver -> Transmitter: Enable Notifications on HID_RPT_ID_VOICE_START_IN
Receiver -> Transmitter: Enable Notifications on HID_RPT_ID_VOICE_DATA_IN

...Wait until transmitter begins streaming...


Receiver <- Transmitter: GATT_Notification on HID_RPT_ID_VOICE_START_IN - start command (0x04)

group Repeat For Each Voice Frame in voice stream

Receiver <- Transmitter: GATT_Notification - HID_RPT_ID_VOICE_DATA_IN - Metadata + voice data
Receiver <- Transmitter: GATT_Notification - HID_RPT_ID_VOICE_DATA_IN - voice data
Receiver <- Transmitter: GATT_Notification - HID_RPT_ID_VOICE_DATA_IN - voice data
Receiver <- Transmitter: GATT_Notification - HID_RPT_ID_VOICE_DATA_IN - voice data
Receiver <- Transmitter: GATT_Notification - HID_RPT_ID_VOICE_DATA_IN - voice data

end

Receiver <- Transmitter: GATT_Notification on HID_RPT_ID_VOICE_START_IN - stop command (0x00)

@enduml

Figure 97. Sequence diagram for Voice Transmission over HID

BLE Voice Frame Data

By default the voice profile will send 20 bytes of application data per notification. Thus, it is thus ideal to choose a PDM driver frame length that is a multiple of 20 bytes.

Recall from PDM Driver Metadata that each frame should contain 4 bytes metadata as well. There is a compromise between frame duration and overhead, which was found to be optimized at a total frame length of 100 bytes, which includes 4 bytes metadata.

The numbered headers in the voice frame above are the metadata fields provided by the PDM driver. See PDM Driver Metadata for an explanation of the metadata fields.

../_images/Audio_packetformat.jpg

Figure 98. One audio packet.

../_images/aafig-b8f0328669d9938be8e2ead0794d3deef9ddca85.png

When transmitted over the air, the audio frames are fragmented into 20 byte notifications, this means that each audio frame is sent as 5 notifications:

../_images/Audio_packedIn5Notification.jpg

Figure 99. One audio frame sent over the air as 5 notification.

Modifiying the Latency

The built in flow control in the Bluetooth low energy Protocol is used to ensure delivery of full audio frames during streaming.

Since the header of each audio frame contains the information required to decode that frame separately, the safest way to discard data in e.g. a noisy environment is to discard the full frame.

The PDM driver will drop full audio frames when there are no available buffers, and the application will handle one frame at a time until it has been successfully queued up in the TX FIFO within the BLE-Stack.

As mentioned above, the application must service a PDM buffer every 2ms. If the application requires a longer contiguous chunk of processing time or a marginal RF environment is causing many re-try events then the number of PDM buffers can be tweaked by modifying MINIMUM_PDM_BUFFER_QUEUE_DEPTH.

Each increase in MINIMUM_PDM_BUFFER_QUEUE_DEPTH triggers a corresponding increase of 2ms of latency which allows the application more time to process. The cost of the increased latency is increased RAM useage.

The user application should be profiled to find the optimal tradeoff between the expected RF conditions, RAM useage, and latency.

Throughput Requirement for BLE

The general required throughput for sending audio frame has been covered in Throughput Requirement. Here we will cover the calculation for required throughput when we take BLE specific headers into consinderation.

Data Calculation rate
L2CAP and ATT header (7 * 5) B / 12ms 23.33kbps
Complete packets overhead (21 * 5) B / 12ms 70kpbs

From Throughput Requirement, we learned that the required thoughput for audio frame is 66.67kbps. After adding the overhead from BLE headers, the required throughput is 70 + 66.67 = 136.67kbps

The hid_adv_remote application will try to transmit as many available audio notifications as possible for every connection event. This means that the required throughput can be obtained with different settings for the connection interval as longs as enough packets can be transmitted in each connection event to successfully reach the ~417 notifications per second limit.

1 notification = 20 B audio data = 160 bits audio data.

Required audio data throughput = 66670bps.

66670 / 160 ~= 417 notifications per second

Typically a connection interval of 10ms can be used where 3-5 notifications are transmitted every connection event.