Speech Recognizer Project 0_80_00_00
Release Notes
Document Revision: 012216.1228
The Speech Recognizer Project provides a low-resource speech recognition library called MinHMM.
The details of the MinHMM Library are provided in the documentation.
This document is divided into the following sections:
What's New
0_80_00_00
- This is the initial public release of the Speech Recognizer Project.
Revision_History
No revision history. This is the first release of Speech Recognizer Project.
Stand alone installation
- Download the SpeechRecognizer software
- Run the appropriate platform specific installer: SpeechRecognizer-<version>-<platform>-installer
- Follow the installer directions
- In the installation directory
- Consult the MinHMM User's Guide for instructions and important usage information
- Consult the MinHMM API Guide for the MinHMM API description
- In the examples directory, the MinHMMDemo program illustrates how to use MinHMM
- The program includes header files that should be used to allocate memory that MinHMM needs
- Example CCS and IAR projects are included
Device Support
The following families of devices are supported in this release of Speech Recognizer Project :
The MinHMM library and MinHMMDemo program have been built
for the toolchains in the following chart. When toolchain debugging
support is available, the library has been tested with the
specified debugger. Although tested with a specific debugger,
code examples and the MinHMM library will work on a variety of
different debuggers.
Resources and Specifications
This
section provides estimates of the resources needed for operation of
MinHMM.
Memory Resources
The
MinHMM library code size is about 23kB. The memory required
to store
model data depends on the maximum model size required. This release
uses one
flash sector of 4kB per model. MinHMM requires
approximately 3kB
of RAM that must persist as long as MinHMM is being used.
During model
training and recognition search additional temporary RAM is required
for
computation. For creating (enrolling) a model 4kB temporary RAM is
required. For updating a model 20kB of temporary RAM is needed. For
recogintion search 250 bytes per model is required.
Processing Cycles
Processor cycle
requirements are a complex function of the trained models,
the input speech data and background noise, and the quality of the
match between the trained data and the input speech. As a general rule,
when there is less background noise and the input speech matches a
model well, then cycle usage decreases during active recognition
search. When there is no speech and background
noise is low or at a moderate level and not changing appreciably,
MinHMM will assume the signal is in a background noise
condition
and enter a background processing mode. When speech is present, or
when other background noises
are appreciable and varying, then MinHMM will enter
the active recognition search mode to attempt to match the
audio signal to a model, thus
causing higher cycle usage. The table below
provides estimates of
typical cycle usage measured while running the MinHMM example demo program on the MSP432P401R operating
at a clock rate of 48MHz . Due to variations in models,
background noise, audio quality, and other factors the actual
cycles an application requires may differ from these estimates.
Condition | Average MCPS (% of Available Cycles) | Maximum MCPS (% of Available Cycles) |
Background Mode | 2 (4.3%) | 2 (4.3%) |
Active Search Mode (1 model) | 6.2 (13%) | 7.7 (16%) |
Active Search Mode (5 models) | 17.3 (36%) | 22.1 (46%) |
Power Consumption
The
MinHMM speech recognizer consumes arrays of sampled audio
data. Typically the arrays consist of 160 samples collected at an 8kHz
sample rate, so an array of data is provided to
MinHMM each 20ms. The power needed to process these arrays
is based on the cycles required for processing the data arrays, which
was discussed in the prior paragraph. The total power consumption
of an application will in addition depend heavily on hardware and
application design. For lowest power operation the design should
utilize low-power best practices. The hardware should use low-power
components for audio collection, such as the microphone and
pre-amplifier. The application should shut down peripherals during
periods when they are not being used and use the lowest clocking rates
necessary. Application software should utilize the low-power modes of
the processor, such as LPM3, during idle times. Excluding the external
hardware, estimates of processor power consumption for an
application utilizing these practices ranges from 250µA in background
mode to 1.8mA in active search mode with five models active.
The
current MinHMM example program demonstrates the use of
the MinHMM speech recognizer, but for simplicity of
illustration it was not designed for lowest power operation. For
example, during idle times it uses LPM0 and the 48MHz clock remains
running.
This release is an EA version.
This is build 0_80_00_00 of Speech Recognizer Project.
Additional Resources
For more information, visit www.ti.com.