• Created: Rony Nandy,Kurt Taylor

  • Contributors: Rony Nandy

  • Packages affected: Speex releases of ports on Android and Ubuntu from Linaro

Summary

This specification is for the porting and optimization of some speech codecs on Linaro running on ARM Cortex a9 with NEON.The optimization will primarily look into the porting of the codec in multi core system like cortex a9.This will involve architectural changes in the codec. Secondarily, assembly optimization in NEON will be taken.

Assumptions

  • GDB,perf and Oprofiler on Cortex a9(required for profiling or look for a workaround like DS5).

Design

  • Consider the available speech/audio codecs available in opensource and do code inspection for good NEON optimisation

List of Codecs/Components

Sl no

Codec/Component

Format

Neon Optimisation

Source Code

Remarks

1

MPEG1 Layer1,2,3

.mp1,.mp2.mp3

Y

libav/ffmpeg

Well Optimised.

2

HEAACv2,HEAAC,AAC+,AAC

Y

libav/ffmpeg

Well Optimised

3

Ogg-Vorbis,Tremor

.ogg

Y

xipg

Need to check,NEON was not available 2 years back. Ogg Vorbis version of libav well optimised with NEON.

4

Skype SILK

N.A

Y

skype

Following up with skype.SDK released with SILK optimised in NEON by Skype.

5

Speex

.ogg

N

xipg

Consolidated code with NEON optimized patches released by Linaro on Ubuntu and Android.

6

Opus Interactive Audio Codec

TBD

xipg

Under study

7

amr-wb

http://opencore-amr.git.sourceforge.net/git/gitweb-index.cgi

  • Do a quantative analysis of the selected speech codecs by profiling and engage with upstream community to undertand the state of the codec and to avaoid duplication of work
  • Check for feasibility of using OpenMP or Open CL also.Need to focus on tool support and overheads using these standards.
  • Have the basic selected speech codec running with test setup on board.This is required for bench marking the received code and check for the conformance of the codec with the standard.Ideally code coverage should be done will new test vectors generated and complete test setup in place.
  • Check for Performance Parameters and update here

Table 1 :Performance Statistics

  • Document the basic design(Code analysis and read standard).Since this is a performance optimization specification,the design will not follow a traditional approach.
  • Codec profiled data with list of functions with % load and absolute load to be updated here.

Speex Performance Figures

Stream Bit Rate(kbit/sec)

Encoder
(MCPS)

Decoder
(MCPS)

4000

42.48

2.16

6000

30.96

2.73

8000

33.84

2.88

11200

46.8

2.95

15000

38.16

3.024

Table 2 :Codec Profiled Data

Codec Profiled Data

Serial No

Function Name

% of Codec MCPS*

Absolute MCPS

Remarks

1

2

..

*Million Cycles per second. **Frames per second.

Implementation

Codec Block Diagram

C code Changes

  • Port into target board and do performance measurement as shown in the performance measurement chart Table 2.The performance measurement will use system timers before and after the decoder function call to do the MCPS calculation.API should be measured in a single thread.
    1. No File i/o inside codec
    2. Input is encodec stream
    3. Output is interleaved PCM
    4. General Algo of measurement

   Test_Wrapper(args)
   {
   Time1=Gettime();
   Call Codec(args);
   Time2= Gettime();
   Time=Time2-Time1;
   TotalTime=TotalTime+Time;
   }
  • Calcuate total cycles using Total Time.Performance measurement using this method has to be done for codec performance measurement at the unit testing level.As it will be decoded in nonreal time.Once it is integrated with the middleware measurements will be done using top,powertop etc to see its system load as the decoding will happen in real time .But,those measurements correlate with the codec MCPS and are outside the scope of this development.

Processor specific code changes

  • Profile in target using perf or other performance measurement tool and update Table 2.
  • Check for alignment issues and do data alignment if required
  • Check for dual issue in NEON and code using intrinscs if required
  • NEON assembly coding of some functions which has been identified as hotspots.

Test/Demo Plan

Run on Orion board using command line with decoded stream redirected to audio driver for playing.

Unresolved issues

N.A.


CategorySpec CategoryTemplate

WorkingGroups/Middleware/Multimedia/Specs/1111/CodecOptimization/Speech (last modified 2012-01-26 16:07:12)