Saturday, January 12, 2013

Simple streaming audio mixer for Android with OpenSL ES - part 1




Previous parts

Today I will write some notes on how to write simple mixer for streaming audio. I will use OpenSL ES and pure C/C++ code (no java needed). Building simple sound mixer for Android includes two areas:
  • working with OpenSL ES - with its object, interfaces, ...,
  • creating some logic to build buffers with data and sending them to output.
 The code pieces are taken from my small cross-platform engine and you can see / hear it in action in Deadly Abyss 2 and Mahjong Tris games. Class SoundService is responsible for interaction with sound device of target platform. When the object of the class is created all of the needed member variables ("m" prefixed) are cleared. I am showing the routine here mainly because it lists the variables I will use later:

SoundService::SoundService()
{
 // engine
 mEngineObj = NULL;
 mEngine = NULL;
 // output
 mOutPutMixObj = NULL;
 // sound
 mSoundPlayerObj = NULL;
 mSoundPlayer = NULL;
 mSoundVolume = NULL;
 mSoundQueue = NULL;
 // sound mixer
 mActiveAudioOutSoundBuffer = NULL;
}

Initialization


 Initializing with OpenSL takes quite a lot of code. OpenSL ES objects are first created but no resources are allocated. According to OpenSL ES 1.1 Specification (see it at khronos.org site) the object is: "an abstraction of a set of resources, assigned for a well-defined set of tasks, and the state of these resources." To allocate the resources the object must be Realized.
 To access features the object offers you have to acquire interface object. Interfaces are defined as: " an abstraction of a set of related features that a certain object provides."

 First we have to define Engine object which is entry point into OpenSL ES API:

s32 SoundService::start()
{
 LOGI("Starting sound service");

 SLresult result;

 // engine
 const SLuint32 engineMixIIDCount = 1;
 const SLInterfaceID engineMixIIDs[] = {SL_IID_ENGINE};
 const SLboolean engineMixReqs[] = {SL_BOOLEAN_TRUE};

 // create engine
 result = slCreateEngine(&mEngineObj, 0, NULL,
   engineMixIIDCount, engineMixIIDs, engineMixReqs);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;
 // realize
 result = (*mEngineObj)->Realize(mEngineObj, SL_BOOLEAN_FALSE);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;
 // get interfaces
 result = (*mEngineObj)->GetInterface(mEngineObj, SL_IID_ENGINE, &mEngine);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;

 With slCreateEngine we are creating Engine object that is returned in first parameter. Next two parameters specify optional features. Last three parameter refers to const values you can see in code list and are related to number of interfaces, which interfaces are requested and whether these interfaces are required or optional. We are requesting only for one interface (SL_IID_ENGINE).
 Just now no resources are allocated yet. We have to Realize the object. The second parameter says whether it should be asynchronous. We want synchronous realization.
 Now we can cache interfaces. Here we have only one and we will store it in mEngine variable (the last parameter is for output now) .

 Next we are going to create output mix - object that is in the end and sends our data to HW device. The creation takes the same logic as for Engine but this time we have zero interfaces.

 // mixed output
 const SLuint32 outputMixIIDCount = 0;
 const SLInterfaceID outputMixIIDs[] = {};
 const SLboolean outputMixReqs[] = {};

 // create output
 result = (*mEngine)->CreateOutputMix(mEngine, &mOutPutMixObj,
   outputMixIIDCount, outputMixIIDs, outputMixReqs);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;
 result = (*mOutPutMixObj)->Realize(mOutPutMixObj, SL_BOOLEAN_FALSE);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;

 return 0;

ERROR:
 LOGE("Starting SoundService failed");
 stop();
 return -1;
}


AudioPlayer


 Now we are going to build the sound player that will be responsible for keeping queue with sound data full. It will be attached to Engine object and it will send its output to created output mix.

 In following routine we are also encountering pieces of the second area - the mixer logic. When we meet them I will just mention it, describe it briefly and skip for now as the mixer logic will be explained in second part of this article.

 The initial part is related to mixer logic - it marks all sound channels of the mixer as unused.

s32 SoundService::startSoundPlayer()
{
 // clear sounds
 for (s32 i = 0; i < SBC_AUDIO_OUT_CHANNELS; i++)
  mSounds[i].mUsed = false;

 First we define data locator - we say where the data we want to play comes from.

 SLresult result;

 // INPUT
 // audio source with maximum of two buffers in the queue
 // where are data
 SLDataLocator_AndroidSimpleBufferQueue dataLocatorInput;
 dataLocatorInput.locatorType = SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE;
 dataLocatorInput.numBuffers = 2;

 We say that the data will be in memory buffer and that we have two buffers. If we wanted to play some mp3 music stored in file we would use SL_DATALOCATOR_ANDROIDFD with different additional parameters.

 Then we define the format of the data that will be stored in memory buffers:

 // format of data
 SLDataFormat_PCM dataFormat;
 dataFormat.formatType = SL_DATAFORMAT_PCM;
 dataFormat.numChannels = 1; // Mono sound.
 dataFormat.samplesPerSec = SL_SAMPLINGRATE_11_025;
 dataFormat.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16;
 dataFormat.containerSize = SL_PCMSAMPLEFORMAT_FIXED_16;
 dataFormat.channelMask = SL_SPEAKER_FRONT_CENTER;
 dataFormat.endianness = SL_BYTEORDER_LITTLEENDIAN;

 The parameters are self-explaining. We will create buffers with raw PCM data. Our playback rate will be 11025 Hz and the data will be 16 bit little endian.

 Now we can combine the location of data with its format to create SLDataSource object that describes the input data:

 // combine location and format into source
 SLDataSource dataSource;
 dataSource.pLocator = &dataLocatorInput;
 dataSource.pFormat = &dataFormat;

 We have finished the description of input so now we have to describe the output. We will send data to output mix we created when initializing in start() method:

 // OUTPUT
 SLDataLocator_OutputMix dataLocatorOut;
 dataLocatorOut.locatorType = SL_DATALOCATOR_OUTPUTMIX;
 dataLocatorOut.outputMix = mOutPutMixObj;

 SLDataSink dataSink;
 dataSink.pLocator = &dataLocatorOut;
 dataSink.pFormat = NULL;

 Now it is time to create the sound player object. the object will be attached to Engine, its data will be as described (raw 16-bit PCM stored in memory buffers) and it will output it dataSink that will forward it to output mix. We will follow again the OpenSL ES logic - create object, realize it (to allocate resources ...), get interfaces. Notice that we have three interfaces. SL_IID_PLAY will allow us to start, stop, pause the playing. SL_IID_BUFFERQUEUE will allow us to control the queue with buffers (we have two of them). The last interface will allow us to control the volume:

 // create sound player
 const SLuint32 soundPlayerIIDCount = 3;
 const SLInterfaceID soundPlayerIIDs[] = {SL_IID_PLAY, SL_IID_BUFFERQUEUE, SL_IID_VOLUME};
 const SLboolean soundPlayerReqs[] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE};

 result =(*mEngine)->CreateAudioPlayer(mEngine, &mSoundPlayerObj, &dataSource, &dataSink,
   soundPlayerIIDCount, soundPlayerIIDs, soundPlayerReqs);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;
 result = (*mSoundPlayerObj)->Realize(mSoundPlayerObj, SL_BOOLEAN_FALSE);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;

 Object is created and realized - get all the three interfaces:

 // get interfaces
 result = (*mSoundPlayerObj)->GetInterface(mSoundPlayerObj, SL_IID_PLAY, &mSoundPlayer);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;
 result = (*mSoundPlayerObj)->GetInterface(mSoundPlayerObj, SL_IID_BUFFERQUEUE, &mSoundQueue);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;
 result = (*mSoundPlayerObj)->GetInterface(mSoundPlayerObj, SL_IID_VOLUME, &mSoundVolume);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;

 At this point we have initialized OpenSL ES Engine, we created audio player and we can start sending the data (in defined format) to it. We said we have two memory buffers. We can fill it with data and enqueue it but how we know that the playing finished and we should send next data? We can register callback routine through the buffer queue interface. When playing of buffer in queue is finished our custom routine (soundPlayerCallback) will get called and we can prepare and send next buffer.

 // register callback for queue
 result = (*mSoundQueue)->RegisterCallback(mSoundQueue, soundPlayerCallback, this);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;

 If we had only one buffer the audio may get choppy as there would be missing data in queue. So in the very beginning we clear both the buffers (fill it with silence) and we send both of them to queue. When playing of the first is finished our callback gets called and we can fill the first buffer with new data. While we are doing so there are still data in second buffer that is playing. Following snippet if more related to mixer logic that will be described in second part. But shortly - there are 2 buffers and one pointer that flips between them.

 // prepare mixer and enqueue 2 buffers
 // clear buffers
 memset(mAudioOutSoundData1, 0, sizeof(s16) * SBC_AUDIO_OUT_BUFFER_SIZE);
 memset(mAudioOutSoundData2, 0, sizeof(s16) * SBC_AUDIO_OUT_BUFFER_SIZE);
 // point to first one
 mActiveAudioOutSoundBuffer = mAudioOutSoundData1;

 // send two buffers
 sendSoundBuffer();
 sendSoundBuffer();

 I was wandering whether the data are copied into the queue upon sending and thus I could have only one buffer. But it seems it is not safe as Specification reads: "The buffers that are queued in a player object are used in place and are not required to be copied by the device, although this may be implementation-dependent. The application developer should be aware that modifying the content of a buffer after it has been queued is undefined and can cause audio corruption."

 Finally we can finish our long routine and start playing:

 // start playing
 result = (*mSoundPlayer)->SetPlayState(mSoundPlayer, SL_PLAYSTATE_PLAYING);
 if (result != SL_RESULT_SUCCESS)
  goto ERROR;

 // no problems
 return 0;

ERROR:
 LOGE("Creating sound player failed");
 return -1;
}


Callback


 The callback routine is as simple as this:

void SoundService::soundPlayerCallback(SLAndroidSimpleBufferQueueItf aSoundQueue, void* aContext)
{
 //LOGE("SOUND CALLBACK called");
 ((SoundService*) aContext)->sendSoundBuffer();
}

and sendBuffer() routine is the last piece in mosaic. All the routines called from it - prepareSoundBuffer() and swapSoundBuffers() are related to mixer logic and do not mess with OpenSL ES.

void SoundService::sendSoundBuffer()
{
 SLuint32 result;

 prepareSoundBuffer();
 result = (*mSoundQueue)->Enqueue(mSoundQueue, mActiveAudioOutSoundBuffer,
   sizeof(s16) * SBC_AUDIO_OUT_BUFFER_SIZE);
 if (result != SL_RESULT_SUCCESS)
  LOGE("enqueue method of sound buffer failed");
 swapSoundBuffers();
}


Cleaning


So far we described the initialization so it is time to show routines that will stop playing and clean. First clearing the sound player...:

void SoundService::stopSoundPlayer()
{
    if (mSoundPlayerObj != NULL)
    {
  SLuint32 soundPlayerState;
  (*mSoundPlayerObj)->GetState(mSoundPlayerObj, &soundPlayerState);

  if (soundPlayerState == SL_OBJECT_STATE_REALIZED)
  {
   (*mSoundQueue)->Clear(mSoundQueue);
   (*mSoundPlayerObj)->AbortAsyncOperation(mSoundPlayerObj);
   (*mSoundPlayerObj)->Destroy(mSoundPlayerObj);
   mSoundPlayerObj = NULL;
   mSoundPlayer = NULL;
   mSoundQueue = NULL;
   mSoundVolume = NULL;
  }
    }
}

... and clearing the Engine and sound output:

void SoundService::stop()
{
    // destroy sound player
 LOGI("Stopping and destroying sound player");
 stopSoundPlayer();

        LOGI("Destroying sound output");
 if (mOutPutMixObj != NULL)
 {
  (*mOutPutMixObj)->Destroy(mOutPutMixObj);
  mOutPutMixObj = NULL;
 }

 LOGI("Destroy sound engine");
 if (mEngineObj != NULL)
 {
  (*mEngineObj)->Destroy(mEngineObj);
  mEngineObj = NULL;
  mEngine = NULL;
 }
}

Conclusion


 So, for now we have:
  • initialized OpenSL ES Engine and sound output,
  • created AudioPlayer with defined input and output,
  • registered callback that will notify us when new data is needed
 We also know how to stop playing and destroy all we created.

 So far we are hearing only silence. In next part I will describe how to fill buffers with data and how to mix channels to  produce some sound.



2 comments:

  1. Thankyou for these posts! I'm having to wade into a lot of this stuff at the moment and this is the probably the best resource currently on the web for android + opensles.

    ReplyDelete
  2. No, one audioplayer is enough. In second part of the article there is explained how to mix the playing sounds and thus how to create data for buffers. The sound data have to be stored somewhere too. But it is not stored in OpenSL buffers but in standard C++ arrays. From OpenSL point of view one player is OK.
    However, this way of playing sounds is not suitable for game audio where you want to start the sound with minimal latency between user action and sound playing. I plan to write article on how to build OpenSL sound pool in future.

    ReplyDelete