Capturing and streaming sound by using DirectSound with C#

I already wrote a little about managed way to use DirectX DirectSound. Today we’ll speak about how to get sound from your microphone or any other DirectSound capturing device (such as FM receiver) and stream it out to your PC speakers and any other DirectSound Output device. So, let’s start creating our first echo service by using managed DirectX.

image

First of all we should decide what Wave format we want to use for capturing and recording. So, let’s choose anything reasonable :)

var format = new WaveFormat {
            SamplesPerSecond = 96000,
            BitsPerSample = 16,
            Channels = 2,
            FormatTag = WaveFormatTag.Pcm
         };

Now, we should calculate block align and average byte per second value for this format. I’m wondering why it cannot be done automatically…

format.BlockAlign = (short)(format.Channels * (format.BitsPerSample / 8));
format.AverageBytesPerSecond = format.SamplesPerSecond * format.BlockAlign;

Next step is to set the size of two buffers – one for input and other for output. Generally those buffers are circular, and capturing one should be twice bigger, then output. Why? Because we choose two channels to use. Also, we should decide about chunk size of the buffer, we want to signal when filled.

_dwNotifySize = Math.Max(4096, format.AverageBytesPerSecond / 8);
_dwNotifySize -= _dwNotifySize % format.BlockAlign;
_dwCaptureBufferSize = NUM_BUFFERS * _dwNotifySize;
_dwOutputBufferSize = NUM_BUFFERS * _dwNotifySize / 2;

Next step is to create CaptureBufferDescriptor and actual capturing buffer. We’ll enumerate all devices and choose one, satisfies given string (captureDescriptor) – for example “Mic” :)

var cap = default(Capture);
var cdc = new CaptureDevicesCollection();
for (int i = 0; i < cdc.Count; i++) {
   if (cdc[i].Description.ToLower().Contains(captureDescriptor.ToLower())) {
      cap = new Capture(cdc[i].DriverGuid);
      break;
   }
}
var capDesc = new CaptureBufferDescription {
   Format = format,
   BufferBytes = _dwCaptureBufferSize
};
_dwCapBuffer = new CaptureBuffer(capDesc, cap);

Then we’ll create output device and buffer. To simplify program, we will use default speakers to output, however, you can choose output device the same way we did for capturing. Also, because DirectSound uses any window as it’s message pump, we have to use SetCooperativeLevel method. In my case (windowless application), I’ll use desktop window as message broker. This why you will have to add Windows.Forms as reference for your project, even if it console application. Also, do not forget to set GlobalFocus value to True, if you want to play echo, even if desktop window is not focused.

var dev = new Device();
dev.SetCooperativeLevel(Native.GetDesktopWindow(), CooperativeLevel.Priority);

var devDesc = new BufferDescription {
   BufferBytes = _dwOutputBufferSize,
   Format = format,
   DeferLocation = true,
   GlobalFocus = true
};
_dwDevBuffer = new SecondaryBuffer(devDesc, dev);

Now, we will subscribe to buffer notifications and set autoResetEvent to be fired when it filled up.

var _resetEvent = new AutoResetEvent(false);
var _notify = new Notify(_dwCapBuffer);
//half&half
var bpn1 = new BufferPositionNotify();
bpn1.Offset = _dwCapBuffer.Caps.BufferBytes / 2 – 1;
bpn1.EventNotifyHandle = _resetEvent.SafeWaitHandle.DangerousGetHandle();
var bpn2 = new BufferPositionNotify();
bpn2.Offset = _dwCapBuffer.Caps.BufferBytes – 1;
bpn2.EventNotifyHandle = _resetEvent.SafeWaitHandle.DangerousGetHandle();

_notify.SetNotificationPositions(new BufferPositionNotify[] { bpn1, bpn2 });

Almost done, the only thing we should do is to fire worker thread to take care on messages

int offset = 0;
_dwCaptureThread = new Thread((ThreadStart)delegate {
   _dwCapBuffer.Start(true);

   while (IsReady) {
      _resetEvent.WaitOne();
      var read = _dwCapBuffer.Read(offset, typeof(byte), LockFlag.None, _dwOutputBufferSize);
      _dwDevBuffer.Write(0, read, LockFlag.EntireBuffer);
      offset = (offset + _dwOutputBufferSize) % _dwCaptureBufferSize;
      _dwDevBuffer.SetCurrentPosition(0);
      _dwDevBuffer.Play(0, BufferPlayFlags.Default);
   }
   _dwCapBuffer.Stop();
});
_dwCaptureThread.Start();

That’s it. Compile and run. Now if you’ll speak, you can hear your echo from PC speakers.

Merry Christmas for whom concerns and be good people – do not scare your co-workers with strange sounds – be polite and make the volume lower :)

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • DotNetKicks
  • DZone
  • Live
  • Reddit
  • TwitThis
  • email
  • Slashdot
  • StumbleUpon

You may also be interested with:

  1. Quick how to: Reduce number of colors programmatically
  2. INotifyPropertyChanged auto wiring or how to get rid of redundant code
  3. How to calculate CRC in C#?

29 Responses to “Capturing and streaming sound by using DirectSound with C#”

  1. Read and use FM radio (or any other USB HID device) from C# | Tamir Khason - Just code Says:

    Pingback from  Read and use FM radio (or any other USB HID device) from C# | Tamir Khason – Just code

  2. Dew Drop - Xmas Edition - December 24-25, 2008 | Alvin Ashcraft's Morning Dew Says:

    Pingback from  Dew Drop – Xmas Edition – December 24-25, 2008 | Alvin Ashcraft’s Morning Dew

  3. DotNetKicks.com Says:

    You’ve been kicked (a good thing) – Trackback from DotNetKicks.com

  4. Zain Says:

    I dont think “var” keyword is available in the C# and you have also not specified the types of many variables. it is very difficult to understand this code. can you tell from where does the “IsReady” and NUM_BUFFER comes from

  5. Tamir Says:

    All you need is to run it toward client set of C# 3.5 :) This way everything will work for you

  6. VJ Says:

    I got the FM Tuner from Silicon labs and ran your code. The issue I’m seeing is that the USB radio isn’t showing up as a capture device. My microphone and Webcam are all being properly identified, but not the FM radio. So, when you iterate through the available capture devices, it fails to find the USB radio. Do you have any insights into why this may be happening?

  7. Tamir Says:

    Please see inside “//init cap buffer” and see what the name of the device detected. This might be wary!

  8. VJ Says:

    It appears like the FM Tuner is showing up as a WDM capture device which is not being recognized by Direct Sound’s DevicesCollection.

  9. Tamir Says:

    What OS you are using?

  10. JX Says:

    When i tried to stream audio from a FM receiver, I experience lag every few seconds (depending on the buffer size i set).
    It seems that whenever the secondary buffer finish playing its buffer and tries to regrab the data, there is a short pause.
    Do you have any idea how to solve this problem?

  11. Joseph Says:

    I’m having the same problem as VJ. My web cam microphone and USB microphone do not show in the DevicesCollection. I am using Vista 64 bit SP1. Any Ideas?

  12. Rishabh Ohri Says:

    HI….Can anyone tell me where to put this code..as I’m new to .Net i’m not able to understand where the code needs to be placed.

  13. Nova Says:

    “All you need is to run it toward client set of C# 3.5 This way everything will work for you” That’s NOT true. It produces build errors with ‘Native’, ‘AutoResetEvent’, ‘ThreadStart’, ‘IsReady’…. You need to provide more details including references.

  14. Keane Says:

    _dwNotifySize = Math.Max(4096, format.AverageBytesPerSecond / 8);
    _dwNotifySize -= _dwNotifySize % format.BlockAlign;
    _dwCaptureBufferSize = NUM_BUFFERS * _dwNotifySize;
    _dwOutputBufferSize = NUM_BUFFERS * _dwNotifySize / 2;

    what does the code mean ?? when i run it, the program error..
    really need help.. ASAP..

    sorry for my bad english

  15. Gabriel Says:

    I need help.

    There are two things I didn’t understand in these pieces of code.

    1st – Where does NUM_BUFFERS come from?
    2nd – Where does IsReady come from?

    Thanks

  16. Developer Says:

    It will be very helpfull if u write the usings and if u declar ur variables…

    but without that all your code is worthless…

  17. The Admin Jr Says:

    I am very grateful for code which would have taken me hours to get going otherwise. However, I do not appreciate incomplete code either.

    Guys, the only required references are to Microsoft.DirectX.DirectSound, which should be found in “C:\Windows\Microsoft.NET\DirectX for Managed Code” and then probably the first folder (and System.Windows.Forms or PresentationFramework and PresentationCore if you are making a Class Library)

    Using statements for this to run correctly:
    using System.Windows.Threading;
    using Microsoft.DirectX.DirectSound;

    Some comments to point out that some of the variables you have to specify yourself would have been nice. I would never dream of doing something as completely confusing as that, probably why my tutorials are so popular.

    Another reason could also be that I actually bother to declare my variables (which you obviously made public in the containing class and forgot to mention).

    Personal opinion: 6/10 – Sloppy

  18. The Admin Jr Says:

    Oh, and NUM_BUFFERS should only be one, I assume from what he’s written. IsReady I’m afraid you will have to program yourselves. This should be true while you wish to capture and then output the sound.

    Finally, I really do appreciate this, but there are some statements in here which do not make sense. Work on improving the English on these would be great.

  19. The Admin Jr Says:

    …And another amendment, the Native class does not seem to exist

    (there is a System.Windows.Forms.NativeWindow, but this does not have a function ‘GetDesktopWindow’)

    so instead I used his earlier “((HwndSource)PresentationSource.FromVisual(SourceHost)).Handle”

    where “SourceHost = FrameworkElement)App.Current.MainWindow.Content” for WPF or simply a control for WinForms (I think)

  20. Luka Says:

    Only Indian developers write code like this – incomplete, smelly and ugly!

  21. Vincent Says:

    I get a problem that the code which
    Format = format

    show that the value are null. May I know why ?

  22. James Carlyle-Clarke Says:

    Dear All,

    Here is an annotated, working version of this code. I’m not finished tidying this up yet but wanted to quickly post it to help Vincent.

    I created a C# Windows Forms project in Visual C# 2008. I added references to Microsoft.DirectX and Microsoft.DirectX.DirectSound. I added start and stop buttons called StartButton and StopButton with click events

    private void StartButton_Click(object sender, EventArgs e)
    private void StopButton_Click(object sender, EventArgs e)

    You may need to change the string in StartEcho(“Input”, this) – look in your Control Panel>Sounds and Audio Devices, under Audio>Sound recording>Default device and choose part or all of that name.

    I’ve made some changes to the code either for comprehension or to get it to work. With the updated code I _was_ getting a sound glitch about twice every time the output buffer filled – ie often. I think the buffer was being overwritten or locked while it was still playing, so I created two output buffers and it seems to work OK now, still the very occasional glitch but it might be my computer (no sound card!) – please post with comments on this when you try it.

    I’m rusty on thread safe variables, and I’ve not had time to do the research. As a result there are some variables that I’m not sure if they are safe or not; I’ve marked them. I think maybe they should be static volatile but it seems to work as is… If anyone can tell me what’s right I’d appreciate it.

    If I continue to improve it (rather than getting on with writing my own code) then I may post back, and I hope others will do the same…

    I hope this is helpful to someone, and I thank the original poster, but suggest better, more complete and clearer code examples in the future.

  23. James Carlyle-Clarke Says:

    Dear All,

    Here is an annotated, working version of this code. I’m not finished tidying this up yet but wanted to quickly post it to help Vincent.

    I created a C# Windows Forms project in Visual C# 2008. I added references to Microsoft.DirectX and Microsoft.DirectX.DirectSound. I added start and stop buttons called StartButton and StopButton with click events

    private void StartButton_Click(object sender, EventArgs e)
    private void StopButton_Click(object sender, EventArgs e)

    You may need to change the string in StartEcho(“Input”, this) – look in your Control Panel>Sounds and Audio Devices, under Audio>Sound recording>Default device and choose part or all of that name.

    I’ve made some changes to the code either for comprehension or to get it to work. With the updated code I _was_ getting a sound glitch about twice every time the output buffer filled – ie often. I think the buffer was being overwritten or locked while it was still playing, so I created two output buffers and it seems to work OK now, still the very occasional glitch but it might be my computer (no sound card!) – please post with comments on this when you try it.

    I’m rusty on thread safe variables, and I’ve not had time to do the research. As a result there are some variables that I’m not sure if they are safe or not; I’ve marked them. I think maybe they should be static volatile but it seems to work as is… If anyone can tell me what’s right I’d appreciate it.

    If I continue to improve it (rather than getting on with writing my own code) then I may post back, and I hope others will do the same…

    I hope this is helpful to someone, and I thank the original author, but suggest better, more complete and clearer code examples in the future.

  24. James Carlyle-Clarke Says:

    Dear All,

    Here is an annotated, working version of this code. It’s the first time I’ve done DirectSound, I don’t understand it yet, and I’m not finished tidying this up yet, but I wanted to quickly post it to help Vincent. Anyway, enough excuses!

    I created a C# Windows Forms project in Visual C# 2008. I added references to Microsoft.DirectX and Microsoft.DirectX.DirectSound. I added start and stop buttons called StartButton and StopButton with click events

    private void StartButton_Click(object sender, EventArgs e)
    private void StopButton_Click(object sender, EventArgs e)

    You may need to change the string in StartEcho(“Input”, this) – look in your Control Panel>Sounds and Audio Devices, under Audio>Sound recording>Default device and choose part or all of that name.

    I’ve made some changes to the code either for comprehension or to get it to work. With the updated code I _was_ getting a sound glitch about twice every time the output buffer filled – ie often. I think the buffer was being overwritten or locked while it was still playing, so I created two output buffers and it seems to work OK now, still the very occasional glitch but it might be my computer (no sound card!) – please post with comments on this when you try it.

    I’m rusty on thread safe variables, and I’ve not had time to do the research. As a result there are some variables that I’m not sure if they are safe or not; I’ve marked them. I think maybe they should be static volatile but it seems to work as is… If anyone can tell me what’s right I’d appreciate it.

    If I continue to improve it (rather than getting on with writing my own code) then I may post back, and I hope others will do the same…

    I hope this is helpful to someone, and I thank the original author, but suggest better, more complete and clearer code examples in the future.

  25. James Carlyle-Clarke Says:

    Apologies for the multiple posts – the web site said the posts had failed and gave an error.

    The last post above is the final version of the three.

    And now for the source code…

  26. James Carlyle-Clarke Says:

    Forgot to say that I also handled the Form’s FormClosing event in Form1_FormClosing()

  27. James Carlyle-Clarke Says:

    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Drawing;
    using System.Linq;
    using System.Text;
    using System.Windows.Forms;

    using Microsoft.DirectX.DirectSound;

    using System.Threading;

    namespace Echo
    {
    public partial class Form1 : Form
    {
    public Form1()
    {
    InitializeComponent();
    }

    private void StartButton_Click(object sender, EventArgs e)
    {
    if (!StartEcho(“Input”, this)) // change “Input” to something suitable
    MessageBox.Show(“No matching Sound Card was found”);
    // These parameters are guessed at – owner WAS Native.GetDesktopWindow()
    }

    private void StopButton_Click(object sender, EventArgs e)
    {
    IsReady = false;
    }

    private void Form1_FormClosing(object sender, FormClosingEventArgs e)
    {
    IsReady = false; // ensures the playback thread shuts down
    }

    // botch – not sure if these and IsReady are thread safe for multiple threads
    public int _dwCaptureBufferSize, _dwOutputBufferSize, _dwNotifySize;
    public CaptureBuffer _dwCapBuffer;
    public SecondaryBuffer[] _dwDevBuffers;
    public Thread _dwCaptureThread;

    public bool IsReady = false;
    // IsReady should be true while you wish to capture and then output the sound.

    private bool StartEcho(string captureDescriptor, Control owner)
    {
    // string captureDescriptor – string for eg “Mic”, “Input”
    // Control owner – maybe Window or Form would do for this – was Native.GetDesktopWindow()
    // if windowless application use desktop window as message broker
    // Returns true for setup done and thread started, false for problem

    // Choose a Wave format, calculating BlockAlign and AverageBytesPerSecond

    var format = new WaveFormat
    {
    SamplesPerSecond = 96000,
    BitsPerSample = 16,
    Channels = 2,
    FormatTag = WaveFormatTag.Pcm
    };

    // Both of these are calculate for All channels
    // BlockAlign = BytesPerSampleAllChannels, AverageBytesPerSecond = BytesPerSecondAllChannels
    format.BlockAlign = (short)(format.Channels * (format.BitsPerSample / 8));
    format.AverageBytesPerSecond = format.SamplesPerSecond * format.BlockAlign;

    // Set the size of input and output buffers

    // Multiplier of both delay and minimum buffer size in units of 1/16th secs,
    int NUM_BUFFERS = 8;

    // Sets _dwNotifySize to enough bytes for 1/16th of a second, all channels
    // Note that this was 1/8th (ie line ended ‘/ 8);’), and output buffer size = capture size/2
    // But this was changed to allow output buffer size to be a multiple of BlockAlign
    _dwNotifySize = Math.Max(4096, format.AverageBytesPerSecond / (8*2));
    // rounds _dwNotifySize to a multiple of BlockAlign (BytesPerSampleAllChannel)
    _dwNotifySize -= _dwNotifySize % format.BlockAlign;

    // Capture buffer is looped – when the end is reached, it starts from the beginning again.
    // Capturing one should be twice as large as output – so that when completed capture
    // is being read to output buffer there is still room to for the buffer to keep filling
    // without overwriting the output. I think.
    _dwCaptureBufferSize = NUM_BUFFERS * _dwNotifySize * 2;
    _dwOutputBufferSize = NUM_BUFFERS * _dwNotifySize;

    // Create CaptureBufferDescriptor and actual capturing buffer
    // Enumerate all devices, choosing one containing the given string (captureDescriptor)
    var cap = default(Capture);
    var cdc = new CaptureDevicesCollection();
    for (int i = 0; i < cdc.Count; i++)
    {
    if (cdc[i].Description.ToLower().Contains(captureDescriptor.ToLower()))
    {
    cap = new Capture(cdc[i].DriverGuid);
    break;
    }
    }

    // Check a matching capture device was found
    if (cap == null)
    return false; // no matching sound card/capture device
    {

    // Make the description and create a CaptureBuffer accordingly
    var capDesc = new CaptureBufferDescription
    {
    Format = format,
    BufferBytes = _dwCaptureBufferSize
    };

    _dwCapBuffer = new CaptureBuffer(capDesc, cap);

    // Create output device and buffers

    // Uses default speakers to output – choose output device in same way as for capturing.
    var dev = new Device();
    // As DirectSound uses any window for a message pump we have to SetCooperativeLevel()
    dev.SetCooperativeLevel(owner, CooperativeLevel.Priority);

    // Set GlobalFocus=True if you want echo even if desktop window is not focused.
    var devDesc = new BufferDescription
    {
    BufferBytes = _dwOutputBufferSize,
    Format = format,
    DeferLocation = true,
    GlobalFocus = true
    };
    // Create two output buffers – this seems to avoid the buffer being locked and written
    // to while it's still playing, helping to avoid a sound glitch on my machine.
    _dwDevBuffers = new SecondaryBuffer[2];
    _dwDevBuffers[0] = new SecondaryBuffer(devDesc, dev);
    _dwDevBuffers[1] = new SecondaryBuffer(devDesc, dev);

    // Set autoResetEvent to be fired when it's filled and subscribe to buffer notifications

    var _resetEvent = new AutoResetEvent(false);
    var _notify = new Notify(_dwCapBuffer);
    // Half&half – one notification halfway through the output buffer, one at the end
    var bpn1 = new BufferPositionNotify();
    bpn1.Offset = _dwCapBuffer.Caps.BufferBytes / 2 – 1;
    bpn1.EventNotifyHandle = _resetEvent.SafeWaitHandle.DangerousGetHandle();
    var bpn2 = new BufferPositionNotify();
    bpn2.Offset = _dwCapBuffer.Caps.BufferBytes – 1;
    bpn2.EventNotifyHandle = _resetEvent.SafeWaitHandle.DangerousGetHandle();

    _notify.SetNotificationPositions(new BufferPositionNotify[] { bpn1, bpn2 });

    IsReady = true; // ready to capture sound

    // Fire worker thread to take care of messages
    // Note that on a uniprocessor, the new thread may not get any processor time
    // until the main thread is preempted or yields, eg by ending button click event or
    // calling Thread.Sleep(0)

    // botch – not sure if these are thread safe for multiple threads
    int offset = 0;
    int devbuffer = 0;

    // Make a new thread – as countained in the { }
    _dwCaptureThread = new Thread((ThreadStart)delegate
    // *********************************************************************
    {
    _dwCapBuffer.Start(true); // start capture

    // IsReady – This should be true while you wish to capture and then output the sound.
    while (IsReady)
    {
    _resetEvent.WaitOne(); // blocks thread until _dwCapBuffer is half/totally full
    // Read the capture buffer into an array, and output it to the next DevBuffer
    var read = _dwCapBuffer.Read(offset, typeof(byte), LockFlag.None, _dwOutputBufferSize);
    _dwDevBuffers[devbuffer].Write(0, read, LockFlag.EntireBuffer);

    // Update offset
    offset = (offset + _dwOutputBufferSize) % _dwCaptureBufferSize;

    // Play the sound
    _dwDevBuffers[devbuffer].SetCurrentPosition(0);
    _dwDevBuffers[devbuffer].Play(0, BufferPlayFlags.Default);
    devbuffer = 1 – devbuffer; // toggle between 0 and 1
    }
    _dwCapBuffer.Stop(); // stop capture
    // *********************************************************************
    });

    _dwCaptureThread.Start(); // start the new Thread

    return true;
    }
    }

    }
    }

  28. James Carlyle-Clarke Says:

    (Shame about all the formatting…!)

    I’ve just tested it again after some sleep and it’s still glitching – can anyone suggest why? Maybe I’ll try writing code from scratch and see if that helps…

  29. Viktor Says:

    Hello!

    The code works for me! Thx! (windows 7 x64)
    But i have a problem:
    I’m capturing a webcam at the same time and the sound is laging about 0.4-0.7 seconds.
    Can anyone help to “make the sound faster”?
    Any solutions?

    Thx

Leave a Reply

Recommended

 


Sponsor


Partners

WPF Disciples
Dreamhost
Code Project
Switched to Better Place

Together