Skip to main content
deleted 549 characters in body
Source Link
Martijn Pieters
  • 14.7k
  • 10
  • 60
  • 59
  1. Arduino and Raspberry Pi are prototyping boards with little chips on them. You should focus on the chip first. Look for something with a DSP (digital signal processing) toolbox, maybe you already have a DSP toolbox and don't know it. DSP toolboxes have algorithms on call like fft (fast fourier transform) and ifft (inverse fft) for fast frequency domain analysis.

  2. Focus on your programmatic style: Are your samples in a stack or a queue? You will want a queue for this type of data. A queue looks like:--------------------------------------------------------------------------------------- Position NO --|1|2|3|4|5|6|7|8|--------------------------------------------------------------------------------------------- Sample Value |5|7|9|1|2|2|9|8|----------------------------------------------------------------------------------------------

     Position NO --|1|2|3|4|5|6|7|8|
     Sample Value  |5|7|9|1|2|2|9|8|
    

    Next iteration:

     Position NO --|1|2|3|4|5|6|7|8|
     Sample Value  |0|5|7|9|1|2|2|9|
     ->  First in First out (FIFO)
    

Next iteration:------------------------------------------------------------------------------------------------------------------ Position NO --|1|2|3|4|5|6|7|8|--------------------------------------------------------------------------------------------- Sample Value |0|5|7|9|1|2|2|9|---------------------------------------------------------------------------------------------- -----------------------------------> First in First out (FIFO)

  1. Arduino and Raspberry Pi are prototyping boards with little chips on them. You should focus on the chip first. Look for something with a DSP (digital signal processing) toolbox, maybe you already have a DSP toolbox and don't know it. DSP toolboxes have algorithms on call like fft (fast fourier transform) and ifft (inverse fft) for fast frequency domain analysis.

  2. Focus on your programmatic style: Are your samples in a stack or a queue? You will want a queue for this type of data. A queue looks like:--------------------------------------------------------------------------------------- Position NO --|1|2|3|4|5|6|7|8|--------------------------------------------------------------------------------------------- Sample Value |5|7|9|1|2|2|9|8|----------------------------------------------------------------------------------------------

Next iteration:------------------------------------------------------------------------------------------------------------------ Position NO --|1|2|3|4|5|6|7|8|--------------------------------------------------------------------------------------------- Sample Value |0|5|7|9|1|2|2|9|---------------------------------------------------------------------------------------------- -----------------------------------> First in First out (FIFO)

  1. Arduino and Raspberry Pi are prototyping boards with little chips on them. You should focus on the chip first. Look for something with a DSP (digital signal processing) toolbox, maybe you already have a DSP toolbox and don't know it. DSP toolboxes have algorithms on call like fft (fast fourier transform) and ifft (inverse fft) for fast frequency domain analysis.

  2. Focus on your programmatic style: Are your samples in a stack or a queue? You will want a queue for this type of data. A queue looks like:

     Position NO --|1|2|3|4|5|6|7|8|
     Sample Value  |5|7|9|1|2|2|9|8|
    

    Next iteration:

     Position NO --|1|2|3|4|5|6|7|8|
     Sample Value  |0|5|7|9|1|2|2|9|
     ->  First in First out (FIFO)
    
Source Link

  1. Arduino and Raspberry Pi are prototyping boards with little chips on them. You should focus on the chip first. Look for something with a DSP (digital signal processing) toolbox, maybe you already have a DSP toolbox and don't know it. DSP toolboxes have algorithms on call like fft (fast fourier transform) and ifft (inverse fft) for fast frequency domain analysis.

  2. Focus on your programmatic style: Are your samples in a stack or a queue? You will want a queue for this type of data. A queue looks like:--------------------------------------------------------------------------------------- Position NO --|1|2|3|4|5|6|7|8|--------------------------------------------------------------------------------------------- Sample Value |5|7|9|1|2|2|9|8|----------------------------------------------------------------------------------------------

Next iteration:------------------------------------------------------------------------------------------------------------------ Position NO --|1|2|3|4|5|6|7|8|--------------------------------------------------------------------------------------------- Sample Value |0|5|7|9|1|2|2|9|---------------------------------------------------------------------------------------------- -----------------------------------> First in First out (FIFO)

Notice how things shift toward the 'right'? I think you described a "circular" algorithm. Just overwrite the oldest samples with the second oldest samples, then overwrite the second oldest samples with the third oldest, ..., all the way to the beginning of the queue where you insert your newest data.

  1. "The code is continuously taking fixed length samples, say 10msec" <--incorrect Think this way: The code is discretely taking quantized (height) samples, at a sampling rate of 10000 samples per second, which makes each sample 0.1 ms apart.

What is your sampling frequency? What is the bitrate on your quantizer? Lower numbers will help you free up memory. I would suggest a low sampling rate like 6600 samples per second (Nyquist). I suspect 4 bit (16 levels) would be sufficient for recognition. So thats 3300 bytes of recording per second. Now do fft and delete everything above 3300 Hz (telephony filter). Now you have 1650 bytes used for one second of sound. These DSP tricks will save a lot of memory.

I don't know who thinks 512 MB is small. With the above info that is 300,000+ seconds of recording... over 3 days solid.

  1. I think you will find the frequency domain (by using fft) to be a better environment to perform voice recognition.

I hope I didnt confuse you worse :)