Lab 2: Set up PlatformIO

Part 1: Set up PlatformIO with Visual Studio Code

In Visual Studio Code, search for and install Platform IO IDE under the Extension tab.

Click the PlatformIO Home button on the bottom left of the window. Then choose Platforms tab, and click Install Embeded Platform:

Search for "k210" and install the Kendryte K210 platform:

During installation, scroll down the page to see supported boards. Clicking the + button shows the ID for each board:

Part 2: Run the voice recognition sample and make your own recordings

Navigate to PIO Home and open the Maix-SpeechRecognizer project. Look at the project structure on the left:

The source files are in src/ folder. Configure the project first by opening platformio.ini. Modify the configurations so that it looks like this:

For board, we use the ID for the Maix Bit-Mic board. For upload_speed, we use 500000 (500 kbps). For monitor_port and upload_port, we use the same port that is selected in Arduino. You can find and copy the port name under PIO Home>Devices.

Now that Platform IO is configured, compile and upload the program to the Maix Bit board. Click the Upload button on the bottom:

If this does not work, upload with this command:


platformio run --target upload

You should see output like this:

Our program is now running on Maix Bit board, but it is not responding to voice commands. This is becaue of an error in the Maix Speech Recognition library. Open file <userfolder>/.platformio/packages/framework-maixduino/libraries/Maix_Speech_Recognition/src/Maix_Speech_Recognition.cpp (Note: for Windows, use C:\Users\<username> as user folder; for macOS, use /Users/<username>). Substitute the following code for both lines 58 and 69:


xxxxxxxxxx
s_tmp = (int16_t)(g_rx_dma_buf[2 * i + 1] & 0xffff);

Now recompile and upload the project. Say "Hey Friday" or "Hey Jarvis". It should now recognize and respond to voice commands (though poorly).

In line 13 of main.cpp, change the RECORD_MODE to 1. Re-upload the program, and open the Arduino Serial Monitor. You shoud see the following text in Serial Monitor:

Start recording... speeking...

Say your own command and observe the output in Serial Monitor. The Maix Speech Recognition library extracts MFCC features from your voice, and outputs them to the serial port. You may make multiple recordings by pressing the reset button on the Maix Bit board.

Part 3: Modify the sample to recognize your own commands

Look at voice_model.h. These are sample MFCC frames for each command. Each sample is represented by its frame count (e.g. fram_num_hey_friday_0) and an array of its MFCC features (e.g. hey_friday_0). Notice there are 2628 elements in each array--the same length as each recording output from the serial port in Part 2.
Make your own recordings of two other commands (different from the two provided by the sample) from the serial port. Replace the sample fram_num and MFCC features in voice_model.h with yours.
In main.cpp, look for the strings that are displayed after each recognition. Modify them as necessary to reflect your commands in step 2.
Change RECORD_MODE back to 0, then re-run the program. Test your changes. Demonstrate your changes to a TA.

Note: You may have noticed that there are many zeros padded to each recording. This is because the time limit specified in the Maix Speech Recognition library for each recording is 2.2 seconds. If you do not need the commands to be that long, you can reduce this limit in /Users/<username>/.platformio/packages/framework-maixduino/libraries/Maix_Speech_Recognition/src/util/MFCC.h, line 15.


xxxxxxxxxx
#define vv_tim_max  2200

change 2200 to something smaller (unit: milliseconds). However, if you try to compile the program, you will see errors like this:


xxxxxxxxxx
src/voice_model.h:91:1: error: too many initializers for 'const int16_t [1188]' {aka 'const short int [1188]'}

I used 1000 ms for the time limit. The error implies that I should trim the recordings in voice_model.h to a length of 1188. If you want to make new recordings with a different time limit like I did, you may fill the arrays with zeros (the number of zeros should match that indicated in the error message), and set fram_num to zero.

You may also want to add additional recordings to improve recognition. Look for addVoiceModel in main.cpp. Learn how to call it by reading the existing code. Refer to its implementation in Maix_Speech_Recognition.cpp for more information.