Support software development to coordinate gestures of CEASAR with voice commands.
The past week brought slow progress. Essentially the problem is understood. Below is the diagram that demonstrates how voice command coordination and voice response should work to help with remotely coordinating CAESAR.
The schematic demonstrates that human voice commands are analyzed by the Pockesphinx voice recognition program. Pocketsphinx produces a string of words or a single word. This output is matched against the established set of words that correspond to commands sent to the robot. If matching is unsuccessful, it should trigger Festival human voice synthesizer program to produce a voice response seeking for clarification or confirmation of the given command.
When the words match the command familiar to the robot, Adruino (or other microcontroller) receive the command(s) from Pocketsphinx and then executes them. Such commands could be navigating the robot (e.g. “Go Forward”, “Turn Right”, or “Stop”), manipulating hands and objects with hands (e.g. “Pick up a yellow ball”, “Drop the ball”), locating objects and reporting their position to the operator (e.g. “Locate the blue block”, “Where is the red rectangle”), and finally responding to voice communication (e.g. “What is your name?”, “How old are you?”, “Where do you live?”, “What is the weather like today? etc.)
Additionally, certain key words could be used to shut down the program. Shutting down the robot can be done by having the code in Pocketsphinx such that when it hears the specific command it stops listening, comes to stop (if it were moving). Additional indication of the fact that the robot stopped listening can be light signals. At this moment Fewstival can synthesize acknowledgment of the last command.
Ideally, voice commands could be used to “wake up” the robot or attract its attention. Again, certain keywords should be used (e.g. "CAESAR wake up"). Similar to shutting down sequence the robot should indicate and acknowledge that it can hear you by using Festival synthesizer by saying, for example: " I am listening, sir" or "I am ready" or "I am waiting for your commands, my friend."
At the first approach, we should establish communication from Pocketsphinx to the Arduino unit. Once this is established, tuning of the program will improve accuracy of the robot's ability to recognize the voice commands and execute the commands. The second stage will be communication of Arduino with Festival and back with Pocketsphinx to have robot respond vocally and in such a way interact with humans continuously. The last stage will be fine tuning of the parts and expanding the vocabulary and library of commands matching with the vocabulary as well as the ability of the robot to perform various tasks.
Lisa Ali and Michael Zitolo
This week we continued our work with Caesar, the lab’s robot. The week started off with us modifying our program that we created last week to perform better. Once Caesar’s head was all put together, we were able to test our program on it. We tested Caesar’s eyes and did a bunch of calibrating of colors so that it would detect more accurately. We were successful at having both eyes (two cameras) detect four different colors: blue, yellow, green and red—along with a few different geometric shapes. We did some research on: robot distance detection using two cameras, to learn of different ways others accomplished this task to help with our own project. Our C++ code grew enormously, we added a lot of information to it this past week. We also drafted an Arduino code to determine coordinates in space along with merging existing code (Jared’s code) with ours. We’ve compiled a code that receives coordinates from the two cameras (Caesar’s eyes), sends those coordinates to an Arduino and finally the Arduino communicates that information to an arbotics—which moves Caesar’s eyes. Overall it was a very productive week for us!
This week the design team worked on the app for ASD students. I did further research into Dr. Paul Ekman’s six universal emotions and created a series of hand sketches that conveyed them while also bearing in mind the limitations of Robot-Head’s design, primarily its inability to turn the corners of its mouth. After we agreed that we didn’t properly convey happy, sad, and fearful expressions, I re-worked them to include smiles and frowns. After we decided on the final designs, I inked my original pencil sketches and we scanned them in to create our app. We then discussed how to re-design Robot-Head’s mouth so that it could smile and frown more convincingly. I also researched what the most psychologically relaxing colors were in deciding a background color for our app. I then expanded my paper to include a section on how apps are used with autistic children and found that our project is the first to include an app centering on emotion recognition with robots.