A few weeks before the first ever Oracle sponsored Maker Faire, I was experimenting with some of the cognitive (vision) recognition APIs available. Google Vision API, Watson Visual Recognition and Microsoft Computer Vision API are some of the biggest players in this field right now.
After testing all of them I found the idea of Microsoft’s CaptionBot really compelling: Upload an image to the CaptionBot and it will try to come up with a coherent caption based on a mashup of three of their cognitive services (Computer Vision API + Emotion API + Bing Image API). I wrote an iOS app (with it’s own Swift framework) to consume this mashup and took it for a spin.
I gave my phone to my kids to test the app. They ran around the house and were truly amused by pointing the camera to an object and getting a description.
So when the call to create a project for Oracle Maker Faire with my kids came, we sat down and started brainstorming. The concept was still fresh on their minds; A computer could guess as close as possible to what an object is and even guess a facial expression.
They came up with a plan and a name: Emotibot, an emotion sensing robot. We drove to the closest Home Depot to find materials. We found an Led Glow Ball Lamp that worked perfectly for the head of our emotion sensing robot.
We used the following materials to build our robot:
- Raspberry Pi 3
- Raspberry Pi Camera
- Ultrasonic Ping sensor
- Adafruit 16-Channel 12-bit PWM/Servo Driver
- 1 Mini Servo
- 2 blink(1)
- Led Glow Ball Lamp
- Peep n’ Peepers Flashing Eye Lights
- Speaker
- Mic stand
- Hair band (for the mouth)
The robot worked as follows:
- The Ultrasonic Ping sensor detected when someone was close to it (about 10 inches).
- The robot started to talk using festival-lite or flite. The mouth servo was synchronized with the flite by counting the words and moving the mouth for each word spoken.
- A picture was snapped and submitted to Microsoft Emotion API. The JSON result was parsed and then spoken with flite by Emotibot.
- Using the blink(1) USB LEDs the robot changed colors based on the emotion.
- At the end we also added a microphone array to interact with the robot, but since we knew it was going to be noisy we didn’t enable that part during the faire.
Overall the project was a success. I was able to involve my kids and they learned some concepts along the way. If anyone is interested to see the code hit me up in the comments and I might put it on Github.
@noel: Your daughters are going to take over the world, whether as UI designers, inventors, artists, makers, or developers, (or CEOs) still TBD. Do you think this is a project that could be scaled for a classroom setting? I have a friend who is an elementary school teacher in Redwood City. She recently set up a “maker lab” at her charter school. They have a 3D printer, laptops, etc. Would this kind of project be feasible in a classroom or do you think it requires too much technical expertise from the adult/teacher/advisor?
It is a great and complex build overall. There are lots of “sensory interface” to work with the underlying black magic. Particularly I like the eye-ball effect, mouth-voice syncing as you can see them.
sathyasun72 at gmail.com, can you share the code in GITHUB