Thursday, February 9, 2017

Tango - I Spy

Project Status: App Idea

Requirements

This app would require a Google Tango device to run. The development skill level is quite a bit more difficult than previous app ideas posted here on this blog.

The Idea

The gameplay is based off of the child's game often called I Spy. The app would see/scan/learn the room the player is in and use computer vision to secretly select objects within the environment. The game is to have the player try and guess which object the app is thinking of, much like the child's game often played in real life. Not only would this app potentially be fun to play, it could be used to help train computer vision neural networks.

Gameplay

When you open the app the game may take a second to acclimate, this is basically how Tango works now as it often says "Hold Tight," only with the added process that the app will have to try and use computer vision to quickly see an object in the field of view that it recognizes.

The very first object selected by the app each time it opens will likely be super obvious and easy to guess, which will serve several purposes. For one thing, the quicker it can find an object the faster the game will start. This saves from having to have the player scan large portions of the room before playing, We know that the player will naturally move the device around and we can scan the room during gameplay. However; if no objects are visible or recognizable right away, we may need to ask the player to slowly look around a bit with the device until this is solved.

Another benefit of this quick and easy recognition is it will help us to understand our player. If he/she is not able to quickly guess an obvious object within a small field of vision, then we will know to keep the object selection and gameplay easy for him/her as it is likely this is a child or someone that would have a hard time guessing in general. On the other hand, if he/she quickly identifies the object we will know that we can safely step up the difficulty when possible.

The app will need to provide a hint to start the player off such as, "I spy something red." The player can walk/look around and touch objects to guess if they have found the correct object that the app is thinking of. When wrong, more hints may be provided over time to help guide the player. If they player is having a hard time the app could even use visual feedback to show if the player is getting "warmer" or "colder" as the device is moved around.

Hints could be basic like "look down," "try higher," or more complex like "it is shiny," "it may be wooden," or "the object is soft and fuzzy." Furthermore; if it happens to be a pretty common object that the app understands clues could be something like this, for a door for example: "you use it," it swings," or "it's closed."

It is possible to do a little machine learning here. At the end of each round we could let the player point out if any hints were misleading or flat out wrong. Of course we could playfully apologize while noting that we are a simple minded AI character that he/she is helping to teach, promising to try and do better in the future.

When the correct object is selected the app could continue with the next object or by allowing the player to think of an object. Reversing the roles is quite a bit higher of a challenge for the developers but could always be added later on as a feature to the game.

Training Computer Vision

By reversing the roles, allowing the player to pick objects, we now have a nice opportunity to train our neural networks and make our computer vision results better over time.

There are several creative ways to allow the player to provide hints to the app, as it now must guess the object. One way is that we could start out by allowing the user to pick from a dynamic color palette specific to this environment. Essentially saying "I spy something the color..." by tapping this selection.

After that point the app could lead the player to objects is wants to guess and provide the player with three options to always select from being something like: yes, no or I don't know. "I don't know" could be worded many other ways but it's basically the option for what you are asking or guessing doesn't make sense, so please try something else or clarify. The app could continue to select objects or pose questions for the player to answer such as "Is it food?" The questions would always be geared towards a yes/no answer and would often teach us something about the object in question.


Other ways of interacting that allow the player to select objects would be to use speech recognition or providing a keyboard and an input box, but typing doesn't sound very fun in this game format and we are already making our workload hard enough without having to deal with speech recognition for what would often fall into the same three categories of yes, no or I don't know.

The Development

While this would be a really fun project to work on or be involved with, it is way over my skill level at the moment. Hopefully though someone will read this and take off running with the idea. Good luck to you and please keep me posted with the progress. Right now I am simply following tutorials and trying to get anything working with Tango. Let me know if you would like more clarification on this idea, I often tend to rattle off parts of an idea that make sense in my head but I lose people along the way by leaving out major portions of the inter-workings.