True history explaining what robot Pepper can do and what she cannot do

One of the projects of Diatom Enterprises’ R&D department in 2017 was the purchase of the Pepper robot, one of the first publicly available humanoid robots, to the end of evaluating the possibility of using the built-in features, as well as adding supplemental ones.

This is what we had at the time based on the promotional booklets:

Pepper is one of the first humanoid robots on the market that is capable of identifying the principal human emotions: joy, anger and surprise. She is able to scan non-verbal language, such as the angle of a person’s head. But one of Pepper’s main advantages is the potential to use custom software development to adjust her behavior, AI, movements and many of the robot’s other exciting features according to your business needs.

AI Software for humanoid robot Pepper

We’re talking to Slava Dubovitsky, who is a certified Pepper developer at Diatom Enterprises – a company located in Latvia.

– The first question that has to be asked. If we go back to what’s available out of the box. Could the robot be used right away for business, or was it just a development platform? In other words, what could the robot do when taken out of the box?
– This is a very good question. When we purchase some computer device, we expect it to be fully functional in terms of running certain programs right away. Launched for the first time and connected to the Internet, Pepper could turn his head and say Hi! There wasn’t anything else he could say in response. After 15 minutes of searching the manufacturer’s website, an additional software package was found that allowed Pepper to be turned into a talking robot with the intelligence of a 3-year-old child. Pepper was able to answer simple questions about who he was, how old he was, say whether he had parents. He was incapable of remembering or recognizing people.

– Which language can the robot be programmed in? How convenient and effective is it?
– Pepper can be programmed in Python 2.7 using the Choregraphe graphical environment especially developed for it. The latest firmware upgrade allows programming Pepper for Android in Java. We program using Python. Python is a very common cross-platform language with a large library of all kinds of classes. Choregraphe is cross-platform, too. You get used to it quickly enough. We did not stumble upon any particular difficulties. When programming, all the tools available in the universe of IT development can be used. Using the ‘client/ server’ terminology, Pepper acts as a client and can call API methods of other services working online.

– What about speech recognition? How would you rate the robot in terms of Speech Recognition?
– Pepper has built-in Speech Recognition support in English and several more languages. However, this does not mean that Pepper can offer you an intelligent reply, having recognized what you said. For that additional AI programming will be needed.

– What about face recognition, for instance? Could the robot remember and recognize faces?
– According to the creators of Pepper, it should be able to remember and recognize faces. However, in practice, we never did manage to use this feature. We created our own facial recognition system in Diatom. Now we can teach the robot to recognize people it hasn’t met before, as well as surely get acquainted with and remember new people.

– Based on the experience accumulated, which application type is most suitable for this type of robot?
– Pepper has a built-in tablet that can display user input interface, its status and mood. In addition, Pepper can operate as a server, running programs for different purposes. I would say that the most common for the tablet is the informational and introductory application type that would display information and call for interactivity with users, with the help of manual input, selection on the tablet or voice.

– Speaking about the speed of the robot in general. Is it possible to achieve a relatively quick reaction comparable to that of a person?
– This is a very interesting question. On average, Pepper answers voice questions with a 2-5 second delay, depending on the Internet speed. After all, there are a great number of things happening behind the scenes. Pepper first runs voice recognition of the received voice sound by using the Internet. Once the text is received, it gets analyzes. Then other special online services for AI are called. After that, the information received is either shown on the tablet or spoken by Pepper. At this point it’s still difficult for Pepper to compete with human reaction, but this is only a matter of time.

AI, speech recognition,face recognition

– If we talk about humanoid robots, it would be reasonable to assume we expect them to have artificial intelligence. This robot came with basic AI elements. Could you tell more about them?
– It does come with AI, but it’s very inconvenient or even impossible to use. Therefore, here at Diatom are developing proprietary AI solutions for Pepper. We came up with a software solution that allows Pepper to reply most common questions. For instance, Pepper can tell you about the weather in your town for a few days, find out the latest news in the world of politics and sports. Thanks to our development, Pepper knows Elton John and other world famous people. Now Pepper can even dance for you.

– And last question … Which features of Pepper, in your opinion, have to be changed to expand its scope of application and, in general, the scope of application of humanoid robots in 2020?
– In my opinion, what’s missing is a generalized solution for robots for AI. It would be great to be able to use the likes of the already existing voice assistances like Siri, Cortana and so on. You could connect such an assistance to the robot and he’d have the same abilities as Siri or Cortana. As for additional features, those we can develop and program for you over at Diatom.

– Slava, thank you for this great conversation!