Jesse Vig’s research lies at the intersection of machine learning and human-computer interaction. He is currently working on using machine learning to improve the user experience in conversational AI systems. Jesse completed his Ph.D. in Computer Science at the University of Minnesota and his B.A. in Mathematics at Oberlin College. He develops apps in his free time and recently won first place in Google’s “Actions on Google Developer Challenge” for his 100 Years Ago app.
Can you tell us a little about your 100 Years Ago app?
100 Years Ago is an “Action on Google” – an app that lives within Google Assistant. It’s essentially an interactive version of an old-fashioned radio show that provides historical news and music, but also enables the user to navigate content and converse with a persona from that time period. It’s focused on content from exactly 100 years ago.
I’ve always been interested in history, and I felt that an audio-based interface could provide a novel way to experience historical content. I was inspired by the early radio shows that faced the same challenge that designers of voice interfaces face today: how do you convey information in a compelling manner without the benefit of a visual display? Prior to television, radio provided many of the same types of shows – not just news and music, but also soap operas, dramas, and variety shows, for example – and they employed a variety of audio effects to produce a compelling experience without a screen. I wanted to take that immersive audio experience and make it interactive.
How did you decide on 100 years?
For the initial release, I wanted to focus on one specific time period because of the challenges of content generation for a broader time range. 1917 was an important year historically – both World War 1 and the Russian Revolution were happening during this time period.
I can imagine expanding the radio show idea to other time periods, or to extend this approach to other types of educational content beyond history.
How do you envision this app being used in the near and distant future?
Currently the primary goal is to provide a history lesson through an interactive audio experience. Some of the content is focused more on entertainment, for example the historical figure is more of a caricature rather than a faithful representation of that person. In the future, I can imagine expanding the app to other time periods and conversing with historically accurate figures from that period.
Here at PARC, you work on conversational AI. What got you into that focus area?
I have a background in machine learning and human-computer interaction, and conversational AI was a natural fit since it combines elements of both.
What projects are you working on now at PARC?
I’m working on using reinforcement learning — learning from experience — to improve conversational agents.
What do you envision for the future of conversational technology?
Conversational technology combines two distinct but related technologies: natural-language interfaces and intelligent agents. In the past few years, we’ve seen rapid advances in the first piece: speech recognition and generation technologies in particular have made great strides due to recent advances in deep learning. Speech interfaces have the advantage that they can be used in situations where the hands and/or eyes are occupied – for example while working around the house or driving a car. As people spend more time in augmented and virtual reality, this will become an equally compelling use case. In the real world, the physical interface is shrinking or disappearing entirely– it can be easier to speak to a smart watch than to navigate its tiny screen, for example, so I think the use cases for speech interfaces here will expand as well.
In terms of the underlying AI, there have been some advances as well, but much work remains to be done. Most of the existing conversational AI systems focus on answering questions or performing single operations or transactions. As the AI improves, we’ll see intelligent assistants that can engage in more complex interactions in domains ranging from medicine to education to business.
Learn more about conversational technology at our Jan. 18 PARC Forum on conversational agents.
Jesse Vig’s research lies at the intersection of machine learning and human-computer interaction. He is currently working on using machine learning to improve the user experience in conversational AI systems. Prior to joining PARC, Jesse worked on recommender systems and social tagging systems; his focus was on designing novel systems that combine the automation of recommender systems with the user control and transparency of tagging systems. He also worked for several years as a medical software developer. Jesse completed his Ph.D. in Computer Science at the University of Minnesota and his B.A. in Mathematics at Oberlin College. He received the best paper award at the 2009 Conference on Intelligent User Interfaces and was nominated for best paper at the 2011 conference. His web design work has been featured in Popular Science magazine, and he was a top-5 finalist for Best NetArt website at the Webby awards.