Overview

The aim of the project is to develop a vocal assistant taking as reference points well-known, cutting-edge applications such as Google Assistant and Amazon Alexa. Of course our imitation attempt mainly focuses on the the basic ideas behind this kind of software which means that we want an application which is able to interact with the user by means of speech recognition (Speech-to-Text) and speech synthesis (Text-to-Speech). In other words, our goal was to achieve satisfying results in terms of available functionalities and user experience without resorting to AI algorithms and having to process huge amounts of data. As far as the workflow is concerned, we can identify three main phases:

  1. The choice of the hardware platform. We were looking for a device that had to be versatile, cheap yet equipped with good computing power. Eventually we decided to work with a Raspberry Pi mainly because it is widely supported by companies and independent developers; this aspect guaranteed, at least in principle, various options with regard to the programming languages and the libraries to be used.
  2. Secondly, we had to choose a third-party software library in order to handle speech recognition and synthesis. In hindsight, this phase turned out to be the key step of our project since it has deeply influenced the following implementation of the features.
  3. The final phase consisted in coding and gradually testing the assistant with the aim of improving the usability and the features' complexity.

Polytechnic Department of Engineering and Architechture
University of Udine, 2019