10
CHAPTER 1
INTRODUCTION
1.1 GENERAL
Speech recognition is a feature that gives us the ability to perform tasks using our
spoken words as input. Speech recognition is gradually becoming a part of our lives
in the form of voice assistants such as Alexa, Google Assistant, and Siri. Whether
it’s dictating words to your device to compose a document, doing a web search using
voice, or controlling your computer using speech — speech to text conversion is
making our life faster and comfortable. It has the potential to replace traditional forms
of human to machine interface input devices, such as keyboards. A future where
humans are able to interact with machines just by using their speech and bodily
movements is not very far.
1.2 OUTLINE OF THE PROJECT
Human interact with each other in several ways such as facial expression, eye
contact, gesture, mainly speech. The speech is primary mode of communication
among human being and also, the most natural and efficient form of exchanging
information among human in speech. Speech-to-text conversion (STT) system is
widely used in many application areas. In the educational field, STT or speech
recognition system is the most effective on deaf or dumb students.
1.3 AVAILABLE TECHNOLOGY FOR SPEECH RECOGNITION
As part of a program of research on speech-to-speech translation, we review some
of the available technologies for speech recognition, the first component in any
voice-based MT system.
Microsoft Speech API
Microsoft Speech API (SAPI) allows access to Windows’ built-in speech recognition
and speech synthesis components. The API was released as part of the OS from
Windows 98 forward. The most recent release, Microsoft Speech API 5.4, supports
a small number of languages: American English, British English, Spanish, French,
German, simplified Chinese, and traditional Chinese. Because it is a native
Windows API, SAPI isn’t easy to use unless you’re an experienced C++ developer.