Espeak A Very Easy and Powerful Festival Alternative
Festival for many years has been the mother of all vocabulary text to speech systems available however it is complex to install especially if you need other voices. Espeak takes the pain out of installation that can challenge the most accomplished or seasoned Linux user. Ubuntu comes with Espeak pre-installed and even if this is not installed it only takes seconds to do so, unlike festival which takes many minutes.
Espeak is great fun to use and very flexible but really this is only a tool, Espeak like it's cousin Festival can re-direct it's output to produce a wav file that can be used again and again on this or other systems.
If you are looking for a screen reader then you should look at the Orca system again with the current version of Ubuntu this is available in the System => Preferences => Assisted Technologies. It is a system that allows you to navigate your way around your Desktop if your blind it is truly an excellent and a totally free system and works very well. (More about Orca in another post)
Both Espeak and Festival are console tools, that is to say they use console commands to initiate the application, this is why this post is in in the tools section.
Here are some examples of console commands you could use with Espeak.
This is command mode, entering "espeak" with no parameters associated with it allows you type your text, sentence or paragraph on pressing of the carriage return on your keyboard or enter key. From this simple example you can begin to see how one can grab a sentence, paragraph from a letter or the web and have read to them the contents of what was copied. Try this yourself put espeak in command mode if you haven't already done so and copy and paste a section of a letter or web page into the espeak application. You should immediately here in perfect clarity the contents of what was pasted into the application, be careful not to copy empty lines as this will be seen by the application as a carriage return and initiate espeak before you are ready.
You can also use espeak to pronounce one or more words.
As you can see to reproduce a sentence you need only enclose the words in quotes.
But then you can also do this
Well ok this is moderately impressive so how about this then.
Now you can begin to see some of the educational aspects and power available using some very simple commands counting in German great huh! But it's not just German it is any installed language and that's quite a lot just replace the two element character code after the "-v"
- English = en
- German = de
- Afikaans = af
- Bosnian = bs
- Greek = el
- Finish = fi
- French = fr
Or to get a full list of voice languages just type this
Which will produce an output similar to that below:
Pty Language Age/Gender VoiceName File Other Langs
5 af M afrikaans af
5 bs M bosnian bs
5 cs M czech cs
5 cy M welsh-test cy
5 de M german de
5 el M greek el
5 en M default default
5 en-sc M en-scottish en/en-sc (en 4)
2 en-uk M english en/en (en 2)
5 en-uk-north M lancashire en/en-n (en-uk 3)
5 en-uk-rp M english_rp en/en-rp (en-uk 4)
5 en-uk-wmids M english_wmids en/en-wm
5 en-us M english-us en/en-r (en 3)
5 en-wi M en-westindies en/en-wi (en-uk 4)
5 eo M esperanto eo
5 es M spanish es
5 es-la M spanish-latin-american es-la (es-mx 6)
5 fi M finnish fi
5 fr M french fr
5 grc M greek-ancient grc
5 hi M hindi-test hi
5 hr M croatian hr (hbs 5)
5 hu M hungarian hu
5 id M indonesian-test id
5 is M icelandic-test is
5 it M italian it
5 jbo lojban jbo
5 ku M kurdish ku
5 la M latin la
5 mk M macedonian-test mk
5 nl M dutch-test nl
5 no M norwegian-test no (nb 5)
5 pl M polish pl
5 pt M brazil pt (pt-br 5)
5 pt-pt M portugal pt-pt
5 ro M romanian ro
5 ru M russian_test ru
5 sk M slovak sk
5 sr M serbian sr
5 sv M swedish sv
5 sw M swahihi-test sw
5 ta M tamil ta
5 tr M turkish tr
5 vi M vietnam-test vi
5 zh M Mandarin zh
5 zh-yue M cantonese-test zhy (yue 5)
Provided your word or sentence is written in the language which is desired the correct or near correct pronunciation will be mad therefore if we where to first write "1 2 3 4 Bonjour" and ask for this in English (espeak default language).
Whilst we could understand the number values correctly espeak can only reproduce a pronunciation for the word that is written be it in English, French or any other language supplied. Now if we write the same message this time using the French dictionary we get both the numbers and message in french.
This all very well but nothing we have described above can be termed useful, fun yes but hardly useful not unless we can change the text that is read to a sound file well guess what you can. So how about this for useful you have some standard english text in a file called mytext.txt somewhere on your hard disk this could be a message you would like spoken rather than read, how do we do this.
We can expand somewhat on this to produce an mp3 file instead essentially it is still the same .
This very rapidly creates a file called mytext.wav from the words, group of words sentence or paragraphs in the file named mytext.txt.
This should establish that a wav file has indeed be created and display a message similar to this.
''mytext.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 22050 Hz''
The mp3 file on the other hand will show a similar output see bellow use this command to check the file you have created.
''mytext.mp3: MPEG ADTS, layer III, v2, 32 kBits, 22.05 kHz, Monaural''
You can then use mplayer to play either of these files.
For something a little more useful you can read html web pages using the following command.
However this syntax works better with pages written in html (Hyper Text Markup Language) if you where to point the above url (Uniform Resource Locator) to this or one of the other wiki pages it would not work near so well, but by all means give it a try.
For a bit more on the fun element you could give this a try, it uses another application called 'fortune' which changes every time it is used as a result espeak output changes also.
Information source Text-to-speech-synthesizer
If you still wonder how useful espeak is then what about creating audio books we have found an entry that might help you achieve just that.
TXT_FILE="$1"
BASENAME=`echo "$TXT_FILE" | sed 's/\(.*\)\(\....$\)/\1/g'`
echo "TTS (text-to-speach) ${TXT_FILE}"
ext=${1##*.}
# if it isn't a TXT file, convert it first
if [ "$ext" != "txt" ] ; then
TMP_FILE="/tmp/espeakfile-$$.txt"
# PDF
if [ "$ext" = "pdf" ] ; then
echo "converting from PDF to TXT"
pdftotext "${TXT_FILE}" "${TMP_FILE}"
fi
# ODT
if [ "$ext" = "odt" ] ; then
echo "converting from ODT to TXT"
odt2txt --subst=all "${TXT_FILE}" > "${TMP_FILE}"
fi
# DOC< br />
if [ "$ext" = "doc" ] ; then
echo "converting from DOC to TXT"
antiword "${TXT_FILE}" > "${TMP_FILE}"
fi
TXT_FILE="${TMP_FILE}"
fi
rm -f /tmp/voice.wav
# create a FIFO "named pipe" to save space
mkfifo /tmp/voice.wav
# espeak write output to a pipe while lame encodes the file on the fly
nice espeak -f "${TXT_FILE}" -w /tmp/voice.wav & \
xterm -e nice lame -a --resample 16 -V 9 --vbr-new --lowpass 8 -f /tmp/voice.wav -o "${BASENAME}_VBR.mp3"
echo "...done! Voice saved as ${1}.mp3"
This application "espeak" does not natively support the reading of or interpretation of "pdf" formated documents, rather you need to use the script above to convert first to text and then an "mp3" file. This essentially is using an automated voice to produce what is called an audiobook. Many community projects have been setup around the country that utilise human vocals an expensive mixer and recording hardware to produce high quality but low volume audiobooks for the blind.
To convert a pdf formated file to common text file, this that the application espeak can support one of course needs to download the ebook as a pdf, doc or odt
for pdf conversion to text
for odt conversion to text
for doc conversion to text
As text you could use one of the methods above to play the audiobook to standard output.
Nine times out of ten such recordings need heavy investment a great deal of man power and resources to maintain and produce, this could although somewhat cruder in it's implementation provide an affordable means to produce large quantities of audiobooks on virtually any subject available in PDF, DOC, ODT or TXT format. Such audiobooks can be provided on mass via very economical means of data distribution, email these can be played on the computer or downloaded for playing on small personal mp3 walkmans and ipods. If this is a project you would like advice on or perhaps participate in feel free to get in touch I am happy to explain and support for such a good cause.
Ever wanted your own speaking clock try this
or you could expand on this and have
The command "espeak" can also be integrated into your console terminal so that each time you launch a new terminal you can audibly hear a fortune. This is accomplished by editing a file called "bash.bashrc" this file in Ubuntu and there derivatives exists in the folder "/etc" this may differ in different distributions but you should try to find this file first before looking elsewhere.
Using vim or gedit edit this file be sure to make a backup of this file first,
or for gedit
At the bottom of this file and on its last line place and save this entry
Lets walk through the command "fortune" is a large database of phrases some long some short and some funny, together with the
contact email : linux 'at' soslug.org

