撰寫: 2022/2/5

發佈: 2022/5/16

在Raspberry Pi使用eSpeak

eSpeak是一個精簡且開源的語音合成軟體，可支援英語與其他語言在Linus及Windows作業系統的環境中使用，所以適合Raspberry Pi的Linus作業系統讓它可以說出你想要的語言，當然eSpeak也被轉到Android, Mac OSX and Solaris等Linus作業系統。

在官網http://espeak.sourceforge.net如下圖可以看到它的功能與特色如下：

A command line program (Linux and Windows) to speak text from a file or from stdin.
在Linux and Windows在作業系統下以command line program的方式執行說出文字，來源可從檔案或直接從stdin來。
Includes different Voices, whose characteristics can be altered.
包括不同種類的聲音(如男人、女人等)，這些聲音特質是可以調整轉換的。
Can produce speech output as a WAV file.
可以藉eSpeak產生的說話語音輸出為WAV檔。
SSML (Speech Synthesis Markup Language) is supported (not complete), and also HTML.
可支援(並非百分之百完全支援) SSML Speech Synthesis Markup Language語音合成標記語法，並且包括網頁用的HTML語法。
Compact size. The program and its data, including many languages, totals about 2 Mbytes.
精簡的大小；程式加資料(甚至包含多個語言)總共約2 Mbytes。
Can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.
可以轉換文字成為聲音元素phonemes碼，所以它可以加成在其他語言合成引擎的前端。
Can be used as a front-end to MBROLA diphone voices, see mbrola.html. eSpeak converts text to phonemes with pitch and length information.
可以應用在MBROLA diphone voices雙音節聲音產生的前端，由eSpeak轉換文字成為有音高pitch及長度資訊的聲音元素phonemes，再交給MBROLA產生語言。
Potential for other languages. Several are included in varying stages of progress. Help from native speakers for these or other languages is welcome.
有潛力開發各種其他語言，目前已經有多種語言在進行到不同變動的程度，這些不同語言原生講者想來幫忙都相當歡迎。
Development tools are available for producing and tuning phoneme data.
現已有可以產生及調整聲音元素資料的開發工具。
Written in C.
採用C語言開發。
在此可以看到eSpeak是執行程式而非以Python lib形式來呼叫執行，安裝後是在作業系統下以command line program的方式執行；為了增加eSpeak語音的品質及可由Python來呼叫執行，所以在此安裝除了eSpeak本身之外總共包括以下三個步驟：

eSpeak
MBROLA
Python call

eSpeak安裝

在Linus下安裝eSpeak透過apt-get如下：

pi@raspberrypi:~ $ sudo apt-get install espeak

正在讀取套件清單... 完成

正在重建相依關係

正在讀取狀態資料... 完成

下列的額外套件將被安裝：

espeak-data libespeak1 libportaudio2 libsonic0

下列【新】套件將會被安裝：

espeak espeak-data libespeak1 libportaudio2 libsonic0

升級 0 個，新安裝 5 個，移除 0 個，有 0 個未被升級。

需要下載 1,217 kB 的套件檔。

此操作完成之後，會多佔用 2,974 kB 的磁碟空間。

是否繼續進行 [Y/n]？ [Y/n] y

下載:1 http://mirror.ossplanet.net/raspbian/raspbian buster/main armhf libportaudio2 armhf 19.6.0-1+deb10u1 [56.7 kB]

下載:2 http://mirror.ossplanet.net/raspbian/raspbian buster/main armhf libsonic0 armhf 0.2.0-7 [9,888 B]

下載:3 http://mirror.ossplanet.net/raspbian/raspbian buster/main armhf espeak-data armhf 1.48.04+dfsg-7+deb10u1 [941 kB]

下載:4 http://mirror.ossplanet.net/raspbian/raspbian buster/main armhf libespeak1 armhf 1.48.04+dfsg-7+deb10u1 [139 kB]

下載:5 http://mirror.ossplanet.net/raspbian/raspbian buster/main armhf espeak armhf 1.48.04+dfsg-7+deb10u1 [70.6 kB]

取得 1,217 kB 用了 4s (287 kB/s)

選取了原先未選的套件 libportaudio2:armhf。

（讀取資料庫 ... 目前共安裝了 98949 個檔案和目錄。）

正在準備解包 .../libportaudio2_19.6.0-1+deb10u1_armhf.deb……

Unpacking libportaudio2:armhf (19.6.0-1+deb10u1) ...

選取了原先未選的套件 libsonic0:armhf。

正在準備解包 .../libsonic0_0.2.0-7_armhf.deb……

Unpacking libsonic0:armhf (0.2.0-7) ...

選取了原先未選的套件 espeak-data:armhf。

正在準備解包 .../espeak-data_1.48.04+dfsg-7+deb10u1_armhf.deb……

Unpacking espeak-data:armhf (1.48.04+dfsg-7+deb10u1) ...

選取了原先未選的套件 libespeak1:armhf。

正在準備解包 .../libespeak1_1.48.04+dfsg-7+deb10u1_armhf.deb……

Unpacking libespeak1:armhf (1.48.04+dfsg-7+deb10u1) ...

選取了原先未選的套件 espeak。

正在準備解包 .../espeak_1.48.04+dfsg-7+deb10u1_armhf.deb……

Unpacking espeak (1.48.04+dfsg-7+deb10u1) ...

設定 libportaudio2:armhf (19.6.0-1+deb10u1) ...

設定 libsonic0:armhf (0.2.0-7) ...

設定 espeak-data:armhf (1.48.04+dfsg-7+deb10u1) ...

設定 libespeak1:armhf (1.48.04+dfsg-7+deb10u1) ...

設定 espeak (1.48.04+dfsg-7+deb10u1) ...

執行 man-db (2.8.5-2) 的觸發程式……

執行 libc-bin (2.28-10+rpt2+rpi1) 的觸發程式……

#Set voice data

copy from source/espeak-data to /home/pi/espeak-data

system copy on another path:

/usr/lib/arm-linux-gnueabihf/espeak-data

*voices

*mbrola: if install mbola

Default setting on /home/pi/espeak-data/voices/default

name default

language en

gender male

pi@raspberrypi:~ $

安裝完可以看到初始語言為language en英文，聲音種類特徵為gender male男性；可以簡單在command line輸入以下命令測試eSpeak：

pi@raspberrypi:~ $ espeak “Hello World”

若安裝完成應該在Raspberry Pi 喇叭可以聽到輸入的文字。

MBROLA安裝

安裝MBROLA較為麻煩，因為要從c原始碼開始來安裝，所以需要以下7 步驟如下：

Step1:安裝gcc

pi@raspberrypi:~ $ sudo apt-get install git make gcc

正在讀取套件清單... 完成

正在重建相依關係

正在讀取狀態資料... 完成

gcc is already the newest version (4:8.3.0-1+rpi2).

gcc 被設定為手動安裝。

git is already the newest version (1:2.20.1-2+deb10u3).

make is already the newest version (4.2.1-1.2).

make 被設定為手動安裝。

升級 0 個，新安裝 0 個，移除 0 個，有 0 個未被升級。

因為原來就已經有安裝gcc，所以在此並沒有再做升級。

Step2: GitHub下載

pi@raspberrypi:~ $ git clone https://github.com/numediart/MBROLA.git

Cloning into 'MBROLA'...

remote: Enumerating objects: 283, done.

remote: Total 283 (delta 0), reused 0 (delta 0), pack-reused 283

Receiving objects: 100% (283/283), 403.59 KiB | 184.00 KiB/s, done.

Resolving deltas: 100% (143/143), done.

Step3: 編譯 make

pi@raspberrypi:~ $ cd MBROLA

pi@raspberrypi:~/MBROLA $ make

if [ ! -d Bin/Standalone ]; then \

mkdir Bin ; mkdir Bin/LibOneChannel; mkdir Bin/LibMultiChannel ; mkdir Bin/Standalone ; mkdir Bin/Standalone/Standalone ; mkdir Bin/Standalone/Parser ; mkdir Bin/Standalone/Engine ; mkdir Bin/Standalone/Database ; mkdir Bin/Standalone/Misc; \

gcc -DLITTLE_ENDIAN -ansi -pedantic -IParser -IStandalone -IMisc -ILibOneChannel -ILibMultiChannel -IEngine -IDatabase -Wall -DROMDATABASE_STORE -DROMDATABASE_INIT -DSIGNAL -o ./Bin/mbrola Bin/Standalone/Standalone/synth.o Bin/Standalone/Engine/mbrola.o Bin/Standalone/Engine/diphone.o Bin/Standalone/Parser/phone.o Bin/Standalone/Parser/parser_input.o Bin/Standalone/Parser/input_file.o Bin/Standalone/Parser/phonbuff.o Bin/Standalone/Misc/audio.o Bin/Standalone/Misc/vp_error.o Bin/Standalone/Misc/mbralloc.o Bin/Standalone/Misc/common.o Bin/Standalone/Database/database.o Bin/Standalone/Database/database_old.o Bin/Standalone/Database/diphone_info.o Bin/Standalone/Database/little_big.o Bin/Standalone/Database/hash_tab.o Bin/Standalone/Database/zstring_list.o Bin/Standalone/Database/rom_handling.o Bin/Standalone/Database/rom_database.o -lm

pi@raspberrypi:~/MBROLA $

Step4: 安裝
只要將make完成的檔案mbrola放到/usr/bin/下就完成安裝，執行以下的命令：

pi@raspberrypi:~/MBROLA $ sudo cp /home/pi/MBROLA/Bin/mbrola /usr/bin/mbrola

Step5: 建立語音種類共同目錄
在系統建立usr/share/mbrola目錄，執行以下的命令：

pi@raspberrypi:~/MBROLA $ sudo mkdir /usr/share/mbrola

Step6: gitHub所需要的語音
在共同目錄下創建需要使用語音目錄，gitHub所需要的語音，在此以英文語音為例，執行以下的命令：

pi@raspberrypi:~/MBROLA $ sudo mkdir /usr/share/mbrola/us1

download us1 voice and copy it over the us1 directory from

https://github.com/numediart/MBROLA-voices

download from web us1,us2,us3

pi@raspberrypi:~/MBROLA $ sudo cp us1 /usr/share/mbrola/us1

Step7: 測試mbrola
安裝完成MBROLA可以執行以下的命令來進行測試：

pi@raspberrypi:~/MBROLA $ espeak "create a direction with the same name"

這時應該聽到eSpeak原來初始設定的聲音。但若改為以下命令：

pi@raspberrypi:~/MBROLA $ espeak -v mb-us1 "create a direction with the same name"

這時應該聽到MBROLA提供的的聲音，各位應該可以聽出來有明顯不同。

Python 呼叫

既然eSpeak已經安裝到系統中，只要在系統的command line呼叫espeak就可以執行eSpeak產生語音；所以要在Python中使用eSpeak可以很簡單利用cmdLine.run就可達到同樣的方法如下：

import subprocess as cmdLine

cmdLine.run("clear")

# Define the words to speak

speech = "Good morning, it is nice day"

# Construct the command line

command = 'espeak -x -v ' + 'mb-us1 ' + chr(34) + speech + chr(34)

# more Phoneme detail result

#command = 'espeak -X -v ' + 'mb-us1 ' + chr(34) + speech + chr(34)

print (command)

print ("")

# Run espeak

#cmdLine.run(command, shell=True)

result = cmdLine.run(command, shell=True, capture_output=True, text=True)

# Print the terminal result

print("Phoneme Results:")

print(result.stdout)

輸出結果如下：

>>>

TERM environment variable not set.

espeak -x -v mb-us1 "Good morning, it is nice day"

g'Ud m'O@nIN

It Iz n'aIs d'eI

另外也可在Python中可以撰寫eSpeak函式庫程式，爾後只需要加入eSpeak函式庫，就可簡易的執行eSpeak呼叫功能。此eSpeak函式庫程式如下：

# Import subprocess to execute the espeak termnial commands

import subprocess as cmdLine

# Name the class:

class eSpeak:

# __init___

def __init__(self, voice='mb-us1'):

self.voice = voice

# Speak the text

def say(self, speech):

# define the command line

command = 'espeak -v ' + self.voice + " " + chr(34) + speech + chr(34)

# Execute the command in term terminal

cmdLine.run(command, shell=True, capture_output=True, text=True)

# Prints phonemes using -X or -x (-q means quiet no speaking)

# You still need to assign -v here for the dictionary reference

def phonemes(self, speech, phone):

command = 'espeak -' + phone + ' -q -v ' + self.voice + " " + chr(34) + speech + chr(34)

result = cmdLine.run(command, shell=True, capture_output=True, text=True)

# Return result

return result.stdout

# Sends the speech out to a wav file named by filename

def wavFile(self, speech, filename):

command = 'espeak -w ' + filename + ' -v ' + self.voice + " " + chr(34) + speech + chr(34)

cmdLine.run(command, shell=True, capture_output=True, text=True)

# Speech out a text file named by filename

def textFile(self, filename):

command = 'espeak -f ' + filename + ' -v ' + self.voice

cmdLine.run(command, shell=True, capture_output=True, text=True)

後記

本文為個人學習的經驗，後續有所改進將再發文分享；本人因工作因素發文後並不會經常檢視讀者問題，對於沒法及時回覆問題敬請見諒!

若覺本文對讀者有所幫助，可回覆感想及你的分享!謝謝!

Philip4G

Philip4G四眼仙機的部落格

Philip4G 發表在痞客邦留言(2) 人氣()

E-mail轉寄

Philip4G四眼仙機的部落格

歡迎光臨Philip4G四眼仙機在痞客邦的小天地