端到端语音识别工具包 WeNet 的编译及运行

听说端到端语音识别工具包 WeNet 效果还不错，但在测试电脑上用 Docker 进行测试并不成功。在使用源码编译的过程中也遇见些问题，遂记录备忘。

二、下载 WeNet 源码

# 当前目录: /your/folder
git clone https://github.com/wenet-e2e/wenet wenet-e2e/wenet

# 当前目录: /your/folder
cd ./wenet-e2e/wenet/runtime/server/x86
mkdir build && cd build && make .. && make --build .

四、下载 WenetSpeech 预训练模型

下载方式在这篇文章有说明，不再赘述。

准备一条包含普通话的 16kHz SampleRate、16 BitsPerSample(s16le)的音频。

1、测试 `decoder_main`

# 当前目录: /your/folder/wenet-e2e/wenet/runtime/server/x86/build
export GLOG_logtostderr=1
export GLOG_v=2
time ./decoder_main \
--chunk_size -1 \
--model_path \
/your/folder/SpeechColab/Leaderboard/models/wenet_wenetspeech/assets/final.zip \
--dict_path \
/your/folder/SpeechColab/Leaderboard/models/wenet_wenetspeech/assets/words.txt \
--wav_path \
/your/folder/TestASR-01.wav

音频长度为 2 分钟左右。在测试机上的运行时间：

15.71s user 4.52s system 97% cpu 20.799 total

2、测试 `websocket_server_main`

首先开启 WebSocket 服务：

# 当前目录: /your/folder/wenet-e2e/wenet/runtime/server/x86/build
export GLOG_logtostderr=1
export GLOG_v=2
./websocket_server_main \
--chunk_size 16 \
--model_path \
/your/folder/SpeechColab/Leaderboard/models/wenet_wenetspeech/assets/final.zip \
--dict_path \
/your/folder/SpeechColab/Leaderboard/models/wenet_wenetspeech/assets/words.txt

然后在浏览器中打开 /your/folder/wenet-e2e/wenet/runtime/server/x86/web/templates/index.html 。在 WebSocket URL 对应的输入框输入 ws://127.0.0.1:10086 。

(图1)

点击 开始录音 按钮开始录音，点击 停止录音 获取识别文字。

测试环境：Apple M1，macOS 12.0.1, Xcode 13.1。也测试过在 Docker 上可以运行但不太稳定，也许和镜像是镜像是基于 x86_64 的有关。Windows 或 Linux 环境尚未测试。

1、下载第三方库失败

如果在 cmake .. 的时候下载第三方库失败，可通过其他方式下载好后放入对应的目录。

-- Downloading...
   dst='/your/folder/wenet-e2e/wenet/runtime/server/x86/fc_base/gflags-subbuild/gflags-populate-prefix/src/v2.2.1.zip'
   timeout='none'
   inactivity timeout='none'
-- Using src='https://github.com/gflags/gflags/archive/v2.2.1.zip'
CMake Error at gflags-subbuild/gflags-populate-prefix/src/gflags-populate-stamp/download-gflags-populate.cmake:170 (message):
  Each download failed!

error: downloading 'https://github.com/gflags/gflags/archive/v2.2.1.zip' failed
          status_code: 35

比如下载 gflags 失败，从 https://github.com/gflags/gflags/archive/v2.2.1.zip 下载好后放入 /your/foler/wenet-e2e/wenet/runtime/server/x86/fc_base/gflags-subbuild/gflags-populate-prefix/src 目录。

涉及如下库：gflags、googletest、boost 和 libtorch。

提醒：版本和保存的文件名要一致。

2、C++14 相关错误

如果在 cmake --build . 的时候报类似如下的编译错误：

/your/folder/wenet/runtime/server/x86/fc_base/libtorch-src/include/ATen/ATen.h:4:2: error: C++14 or later compatible compiler is required to use ATen.
#error C++14 or later compatible compiler is required to use ATen.

修改 CMakeLists.txt 文件

# 当前文件: /your/folder/wenet-e2e/wenet/runtime/server/x86/CMakeLists.txt
cmake_minimum_required(VERSION 3.14 FATAL_ERROR)

project(wenet VERSION 0.1)

# 新增下面两行
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

(WeNet)x86 平台上使用 WeNet 进行语音识别
 体验了开源的离线语音识别模型 wenetspeech，准确度很高。分享一些使用心得。

二、下载 WeNet 源码

四、下载 WenetSpeech 预训练模型

1、测试 `decoder_main`

2、测试 `websocket_server_main`

1、下载第三方库失败

2、C++14 相关错误

Recommend

写给初级程序员的十点提升建议

饿了么超级限定x麦当劳：用猫抓住年轻人

Digital Research Source Code

I have a brain injury.

F# Advent 2021 Dec 08 - Fast data pipelines with F#6

一些有趣但又值得思考的事情

Merlin the magician: from devil’s son to King Arthur’s trusted advisor

用 Exodus 打包 Linux ELF 檔案到其他機器上

Building the atomic clock I’ve always wanted

Moore's Law, AI, and the pace of progress

About Joyk

端到端语音识别工具包 WeNet 的编译及运行

二、下载 WeNet 源码

四、下载 WenetSpeech 预训练模型

1、测试 decoder_main

2、测试 websocket_server_main

1、下载第三方库失败

2、C++14 相关错误

Recommend

About Joyk

1、测试 `decoder_main`

2、测试 `websocket_server_main`