24分钟阅读

Web组装 / 银河游戏官方首页教程:音调完美的音频处理

Peter是一位全职开发人员,拥有15年为Web和桌面构建应用程序的经验。

在所有现代浏览器的支持下,WebAssembly(或“ Wasm”)正在改变我们开发Web用户体验的方式。它是一种简单的二进制可执行格式,它允许使用其他编程语言编写的库甚至整个程序都可以在Web浏览器中运行。

开发人员经常寻找提高生产力的方法,例如:

  • 将单个应用程序代码库用于多个目标平台,但使应用程序在所有目标平台上均运行良好
  • 创建在桌面上流畅美观的UX mobile environments
  • 利用开源库生态系统来避免在应用程序开发过程中“重新发明轮子”

对于前端开发人员,WebAssembly提供了所有这三种功能,可以回答搜索真正可与本机移动或桌面体验媲美的Web应用程序UI。它甚至允许使用以非JavaScript语言编写的库,例如C ++或Go!。

在此Wasm / 银河游戏官方首页教程中,我们将创建一个简单的音高检测器应用程序,例如吉他调音器。它将使用浏览器的内置音频功能,并且 甚至可以在移动设备上以每秒60帧(FPS)的速度运行。 您无需了解Web Audio API,甚至无需熟悉 跟随本教程;但是,希望可以舒适地使用JavaScript。

注意:不幸的是,在撰写本文时,本文中使用的技术(特定于Web Audio API)在Firefox中尚不起作用。因此,尽管目前在Firefox中还提供了出色的Wasm和Web Audio API支持,但暂时建议在本教程中使用Chrome,Chromium或Edge。

此WebAssembly / 银河游戏官方首页教程涵盖的内容

  • 在银河游戏官方首页中创建一个简单的函数,然后从JavaScript中调用它(通过WebAssembly)
  • Using the modern AudioWorklet API of the browser for high-performance audio processing in the browser
  • 使用JavaScript在工作人员之间进行交流
  • 将所有内容捆绑在一起成为一个简单的React应用程序

注意:如果您对本文的“方式”比“原因”更感兴趣,请直接跳转至 本教程.

为什么是瓦姆?

使用WebAssembly可能有几个原因:

  • 它允许在编写的浏览器中执行代码 可以想象任何语言.
    • 这包括 利用现有的图书馆 (数字,音频处理,机器学习等)语言,而不是用JavaScript编写。
  • 根据所使用的语言选择,Wasm能够以接近本机的速度运行。这有可能带来Web应用程序的性能特征 更近移动和台式机的原生体验.

为什么不 总是 Use Wasm?

Web组装的普及程度肯定会持续增长;但是,它并不适合所有网络开发:

  • 对于简单的项目,坚持使用JavaScript,HTML和CSS可能会在更短的时间内交付有效的产品。
  • 较旧的浏览器(例如Internet Explorer)不直接支持Wasm。
  • Web组装的典型用法需要在工具链中添加工​​具,例如语言编译器。如果您的团队优先考虑尽可能简化开发和持续集成工具,那么使用Wasm将与此背道而驰。

为什么要专门提供Wasm / 银河游戏官方首页教程?

尽管许多编程语言都可以编译为Wasm,但在此示例中,我选择了银河游戏官方首页。 银河游戏官方首页是由Mozilla在2010年创建的,并且越来越受欢迎。锈占 头把交椅 Stack Overflow在2020年开发人员调查中使用的“最受欢迎”语言。但是,将银河游戏官方首页与WebAssembly结合使用的原因不仅仅限于流行趋势:

  • 首先,银河游戏官方首页的运行时很小,这意味着 发送到浏览器的代码更少 当用户访问该网站时,有助于减少网站占用量。
  • 锈具有出色的Wasm支持,支持 高层次的互操作性 with JavaScript.
  • 锈提供近 C / C ++级性能,但有一个 非常安全的记忆模型。与其他语言相比,银河游戏官方首页在编译代码时会执行额外的安全检查,从而极大地减少了因空变量或未初始化变量而导致崩溃的可能性。当发生意料之外的问题时,这可能会导致错误处理更加简单,并且维护良好UX的可能性更高。
  • 锈是 没有被垃圾收集。这意味着银河游戏官方首页代码完全控制何时分配和清理内存,从而允许 持续的 性能-实时系统中的一项关键要求。

锈的许多好处还伴随着陡峭的学习曲线,因此选择正确的编程语言取决于多种因素,例如将开发和维护代码的团队的组成。

Web组装性能:维护流畅的Web应用程序

由于我们正在使用银河游戏官方首页在WebAssembly中进行编程,因此如何使用银河游戏官方首页来获得首先导致Wasm的性能优势?要使具有快速更新的GUI的应用程序对用户感到“流畅”,它必须能够像屏幕硬件一样定期刷新显示。通常为60 FPS,因此我们的应用程序必须能够在〜16.7 ms(1,000 ms / 60 FPS)内重绘其用户界面。

我们的应用程序实时检测并显示当前音高,这意味着组合的检测计算和绘图必须保持在每帧16.7毫秒之内。在下一节中,我们将利用浏览器支持来分析另一个线程上的音频 尽管 主线程完成其工作。这是性能的重大胜利,因为随后进行了计算和绘图 每个 拥有16.7毫秒的可用时间。

网络音频基础

在此应用程序中,我们将使用高性能的WebAssembly音频模块执行音高检测。此外,我们将确保计算不会在主线程上运行。

为什么我们不能使事情简单并在主线程上执行螺距检测?

  • 音频处理通常需要大量计算。这是由于需要每秒处理大量样本。例如,要可靠地检测音频音调,就需要每秒分析44,100个样本的频谱。
  • JIT编译和JavaScript的垃圾回收发生在主线程上,我们希望在音频处理代码中避免这种情况,以实现一致的性能。
  • 如果处理音频帧所花费的时间被16.7 ms的帧预算所占用,UX将会受到不稳定的动画的影响。
  • 我们希望我们的应用程序即使在性能较低的移动设备上也能平稳运行!

Web音频工作集使应用程序能够继续达到60 FPS的平滑速度,因为音频处理无法占用主线程。如果音频处理太慢并且落后,则会有其他效果,例如音频滞后。但是,UX将保持对用户的响应。

Web组装 / 银河游戏官方首页教程:入门

This tutorial assumes you have Node.js installed, as well as npx. If you don’t have npx already, you can use npm (which comes with Node.js) to install it:

npm install -g npx

创建一个Web应用

在此Wasm / 银河游戏官方首页教程中,我们将使用React。

在终端中,我们将运行以下命令:

npx create-react-app wasm-audio-app
cd wasm-audio-app

This uses npx 到 execute the create-react-app command (contained in the corresponding package maintained by Facebook) to create a fresh React application in the directory wasm-audio-app.

create-react-app 是用于生成基于React的单页面应用程序(SPA)的CLI。使用React可以轻松地启动一个新项目。但是,输出项目包含需要替换的样板代码。

First, although I highly recommend unit testing your application throughout development, testing is beyond the scope of this tutorial. So we’ll go ahead and delete src/App.test.jssrc/setupTests.js.

应用概述

我们的应用程序中将包含五个主要的JavaScript组件:

  • public/wasm-audio/wasm-audio.js 包含JavaScript绑定到提供音高检测算法的Wasm模块。
  • public/PitchProcessor.js 是音频处理发生的地方。它在Web Audio渲染线程中运行,并且将使用Wasm API。
  • src/PitchNode.js 包含Web Audio节点的实现,该节点连接到Web Audio图并在主线程中运行。
  • src/setupAudio.js 使用Web浏览器API来访问可用的音频记录设备。
  • src/App.jssrc/App.css comprise the application user interface.

音高检测应用程序的流程图。块1和2在Web Audio线程上运行。块1是Wasm(锈迹)音高检测器,位于wasm-audio / lib.rs文件中。块2是文件PitchProcessor.js中的Web音频检测+通信。它要求检测器进行初始化,然后检测器将检测到的音高发送回Web Audio界面。块3、4和5在主线程上运行。块3是Web Audio Controller,位于文件PitchNode.js中。它将Wasm模块发送到PitchProcessor.js,并从中接收检测到的音高。第4块是setupAudio.js中的Web Audio Setup。它创建一个PitchNode对象。块5是Web应用程序UI,由App.js和App.css组成。它在启动时调用setupAudio.js。它还通过将消息发送到PitchNode来暂停或恢复音频记录,然后从中接收检测到的音调以显示给用户。
Wasm音频应用程序概述。

让我们直接研究应用程序的核心,并为Wasm模块定义银河游戏官方首页代码。然后,我们将对与网络音频相关的JavaScript的各个部分进行编码,并以UI结尾。

1.使用银河游戏官方首页和WebAssembly进行音高检测

我们的银河游戏官方首页代码将从一系列音频样本中计算出音高。

生锈

您可以关注 这些指示 建立银河游戏官方首页开发链。

用于在银河游戏官方首页中构建WebAssembly组件的安装工具

wasm-pack 允许您构建,测试和发布银河游戏官方首页生成的WebAssembly组件。如果您还没有, 安装wasm-pack.

cargo-generate 通过利用预先存在的Git存储库作为模板,有助于启动和运行新的银河游戏官方首页项目。我们将使用它在银河游戏官方首页中引导一个简单的音频分析器,可以使用WebAssembly从浏览器进行访问。

Using the cargo tool that came with the 银河游戏官方首页 chain, you can install cargo-generate:

cargo install cargo-generate

一旦安装完成(可能需要几分钟),我们就可以创建银河游戏官方首页项目。

创建我们的WebAssembly模块

从应用程序的根文件夹中,我们将克隆项目模板:

$ cargo generate --git //github.com/rustwasm/wasm-pack-template

When prompted for a new project name, we’ll enter wasm-audio.

In the wasm-audio directory, there will now be a Cargo.toml file with the following contents:

[package]
name = "wasm-audio"
version = "0.1.0"
authors = ["Your Name <[email protected]"]
edition = "2018"

[lib]
crate-type = ["cdylib", "rlib"]

[features]
default = ["console_error_panic_hook"]

[dependencies]
wasm-bindgen = "0.2.63"

...

Cargo.toml is used to define a 银河游戏官方首页 package (which 银河游戏官方首页 calls a “crate”), serving a similar function for 银河游戏官方首页 apps that package.json does for JavaScript applications.

The [package] section defines metadata that is used when publishing the package to the official 包注册表 of 银河游戏官方首页.

The [lib] section describes the output format from the 银河游戏官方首页 compilation process. Here, “cdylib” tells 银河游戏官方首页 to produce a “dynamic system library” that can be loaded from another language (in our case, JavaScript) and including “rlib” tells 银河游戏官方首页 to add a static library containing metadata about the produced library. This second specifier is not necessary for our purposes - it assists with development of further 银河游戏官方首页 modules that consume this crate as a dependency - but is safe to leave in.

In [features], we ask 银河游戏官方首页 to include an optional feature console_error_panic_hook 到 provide functionality that converts the unhandled-errors mechanism of 银河游戏官方首页 (called a panic) to console errors that show up in the dev tools for debugging.

Finally, [dependencies] lists all crates that this one depends on. The only dependency supplied out of the box is wasm-bindgen, which provides automatic generation of JavaScript bindings to our Wasm module.

在银河游戏官方首页中实现音高检测器

该应用程序的目的是能够实时检测音乐家的声音或乐器的音调。为了确保尽快执行,WebAssembly模块负责计算音高。对于单声音高检测,我们将使用现有银河游戏官方首页中实现的“ McLeod”音高方法 pitch-detection library.

与Node.js软件包管理器(npm)十分相似,银河游戏官方首页包含了自己的名为Cargo的软件包管理器。这样可以轻松地安装已发布到银河游戏官方首页 crate注册表中的软件包。

To add the dependency, edit Cargo.toml, adding the line for pitch-detection 到 the dependencies section:

[dependencies]
wasm-bindgen = "0.2.63"
pitch-detection = "0.1"

This instructs Cargo to download and install the pitch-detection dependency during the next cargo build or, since we’re targeting WebAssembly, this will be performed in the next wasm-pack.

在银河游戏官方首页中创建一个JavaScript可调用的音高检测器

首先,我们将添加一个文件,该文件定义了一个有用的实用程序,我们将在后面讨论其用途:

Create wasm-audio/src/utils.rs 和 paste 该文件的内容 into it.

We’ll replace the generated code in wasm-audio/lib.rs with the following code, which performs pitch detection via a fast Fourier transform (FFT) algorithm:

use pitch_detection::{McLeodDetector, PitchDetector};
use wasm_bindgen::prelude::*;
mod utils;

#[wasm_bindgen]
pub struct WasmPitchDetector {
  sample_rate: usize,
  fft_size: usize,
  detector: McLeodDetector<f32>,
}

#[wasm_bindgen]
impl WasmPitchDetector {
  pub fn new(sample_rate: usize, fft_size: usize) -> WasmPitchDetector {
    utils::set_panic_hook();

    let fft_pad = fft_size / 2;

    WasmPitchDetector {
      sample_rate,
      fft_size,
      detector: McLeodDetector::<f32>::new(fft_size, fft_pad),
    }
  }

  pub fn detect_pitch(&mut self, audio_samples: Vec<f32>) -> f32 {
    if audio_samples.len() < self.fft_size {
      panic!("Insufficient samples passed to detect_pitch(). Expected an array containing {} elements but got {}", self.fft_size, audio_samples.len());
    }

    // Include only notes that exceed a power threshold which relates to the
    // amplitude of frequencies in the signal. Use the suggested default
    // value of 5.0 from the library.
    const POWER_THRESHOLD: f32 = 5.0;

    // The clarity measure describes how coherent the sound of a note is. For
    // example, the background sound in a crowded room would typically be would
    // have low clarity and a ringing tuning fork would have high clarity.
    // This threshold is used to accept detect notes that are clear enough
    // (valid values are in the range 0-1).
    const CLARITY_THRESHOLD: f32 = 0.6;

    let optional_pitch = self.detector.get_pitch(
      &audio_samples,
      self.sample_rate,
      POWER_THRESHOLD,
      CLARITY_THRESHOLD,
    );

    match optional_pitch {
      Some(pitch) => pitch.frequency,
      None => 0.0,
    }
  }
}

让我们更详细地研究一下:

#[wasm_bindgen]

wasm_bindgen 是一个银河游戏官方首页宏,可帮助实现JavaScript和银河游戏官方首页之间的绑定。编译为WebAssembly时,此宏指示编译器创建与类的JavaScript绑定。上面的银河游戏官方首页代码将转换为JavaScript绑定,这些绑定只是用于传入和传出Wasm模块的调用的薄包装器。轻量级的抽象层与JavaScript之间的直接共享内存相结合,可以帮助Wasm提供出色的性能。

#[wasm_bindgen]
pub struct WasmPitchDetector {
  sample_rate: usize,
  fft_size: usize,
  detector: McLeodDetector<f32>,
}

#[wasm_bindgen]
impl WasmPitchDetector {
...
}

锈没有类的概念。相当, 对象的数据 is described by a struct它的行为 through impls or traits.

为什么通过对象而不是普通函数公开音高检测功能?因此,我们仅初始化内部McLeodDetector使用的数据结构 一次, during the creation of the WasmPitchDetector. This keeps the detect_pitch function fast by avoiding expensive memory allocation during operation.

pub fn new(sample_rate: usize, fft_size: usize) -> WasmPitchDetector {
  utils::set_panic_hook();

  let fft_pad = fft_size / 2;

  WasmPitchDetector {
    sample_rate,
    fft_size,
    detector: McLeodDetector::<f32>::new(fft_size, fft_pad),
  }
}

When a 银河游戏官方首页 application encounters an error that it cannot easily recover from, it is quite common to invoke a panic! macro. This instructs 银河游戏官方首页 to report an error and terminate the application immediately. Making use of panics can be useful particularly for early development before an error-handling strategy is in place as it allows you to catch false assumptions quickly.

Calling utils::set_panic_hook() 一次 during setup will ensure panic messages appear in the browser development tools.

Next, we define fft_pad, the amount of zero-padding applied to each analysis FFT. Padding, in combination with the windowing function used by the algorithm, helps “smooth” the results as the analysis moves across the incoming sampled audio data. Using a pad of half the FFT length works well for many instruments.

Finally, 银河游戏官方首页 returns the result of the last statement automatically, so the WasmPitchDetector struct statement is the return value of new().

The rest of our impl WasmPitchDetector 锈 code defines the API for detecting pitches:

pub fn detect_pitch(&mut self, audio_samples: Vec<f32>) -> f32 {
  ...
}

This is what a member function definition looks like in 银河游戏官方首页. A public member detect_pitch is added to WasmPitchDetector. Its first argument is a mutable reference (&mut) to an instantiated object of the same type containing structimpl fields—but this is passed automatically when calling, as we’ll see below.

另外,我们的成员函数采用任意大小的32位浮点数数组,并返回一个数字。在这里,这将是跨这些样本计算得出的最终音高(以Hz为单位)。

if audio_samples.len() < self.fft_size {
  panic!("Insufficient samples passed to detect_pitch(). Expected an array containing {} elements but got {}", self.fft_size, audio_samples.len());
}

The above code detects whether sufficient samples were provided to the function for a valid pitch analysis to be performed. If not, the 银河游戏官方首页 panic! macro is called which results in immediate exit from Wasm and the error message printed to the browser dev-tools console.

let optional_pitch = self.detector.get_pitch(
  &audio_samples,
  self.sample_rate,
  POWER_THRESHOLD,
  CLARITY_THRESHOLD,
);

This calls into the third-party library to calculate the pitch from the latest audio samples. POWER_THRESHOLDCLARITY_THRESHOLD can be adjusted to tune the sensitivity of the algorithm.

We end with an implied return of a floating point value via the match keyword, which works similarly to a switch statement in other languages. Some()None let us handle cases appropriately without running into a null-pointer exception.

构建WebAssembly应用程序

When developing 银河游戏官方首页 applications, the usual build procedure is to invoke a build using cargo build. However, we are generating a Wasm module, so we’ll make use of wasm-pack, which provides simpler syntax when targeting Wasm. (It also allows publishing the resulting JavaScript bindings to the npm registry, but that’s outside the scope of this tutorial.)

wasm-pack supports a variety of build targets. Because we will consume the module directly from an Web Audio worklet, we will target the web option. Other targets include building for a bundler such as webpack or for consumption from Node.js. We’ll run this from the wasm-audio/ subdirectory:

wasm-pack build --target web

If successful, an npm module is created under ./pkg.

This is a JavaScript module with its very own auto-generated package.json. This can be published to the npm registry if desired. To keep things simple for now, we can simply copy and paste this pkg under our folder public/wasm-audio:

cp -R ./wasm-audio/pkg ./public/wasm-audio

With that, we have created a 银河游戏官方首页 Wasm module ready to be consumed by the web app, or more specifically, by 音调处理器.

2. Our 音调处理器 Class (Based on the Native AudioWorkletProcessor)

对于此应用程序,我们将使用一种音频处理标准,该标准最近获得了广泛的浏览器兼容性。具体来说,我们将使用网络音频API并通过自定义运行昂贵的计算 AudioWorkletProcessor. Afterwards we’ll create the corresponding custom AudioWorkletNode class (which we’ll call PitchNode) as a bridge back to the main thread.

Create a new file public/PitchProcessor.js 和 paste the following code into it:

import init, { WasmPitchDetector } from "./wasm-audio/wasm_audio.js";

class PitchProcessor extends AudioWorkletProcessor {
  constructor() {
    super();

    // Initialized to an array holding a buffer of samples for analysis later -
    // once we know how many samples need to be stored. Meanwhile, an empty
    // array is used, so that early calls to process() with empty channels
    // do not break initialization.
    this.samples = [];
    this.totalSamples = 0;

    // Listen to events from the PitchNode running on the main thread.
    this.port.onmessage = (event) => this.onmessage(event.data);

    this.detector = null;
  }

  onmessage(event) {
    if (event.type === "send-wasm-module") {
      // PitchNode has sent us a message containing the Wasm library to load into
      // our context as well as information about the audio device used for
      // recording.
      init(WebAssembly.compile(event.wasmBytes)).then(() => {
        this.port.postMessage({ type: 'wasm-module-loaded' });
      });
    } else if (event.type === 'init-detector') {
      const { sampleRate, numAudioSamplesPerAnalysis } = event;

      // Store this because we use it later to detect when we have enough recorded
      // audio samples for our first analysis.
      this.numAudioSamplesPerAnalysis = numAudioSamplesPerAnalysis;

      this.detector = WasmPitchDetector.new(sampleRate, numAudioSamplesPerAnalysis);

      // Holds a buffer of audio sample values that we'll send to the Wasm module
      // for analysis at regular intervals.
      this.samples = new Array(numAudioSamplesPerAnalysis).fill(0);
      this.totalSamples = 0;
    }
  };

  process(inputs, outputs) {
    // inputs contains incoming audio samples for further processing. outputs
    // contains the audio samples resulting from any processing performed by us.
    // Here, we are performing analysis only to detect pitches so do not modify
    // outputs.

    // inputs holds one or more "channels" of samples. For example, a microphone
    // that records "in stereo" would provide two channels. For this simple app,
    // we use assume either "mono" input or the "left" channel if microphone is
    // stereo.

    const inputChannels = inputs[0];

    // inputSamples holds an array of new samples to process.
    const inputSamples = inputChannels[0];

    // In the AudioWorklet spec, process() is called whenever exactly 128 new
    // audio samples have arrived. We simplify the logic for filling up the
    // buffer by making an assumption that the analysis size is 128 samples or
    // larger and is a power of 2.
    if (this.totalSamples < this.numAudioSamplesPerAnalysis) {
      for (const sampleValue of inputSamples) {
        this.samples[this.totalSamples++] = sampleValue;
      }
    } else {
      // Buffer is already full. We do not want the buffer to grow continually,
      // so instead will "cycle" the samples through it so that it always
      // holds the latest ordered samples of length equal to
      // numAudioSamplesPerAnalysis.

      // Shift the existing samples left by the length of new samples (128).
      const numNewSamples = inputSamples.length;
      const numExistingSamples = this.samples.length - numNewSamples;
      for (let i = 0; i < numExistingSamples; i++) {
        this.samples[i] = this.samples[i + numNewSamples];
      }
      // Add the new samples onto the end, into the 128-wide slot vacated by
      // the previous copy.
      for (let i = 0; i < numNewSamples; i++) {
        this.samples[numExistingSamples + i] = inputSamples[i];
      }
      this.totalSamples += inputSamples.length;
    }

    // Once our buffer has enough samples, pass them to the Wasm pitch detector.
    if (this.totalSamples >= this.numAudioSamplesPerAnalysis && this.detector) {
      const result = this.detector.detect_pitch(this.samples);

      if (result !== 0) {
        this.port.postMessage({ type: "pitch", pitch: result });
      }
    }

    // Returning true tells the Audio system to keep going.
    return true;
  }
}

registerProcessor("音调处理器", PitchProcessor);

The 音调处理器 is a companion to the PitchNode but runs in a separate thread so that audio-processing computation can be performed without blocking work done on the main thread.

Mainly, the 音调处理器:

  • Handles the "send-wasm-module" event sent from PitchNode by compiling and loading the Wasm module into the worklet. Once done, it lets PitchNode know by sending a "wasm-module-loaded" event. This callback approach is needed because all communication between PitchNode音调处理器 crosses a thread boundary and cannot be performed synchronously.
  • Also responds to the "init-detector" event from PitchNode by configuring the WasmPitchDetector.
  • Processes audio samples received from the browser audio graph, delegates pitch-detection computation to the Wasm module, and then sends any detected pitch back to PitchNode (which sends the pitch along to the React layer via its onPitchDetectedCallback).
  • 自行注册 under a specific, unique name. This way the browser knows—via the base class of PitchNode, the native AudioWorkletNode—how to instantiate our 音调处理器 later when PitchNode is constructed. See setupAudio.js.

The following diagram visualizes the flow of events between the PitchNode音调处理器:

比较运行时PitchNode和PitchProcess对象之间的交互的更详细的流程图。在初始设置期间,PitchNode将Wasm模块以字节数组的形式发送给PitchProcessor,由PitchProcessor对其进行编译并将其发送回PitchNode,后者最终会通过事件消息进行响应,请求PitchProcessor对其进行初始化。在录制音频时,PitchNode不发送任何内容,并且从PitchProcessor接收两种类型的事件消息:如果检测到的音调或错误(如果是Wasm或工作件产生的话)。
运行时事件消息。

3.添加Web Audio Worklet代码

PitchNode.js provides the interface to our custom pitch-detection audio processing. The PitchNode object is the mechanism whereby pitches detected using the WebAssembly module working in the AudioWorklet thread will make their way to the main thread and React for rendering.

In src/PitchNode.js, we’ll subclass the built-in AudioWorkletNode Web Audio API的功能:

export default class PitchNode extends AudioWorkletNode {
  /**
   * Initialize the Audio processor by sending the fetched WebAssembly module to
   * the processor worklet.
   *
   * @param {ArrayBuffer} wasmBytes Sequence of bytes representing the entire
   * WASM module that will handle pitch detection.
   * @param {number} numAudioSamplesPerAnalysis Number of audio samples used
   * for each analysis. Must be a power of 2.
   */
  init(wasmBytes, onPitchDetectedCallback, numAudioSamplesPerAnalysis) {
    this.onPitchDetectedCallback = onPitchDetectedCallback;
    this.numAudioSamplesPerAnalysis = numAudioSamplesPerAnalysis;

    // Listen to messages sent from the audio processor.
    this.port.onmessage = (event) => this.onmessage(event.data);

    this.port.postMessage({
      type: "send-wasm-module",
      wasmBytes,
    });
  }

  // Handle an uncaught exception thrown in the PitchProcessor.
  onprocessorerror(err) {
    console.log(
      `An error from AudioWorkletProcessor.process() occurred: ${err}`
    );
  };

  onmessage(event) {
    if (event.type === 'wasm-module-loaded') {
      // The Wasm module was successfully sent to the PitchProcessor running on the
      // AudioWorklet thread and compiled. This is our cue to configure the pitch
      // detector.
      this.port.postMessage({
        type: "init-detector",
        sampleRate: this.context.sampleRate,
        numAudioSamplesPerAnalysis: this.numAudioSamplesPerAnalysis
      });
    } else if (event.type === "pitch") {
      // A pitch was detected. Invoke our callback which will result in the UI updating.
      this.onPitchDetectedCallback(event.pitch);
    }
  }
}

The key tasks performed by PitchNode are:

  • Send the WebAssembly module as a sequence of raw bytes—those passed in from setupAudio.js—to the 音调处理器, which runs on the AudioWorklet thread. This is how the 音调处理器 loads the pitch-detection Wasm module.
  • Handle the event sent by 音调处理器 when it successfully compiles the Wasm, and send it another event that passes pitch-detection configuration information to it.
  • Handle detected pitches as they arrive from the 音调处理器 和 forward them to the UI function setLatestPitch() via onPitchDetectedCallback().

注意:该对象的代码在主线程上运行,因此应避免对检测到的音高进行进一步处理,以防代价高昂并导致帧频下降。

4.添加代码以设置Web音频

为了使Web应用程序能够访问和处理来自客户端计算机的麦克风的实时输入,它必须:

  1. 取得用户对浏览器的访问权限,以访问任何已连接的麦克风
  2. 将麦克风的输出作为音频流对象访问
  3. 附加代码以处理传入的音频流样本并产生一系列检测到的音高

In src/setupAudio.js, we’ll do that, and also load the Wasm module asynchronously so we can initialize our PitchNode with it, before attaching our PitchNode:

import PitchNode from "./PitchNode";

async function getWebAudioMediaStream() {
  if (!window.navigator.mediaDevices) {
    throw new Error(
      "This browser does not support web audio or it is not enabled."
    );
  }

  try {
    const result = await window.navigator.mediaDevices.getUserMedia({
      audio: true,
      video: false,
    });

    return result;
  } catch (e) {
    switch (e.name) {
      case "NotAllowedError":
        throw new Error(
          "A recording device was found but has been disallowed for this application. Enable the device in the browser settings."
        );

      case "NotFoundError":
        throw new Error(
          "No recording device was found. Please attach a microphone and click Retry."
        );

      default:
        throw e;
    }
  }
}

export async function setupAudio(onPitchDetectedCallback) {
  // Get the browser audio. Awaits user "allowing" it for the current tab.
  const mediaStream = await getWebAudioMediaStream();

  const context = new window.AudioContext();
  const audioSource = context.createMediaStreamSource(mediaStream);

  let node;

  try {
    // Fetch the WebAssembly module that performs pitch detection.
    const response = await window.fetch("wasm-audio/wasm_audio_bg.wasm");
    const wasmBytes = await response.arrayBuffer();

    // Add our audio processor worklet to the context.
    const processorUrl = "音调处理器.js";
    try {
      await context.audioWorklet.addModule(processorUrl);
    } catch (e) {
      throw new Error(
        `Failed to load audio analyzer worklet at url: ${processorUrl}. Further info: ${e.message}`
      );
    }

    // Create the AudioWorkletNode which enables the main JavaScript thread to
    // communicate with the audio processor (which runs in a Worklet).
    node = new PitchNode(context, "音调处理器");

    // numAudioSamplesPerAnalysis specifies the number of consecutive audio samples that
    // the pitch detection algorithm calculates for each unit of work. Larger values tend
    // to produce slightly more accurate results but are more expensive to compute and
    // can lead to notes being missed in faster passages i.e. where the music note is
    // changing rapidly. 1024 is usually a good balance between efficiency and accuracy
    // for music analysis.
    const numAudioSamplesPerAnalysis = 1024;

    // Send the Wasm module to the audio node which in turn passes it to the
    // processor running in the Worklet thread. Also, pass any configuration
    // parameters for the Wasm detection algorithm.
    node.init(wasmBytes, onPitchDetectedCallback, numAudioSamplesPerAnalysis);

    // Connect the audio source (microphone output) to our analysis node.
    audioSource.connect(node);

    // Connect our analysis node to the output. Required even though we do not
    // output any audio. Allows further downstream audio processing or output to
    // occur.
    node.connect(context.destination);
  } catch (err) {
    throw new Error(
      `Failed to load audio analyzer WASM module. Further info: ${err.message}`
    );
  }

  return { context, node };
}

This assumes a WebAssembly module is available to be loaded at public/wasm-audio, which we accomplished in the earlier 银河游戏官方首页 section.

5.定义应用程序UI

Let’s define a basic user interface for the pitch detector. We’ll replace the contents of src/App.js with the following code:

import React from "react";
import "./App.css";
import { setupAudio } from "./setupAudio";

function PitchReadout({ running, latestPitch }) {
  return (
    <div className="Pitch-readout">
      {latestPitch
        ? `Latest pitch: ${latestPitch.toFixed(1)} Hz`
        : running
        ? "Listening..."
        : "Paused"}
    </div>
  );
}

function AudioRecorderControl() {
  // Ensure the latest state of the audio module is reflected in the UI
  // by defining some variables (and a setter function for updating them)
  // that are managed by React, passing their initial values to useState.

  // 1. audio is the object returned from the initial audio setup that
  //    will be used to start/stop the audio based on user input. While
  //    this is initialized once in our simple application, it is good
  //    practice to let React know about any state that _could_ change
  //    again.
  const [audio, setAudio] = React.useState(undefined);

  // 2. running holds whether the application is currently recording and
  //    processing audio and is used to provide button text (Start vs Stop).
  const [running, setRunning] = React.useState(false);

  // 3. latestPitch holds the latest detected pitch to be displayed in
  //    the UI.
  const [latestPitch, setLatestPitch] = React.useState(undefined);

  // Initial state. Initialize the web audio once a user gesture on the page
  // has been registered.
  if (!audio) {
    return (
      <button
        onClick={async () => {
          setAudio(await setupAudio(setLatestPitch));
          setRunning(true);
        }}
      >
        Start listening
      </button>
    );
  }

  // Audio already initialized. Suspend / resume based on its current state.
  const { context } = audio;
  return (
    <div>
      <button
        onClick={async () => {
          if (running) {
            await context.suspend();
            setRunning(context.state === "running");
          } else {
            await context.resume();
            setRunning(context.state === "running");
          }
        }}
        disabled={context.state !== "running" && context.state !== "suspended"}
      >
        {running ? "Pause" : "Resume"}
      </button>
      <PitchReadout running={running} latestPitch={latestPitch} />
    </div>
  );
}

function App() {
  return (
    <div className="App">
      <header className="App-header">
        Wasm Audio Tutorial
      </header>
      <div className="App-content">
        <AudioRecorderControl />
      </div>
    </div>
  );
}

export default App;

And we’ll replace App.css with some basic styles:

.App {
  display: flex;
  flex-direction: column;
  align-items: center;
  text-align: center;
  background-color: #282c34;
  min-height: 100vh;
  color: white;
  justify-content: center;
}

.App-header {
  font-size: 1.5rem;
  margin: 10%;
}

.App-content {
  margin-top: 15vh;
  height: 85vh;
}

.Pitch-readout {
  margin-top: 5vh;
  font-size: 3rem;
}

button {
  background-color: rgb(26, 115, 232);
  border: none;
  outline: none;
  color: white;
  margin: 1em;
  padding: 10px 14px;
  border-radius: 4px;
  width: 190px;
  text-transform: capitalize;
  cursor: pointer;
  font-size: 1.5rem;
}

button:hover {
  background-color: rgb(45, 125, 252);
}

这样,我们应该可以运行我们的应用程序了-但是首先要解决一个陷阱。

Web组装 / 银河游戏官方首页教程:如此接近!

Now when we run yarnyarn start, switch to the browser, and attempt to record audio (using Chrome or Chromium, with developer tools open), we’re met with some errors:

在wasm_audio.js第24行,'s the error, "未捕获的ReferenceError:未定义TextDecoder,"然后在setupAudio.js第84行,由来自App.js第43行的async onClick触发,它显示为"未捕获(承诺)错误:无法加载音频分析器WASM模块。进一步的信息:构造失败'AudioWorkletNode':无法创建AudioWorkletNode:节点名称'PitchProcessor'在AudioWorkletGlobalScope中未定义。"
Wasm需求得到了广泛的支持-在Worklet规范中还没有。

The first error, TextDecoder is not defined, occurs when the browser attempts to execute the contents of wasm_audio.js. This in turn results in the failure to load the Wasm JavaScript wrapper, which produces the second error we see in the console.

The underlying cause of the issue is that modules produced by the Wasm package generator of 银河游戏官方首页 assume that TextDecoder (and TextEncoder) will be provided by the browser. This assumption holds for modern browsers when the Wasm module is being run from the main thread or even a worker thread. However, for worklets (such as the AudioWorklet context needed in this tutorial), TextDecoderTextEncoder are not yet part of the spec and so are not available.

TextDecoder Wasm代码生成器需要使用它来将银河游戏官方首页的平面,打包,共享内存表示形式转换为JavaScript使用的字符串格式。换句话说,为了查看Wasm代码生成器生成的字符串, TextEncoderTextDecoder must be defined.

此问题是WebAssembly相对较新的症状。随着浏览器支持的改进以支持现成的常见WebAssembly模式,这些问题可能会消失。

For now, we are able to work around it by defining a polyfill for TextDecoder.

Create a new file public/TextEncoder.js 和 import it from public/PitchProcessor.js:

import "./TextEncoder.js";

Make sure that this import statement comes before the wasm_audio import.

最后贴上 这个实现 into TextEncoder.js (courtesy of @Yaffle on GitHub).

Firefox问题

如前所述,在我们的应用程序中将Wasm与Web Audio工作集结合使用的方式在Firefox中不起作用。即使使用上述填充程序,单击“开始收听”按钮也将导致以下结果:

Unhandled Rejection (Error): Failed to load audio analyzer WASM module. Further info: Failed to load audio analyzer worklet at url: PitchProcessor.js. Further info: The operation was aborted.
    

那是因为Firefox 尚不支持 importing modules from AudioWorklets—for us, that’s 音调处理器.js running in the AudioWorklet thread.

完成的申请

完成后,我们只需重新加载页面即可。该应用程序应该加载而没有错误。单击“开始收听”,并允许您的浏览器访问您的麦克风。您会看到一个使用Wasm用JavaScript编写的非常基本的音高检测器:

该应用的屏幕截图显示了其标题,"Wasm Audio Tutorial,"一个带有暂停字样的蓝色按钮,以及该文本"最新音调:1380.1 Hz" underneath that.
实时音高检测。

使用银河游戏官方首页在WebAssembly中编程:实时Web音频解决方案

在本教程中,我们从头开始构建了一个Web应用程序,该应用程序使用WebAssembly执行计算量大的音频处理。 WebAssembly允许我们利用银河游戏官方首页的近乎原生的性能来执行音高检测。此外,这项工作可以在另一个线程上执行,从而使主要的JavaScript线程可以专注于渲染,即使在移动设备上也可以支持柔滑的帧速率。

Wasm / 银河游戏官方首页和Web音频摘要

  • 现代浏览器可在Web应用程序内部提供高效的音频(和视频)捕获和处理。
  • 铁锈有 Wasm的出色工具,这有助于将其推荐为包含WebAssembly的项目的首选语言。
  • 使用Wasm可以在浏览器中高效地执行计算密集型工作。

尽管WebAssembly具有许多优点,但仍有一些Wasm陷阱需要注意:

  • Tooling for Wasm within worklets is still evolving. For example, we needed to implement our own versions of TextEncoder and TextDecoder functionality required for passing strings between JavaScript and Wasm because they were missing from the AudioWorklet context. That, and importing Javascript bindings for our Wasm support from an AudioWorklet is not yet available in Firefox.
  • Although the application we developed was very simple, building the WebAssembly module and loading it from the AudioWorklet required significant setup. Introducing Wasm to projects does introduce an increase in tooling complexity, which is important to keep in mind.

为了您的方便, 这个GitHub仓库 包含最终的,已完成的项目。如果您还进行后端开发,您可能还对通过WebAssembly使用银河游戏官方首页感兴趣。 在Node.js中.


在Toptal工程博客上的进一步阅读:

了解基础

Web组装是一种语言吗?

Web组装是一种编程语言,但不是旨在由人类直接编写的一种语言。而是将其从其他高级语言编译成紧凑的二进制字节码形式,以便在Web上高效传输并在当今的浏览器中执行。

Web组装有什么用?

Web组装允许使用非JavaScript语言编写的软件在浏览器中无缝运行。这使Web开发人员可以利用特定语言的独特优势,或者通过Web的便利性和普遍性来重用现有的库。

是什么使WebAssembly快速?

Web组装程序使用紧凑的二进制表示形式,因此比JavaScript传输到浏览器的速度更快。像银河游戏官方首页这样的高性能语言通常也可以转换为快速运行的Wasm字节码。

Web组装是用什么编写的?

Web组装程序使用紧凑的二进制字节码表示形式,因此与JavaScript相比,它可以更快地在Web上进行传输。该字节码不能直接由人类编写,而是在编译以高级语言(例如C / C ++或银河游戏官方首页)编写的代码时生成的。

锈编程语言是用来做什么的?

锈具有健壮的内存模型,良好的并发支持和较小的运行时占用空间,使其非常适合于系统级软件,例如操作系统,设备驱动程序和嵌入式程序。对于具有苛刻图形或数据处理要求的Web应用程序,它还是一个功能强大的WebAssembly选项。

为什么银河游戏官方首页这么快?

锈程序之所以快速,是因为它们的代码可以编译为优化的机器级指令,并且银河游戏官方首页不使用垃圾回收,从而使程序员可以完全控制内存的使用方式。这样可以产生一致且可预测的性能。