Skip to content

WebAssembly (WASM)

Ambi compiles to WASM32 and runs in browsers. This is a first-class target, not an afterthought.

Limitations compared to native

FeatureNativeWASM
llama.cpp inferenceYesNo (compile-time blocked)
OpenAI APIYesYes (browser fetch)
Streaming APIYes (tokio)Yes (native fetch + ReadableStream)
Custom engineYesYes
spawn_blockingThread poolInline execution
Send + Sync boundsEnforcedRelaxed (single-threaded)
GPU accelerationYesNo

The llama-cpp feature is blocked at compile time for WASM:

rust
#[cfg(all(target_arch = "wasm32", feature = "llama-cpp"))]
compile_error!("llama-cpp not supported on wasm32");

Only openai-api or custom engines work on WASM.

Building for WASM

bash
cargo build --target wasm32-unknown-unknown --no-default-features --features openai-api

Or use wasm-pack for a browser-ready package:

bash
wasm-pack build --target web --no-default-features --features openai-api

Runtime polyfills

The runtime module replaces Tokio-specific calls with WASM-compatible alternatives:

  • spawn()wasm_bindgen_futures::spawn_local()
  • spawn_blocking() → direct synchronous execution (single-threaded)
  • sleep()gloo_timers::future::sleep()
  • timeout() → future race against a timer
  • SendSync trait → empty marker (no-op in single-threaded context)

You don't need to change any code – the polyfills are applied automatically based on #[cfg(target_arch = "wasm32")].

Cargo.toml for WASM

toml
[dependencies]
ambi = { version = "0.3", default-features = false, features = ["openai-api"] }
tokio = { version = "1", features = ["sync", "macros"] }    # no rt-multi-thread
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"

Note: rt-multi-thread is not needed (and won't compile) for WASM.

Streaming in the browser

The OpenAI provider for WASM uses native fetch and ReadableStream APIs for true streaming. The same chat_stream() API works identically in the browser:

rust
use futures::StreamExt;

let mut stream = runner.chat_stream(&agent, &state, "Tell me a story").await?;
while let Some(chunk) = stream.next().await {
    if let Ok(text) = chunk {
        // append to DOM
    }
}

No special WASM polyfills are needed – the runtime module automatically swaps Tokio internals for WASM-compatible alternatives.

Example

See examples/webAssembly for a complete browser-ready setup with a UI toggle demoing real-time streaming text generation.

Released under the Apache-2.0 License.