WebAssembly (WASM)
Ambi compiles to WASM32 and runs in browsers. This is a first-class target, not an afterthought.
Limitations compared to native
| Feature | Native | WASM |
|---|---|---|
| llama.cpp inference | Yes | No (compile-time blocked) |
| OpenAI API | Yes | Yes (browser fetch) |
| Streaming API | Yes (tokio) | Yes (native fetch + ReadableStream) |
| Custom engine | Yes | Yes |
spawn_blocking | Thread pool | Inline execution |
Send + Sync bounds | Enforced | Relaxed (single-threaded) |
| GPU acceleration | Yes | No |
The llama-cpp feature is blocked at compile time for WASM:
#[cfg(all(target_arch = "wasm32", feature = "llama-cpp"))]
compile_error!("llama-cpp not supported on wasm32");Only openai-api or custom engines work on WASM.
Building for WASM
cargo build --target wasm32-unknown-unknown --no-default-features --features openai-apiOr use wasm-pack for a browser-ready package:
wasm-pack build --target web --no-default-features --features openai-apiRuntime polyfills
The runtime module replaces Tokio-specific calls with WASM-compatible alternatives:
spawn()→wasm_bindgen_futures::spawn_local()spawn_blocking()→ direct synchronous execution (single-threaded)sleep()→gloo_timers::future::sleep()timeout()→ future race against a timerSendSynctrait → empty marker (no-op in single-threaded context)
You don't need to change any code – the polyfills are applied automatically based on #[cfg(target_arch = "wasm32")].
Cargo.toml for WASM
[dependencies]
ambi = { version = "0.3", default-features = false, features = ["openai-api"] }
tokio = { version = "1", features = ["sync", "macros"] } # no rt-multi-thread
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"Note: rt-multi-thread is not needed (and won't compile) for WASM.
Streaming in the browser
The OpenAI provider for WASM uses native fetch and ReadableStream APIs for true streaming. The same chat_stream() API works identically in the browser:
use futures::StreamExt;
let mut stream = runner.chat_stream(&agent, &state, "Tell me a story").await?;
while let Some(chunk) = stream.next().await {
if let Ok(text) = chunk {
// append to DOM
}
}No special WASM polyfills are needed – the runtime module automatically swaps Tokio internals for WASM-compatible alternatives.
Example
See examples/webAssembly for a complete browser-ready setup with a UI toggle demoing real-time streaming text generation.