WebNN Basics

Introduction

Web Neural Network API (WebNN), provides a low-level hardware-accelerated interface for running neural network inference in browsers.

In this post, we’ll explore WebNN with a straightforward example. This piece is geared toward those who are new to machine learning, so the example is kept simple and the explanation is streamlined for clarity.

There are a couple of core parts:

Context:
https://www.w3.org/TR/webnn/#mlcontext
GraphBuilder
https://www.w3.org/TR/webnn/#mlgraphbuilder
Graph
https://www.w3.org/TR/webnn/#mlgraph
Tensors
https://www.w3.org/TR/webnn/#mltensor

There are a lot more parts but for this post these are the basics.

Description

The Graph

A computational graph can be seen as the framework that governs how operations execute in a machine learning model. It consists of a series of mathematical steps applied to incoming data. The graph includes input nodes for supplying values to be processed, and it produces an output that we often refer to as the model’s “inference.”

Although we often refer to the graph as the model, technically the graph only represents the structure or “blueprint.” A complete model also includes the trained parameters (like weights and biases), making it more than just the graph alone.

Thus a model = graph + weights + biases (we don’t need to dive to deep in this now)

Graph Builder

We use a graph builder to define our computational graph step by step. After building the graph, we don’t need the builder again unless we decide to alter or reconstruct the graph. The builder offers multiple functions to set up inputs, outputs, and the mathematical operations that form the model’s structure. Below is a simple code snippet demonstrating how to create two inputs and combine them:

const A = graphBuilder.input("A", descriptor);
const B = graphBuilder.input("B", descriptor);
const C = graphBuilder.add(A, B);

What this code does:

graphBuilder.input: Defines two input tensors, named “A” and “B,” each described by the same descriptor (which typically specifies shape and data type).
graphBuilder.add(A, B): Creates a node C that represents the sum of A and B. This becomes one of the operations in the computational graph, and can serve as an output node or feed into further operations.

Context

A context acts as a runtime environment for executing the computational graph. Although the graph defines which inputs (such as A and B) produce which outputs (like C), it doesn’t store any data by itself. That’s where tensors come in: they are the multi-dimensional arrays that hold the actual numeric data for inputs, intermediate calculations, and outputs. When you provide values to the graph’s inputs, you’re effectively populating A and B with tensors. The context then runs the operations in the graph—based on the defined computation flow—and finally, you can read the resulting tensor from C, which represents the completed inference or calculation.

Tensors

Tensors store the actual numeric data needed to perform calculations in a model and are typically created using a context. Each tensor can be configured to be readable, writable, or both, depending on the needs of your application. When creating a tensor, you must specify:

The data type of the values it will hold (e.g., float32).
Its shape (the number and dimensions of values).
Whether it can be read from, written to, or both.

Below is an example of how you might create a tensor using a context’s createTensor function:

async #createTensor(context, operand, writable, readable) {
    return context.createTensor({
        dataType: operand.dataType, shape: operand.shape, writable, readable
    })
}

In this snippet, the operand object comes from the graph builder (for instance, when calling input or add). Here’s how you might use the helper function to create tensors for inputs and outputs:

const inputA = await this.#createTensor(this.#context, A, true, true);
const inputB = await this.#createTensor(this.#context, B, true, true); 
const outputC = await this.#createTensor(this.#context, C, false, true);

inputA and inputB are both readable and writable, so you can set their values before running the graph.outputC is not writable (false), but is readable (true), since you only need to read the results of the computation.

Running the Graph

Below is a code snippet illustrating how to feed input values into tensors, run a graph, and read the resulting output. We retrieve the user input from the DOM, write these values to tensors, execute the graph via the context, and finally display the result back in the DOM.

const value1 = this.shadowRoot.querySelector("#value1").value;
const value2 = this.shadowRoot.querySelector("#value2").value;

this.#context.writeTensor(this.#tensors.inputA, new Float32Array([value1]));
this.#context.writeTensor(this.#tensors.inputB, new Float32Array([value2]));

const inputs = {
    'A': this.#tensors.inputA,
    'B': this.#tensors.inputB
};

const outputs = {
    'C': this.#tensors.outputC
};

this.#context.dispatch(this.#graph, inputs, outputs);
const output = await this.#context.readTensor(this.#tensors.outputC); 
const result = new Float32Array(output)[0];
this.shadowRoot.querySelector("#result").textContent = result;

Why Use `Float32Array`?

Precision Requirements: Most machine learning operations typically use 32-bit floating-point values (also known as float32) to balance precision and memory usage. Using Float32Array ensures that we match these precision requirements.
Typed Arrays for Performance: Web APIs like WebNN (and other lower-level graphics or ML APIs) are optimized around typed arrays. These arrays provide a direct way to handle binary data in JavaScript with minimal overhead.
Consistency in Data Representation: By always using Float32Array, you maintain a consistent format for all numeric values flowing through your graph, which simplifies data handling and reduces possible conversion errors.

Conclusion

Working with WebNN boils down to these core steps:

Build a Graph: Define how inputs flow through various operations to produce an output.
Create a Context: This “wrapper” handles data transfers and executes the graph.
Make Tensors: Assign shapes and data types to your inputs and outputs.
Execute and Read: Write your input values, run the graph, then retrieve and display the result.

Even though we used a simple addition example, the same workflow extends to more complex neural networks (e.g., CNNs, RNNs) with many layers and trained parameters.

Code Example

import DeviceType from "./device-type.js";
import PowerOptions from "./power-options.js";

/**
 * Class representing a WebNN program.
 */
export class Program {
    #context;
    #graphBuilder;
    #graph;
    #inputTensors = {};
    #outputTensors = {};

    /**
     * Initialize the program.
     * @param {DeviceType} deviceType - The type of device to use.
     * @param {PowerOptions} powerPreference - The power preference for the device.
     * @returns {Promise<void>}
     */
    async init(deviceType = DeviceType.GPU, powerPreference = PowerOptions.DEFAULT) {
        const contextOptions = {
            deviceType,
            powerPreference
        };

        this.#context = await navigator.ml.createContext(contextOptions);
        this.#graphBuilder = new MLGraphBuilder(this.#context);
    }

    /**
     * Dispose of the program resources.
     * @returns {void}
     */
    dispose() {
        this.#context = null;
        this.#graphBuilder = null;
        this.#graph = null;
        this.#inputTensors = null;
        this.#outputTensors = null;
    }

    /**
     * Create tensor on context
     * @param {*} operand 
     * @param {*} writable 
     * @param {*} readable 
     * @returns 
     */
    async #createTensor(operand, writable, readable) {
        return await this.#context.createTensor({
            dataType: operand.dataType, shape: operand.shape, writable, readable
        })
    }

    /**
     * Add an input tensor to the program.
     * @param {string} name - The name of the tensor.
     * @param {Object} tensor - The tensor object.
     * @returns {Promise<void>}
     */
    async addInputTensor(name, operand) {
        this.#inputTensors[name] = await this.#createTensor(operand, true, false);
    }

    /**
     * Add an output tensor to the program.
     * @param {string} name - The name of the tensor.
     * @param {Object} tensor - The tensor object.
     * @returns {Promise<void>}
     */
    async addOutputTensor(name, operand) {
        this.#outputTensors[name] = await this.#createTensor(operand, false, true);
    }

    /**
     * Add a node to the graph.
     * @param {string} action - The type of the node (e.g., "input", "add").
     * @param {...*} args - The arguments for the node.
     * @returns {Object} The created node.
     */
    addToGraph(action, ...args) {
        return this.#graphBuilder[action](...args);
    }

    /**
     * Set the value of a tensor.
     * @param {string} name - The name of the tensor.
     * @param {Array<number>} values - The value to set.
     * @returns {Promise<void>}
     */
    async set(name, values) {
        await this.#context.writeTensor(this.#inputTensors[name], new Float32Array(values));
    }

    /**
     * Build the graph with the specified outputs.
     * @param {Object} args - The outputs of the graph.
     * @returns {Promise<void>}
     */
    async build(args) {
        this.#graph = await this.#graphBuilder.build(args);
    }

    /**
     * Run the program and get the result.
     * @returns {Promise<number>} The result of the computation.
     */
    async run() {
        this.#context.dispatch(this.#graph, this.#inputTensors, this.#outputTensors);

        const outputKey = Object.keys(this.#outputTensors)[0];
        const output = await this.#context.readTensor(this.#outputTensors[outputKey]);
        return new Float32Array(output)[0];
    }
}

/**
 * Example usage:
 * 
 * const program = new Program();
 * await program.init();
 * 
 * const descriptor = {dataType: 'float32', shape: [1]};
 * const A = program.addToGraph("input", "A", descriptor);
 * const B = program.addToGraph("input", "B", descriptor);
 * const C = program.addToGraph("add", A, B);
 * program.build({C});
 * 
 * await program.addInputTensor("A", A);
 * await program.addInputTensor("B", B);
 * await program.addOutputTensor("C", C);
 * 
 * await program.set("A", [1]);
 * await program.set("B", [2]);
 * 
 * const result = await program.run();
 * console.log(result); // Output: 3
 */

caperaven

Running Text-to-Speech Locally with Web Technology

UI needs to change 2026+

AI Is Not Bad at Maintaining Large Codebases. Most Teams Are Just Using It Wrong.

Running Text-to-Speech Locally with Web Technology

UI needs to change 2026+

AI Is Not Bad at Maintaining Large Codebases. Most Teams Are Just Using It Wrong.

Introduction

Description

The Graph

Graph Builder

Context

Tensors

Running the Graph

Why Use Float32Array?

Conclusion

Code Example

Running Text-to-Speech Locally with Web Technology

UI needs to change 2026+

AI Is Not Bad at Maintaining Large Codebases. Most Teams Are Just Using It Wrong.

Running Text-to-Speech Locally with Web Technology

UI needs to change 2026+

AI Is Not Bad at Maintaining Large Codebases. Most Teams Are Just Using It Wrong.

Why Use `Float32Array`?