,

Process API in AI tooling

I am currently working on the third iteration of the process API, this time in Rust.

The first version, written in JavaScript, aimed to expose user-defined processes without requiring users to write code. This introduced the concept of intent-driven development, where users define the intent and the process API handles the execution. This approach has been successfully running in our applications for some time now.

The second version was developed in Python with the goal of creating a well-defined, extendable process API to aid in UI automation testing using Selenium as the driver. This introduced intent-driven testing, allowing test writers to specify what they want to accomplish in the UI without detailing how to do it. The process API contains all the necessary information and emphasizes the separation of concerns. Although there are some minor deviations from the original version, this approach has proven to be groundbreaking, significantly simplifying UI automation.

Rust process api

To be honest, I wanted to see if I could. My previous versions were all written in dynamic languages, and I wanted to explore the challenges and benefits of using a static language.

I chose Rust because we use it for WebAssembly, and it offers excellent safety features along with access to great libraries. With WebAssembly, we can write the process API in Rust and still use it alongside other languages.

My initial tests indicate that this approach is not only feasible but also allows for the possibility of defining process API modules in other languages. For instance, while the core of the process API is written in Rust, modules that define execution logic can be written in JavaScript or Python. This is particularly interesting because it enables a dynamic application that can be extended with plugins, loading scripted modules that the process API can call to execute user intent.

The primary focus remains on developing Rust modules that can utilize features of the main crate, while also allowing for the registration of custom Rust modules.

Initial thoughts

Third time’s the charm! I am thrilled with how the design is progressing. The process API has evolved into a more modular execution pipeline. It no longer has inherent knowledge of processes; it simply knows which modules are registered and how to invoke actions on those modules. Processes are now separate from the core and operate independently. When a process is executed, we pass the process API instance specific to that process. This design allows different versions of the API to be used, enabling processes to call actions without needing to understand the differences between API versions.

Creating a process api and registering modules

fn create_api() -> ProcessApi {
    let mut api = ProcessApi::new();
    api.register("math", Box::new(MathModule {}));
    api.register("console", Box::new(ConsoleModule {}));
    api
}

Creating a process from a JSON resource using serde

process = Process::from(process_json)

Running the process

process.run(&api).expect("TODO: panic message");

The process api JSON structure very much looks the same as the other versions.

    let process_json = serde_json::from_str(r#"
    {
        "steps": {
            "start": {
                "module": "math",
                "action": "add",
                "args": {
                    "a": 1,
                    "b": 2,
                    "$target": "${add_result}"
                },
                "next_step": "print"
            },
            "print": {
                "module": "console",
                "action": "print",
                "args": {
                    "message": "result: ${add_result}"
                }
            }
        }
    }
    "#).unwrap();

You may notice a few changes here that deviate from previous versions:

  1. The use of the “$target” keyword instead of “target”: In the past, there was a conflict when a legitimate argument was named “target” because it also defined where the results of the operation should be saved. Using “$target” resolves this issue.
  2. The use of “${add_result}” to save the result to the process’s “data”: Previously, we could also target the execution context path or an item defined in a loop. Since loops are not supported at the moment, we need to reconsider that aspect. However, in this version, we will not allow access to the execution context. Therefore, it makes sense to restrict reading and writing variables only as part of the current process.

Rust and AI

One of the steps involves identifying how to run AI using Rust. There are several options available, such as:

  1. RustFormers
  2. Kalosm
  3. Ollama

However, I wasn’t particularly impressed with any of these options. I had concerns regarding their maintenance, the scope of these libraries, and how frequently they are updated with the latest models. For instance, Ollama primarily streams data using console commands.

Eventually I came across hugging face’s candle crate.

I found this particularly interesting because it seems to be positioning itself as a competitor to general-purpose libraries like PyTorch, but in Rust. Additionally, many of their examples target WebAssembly (Wasm) as a delivery vehicle, enabling models to run in the browser.

While Candle is not a silver bullet, here are some aspects I appreciate:

  • It is created and maintained by Hugging Face.
  • It is updated frequently.
  • General-purpose AI is possible.
  • It has an extensive list of examples.
  • WebAssembly is not an afterthought.
  • It is licensed under Apache 2.0 and MIT.

AI and process api

The Process API can be executed using a JSON intent document that defines the steps of the process, along with its parameters and process-level variables.

With AI function calling, we can generate JSON results as output from the AI—let’s assume a large language model (LLM) for now. As long as the function calling result JSON is properly formatted, we can use the Process API to execute the intent in a controlled manner. Although there is much to be said about generating code in real-time, the lack of control is concerning. Using tooling provides a more secure environment, and since the Process API is modular, there are no limits to the types of actions you can perform. This includes actions ranging from operating system-level tasks to generating graphics and everything in between.

The fact that Candle enables running models in the browser means it can interact with the JS Process API similarly to how it would with the Rust one. Ideally, it would be preferable to see the core shift from JS to WebAssembly (Wasm) with Rust.

Other libraries to take note of

As part of the process api there are other libraries that would be interesting to include as modules.

  1. Polars – dataframe alternative with huge performance benefits due to it’s multithreading.
  2. V8 – enable running JS in rust process api – desktop
  3. Pyo3 – running python in rust or creating rust functions for python.
  4. Plotters – drawing graphs

Looking ahead

The current priority is to establish a solid foundation where all the essential core mechanics are fully functional.

The first module, and probably the most challenging, will involve running various AI models. Once we have the capability to run models and support function calling, we can begin building on several AI-related operations. This includes integrating vector databases for embeddings and fine-tuning models using vector data. Additionally, implementing model quantization will be a valuable feature, but our initial focus should be on getting the models operational.

The other libraries mentioned above support this effort but can also function independently. Not all of them are WASM-based, and the aim is not to have everything in WASM, but rather to enable the feature where feasible.

Our primary focus will be on using frameworks like Dioxus to enable a modular application.