What is the Microsoft Project Bonsai AI toolchain?

What is the Microsoft Project Bonsai AI toolchain?

You may have heard of Project Bonsai from Neal Analytics, Microsoft, or other partners for a while and are still a bit confused about what it is and what it does. This short post succinctly explains this new (still in preview) Azure AI technology to help you decide whether it’s appropriate for your business needs or not.

But first, to set the stage, let’s put Bonsai in perspective with three other key concepts: Autonomous Systems, deep learning (or deep neural networks, i.e., DNN), and (Deep) Reinforcement Learning (DRL).

Autonomous Systems, DNN, DRL, and Microsoft Project Bonsai  

As this video clearly explains, although many non-practitioners are confused about these concepts, Machine Learning (ML) and Artificial Intelligence (AI) are different concepts.

In short, Artificial Intelligence relates to all computer software that (tries) to mimic human intelligence. This could be achieved through two main software programming strategies:

    • Rule-based AI, which – conceptually – uses a series of if/then statements to decide what the next step would be. E.g., “If users type the word “support” in their chat sentence, send them to the support page.”
    • ML-based AI. This AI uses data from expected behavior (i.e., which output should be generated for a given input) to train a model. There are different types of ML models.

Statistical ML vs. Deep Learning   

ML can be split into two main categories.  

Statistical ML  

In traditional (statistical) ML, stochastic methods (regressions, etc.) are used to connect, often through a simple linear or polynomial way, inputs and outputs. A “best fit” line or curve through a cloud of input (X-axis) and output (Y-axis) data points is a typical statistical ML model.

Deep learning, or Deep Neural Network-based AI  

Though it has existed since the early ’90s, deep learning (DL) has gained lots of popularity and real-life applicability since the early 2010s. It leverages the concept of multi-layered (i.e., “Deep”) “neurons” (or nodes) connecting inputs to output.

DL became possible with the rise of cloud computing and the availability of massive training data sources. All recent groundbreaking AI developments became possible because of DL: speech recognition, natural language processing (NLP), human-level accuracy machine translation, automatic image tagging, and more.

However, DL has a few drawbacks. The first one is the computational requirements, train models. This drawback can be easily alleviated using hyper-scale cloud solutions.

The second drawback is tougher to mitigate when targeting real-life use cases. It’s the training data availability. Training a DNN from scratch requires at minimum hundreds of thousands of training data points, often several million. And although in some limited scenarios, a DNN can learn on its own through trial and error, like a baby learning to walk, it is rarely an option for real-life business use cases.

Therefore, the question that arises to make DL-based AI (referred to as “AI agents” in the rest of this article) a business reality is: how can businesses gather or generate enough training data to build an AI agent that will work for their unique need?

That’s where Deep Reinforcement Learning comes in.

Deep Reinforcement Learning  

DRL is a training approach that leverages a trial-and-error approach (like a baby learning to walk). “Deep” refers to its use of DNNs. Reinforcement Learning (RL) refers to the use of feedback mechanisms (reward functions) to “nudge” the AI agent parameters towards their optimal values.

However, it is virtually impossible to use a real-life system and let the “crawling” (to keep the baby analogy) AI agent randomly control an extruder, a chemical vat, a logistic supply chain, etc.

Therefore, DRL uses advanced simulations to train its agent before deploying it.

DRL cycle

Autonomous Systems, Project Bonsai, and DRL  

Designing, training, and deploying a DRL-trained AI agent is quite an involved process.

It requires:

    • The design of an advanced simulator
    • The design of a DNN (the actual AI agent) architecture
    • The definition of an appropriate training reward function
    • Connecting and running the simulation + AI agent in training + reward function loops up to millions of times
    • Testing the AI agent
    • Deploying the AI agent
  • An Autonomous System can be characterized as an industrial, supply chain, mechanical, or any other process that can autonomously function, including when the external environment changes. An Autonomous System consists of the combination of this core process together with its controlling DRL-trained AI agent.  

Microsoft Project Bonsai is a toolchain that allows non-AI specialists to design and train, using DRL, such an AI agent.

In Project Bonsai, the developer will:

    • Import or connect to the process simulator. This simulator will be physics-based, simulator platform-based, or AI-based. Learn more here.
    • Define the DRL reward function
    • Define the “training curriculum”, i.e., the AI agent (aka “brain”) training script connecting the simulator with the “brain” in training.
    • Test the trained brain
    • Deploy the brain in production (e.g., on an edge device such as Azure Stack Edge running Kubernetes containers)

As you have probably noted, we did not mention anything about the actual DNN design. This is on purpose because, besides managing the training end-to-end, Bonsai also automatically selects the best DNN architecture for the project. It shields the user from knowing the practical details of neural network architecture (CNN, RNN, Positron, transformer, etc.), number of network layers, number of nodes per layer, activation functions, etc.

Why use Microsoft Project Bonsai for your next DRL-trained AI project?  

Let’s be honest, we’re biased. Neal experts have been working with the Project Bonsai toolchain since 2020 and it has been a fantastic tool to work on several projects from design to deployment such as the PepsiCo Cheetos extrusion process.

But, in earnest, we also celebrate our 10 years in the AI/ML space this year and we know how to recognize great tools when we see them. This is why we have become proponents of Azure MLOps for our data science projects and Project Bonsai for our DRL ones.

There are other options that exist. Some are partially integrated, others limited to stand-alone elements your data scientists will have to manually stitch together with “glue code”.

Our experience over the years, however, showed us that these integrated platforms reduce project risks, development time, and overall concept to production rollout cost.

Project Bonsai falls square into this category and, if you have any questions about it, our experts would love to share more about our experience and the new AI use cases Project Bonsai opens.