Most of us had to learn the hard way, that version 4 was basically a complete rewrite of version 3 and required a fairly different way of designing and coding chatbots. In this post, we’ll walk through those differences and highlight the key concepts, so that you can also get started quickly with Bot Framework v4.
Core Concepts of Bot Framework v4
Firstly, let’s examine the different components that make up a bot on Azure.
Bot Architecture on Azure
One of the things that is fairly similar to v3 is the overall bot architecture on Azure.
Let’s start with the user. The user itself does not connect directly to the bot, but rather talk to it via a
Channel through the
Channel Connector service in Azure. These connectors are identical to those in v3. They are used to connect our bot to different platforms like Skype, Microsoft Teams, Kik, Facebook Messenger, and so on. Each
Channel Connector sends its incoming messages from the user to the
/api/messages path of the bot’s API endpoint.
A bot (shown in the middle) is a simple web service that exposes one single API on the
/api/messages path, as discussed before. This API is used for receiving incoming messages from the user and answering them.
The bot’s code is based on Bot Framework as the underlying “engine” and is hosted and executed on an Azure App Service. Similar to v3, a bot in v4 is stateless and the state of a conversation with a user is persisted after every single message. Azure Blob and CosmosDB are supported as persistency layers for a bot. As a result, a bot can easily be scaled-out horizontally in order to handle more incoming messages.
The bot code leverages Azure Cognitive Services, mainly its Language Understanding Service (LUIS) for natural language processing, as well as other services like QnaMaker (for simple question/answer pairs) or Speech API for being able to process and answer with speech.
Bot Framework v4 has deeper integration into Application Insights. Therefore monitoring the bot and its infrastructure, as well as how users “flow” through conversations with the bot can be performed by Application Insights. We’ll spend more on this in the section below.
Bot Deployment Options on Azure
Out of the box, Azure Bot Service currently only supports Azure App Service as the platform for v4 bots. While v3 also supported Azure Functions (after all, the Bot Framework is stateless), I personally see no reason why a v4 Node.js based-bot shouldn’t be able to run in an Azure Function. My personal guess is that it has just not been integrated into the Azure platform.
Is this is a big pity? Not really – in my experience most bots have been deployed on Azure App Service anyways.
Bot Framework v4 Concepts
Let’s switch over to the coding side and examine the new concepts of v4. Hint: now might be a good moment to forget most of what you know from v3.
Adapter is a necessary component of every bot. After the web service endpoint (under
/api/messages) received a message from the user (or more generally speaking, an
Activity), it is forwarded to the
Adapter unwraps it, performs authentication, maps it to the user, etc. As an output, it creates an
TurnContext object for us, which our actual bot code can process in the current
Turn & TurnContext
Bot Framework v4 represents interactions between users and the bot as
Activity a user performs generates a new
Turn. For example, a message from the user to the bot will imply a new
Turn, but there are numerous activities that also imply a new turn (more on that later).
Turn, our bot receives a
TurnContext object (generated by the
TurnContext contains information about the current conversation, the activity that triggered the turn, the user state and further data points.
Activities are the events that our bot receives from its users. Probably the two most prominent activities are
message. While message is self-explanatory (a message sent from the user to the bot), the
conversationUpdate is triggered when a user or the bot join a conversation. Other Activities include
contactRelationUpdate (when user adds or removes the bot to/from the contact list) or
typing (triggered when the user is typing). A full list of all supported Activities can be found here.
Adapter produces the
TurnContext object by passing the initial request through the Bot Framework’s
Middleware is a pipeline that for example restores the state of the conversation, and potentially performs language understanding or translation. This
Middleware pipeline can be extended with additional processing steps and is executed on every incoming message.
State and Persistency
Middleware component automatically restores the conversation state (e.g., the dialog in which the conversation with the user is in) and also restores any custom user state. In contrast to Bot Framework v3, we are now responsible for manually updating both states. Both Azure Blob and CosmosDB are supported targets for persisting state.
This is probably the area where the most changes happened.
First of all,
Dialogs are not a “must-use” concept in v4. For bots and assistants that perform single-shot operations, e.g. “turn off the lights” (similar to Alexa) we probably do not need to use any dialogs, but rather just use regular classes and call their methods.
Dialogs are the way to go for more complex and nested conversations:
Dialog is composed of one or more
WaterfallSteps. This allows for a linear conversation flow, as indicated in this example:
- Dialog starts
- Waterfall Step 1: Bot asks something
- User answers
- Waterfall Step 2: Bot processes the response and asks something else
- User answers
- Waterfall Step 3: Bot processes the response and answers
- Dialog ends
In this example, the
Dialog would contain three
WaterfallStep entries. Step 1 and 2 would contain a
Prompt is a single-step
Dialog that asks the user something. The concept of
Prompts is similar its counterpart in v3 and several built-in
Prompts are included. However, custom
Prompts with custom
Validators can be written for better reusability of code.
How does a
Dialog receive data? Similar to v3, either by passing (or having it passed) into the
Dialog via the
TurnContext object or by accessing the custom user state data.
Prompts are grouped together in a
DialogSet. As a
Dialog can not have child-
Dialogs any more,
DialogSets are the way to group
Prompts. In v3, we often used a Root
Dialog and routed to the individual sub-dialogs. In v4, we would have a
DialogSet as the Root Dialog, containing all our sub-
Our actual Bot
Our actual bot is nothing more than a class with a single method called
onTurn() which receives the
TurnContext object during each invocation. From there, we decide if we want to leverage dialogs or just perform single-shot responses.
With the new Application Insights integration for Bot Framework v4, it becomes significantly easier to monitor the bot and its surrounding infrastructure:
Application Insights generates an Application Map based on all events occurring in the infrastructure. For example, we can see that our bot talks to Azure Blob (upper left), has been accessed by the Webchat Channel (top), or uses LUIS for language understanding (upper right). Furthermore, we see which calls generated errors (e.g.,
500 errors) and their associated latencies.
User Flow feature in Application Insights, we can see how requests flow through our bot code. This enables us to know which dialogs our users frequently take and allows us to explore where we might have performance bottlenecks or a bumpy flow through the conversation.
In this example, we see how messages flow into the
/api/messages path and then trigger various Waterfall steps in the dialogs.
The Application Insight integration package is available via
botbuilder-applicationinsights and requires only a few lines of code for integration.
Yes, finally a decent library for performing authentication has been included in the Bot Framework. Full instructions on its setup are available in the Azure documentation here.
Bot Framework v4 is a complete rewrite of Bot Framework v3, hence both are very different in terms of functionality, as well as coding patterns. In this post we’ve looked at most of the changes and discussed most of the new concepts. With all those differences between the framework versions, there are quite a few things that work better with v4, but also some things that hopefully will improve over time.
Things are better
Overall, I believe the following capabilities make v4 a more versatile framework for building bots:
- Monitoring has been improved drastically – we can now monitor the bot itself, its infrastructure, and the users’ flow through our bot
- Authentication has been simplified
- We now have more consistent coding model across the different languages (C#, JS, etc.)
- We ca better reuse dialog code
- Overall, Bot Framework v4 offers more freedom in terms of designing and integrating bots
Things that need improvement
On the flip side, there a few things that hopefully will get some more attention in the future:
- Overall, building a bot in v4 takes a bit more reading for getting started
- Documentation is still a bit behind
- Code bloat – Bot Framework v4 requires us to write a lot more lines of code compared to v3. Better code refactoring capabilities hopefully will help to mitigates this, but it will definitely require some thinking and optimisation before it will pay off
Last but not least, if you are familiar with Bot Framework v3, you’ll be able to get the twist of v4 after a day or two. Understanding the new concepts for updating state, handling dialogs and turns helped me to quickly transition some v3 bots to the new framework version.
If you have any questions or want to give feedback, feel free to reach out via Twitter or just comment below.