DEV Community

Cover image for What’s New in Rasa 2.0 - Build Your Own Chatbot
Ng Wai Foong
Ng Wai Foong

Posted on

What’s New in Rasa 2.0 - Build Your Own Chatbot

The following post is a simplified version of my original article that was published on Medium.

Introduction

Rasa Open Source is a machine learning framework used to build text- and voice-based chatbots. Recently, it released their first official version 2.0 on October 2020. The new version aims at unifying training data formats, configuration files, and the way to handle models. As a result, there are quite a number of major breaking changes compared to the previous version.

In this post, we are going to dive deep into some of the major differences between Rasa 1.10 and the latest version 2.0. You should check it out and determine if it is worth the effort before performing the migration for your old Rasa server.

Folder and Files Hierarchy

The structure of the folders and files is more or less similar to the previous version with the following exceptions:

  • actions.py is no longer in the root directory.
  • There is a new folder called actions where actions.py is located.
  • There is an additional file called rules.yml in the data folder.

Configuration

Version 2.0 now comes with default configurations for both the pipeline and policies. This is a lot more convenient for new users. Having said that, you can still customize it, and the format is exactly the same as the previous version.

Most of the built-in tokenizers now have been standardized with the following properties:

  • intent_tokenization_flag — Flag to check whether to split intents
  • intent_split_symbol — Symbol on which intent should be split
  • token_pattern — Regular expression to detect tokens

These keys are used for multi-intent classification. It only works with DIETClassifier at the moment.

Have a look at the following code snippet for WhiteSpaceTokenizer:

pipeline:
- name: "WhitespaceTokenizer"
  "intent_tokenization_flag": False
  "intent_split_symbol": "_"
  "token_pattern": None
Enter fullscreen mode Exit fullscreen mode

Besides, the property case_sensitive has been moved from tokenizers to featurizers. You can safely ignore this property unless you are using the following components:

  • KeywordIntentClassifier
  • SpacyNLP
  • RegexFeaturizer

Policies

There is an addition of a new policy called RulePolicy. It’s useful for conversations that always have a fixed behaviour. It’s slightly different from stories and should be used together with stories for better performance. You need to have RulePolicy in your config.yml in order to use Rules and Forms. It is extremely useful when implementing the following features:

  • one-turn interactions, such as FAQs
  • fallback behaviour
  • handling of unwanted behaviour in forms

Moreover, the following policies have been deprecated in favor of RulePolicy. You can still implement the same functionalities using RulePolicy and their respective classifiers.

  • Mapping policy
  • Fallback policy
  • Two-stage fallback policy
  • Form policy

Importers

By default, Rasa uses the RasaFileImporter. You can now create your own data parser to load training data in other formats. This is useful if your training data is from different resources.

In addition, there is an experimental feature called MultiProjectImporter which allows you to combine the dataset from multiple Rasa projects into one. For example, you can modularize your project into the following sub-project:

  • restaurant bot
  • room booking bot
  • chitchat bot

and combine them later on to create a full-fledged chatbot.

Domain

The data structure of domain.yml remains the same except that you need to specify the version at the top of it. You need to specify this for all of your training data files. If it is omitted, Rasa will read it as version 2.0 by default.

version: "2.0"

intents:
  - greet
  - goodbye
  - affirm
  - deny
  - mood_great
  - mood_unhappy
  - bot_challenge
Enter fullscreen mode Exit fullscreen mode

NLU

For NLU training data, the new structure for version 2.0 is as follows:

version: "2.0"nlu:
- intent: greet
  examples: |
    - Hey
    - Hi
    - hello- intent: goodbye
  examples: |
    - Goodbye
    - bye bye
Enter fullscreen mode Exit fullscreen mode

Metadata

In fact, you can now declare additional metadata which contains arbitrary key-value pairs. metadata is accessible by your custom components. For example, you can declare it as follows:

nlu:
- intent: greet
  metadata:
    sentiment: neutral
  examples:
  - text: |
      hi
  - text: |
      hello
Enter fullscreen mode Exit fullscreen mode

In the example given above, metadata is declared at the intent level. As a result, all of the examples contain the metadata. You can declare the metadata individually for each example.

nlu:
- intent: greet
  examples:
  - text: |
      hi
    metadata:
      sentiment: neutral
  - text: |
      hello
Enter fullscreen mode Exit fullscreen mode

Retrieval Intent

In the old version, retrieval intent is an experimental feature in which an intent can be categorized into smaller sub-intents. This helps a lot when building for small talk as it reduces overhead in your stories. You need to use the / symbol to separate the main intent and its sub-intent. Have a look at the following example for a chitchat intent with two sub-intents:

nlu:
- intent: chitchat/ask_name
  examples: |
    - What is your name?
    - May I know your name?- intent: chitchat/ask_weather
  examples: |
    - What is the weather?
    - May I know the current weather outside?
Enter fullscreen mode Exit fullscreen mode

Unlike the normal intent, the answers have to be placed inside responses.yml instead of domain.yml. The file should be located under the data folder, and the format is exactly the same as domain.yml. This means that you can have multiple variations and specify different payloads.

responses:
  utter_chitchat/ask_name:
    - text: "I don't have a name!"
  utter_chitchat/ask_weather:
    - text: "It is sunny"
    - text: "It is raining heavily outside"
Enter fullscreen mode Exit fullscreen mode

Entities

The format for entities is still the same as the previous version, together with the experimental role and group labels. If you are not aware of it, role and group labels can be used to distinguish certain concepts in the same entity. Consider the following example:

Book me a flight from [Malaysia]{"entity": "country"} to [Singapore]{"entity": "country"}.
Enter fullscreen mode Exit fullscreen mode

From a human perspective, even though both of the entities refer to country, we know that the first country refers to departure while the second entity refers to destination. We can label it using this experimental feature as follows:

Book me a flight from [Malaysia]{"entity": "country", "role": "departure"} to [Singapore]{"entity": "country", "role": "destination"}.
Enter fullscreen mode Exit fullscreen mode

Stories

You can combine the stories together with NLU as a single file, but it is highly recommended to separate them. The new format is a lot more verbose compared to the old version, but it does help to distinguish between intents and actions. As a result, you are less likely to make unnecessary mistakes when building your stories.

version: "2.0"stories: - story: happy path
   steps:
   - intent: greet
   - action: utter_greet
   - intent: goodbye
   - action: utter_bye
Enter fullscreen mode Exit fullscreen mode

Similar to NLU, you can define metadata inside stories to store relevant information related to the story. metadata is not used in training and will not impact the performance of your stories.

Forms

Forms are now part of the training data instead of Rasa SDK. Forms require RulePolicy, which is already inside the configuration by default. First, you should define it as follows:

forms:
  your_form:
    age:
    - type: from_entity
      entity: age
Enter fullscreen mode Exit fullscreen mode

Then you can specify action and active_loop fields in which Rasa will loop over and call either utter_ask_{form_name}_{slot_name} or utter_ask_{slot_name} until the entity age is filled. You need to define them in your responses.

stories:
 - story: form asking for age
   steps:
   - intent: intent_ask_age
   - action: your_form
   - active_loop: your_form
Enter fullscreen mode Exit fullscreen mode

Rules

Rules describe a small part of a conversation that always has a fixed path. You can think of it as one-turn interactions in which the same answer will be returned. Unlike stories, rules do not generalize and mostly serve to answer FAQs. It is akin to the usage of trigger via MappingPolicy in domain.yml for Rasa 1.0. Let’s say the bot should always respond with utter_greet whenever users greet it. You can easily define it using rules as follows:

rules: - rule: Say `hello` whenever users greet the bot
   steps:
   - intent: greet
   - action: utter_greet
Enter fullscreen mode Exit fullscreen mode

Thanks for reading this post!

References

Top comments (0)