DEV Community

Sarthak Mittal
Sarthak Mittal

Posted on

Story Chatbot — Jina.ai

Jina.ai is a young open source neural search company built ground up with deep learning and AI. Jina offers very easy way to index and query documents, audio, images and videos using state of the art nlp and vision techniques.
Rasa is a well known Conversational AI platform focussed on creating Chatbots at a rapid pace with customizable nlp and dialog policies.
Semantic Search is a way to search content by understanding the meaning of what user wants rather than just keyword search. Semantic Search can be very useful while building chatbots for serving information to user. Rasa currently don’t have a native way to do search today. Closest thing to search in Rasa is called Retrieval Actions which is used to handle faqs.

Jina.ai is a perfect open source combo for Rasa which closes that gap. Jina not only understands text but also images and other modalities.
There are 3 main components in the Story Chatbot
Jina search
Rasa chatbot integrated with Jina
Flask based UI using rasa webchat to display the bot
Jina search
For the Jina.ai search for Story Chatbot, we need the do below for a search query to work.
create a Jina project using cookiecutter
download the data from kaggle
index the data using Jina cli
jina search process which uses the indexed data

jina-box search widget
Create a Jina project using cookiecutter
Create a project folder called story_chatbot.
mkdir story_project
cd story_project
Create a new virtual environment for jina.
python3 -m venv env/search_story_env
source env/search_story_env/bin/activate
Trending Bot Articles:

  1. 8 Proven Ways to Use Chatbots for Marketing (with Real Examples)
  2. How to Use Texthero to Prepare a Text-based Dataset for Your NLP Project
  3. 5 Top Tips For Human-Centred Chatbot Design
  4. Chatbot Conference Online Install cookiecutter and use the jina template with following values when prompted. pip install -U cookiecutter && cookiecutter gh:jina-ai/cookiecutter-jina project_name: Search Stories (non-default) jina_version: 0.5.5 project_slug: search_stories task_type: nlp (non-default) index_type: strings (non-default) public_port: 65482 Navigate to search folder. cd story_search Add kaggle to the auto generated requirements.txt. Jina dependencies are already present in the requirements.txt. As the nlp was selected during cookiecutter prompts, the project is pre-populated with torch and transformers dependencies.

Install the requirements.
pip install -r requirements.txt
Download story data from Kaggle
kaggle datasets download -d shubchat/1002-short-stories-from-project-guttenberg
setup the kaggle.json for above command to work.

stories from https://www.gutenberg.org/
Index the data using Jina cli
Index the story data using Jina with below command.
Optional(default is 500): export $MAX_DOCS=1100
python app.py index
Program will invoke the below code in app.py which will load the flows/index.yml file.

load index.yml file in app.py

index.yml
Flows are easy high level abstractions from Jina for tasks like indexing and searching.
Built-in executors are part of jina-hub and is a flexible way to add algorithms. A flow can have many executors in it from jina-hub or your own custom executor. This is similar to Rasa pipelines which makes it easy to select algorithms.
The first executor mentioned here is crafter which splits the sentences into chunks based on punctuations.
Second one is encoder which uses TransformerTorchEncoder using distilbert-base-cased. This is a wrapper of hugging face torch-version transformers.
Third and fourth one in the index.yml are chunk_indexer and doc_indexers gets data from encoder and saves it to jina format files.
Jina Search process
Now that the data is indexed, next step is to start the search process which can access the search data by running below command. Search Flow uses query.yml which are same executors as indexing except ranker.
python app.py search

query.yml for jina search
This will start the search process and expose it as a POST rest api as below.
http://localhost:65482/api/search
Request
{"top_k":5,"mode":"search","data":["adventure stories"]}
Top 5 results will be returned as a json as the “top_k” is 5 in the request.
Jina box can also be used to view the results. Jina Search is now ready to use.
Rasa chatbot
For creating the rasa chatbot, we need the do below.
rasa init to create a skeleton project
create an intent called search_stories
create a rasa story called search_stories with rasa forms
update domain.yml with form details and utterances
create a form action SearchStoriesForm and call the jina rest api
Start the rasa core and action server
rasa init to create a skeleton project
open a new terminal and navigate to home folder.
cd story_project
create a new virtual environment for rasa.
python3 -m venv env/rasa_env source env/rasa_env/bin/activate
create a folder called rasa and update requirements.txt with rasa, flask and requests. Install the requirements.

mkdir rasa
cd rasa
pip install -r requirements.txt
Initialize a rasa project using rasa init and train the rasa project.
rasa init --no-prompt
rasa train
Create an intent called search_stories
Create a new intent in data/nlu.md called search stories as below. This makes rasa understand when we ask to find a story to the chatbot.

Create a rasa story called search_stories with rasa forms

Use rasa form to get the story search string using prompt to user. Rasa forms is way to get a string from user without running any nlp after a prompt.
Update domain.yml with form details and utterances

Update the intent name, a slot name called search_text, mention the form name and also what should prompt to User while searching for books.
Create a form action SearchStoriesForm and call the jina rest api
Last step for rasa chatbot is to add a class called SearchStoriesForm as shown in the git repo. This custom action will call Jina rest api to pass the user search text and return a carousal back to User with the story links.
Start the rasa core and action server
rasa run -p 5007 --cors "*" --debug
python -m rasa run actions
Rasa chat bot is now ready.
Flask based UI using rasa webchat to display the bot
Last step is to create a flask based UI using rasa webchat component.
Create a file called app.py and configure a jinja template index.html in templates folder. Configure the rasa webchat in the index.html.

Start the flask server using below command.
python app.py
Server will be started by default on port 5000.

Top comments (0)