This post was first posted on my blog in Turkish
First of all, I'm so sorry because of my grammar mistakes.
Hi, in this article, I will give some information about using Python and Elasticsearch.
What is the Elasticsearch?
Elasticsearch is an open-source, RESTful, distributed search and analytics engine built on Apache Lucene. (via)
Using Elasticsearch with Python and Flask
Before I starting the article, I should say this; I'll use the Flask framework. We will send our queries through the Postman.
But you already design it in very different scenarios in a real-world application.
Requirements
The JDK must be installed. In this article, I will not share any information on how to install JDK / JRE. There are different docs about the installation of JDK. These depend on your operating system.
If you don't have a JDK on your system, you should install it.
Installation
In this article, I will give some information about the Elasticsearch version 6.2.1.
The download links to the official site include Debian-based versions, RPM for Fedora / RedHat and similar, and MSI for Windows. ZIP / TAR archives are available for others.
Download Link: https://www.elastic.co/downloads/past-releases/elasticsearch-6-2-1
If you are using Ubuntu, the installation will be like this:
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.1.deb
sudo dpkg -i elasticsearch-6.2.1.deb
Since we have Java dependencies installed, we will not get any errors. After that, start elastic service:
sudo systemctl enable elasticsearch
We will not do the biggest configurations because it is the basic level. But it will be a good step for us to learn how to make some basic configurations.
The configuration file is elasticsearch.yml. We will edit this file.
sudo nano /etc/elasticsearch/elasticsearch.yml
Let's find cluster.name with CTRL + W and update its value. Our cluster's name is can be Elton John. Of course, If we want.
cluster.name: eltonjohn
node.name: "Our First Node"
After that, we need to save this file. In order to save file we will use these keyboard combinations;
CTRL + X
Y
Enter
Elasticsearch service needs to be restarted.
sudo systemctl restart elasticsearch
Let's test the following curl request from the console screen:
curl -X GET 'http://localhost:9200'
Elasticsearch currently working on 9200 port. But you can set your own port number. The above request returned the following result.
{
"name": "Our First Node",
"cluster_name": "eltonjohn",
"cluster_uuid": "FArhKpkhSZqjDfs3oSBMHw",
"version": {
"number": "6.2.1",
"build_hash": "7299dc3",
"build_date": "2018-02-07T19:34:26.990113Z",
"build_snapshot": false,
"lucene_version": "7.2.1",
"minimum_wire_compatibility_version": "5.6.0",
"minimum_index_compatibility_version": "5.0.0"
},
"tagline": "You Know, for Search"
}
Of course, the endpoint to request for Elasticsearch should not be traceable from the network activity screen that everyone can access.
For this reason, let's do this with Python and Flask as follows:
Flask and Elasticsearch Installation with the Virtualenv
First, let's do Flask setup with Virtualenv:
virtualenv venv
. venv/bin/activate
pip install Flask
then we will set up the elasticsearch-py module:
pip install elasticsearch
More Information: https://elasticsearch-py.readthedocs.io/en/master/
Dependencies are installed for Python. Let's create a file called main.py and import the elasticsearch-py module and Flask.
from datetime import datetime
from flask import Flask, jsonify, request
from elasticsearch import Elasticsearch
then let's initialize the Flask and Elasticsearch modules.
es = Elasticsearch()
app = Flask(__name__)
We will create three methods. Normally, we shouldn't do like this. but I'm just talking about Elasticsearch.
@app.route('/', methods=['GET'])
def index():
results = es.get(index='contents', doc_type='title', id='my-new-slug')
return jsonify(results['_source'])
This first method will help to query from id to the index and document type we specified on Elasticsearch. Since there is no content at this point, the empty result will be returned.
We should know this; The index, doc_type and id values are completely programmable.
If you show the return value from the get method, you will need to look at the _source key.
@app.route('/insert_data', methods=['POST'])
def insert_data():
slug = request.form['slug']
title = request.form['title']
content = request.form['content']
body = {
'slug': slug,
'title': title,
'content': content,
'timestamp': datetime.now()
}
result = es.index(index='contents', doc_type='title', id=slug, body=body)
return jsonify(result)
In this step, suppose we post three form values in the name slug, title, and content. Let's pass these return values to the body parameter of the index () method. We'll be able to add new data.
If you want to, create an index, change the document type, and add a different slug.
If the values you are passing are already indexed, the version is updated instead of adding new data.
Let's go to the last step. We will learn to search at this point. Let's create our third and last method for this:
@app.route('/search', methods=['POST'])
def search():
keyword = request.form['keyword']
body = {
"query": {
"multi_match": {
"query": keyword,
"fields": ["content", "title"]
}
}
}
res = es.search(index="contents", doc_type="title", body=body)
return jsonify(res['hits']['hits'])
In this example, we will use a property called multi_match. So we will be able to query by more than one body key. The search text is coming from a form value called keyword.
The second of the hits in the return value will be JSON array. So, all values with similar content will be listed.
{
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "other-my-diff",
"_index": "contents",
"_score": 0.2876821,
"_source": {
"content": "What kind of content is this?",
"slug": "other-my-diff",
"timestamp": "2018-02-11T16:08:10.409353",
"title": "Very different title second"
},
"_type": "title"
},
{
"_id": "other-my",
"_index": "contents",
"_score": 0.2876821,
"_source": {
"content": "What kind of content?",
"slug": "other-my",
"timestamp": "2018-02-11T16:00:51.613402",
"title": "Very different title"
},
"_type": "title"
}
],
"max_score": 0.2876821,
"total": 2
},
"timed_out": false,
"took": 3
}
That is all. At this point, elasticsearch should not be thought of as a database.
All Codes
All codes here. If the dependencies are installed, the following code will work properly.
from datetime import datetime
from flask import Flask, jsonify, request
from elasticsearch import Elasticsearch
es = Elasticsearch()
app = Flask(__name__)
@app.route('/', methods=['GET'])
def index():
results = es.get(index='contents', doc_type='title', id='my-new-slug')
return jsonify(results['_source'])
@app.route('/insert_data', methods=['POST'])
def insert_data():
slug = request.form['slug']
title = request.form['title']
content = request.form['content']
body = {
'slug': slug,
'title': title,
'content': content,
'timestamp': datetime.now()
}
result = es.index(index='contents', doc_type='title', id=slug, body=body)
return jsonify(result)
@app.route('/search', methods=['POST'])
def search():
keyword = request.form['keyword']
body = {
"query": {
"multi_match": {
"query": keyword,
"fields": ["content", "title"]
}
}
}
res = es.search(index="contents", doc_type="title", body=body)
return jsonify(res['hits']['hits'])
app.run(port=5000, debug=True)
Thank you for reading
Top comments (4)
I used this code and basically got an error on 127.0.0.1:5000 saying index doesnt exist. I tried to hit 127.0.0.1:5000/insert_data and got a 405, not allowed. WHat am I missing?
Hi, I didn't follow the code nor run it locally. But from what you describe is that when you hit
127.0.0.1:5000/insert_data
you invoke GET request and that route only acceptsPOST
so you get not allowed error. You probably would need to docurl -H 'Content-Type: application/json' -X PUT -d '[JSON]' http://127.0.0.1:5000/insert_data
where JSON is your data matching your es schema.Hi, great article. However what if I wanted t use multiple search queries
like
search?slug=blah&title=blahblah&price=2
?Is this possible with elasticsearch and flask?
Hi , I try to connect Elasticsearch with SQLAlchemy .Do you know how to do it?