DEV Community

Mallikarjun H T
Mallikarjun H T

Posted on • Edited on

Elastic Search pipeline

Here's a quick look at how to leverage Elastic Search's processor to quickly change data.
It is difficult to understand the answers available on the internet, and I recommend official documentation, but understanding the documentation requires some trial and error. In this post, I've included some queries that will change text to lower or upper case.

UPPERCASE PROCESSOR

If you have a string (many words), you must divide it before processing it, therefore I've included a 'split processor' and then we can loop through to convert to lowercase or uppercase.

curl -XPUT "http://localhost:9200/_ingest/pipeline/uppercase_Processor" -H 'Content-Type: application/json' -d'
{
    "processors": [
        {
            "split": {
                "field": "name",
                "separator": " ",
                "target_field": "name",
                "ignore_missing": true
            }
        },
        {
            "foreach": {
                "field": "name",
                "processor": {
                    "uppercase": {
                        "field": "_ingest._value"
                    }
                }
            }
        }
    ]
}'
Enter fullscreen mode Exit fullscreen mode
curl -XPUT "http://localhost:9200/_ingest/pipeline/lowercase_Processor" -H 'Content-Type: application/json' -d'
{
    "processors": [
        {
            "split": {
                "field": "name",
                "separator": " ",
                "target_field": "name",
                "ignore_missing": true
            }
        },
        {
            "foreach": {
                "field": "name",
                "processor": {
                    "lowercase": {
                        "field": "_ingest._value"
                    }
                }
            }
        }
    ]
}'
Enter fullscreen mode Exit fullscreen mode

The result will be an array of values; if you're wondering what the point in doing this , don't worry,  we can  perform the 'join processor'.

curl -XPUT "http://localhost:9200/_ingest/pipeline/join_Processor" -H 'Content-Type: application/json' -d'
{
    "processors": [
        {
            "join": {
                "field": "name",
                "separator": " "
            }
        }
    ]
}'
Enter fullscreen mode Exit fullscreen mode

Do you deed to trim leading and trailing spaces?
In javascript we can call .trim() and what about ElasticSearch. well, we have Trim Processor.

curl -XPUT "http://localhost:9200/_ingest/pipeline/trim_Processor" -H 'Content-Type: application/json' -d'
{
    "processors": [
        {
  "trim": {
    "field": "name"
  }
}
    ]
}'
Enter fullscreen mode Exit fullscreen mode

Need to remove or convert special characters to desired character?
let's say Double Space to Single Space, gsub is the processor we can use.

curl -XPUT "http://localhost:9200/_ingest/pipeline/gsub_Processor" -H 'Content-Type: application/json' -d'
{
    "processors": [
        {
  "gsub": {
    "field": "name",
    "pattern": "\\s+",
    "replacement": " "
  }
}
    ]
}'
Enter fullscreen mode Exit fullscreen mode

use this query to bulk update data.

curl -XPOST "http://localhost:9200/index/_update_by_query?pipeline=join_Processor"

Enter fullscreen mode Exit fullscreen mode
curl -XPOST "http://localhost:9200/index/_update_by_query?pipeline=join_Processor&q=name='your_name'"

Enter fullscreen mode Exit fullscreen mode

Top comments (0)