Just another Network site

How to Export Data from Elasticsearch into a CSV File

How to Export Data from Elasticsearch into a CSV File  | #elasticsearch #devops #bigdata

  • You can verify that the query works by adjusting it slightly and running it in Sense. If the query works, you will have to create a config file from it and try running it.Â
  • Â we start out by building a visualization in Kibana and viewing the query for that visualization.
  • We can almost directly make use of the query to view the output of the query in Sense by copying and pasting it into Sense.
  • I created a query to show a data table of the email field in the aminno_member_email index that I created in the Ashley Madison post.
  • The logstash-output-csv plugin allows us to specify which fields to make use of in the results of the query we ran when we output to a CSV file.

This tutorial shows you how to export data from Elasticsearch into a CSV file. Imagine that you have some data in Elasticsearch that you would like to open up in Excel and create pivot tables from the data.

@supergiantio: How to Export Data from Elasticsearch into a CSV File | #elasticsearch #devops #bigdata

This tutorial shows you how to export data from Elasticsearch into a CSV file. Imagine that you have some data in Elasticsearch that you would like to open up in Excel and create pivot tables from the data. This is just one use case where exporting data from Elasticsearch into a CSV file would be useful. 

Install Logstash-Input-Elasticsearch Plugin. You can install it by running the following:

# cd /opt/logstash/ # bin/plugin install logstash-input-elasticsearch The use of bin/plugin is deprecated and will be removed in a feature release. Please use bin/logstash-plugin. Validating logstash-input-elasticsearch Installing logstash-input-elasticsearch Installation successful

Next you need to install the Logstash-output-csv plugin. Install it with:

You should be ready to go ahead now. We are going to write an elasticsearch query in the input section of the logstash configuration file that will return a bunch of JSON (the results of the query that you just ran). 

Blog Post: How to Calculate Kubernetes Cost Savings

The logstash-output-csv plugin allows us to specify which fields to make use of in the results of the query we ran when we output to a CSV file.  Below is a skeleton of what the Logstash configuration file will look like. This is just to give you an idea of what a configuration will look like and what each section does.

input {  elasticsearch {    hosts => “localhost:9200”    index => “index-we-are-reading-froml”    query => ‘  {“query”: {   .. #Insert your Elasticsearch query here        }      }    } }}’  } } output {  csv { # This is the fields that you would like to output in CSV format. # The field needs to be one of the fields shown in the output when you run your # Elasticsearch query    fields => [“field1”, “field2”, “field3″,”field4″,”field5”] # This is where we store output. We can use several files to store our output # by using a timestamp to determine the filename where to store output.     path => “/tmp/csv-export.csv”  } }

If you are lazy when it comes to writing elasticsearch queries in Sense or using curl, then you could just use a shortcut to do it. One shortcut I like to use involves creating a visualization with the data that you would like to have in the output of your query in Kibana, and then viewing the query in Kibana. Then, you can verify that the query works by adjusting it slightly and running it in Sense. If the query works, you will have to create a config file from it and try running it. 

Let me show you the steps that I make use of. Take note that I am building on a previous blog post where I imported the Ashley Madison data dumps into Elasticsearch. You can use the steps I use for your use-case.  Just adapt it for your use case. 

Related Tutorial: How to Import from CSV into Elasticsearch 

index that I created in the Ashley Madison post. This is what the data table looks like:

While building the visualization, click on the bottom arrow which is facing up and view what the request that Kibana builds in order to construct a graph from the data in Elasticsearch. This is where we view the query. See the below screenshot:

This is the query as I got it from Sense:

{ “size”: 0, “query”: { “filtered”: { “query”: { “query_string”: { “query”: “*”, “analyze_wildcard”: true } }, “filter”: { “bool”: { “must”: [ { “range”: { “@timestamp”: { “gte”: 1443607774987, “lte”: 1475230174987, “format”: “epoch_millis” } } } ], “must_not”: [] } } } }, “aggs”: { “2”: { “terms”: { “field”: “email”, “size”: 100, “order”: { “_count”: “desc” } } } } }

We can almost directly make use of the query to view the output of the query in Sense by copying and pasting it into Sense. Take note of how I added the Index to search and a few other small things:

This is what I put into Sense in order to see what the query actually returns:

GET /aminno_member_email/_search?pretty { “size”: 0, “query”: { “filtered”: { “query”: { “query_string”: { “query”: “*”, “analyze_wildcard”: true } }, “filter”: { “bool”: { “must”: [ { “range”: { “@timestamp”: { “gte”: 1443607774987, “lte”: 1475230174987, “format”: “epoch_millis” } } } ], “must_not”: [] } } } }, “aggs”: { “2”: { “terms”: { “field”: “email”, “size”: 100, “order”: { “_count”: “desc” } } } } }

I had to remove the aggregation section in order to be able to use the query in my logstash config. You can try this query in sense too. It has no aggregation section added and it pretty much returns the same thing:

GET /aminno_member_email/_search?pretty {  “query”: {    “filtered”: {      “query”: {        “query_string”: {          “query”: “*”,          “analyze_wildcard”: true        }      },      “filter”: {        “bool”: {          “must”: [            {              “range”: {                “@timestamp”: {                  “gte”: 1443607774987,                  “lte”: 1475230174987,                  “format”: “epoch_millis”                }              }            }          ],          “must_not”: []        }      }    } } }

This is my final configuration file. I will explain shortly what it does and how it works:

and you can run it with Logstash with:

directory which contains a few things; most notably an email address. You can view the contains of the file while Logstash is running on your configuration file by running:

. From what I’ve read, this option helps to prevent what has been coined as CEMI (CSV Excel Macro Injection). This is where user submitted data in your application is not properly escaped, of which could lead to macro’s being executed on a victim’s computer. You can read more about CEMI here. 

This tutorial explains one method to export data from Elasticsearch into a CSV file. Naturally, there are much more options for optimization, but we hope this is enough info to get a good understanding of the methodology. Questions/Comments? Drop us a line below. 

How to Export Data from Elasticsearch into a CSV File

Comments are closed, but trackbacks and pingbacks are open.