Using Elasticsearch Painless scripting to recursively iterate through JSON fields

Authors Alexander Marquardt Honza Kral Introduction Painless is a simple, secure scripting language designed specifically for use with Elasticsearch. It is the default scripting language for Elasticsearch and can safely be used for inline and stored scripts. In one of its many use cases, Painless can modify documents as they are ingested into your Elasticsearch cluster. In this use case, you may find that you would like to use Painless to evaluate every field in each document that is received by Elasticsearch. However, because of the hierarchical nature of JSON documents, how to iterate over all of the fields may be non-obvious. ...

November 6, 2020

Understanding and fixing "too many script compilations" errors in Elasticsearch

Introduction When using Elasticsearch, in some rare instances you may see an error such as “Too many dynamic script compilations within X minutes”. Such an error may be caused by a poor script design where parameters are hard-coded. In other cases this may be due to the script cache being too small or the compilation limit being too low. In this article, I will show how to determine if these default limits are too low, and how these limits can be modified. ...

October 21, 2020

Converting CSV to JSON in Filebeat

Introduction Many organisations use excel files for creating and storing important data. For various reasons it may be useful to import such data into Elasticsearch. For example, one may need to get Master Data that is created in a spreadsheet into Elasticsearch where it could be used for enriching Elasticsearch documents. Or one may wish to use Elasticsearch and Kibana for analysing a dataset that is only available in a spreadsheet. In such cases, one option is to use Filebeat for uploading such CSV data into an Elasticsearch cluster. ...

March 17, 2020

Using Logstash and Elasticsearch scripted upserts to transform eCommerce purchasing data

Introduction Logstash is a tool that can be used to collect, process, and forward events to Elasticsearch. In order to demonstrate the power of Logstash when used in conjunction with Elasticsearch’s scripted upserts, I will show you how to create a near-real-time entity-centric index. Once data is transformed into an entity-centric index, many kinds of analysis become possible with simple (cheap) queries rather than more computationally intensive aggregations. As a note, using the approach demonstrated here would result in documents similar to those generated by Elasticsearch transforms. Nevertheless, the technique that is documented has not been benchmarked against Elasticsearch transforms, as the main goal of this blog is to demonstrate the power and flexibility of Logstash combined with scripted upserts. ...

December 17, 2019

Emulating transactional functionality in Elasticsearch with two-phase commits

Introduction Elasticsearch supports atomic create, update, and delete operations at the individual document level, but does not have built-in support for multi-document transactions. Although Elasticsearch does not position itself as a system of record for storing data, in some cases it may be necessary to modify multiple documents as a single cohesive unit. Therefore, in this blog post we present a two-phase commit protocol which can be used to emulate multi-document transactions. ...

December 5, 2019

Converting local time to ISO 8601 time in Elasticsearch

This article is available at: https://www.elastic.co/blog/converting-local-time-to-iso-8601-time-in-elasticsearch

October 16, 2019

Counting unique beats agents sending data into Elasticsearch

Introduction When using Beats with Elasticsearch, it may be useful to keep track of how many unique agents are sending data into an Elasticsearch cluster, and how many documents each agent is submitting. Such information for example could be useful for detecting if beats agents are behaving as expected. In this blog post, I first discuss how to efficiently specify a filter for documents corresponding to a particular time range, followed by several methods for detecting how many beats agents are sending documents to Elasticsearch within the specified time range. ...

July 18, 2019

Debugging Elasticsearch and Lucene with IntelliJ IDEA

This article can be found at: https://www.elastic.co/blog/how-to-debug-elasticsearch-source-code-in-intellij-idea

February 2, 2019

A step-by-step guide to enabling security, TLS/SSL, and PKI authentication in Elasticsearch

This article is available at: https://www.elastic.co/blog/elasticsearch-security-configure-tls-ssl-pki-authentication

November 5, 2018

Using Logstash to drive filtered data from a single source into multiple output destinations

This article is available at: https://www.elastic.co/blog/using-logstash-to-split-data-and-send-it-to-multiple-outputs

August 31, 2018