Currently I migrated to Elasticsearch 5.5. My whole system is running again but I can’t figure out a way to enable the TTL again since it has been removed from ES5. Is there an alternative configuration to keep only the documents for an X amount of time?
It looks like the recommendation from Elasticsearch is to either use time-based indices or externally schedule a process to remove documents based on timestamp.
The _timestamp and _ttl fields were deprecated and are now removed. As a replacement for _timestamp, you should populate a regular date field with the current timestamp on application side. For _ttl, you should either use time-based indices when applicable, or cron a delete-by-query with a range query on a timestamp field
Our recommendation is to use time based indices. Internally we use daily indices which which allow us to easily control the amount of data in the cluster as well as providing the ability to change shard counts overtime if your event volumes change.
The way to do this with the Elasticsearch Sink is to:
Update your alias each day after creating your new index
NOTE: You will need to ensure that your alias only points to 1 index for the sink to be able to work!
While the removal of the TTL involves more management it does make the cluster a lot more efficient at scale as it is not constantly searching for data to expire.