It has been a while unfortunately since I sat down for some writing on this blog. But the writing bug is persistent – once hooked, you got to write. Serious writing requires original research, data collection/analysis etc… so can take a good bit of time depending on the topic. I had been playing with ELK on a routine basis so for what I thought to be a quick win, I decided to add to the earlier blog post on Building elasticsearch clusters with Vagrant . Well, it did not quite turn out that way and I had to cover a good bit of ground and publish code to other repos in order for this blog to be useful.
To recap, that post used (a) Virtualbox as the means to build the VMs for the cluster, and (b) a shell script to orchestrate the installation & configuration of an elasticsearch cluster on those VMs. In this post we will still use Virtual box for giving us the VMs, but enhance the provisioning in 2 ways.
- We will build a full ELK stack where appication logs are shiipped by Beats to a Logstash host for grokking and posting to an ES cluster hooked to Kibana for querying & dashboards. Here is a schematic.
- The provisioning (install & config) of the software for each of E (Elasticsearch), L (Logstash), K (Kibana) and the Filebeat plugin is done via ansible playbooks. Why? While provisioning with shell scripts is very handy, it is programmatic, can get long and winded for building complex coupled software systems across a cluster of hosts. Ansible hides much of that and in stead presents more or a less a declarative way (playbooks!) of orchestrating the provisioning. While there are alternatives, ansible has become insanely popular lately in the devops world.
=> Download from github to play along with the build out.
1. The Inventory
We need 7 VMs – 2 for applications with Filebeat, 1 ES master node, 2 ES data nodes, and 1 each for Logstash, Kibana. The names, ip addresses for these VMs will be needed both by Vagrant for creating these and, by Ansible later for provisioning. So we prepare a single inventory file and use it with both Vagrant & Ansible. Further, this file rations the cpu/memory resources on my 8-core, 16GB memory laptop across these 7 Vms. The file is simply YAML that is processed in RUBY by Vagrant & in PYTHON by Ansible. Our file looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
es-master-nodes: hosts: es-master-1: # hostname ansible_host: 192.168.33.25 # ip address ansible_user: vagrant memory: 2048 # ram to be assigned in MB ansible_ssh_private_key_file: .vagrant/machines/es-master-1/virtualbox/private_key es-data-nodes: hosts: es-data-1: ansible_host: 192.168.33.26 ansible_user: vagrant memory: 2048 ansible_ssh_private_key_file: .vagrant/machines/es-data-1/virtualbox/private_key es-data-2: ansible_host: 192.168.33.27 ansible_user: vagrant memory: 2048 ansible_ssh_private_key_file: .vagrant/machines/es-data-2/virtualbox/private_key kibana-nodes: hosts: kibana-1: ansible_host: 192.168.33.28 ansible_user: vagrant memory: 512 ansible_ssh_private_key_file: .vagrant/machines/kibana-1/virtualbox/private_key logstash-nodes: hosts: logstash-1: ansible_host: 192.168.33.29 ansible_user: vagrant memory: 1536 ansible_ssh_private_key_file: .vagrant/machines/logstash-1/virtualbox/private_key filebeat-nodes: hosts: filebeat-1: ansible_host: 192.168.33.30 ansible_user: vagrant memory: 512 ansible_ssh_private_key_file: .vagrant/machines/filebeat-1/virtualbox/private_key filebeat-2: ansible_host: 192.168.33.31 ansible_user: vagrant memory: 512 ansible_ssh_private_key_file: .vagrant/machines/filebeat-2/virtualbox/private_key |
2. The Vagrantfile
The Vagrantfile below builds each of the 7 Vms as per the specs in the inventory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
require 'rbconfig' require 'yaml' DEFAULT_BASE_BOX = "bento/ubuntu-16.04" cpuCap = 10 # Limit to 10% of the cpu inventory = YAML.load_file("inventory.yml") # Get the names & ip addresses for the guest hosts VAGRANTFILE_API_VERSION = '2' Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vbguest.auto_update = false inventory.each do |group, groupHosts| next if (group == "justLocal") groupHosts['hosts'].each do |hostName, hostInfo| config.vm.define hostName do |node| node.vm.box = hostInfo['box'] ||= DEFAULT_BASE_BOX node.vm.hostname = hostName # Set the hostname node.vm.network :private_network, ip: hostInfo['ansible_host'] # Set the IP address ram = hostInfo['memory'] # Set the memory node.vm.provider :virtualbox do |vb| vb.name = hostName vb.customize ["modifyvm", :id, "--cpuexecutioncap", cpuCap, "--memory", ram.to_s] end end end end end |
The vms are created simply with vagrant up --no-provision and the cluster is provisioned with Ansible.
3. The Playbook
The main playbook simple, delegating the specific app provisioning to roles, while overriding some defaults as needed. We override the port variables in the main playbook so we can see they match up as per our schematic for the cluster. Some other variables are overridden in group_vars/* files to keep them from cluttering the main playbook. The cluster is provisioned with
ansible-playbook -i inventory.yml elk.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
- hosts: es-master-nodes become: true roles: - { role: elastic.elasticsearch, cluster_http_port: 9201, cluster_transport_tcp_port: 9301} - hosts: es-data-nodes become: true roles: - { role: elastic.elasticsearch, cluster_http_port: 9201, cluster_transport_tcp_port: 9301} - hosts: kibana-nodes become: true roles: - { role: ashokc.logstash, kibana_server_port: 5601, cluster_http_port: 9201 } - hosts: logstash-nodes become: true roles: - { role: ashokc.logstash, cluster_http_port: 9201, filebeat_2_logstash_port: 5044 } - hosts: filebeat-nodes become: true roles: - {role: ashokc.filebeat, filebeat_2_logstash_port: 5044 } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
. ├── elk.yml ├── group_vars │ ├── all.yml │ ├── es-data-nodes.json │ ├── es-master-nodes.json │ ├── filebeat-nodes.yml │ ├── kibana-nodes.yml │ └── logstash-nodes.yml ├── inventory.yml ├── roles │ ├── ashokc.filebeat │ ├── ashokc.kibana │ ├── ashokc.logstash │ └── elastic.elasticsearch └── Vagrantfile |
1 2 3 4 5 6 |
public_iface: eth1 # For Vagrant Provider elk_version: 5.6.1 es_major_version: 5.x es_apt_key: https://artifacts.elastic.co/GPG-KEY-elasticsearch es_version: "{{ elk_version }}" es_apt_url: deb https://artifacts.elastic.co/packages/{{ es_major_version }}/apt stable main |
3.1 Elasticsearch
The provisioning of elasticsearch on master & data nodes is delegated to the excellent role elastic.elasticsearch published by elastic.co. As the role allows for multiple instances of ES on a host, we name the instances as “{{cluster_http_port}}_{{cluster_transport_port}}” which would be a unique identifier. The ES cluster itself is taken to be defined by this pair of ports that are used by all the master/data members of the cluster. If we rerun the playbook with a separate pair say 9202 & 9302 we will get a second cluster ‘9202_9302’ (in addition to ‘9201_9301’ that we do here on the first run) on the same set of hosts, and all would work fine.
The master node configuration variables are in group_vars/es-master-nodes.json. The key useful thing here is in the highlighted lines 5, 12 & 13 where we derive ”discovery.zen.ping.unicast.hosts’ and ‘network.host’ settings for elasticsearch from the information in inventory file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
{ "es_java_install" : true, "es_api_port": "{{cluster_http_port}}", "es_instance_name" : "{{cluster_http_port}}_{{cluster_transport_tcp_port}}", "masterHosts_transport" : "{% for host in groups['es-master-nodes'] %} {{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_trans port_tcp_port}}{%endfor %}", "es_config": { "cluster.name": "{{es_instance_name}}", "http.port": "{{cluster_http_port}}", "transport.tcp.port": "{{cluster_transport_tcp_port}}", "node.master": true, "node.data": false, "network.host": ["{{ hostvars[inventory_hostname]['ansible_' + public_iface]['ipv4']['address'] }}","_local_" ], "discovery.zen.ping.unicast.hosts" : "{{ masterHosts_transport.split() }}" } } |
The data node configuration variables are very similar in group_vars/es-data-nodes.json. The highlighted lines show the only changes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
{ "es_data_dirs" : "/opt/elasticsearch", "es_java_install" : true, "es_api_port": "{{cluster_http_port}}", "es_instance_name" : "{{cluster_http_port}}_{{cluster_transport_tcp_port}}", "masterHosts_transport" : "{% for host in groups['es-master-nodes'] %} {{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_trans port_tcp_port}}{%endfor %}", "es_config": { "cluster.name": "{{es_instance_name}}", "http.port": "{{cluster_http_port}}", "transport.tcp.port": "{{cluster_transport_tcp_port}}", "node.master": false, "node.data": true, "network.host": ["{{ hostvars[inventory_hostname]['ansible_' + public_iface]['ipv4']['address'] }}","_local_" ], "discovery.zen.ping.unicast.hosts" : "{{ masterHosts_transport.split() }}" } } |
3.2 Logstash
Logstash is provisioned with the role ashokc.logstash. The default variables for this role are overridden with group_vars/logstash-nodes.yml. Lines 4-5 specify the user & group that own this instance of logstash. Line 9 derives the elasticsearch urls from the inventory file. It will be used for configuring elasticsearch output sections.
1 2 3 4 5 6 7 8 9 10 11 |
es_java_install: True update_java: False logstash_version: "{{ elk_version }}" logstash_user: logstashUser logstash_group: logstashGroup logstash_enabled_on_boot: yes logstash_install_plugins: - logstash-input-beats esMasterHosts: "{% for host in groups['es-master-nodes'] %} http://{{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_http_port}} {% endfor %}" logstash_es_urls : "{{ esMasterHosts.split() }}" |
A simple elasticsearch output config & filebeat input config are enabled with:
1 2 3 4 5 |
output { elasticsearch { hosts => {{ logstash_es_urls | to_json }} } } |
1 2 3 4 5 |
input { beats { port => {{filebeat_2_logstash_port}} } } |
3.3 Kibana
Kibana is provisioned with the role ashokc.kibana. The default variables for this role are again overridden with group_vars/kibana-nodes.yml. Unlike logstash it is quite common to run multiple kibana servers on a single host with each instance targeting a separate ES cluster. This role allows for that and identifies the Kibana instance with the port it is running at (Line # 7). Lines 2, and 3 specify the owner/group for the instance.
1 2 3 4 5 6 7 |
kibana_version: "{{ elk_version }}" kibana_user: kibanaUser kibana_group: kibanaGroup kibana_enabled_on_boot: yes kibana_server_host: 0.0.0.0 kibana_elasticsearch_url : http://{{hostvars[groups['es-master-nodes'][0]]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_http_port}} kibana_instance: "{{kibana_server_port}}" |
The template file for ‘kibana.yml‘ below picks up the correct elasticsearch cluster url.
1 2 3 4 5 |
server.port: {{ kibana_server_port }} server.host: {{ kibana_server_host }} elasticsearch.url: {{ kibana_elasticsearch_url }} pid.file: {{ kibana_pid_file }} logging.dest: {{ kibana_log_file }} |
3.4 Filebeat
Filebeat is provisioned with the role ashokc.filebeat The default variables are overridden with groups_vars/filebeat-nodes.yml. Line 5 figures out the logstash connection to use.
1 2 3 4 5 6 7 |
filebeat_version: "{{ elk_version }}" filebeat_enabled_on_boot: yes filebeat_user: filebeatUser filebeat_group: filebeatGroup logstashHostsList: "{% for host in groups['logstash-nodes'] %} {{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{filebeat_2_logstash_por t}}{% endfor %}" filebeat_logstash_hosts: "{{ logstashHostsList.split() }}" |
Line #14 in the template for the sample filebeat.yml configures the output to our logstash host at the right port.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
filebeat.prospectors: - type: log enabled: true paths: - /tmp/custom.log fields: log_type: custom type: {{ansible_hostname}} from: beats multiline.pattern: '^\s[+]{2}\scontinuing .*' multiline.match: after output.logstash: hosts: {{ filebeat_logstash_hosts | to_nice_yaml }} |
4. Logs
The last step would be to run an application on the filebeat nodes and watch the logs flow into Kibana. Our application would simply be a perl script that writes the log file /tmp/custom.log. We login to each of the filebeat hosts and run the following Perl script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
#!/usr/bin/perl -w use strict ; no warnings 'once'; my @codes = qw (fatal error warning info debug trace) ; open(my $fh, ">>", "/tmp/custom.log") ; $fh->autoflush(1); my $now = time(); for my $i (1 .. 100) { my $message0 = "Type: CustomLog: This is a generic message # $i for testing ELK" ; my $nDays = int(rand(5)) ; my $nHrs = int(rand(24)) ; my $nMins = int(rand(60)) ; my $nSecs = int(rand(60)) ; my $timeValue = $now - $nDays * 86400 - $nHrs * 3600 - $nMins * 60 - $nSecs ; my $now1 = localtime($timeValue) ; my $nMulti = int(rand(10)) ; my $message = "$now1 $nDays:$nHrs:$nMins:$nSecs $nMulti:$codes[int(rand($#codes))] $message0" ; if ($nMulti > 0) { for my $line (1 .. $nMulti) { $message = $message . "\n ++ continuing the previous line for this log error..." } } print $fh "$message\n" ; } close $fh ; |
The corresponding sample logstash config file for processing this log would be placed at roles/ashokc.logstash/files/custom-filter.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
filter { if [fields][log_type] == "custom" { grok { match => [ "message", "(?<matched-timestamp>\w{3}\s+\w{3}\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}\s+\d{4})\s+(?<nDays>\d{1,3}):(?<nHrs>\d{1,2}):(?<nMi ns>\d{1,2}):(?<nSecs>\d{1,2})\s+(?<nLines>\d{1,2}):(?<code>\w+) Type: (?<given-type>\w+):[^#]+# (?<messageId>\d+)\s+%{GREEDYDATA}" ] add_tag => ["grokked"] add_field => { "foo_%{nDays}" => "Hello world, from %{nHrs}" } } mutate { gsub => ["message", "ELK", "BULK"] } date { match => [ "timestamp" , "EEE MMM d H:m:s Y", "EEE MMM d H:m:s Y" ] add_tag => ["dated"] } } } |
Conclusion
By placing appropriate filter files for logstash at roles/ashokc.logstash/files and prospector config file for filebeat at roles/ashokc.filebeat/templates/filebeat.yml.j2, one can use this ELK stack to analyze application logs. A variety of extensions are possible, for example enabling X-PACK login/security, other distributions & versions for ‘ashokc’ roles, automated testing etc… But then there is always more to be done, isn’t there?