In the previous post we built a virtual ELK cluster with Vagrant and Ansible, where the individual VMs comprising the cluster were carved out of a single host. While that allowed for a self-contained development & testing of all the necessary artifacts – it is not a real world scenario. The components of the ELK stack are usually on separate, possibly dedicated hosts. Fortunately this does not mean that we are at square one on our efforts to put up an ELK cluster in these cases. Having used ansible roles for each of the software components earlier, we already have an idempotent and reproducible means to deliver software to hosts. It is the provisioning of the hosts, and the targeting of sub-groups among them for different roles is what would be different, as we change the provisioner from Virtualbox to something else. Here we choose AWS as the host provisioner and devote the bulk of this blog to the mechanics of building the ELK cluster on AWS with Ansible. In the end we touch upon the small modifications needed to our earlier playbook for delivering software to these hosts.
=> Download from github to play along with the build out.
1. The cluster
We prepare a yml file with some information on the type, number of hosts for each group in the ELK cluster, along with some tags that allow us to pull out a specific group of hosts by a tag later for software delivery.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
awsHosts: master-nodes: instance_type: t2.small # 1 cpu & 2gb ram exact_count: 1 # Makes sure that there will only be one Master node instance_tags: Name: esMaster count_tag: Name: esMaster data-nodes: instance_type: t2.medium # 2 cpus & 4gb ram exact_count: 2 # Makes sure that there will be exactly 2 hosts with the tag 'esData' instance_tags: Name: esData # Allows us to refer to all data nodes via "tag_Name_esData" later on count_tag: Name: esData logstash-nodes: instance_type: t2.small exact_count: 1 instance_tags: Name: logstash count_tag: Name: logstash kibana-nodes: instance_type: t2.micro # 1 cpu & 1gb ram exact_count: 1 instance_tags: Name: kibana count_tag: Name: kibana filebeat-nodes: instance_type: t2.micro exact_count: 2 instance_tags: Name: filebeat count_tag: Name: filebeat |
2. Provision Hardware
There are a number of ways we can have hosts (“ec2 instances”) created on AWS for our needs – Aws console UI, Aws CLI, Variety of SDKs, Vagrant, Ansible, etc… Here we opt for Ansible as that will be our means later for delivering software as well via roles to these hosts. So we will have 2 playbooks – provisionHardware.yml for building a cluster of ssh’able hosts as per the specs in cluster.yml, and the second playbook provisionSoftware.yml for delivering ELK software to these hosts. This provisionSoftware.yml playbook would be essentially the same as the one we used earlier with Vagrant – save for some minor changes to accommodate AWS Vs Vagrant differences for targeting hosts & ssh’ing to them.
Ansible has a series of excellent ec2 modules that allow us to orchestrate cluster set up on AWS from scratch. These tasks run locally, communicate with AWS via API and spin up the hosts as per the specs. Having an AWS account and credentials in hand for api access is a prerequisite of course. Here are a sequence of steps that we convert to ansible tasks in the playbook.
- Set up a (non-default) VPC with an Internet Gateway, and a public subnet with routing.
- Set up a security group that allows all communication within that group, and allows access to ports 22 (for ssh) & 5601 (for kibana) from outside ( ansible host)
- Generate a key-pair for use with SSH access needed for running provisionSoftware.yml playbook.
- Provision the cluster hosts listed in cluster.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# ELK setup on AWS - hosts: localhost connection: local gather_facts: False vars: # variables - region: us-west-2 - image: ami-b9ff39d9 # ubuntu xenial - cidr: 172.17.0.0/16 # allows for over 65000 hosts - local_ip: zzz.zzz.zzz.zzz # Replace with the IP address of the ansible host vars_files: - aws-secrets.yml # api credentials best protected by ansible-vault - cluster.yml # details of the cluster hosts, counts, tags tasks: - name: Set up VPC for ELK ec2_vpc_net: name: ELK_VPC cidr_block: "{{cidr}}" region: "{{region}}" tags: Name: ELK non-default VPC register: elkVpc - name: Set up an Internet Gateway for this non-Default VPC ec2_vpc_igw: vpc_id: "{{elkVpc.vpc.id}}" region: "{{region}}" state: present tags: Name: ELK internet gateway register: elkIgw - name: Create a subnet for ELK within this VPC ec2_vpc_subnet: state: present vpc_id: "{{elkVpc.vpc.id}}" cidr: "{{cidr}}" map_public: yes tags: Name: ELK public subnet register: elkSubnet - name: Set up routing for the subnet ec2_vpc_route_table: vpc_id: "{{elkVpc.vpc.id}}" region: "{{region}}" tags: Name: ELK route table subnets: - "{{ elkSubnet.subnet.id }}" routes: - dest: 0.0.0.0/0 gateway_id: "{{ elkIgw.gateway_id }}" register: public_route_table - name: Set up Security Group for ELK ec2_group: name: ELK_Security_Group description: A security group to be used with ELK stack vpc_id: "{{elkVpc.vpc.id}}" region: "{{region}}" aws_access_key: "{{aws_access_key}}" # from aws-secrets.yml aws_secret_key: "{{aws_secret_key}}" # from aws-secrets.yml tags: app_group: ELK rules: - proto: tcp ports: - 22 - 5601 cidr_ip: "{{local_ip}}/32" # ports 22, and 5601 are allowed ONLY from this ansible host - proto: all group_name: ELK_Security_Group # allow all communication across the cluster hosts rules_egress: - proto: all cidr_ip: 0.0.0.0/0 register: elkSg - name: Set up an ELK key ec2_key: # The key-pair used for ssh access to these hosts name: "{{key_pair}}" region: "{{region}}" register: elkKey - name: Save the private key copy: content: "{{ elkKey.key.private_key }}" dest: "~/.ssh/{{key_pair}}.pem" # save the private key mode: 0600 when: elkKey.changed - name: Provision a set of instances ec2: # Iterate over the dictionary read from the "cluster.yml" file aws_access_key: "{{aws_access_key}}" aws_secret_key: "{{aws_secret_key}}" key_name: "{{key_pair}}" vpc_subnet_id: "{{elkSubnet.subnet.id}}" group_id: "{{elkSg.group_id}}" region: "{{region}}" instance_type: "{{item.value.instance_type}}" image: "{{image}}" wait: true exact_count: "{{item.value.exact_count}}" instance_tags: "{{item.value.instance_tags}}" count_tag: "{{item.value.count_tag}}" with_dict: "{{awsHosts}}" |
1 2 3 4 |
... [ OTHER STUFF ] ... key_pair: ELK_KEY_PAIR ansible_ssh_private_key_file: ~/.ssh/{{key_pair}}.pem |
1 2 3 4 |
--- aws_access_key: YOUR AWS_ACCESS_KEY aws_secret_key: YOUR AWS_SECRET_KEY |
1 2 |
#!/bin/bash ansible-playbook -v -i 'localhost,' provisionHardware.yml --ask-vault-pass |
- Line #11: Make sure to give the IP address of your ansible host. This host will have SSH access to the cluster hosts & access to Kibana
- Line #14: The credentials for api access are placed in the file aws-secrets.yml
- Line #74: The ansible host is granted access to ports 22 (ssh) & 5601 (kibana) on all the hosts. There is no app at ‘5601’ on non-Kibana nodes of course, but perhaps there is no need to be picky about this 🙂
- Line #91: The generated key-pair is copied to the same location as what is referenced in group_vars/all.yml as the provisionSoftware.yml playbook needs it.
- Lines #95 – 109: The cluster.yml file is read into a dictionary and iterated over to provision all the hosts
Dynamic inventory
When working with a cloud provider such as AWS, the hosts will come and go over time. If an instance has been terminated, running the provisionHardware.yml playbook again will re-provision it, but the IP address can be different. So it is best to query AWS for the current inventory details, at the time software is being provisioned. There are scripts available that will readily do this for us. Here are a few steps.
- Install the Boto module for your Python, say like: sudo pip install boto
- Download ec2.py and ec2.ini . chmod uog+x ec2.py
- Run the following shell script
1 2 3 4 5 6 7 |
#!/bin/bash export AWS_ACCESS_KEY_ID="YOUR_AWS_ACCESS_KEY_ID" export AWS_SECRET_ACCESS_KEY="YOUR_AWS_SECRET_ACCESS_KEY" export EC2_INI_PATH="$PWD/ec2.ini" $PWD/ec2.py |
3. Provision Software
With (a) the infrastructure in place, and (b) a way to get a list of the available hosts, and (c) the ansible roles already developed and tested – all we need to do now is to just run our playbook provisionSoftware.yml against this infrastructure. We have a bit of housekeeping to do however before that to account for the differences between Vagrant/Virtualbox hosts & Aws hosts. The main difference is the need for an extra ‘pre_tasks’ section that should be run before applying the roles.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
- hosts: security_group_ELK_Security_Group # a group of all hosts in the cluster name: Run a pre_task on all hosts to insyall python user: ubuntu gather_facts: False # Cannot gather facts until pre_tasks are done pre_tasks: - name: apt update on ubuntu raw: sudo apt-get -y update ignore_errors: true - name: install python2 on ubuntu raw: sudo apt-get -y install python-simplejson ignore_errors: true notify: Gather facts now # can gather facts now via a handler handlers: - name: Gather facts now setup: - hosts: tag_Name_esMaster # group of all ES master nodes become: true roles: - { role: elastic.elasticsearch, cluster_http_port: 9201, cluster_transport_tcp_port: 9301} - hosts: tag_Name_esData # group of all ES data nodes become: true roles: - { role: elastic.elasticsearch, cluster_http_port: 9201, cluster_transport_tcp_port: 9301} - hosts: tag_Name_kibana # group of all kibana nodes become: true roles: - { role: ashokc.kibana, kibana_server_port: 5601, cluster_http_port: 9201 } - hosts: tag_Name_logstash # group of all logstash nodes become: true roles: - { role: ashokc.logstash, cluster_http_port: 9201, filebeat_2_logstash_port: 5044 } - hosts: tag_Name_filebeat # group of all (application) nodes with filebeat become: true roles: - {role: ashokc.filebeat, filebeat_2_logstash_port: 5044 } |
- For ansible modules to run on a target host, that host will need to have the right Python packages. With Vagrant we had chosen a box that was already blessed with those. But the chosen AWS ami may not have it. So, before any ansible modules can run on the AWS host to apply the roles, we need to do a ‘raw install’ of these Python packages on those hosts. This is done by adding ‘pre_tasks’ to provisionSoftware.yml in Lines #1 – #20. The ‘pre_tasks’ run prior any role being applied.
- The way we get ‘groups’ of hosts is by using the instance_tags we specified in cluster.yml. The filenames in group_vars should be accordingly changed as well.
1 2 3 4 5 6 7 |
group_vars/ ├── all.yml ├── tag_Name_esData.json ├── tag_Name_esMaster.json ├── tag_Name_filebeat.yml ├── tag_Name_kibana.yml └── tag_Name_logstash.yml |
As for the contents of the above files, couple of changes are in order with respect to getting the IP address of the host. For example in the files ‘tag_Name_esData.json’ & ‘tag_Name_esMaster.json’:
“masterHosts_transport” : “{% for host in groups[‘es-master-nodes’] %} {{hostvars[host][‘ansible_’+public_iface][‘ipv4’][‘address’] }}:{{cluster_trans port_tcp_port}}{%endfor %}”,
“masterHosts_transport” : “{% for host in groups[‘tag_Name_esMaster’] %} {{hostvars[host][‘ec2_private_ip_address’]}}:{{cluster_transport_tcp_port}}{% endfor %}”,
“network.host”: [“{{ hostvars[inventory_hostname][‘ansible_’ + public_iface][‘ipv4’][‘address’] }}”,”_local_” ],
“network.host”: [“{{ hostvars[inventory_hostname][‘ec2_private_ip_address’] }}”,”_local_” ],
With similar changes made to the other yml files, we are finally ready to provision software to the cluster by running the following script
1 2 |
#!/bin/bash ansible-playbook -u ubuntu -v -i ./getAwsInventory.sh --ask-vault-pass |
4. Testing
With the ELK cluster up and running, we can generate some logs on the filebeat hosts and watch them flow into Kibana. For this we can simply do:
- Find the IP address of a filebeat host and copy genLogs.pl to that host. Log into that host and run the Perl script. Replace “xxx.xxx.xxx.xxx” below with the actual IP.
1 2 3 4 5 |
scp -i ~/.ssh/ELK_KEY_PAIR.pem ./genLogs.pl ubuntu@xxx.xxx.xxx.xxx:/home/ubuntu/genLogs.pl ssh -i ~/.ssh/ELK_KEY_PAIR.pem xxx.xxx.xxx.xxx -l ubuntu ./genLogs.pl |
- Get the IP address of KIbana host and go to the following URL. Replace yyy.yyy.yyy.yyy below with the actual Kibana IP
1 |
http://yyy.yyy.yyy.yyy:5601 |
5. Summary
Our objective was to set up an ELK cluster on AWS. We split that into 2 ansible playbooks –
- one for provisioning hardware as per specs, and
- the other for provisioning software via previously developed & tested roles
While there may be number of other ways to skin this cat, it looks like we have achieved our objective… Agree or disagree?