Wednesday, August 29, 2018

Managing Mysql Automated Failover on Ec2 instances with Orchestrator

Orchestrator is a free and opensource mysql high availability and replication management tool whose major functionalities includes MySQL/MariaDB MasterDB failover in seconds (10-20 secs) considering our requirements and managing replication topology (Changing Replication Architecture by drag-drop or via Orchestrator CLI) with ease.

Orchestrator has the following features:-

1. Discovery:- It actively crawls through the topologies and maps them and can read the basic replication status and configuration and provides slick visualisation of topologies including replication problems.

2. Refactoring:- Understands replication rules, knows about binlog file:position, GTID, Pseudo GTID, Binlog Servers.Refactoring replication topologies can be a matter of drag & drop a replica under another master.

3. Recovery:- It can detect master and intermediate master failures. It can be configured to perform automated recovery or allow user to choose manual recovery. Master failover supported with pre/post failure hooks.

Mysql Benchmarking with sysbench

Benchmarking helps to establish the performance parameters for a mysql database on different instance sizes in AWS Cloud

TOOL:                          sysbench
MySQL Version:          5.6.39-log MySQL Community Server (GPL)
EC2 Instance type:     r4.xlarge (30 GB RAM, 4 CPU) 
DB Size:                      25 GB (10 tables)

MySQL 5.6.39-log MySQL Community Server (GPL)
sysbench /usr/share/sysbench/oltp_read_write.lua --threads=32 --events=0 --time=120 --mysql-user=root --mysql-password=XXXXXXXXX --mysql-port=3306 --tables=10 --delete_inserts=10
--index_updates=10 --non_index_updates=0 --table-size=10000000 --db-ps-mode=disable --report-interval=5 --mysql-host= run
sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 32
Report intermediate results every 5 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 5s ] thds: 32 tps: 581.73 qps: 26855.28 (r/w/o: 8201.90/14784.53/3870.84) lat (ms,95%): 99.33 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 32 tps: 586.70 qps: 26985.76 (r/w/o: 8222.39/14890.24/3873.13) lat (ms,95%): 97.55 err/s: 0.00 reconn/s: 0.00
[ 120s ] thds: 32 tps: 570.20 qps: 26232.19 (r/w/o: 7987.66/14500.71/3745.83) lat (ms,95%): 77.19 err/s: 0.00 reconn/s: 0.00

SQL statistics:
queries performed:
read: 969179
write: 1756892
other: 458342
total: 3184413
transactions: 69227 (576.70 per sec.)
queries: 3184414 (26528.16 per sec.)
ignored errors: 1 (0.01 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 120.0371s
total number of events: 69226
Latency (ms):
min: 10.15
avg: 55.48
max: 232.76
95th percentile: 82.96
sum: 3840451.69
Threads fairness:
events (avg/stddev): 2163.3125/62.64
execution time (avg/stddev): 120.0141/0.01

Upgrading AWS Instances to 5th Generation instances

Note:  Latest AWS CLI version is required.
Once you've installed and configured the AWS CLI, and created an AMI, please follow the steps below:

1) SSH into your Ubuntu instance.

2) Upgrade the kernel on your instance by running the command:
    sudo apt-get update && sudo apt-get install linux-image-generic

3) Stop the instance from the console or AWS CLI.

4) Using the AWS CLI, modify the instance attributes to enable ENA by running the command below:
    aws ec2 modify-instance-attribute --instance-id --ena-support --region ap-southeast-1

5) Using the AWS CLI, modify the instance attributes to change to the desired instance type (for example: m5.large)
    aws ec2 modify-instance-attribute --instance-id --instance-type m5.large --region ap-southeast-1

6) Start your instance from the console or the AWS CLI.

Once the instance boots, please confirm if the ENA module is in use on the network interface by running the command below:
    ethtool -i eth0
New AMI (to launch new instances): v7 with ena support 

Sunday, August 26, 2018

kubernetes Installation Part-1 (Kubeadm, kubectl, kubelet)

Kubernetes Installation Requirements

DynamoDB table getting throttled

Usually when a table is throttled while the consumed capacity is well below the provisioned capacity, it is an indicator of hot partition or a “burst” of read or write activity. Hot partition means one single partition is accessed more than other partitions and hence more capacity is consumed by that partition. Hot partition can be caused by unevenly distributed data across partitions due to hot partition key. A “burst” of read or write activity occurs when workloads rely on short periods of time with high usage, for example batch data upload.

Tagging EBS Volumes with Same tags as on EC2 instances in Autoscaling

Propagate the tags from a instance in an autoscaling group to the EBS volumes attached.

Although autoscaling group allows to apply tags to the instances this doesn't propagate to the instance volumes. So some scripting on user data section is needed during instance launch to properly tag the volumes of instances created by the autoscaling group.

You can use the below script to apply the tags to the EBS Volumes in an autoscaling group which should be created in the userdata field in the launch configuration of the autoscaling group. Also you need to attach a IAM role to the instance with the permissions of describe-instances,create-tags etc.

SSL certificate uploaded in AWS ACM but not available while create ELB/ALB

In case you have created an ACM SSL certificate however it is not available in the drop down list to associate with your load balancer.

The reason that you are unable to attach it to your ELB is because the certificate has a key length size of RSA-4096.

Although it is possible to import a SSL certificate of 4096 bits into ACM, currently the ELB supported Certificate types are RSA_1024 and RSA_2048. If you are using any other type of Certificate, it will unfortunately not be eligible for attachment to your ELB, which means that you won't be able to select it during the ELB creation process.

ACM supports 4096-bit RSA (RSA_4096)  certificates but integrated services (such as ELBs) allow only algorithms and key sizes they support to be associated with their resources.

Note that ACM certificates (including certificates imported into ACM), are regional resources. Therefore, you must import your certificate into the same region that your ELB is in in order to associate it with the ELB. 

Runbook to resolve some of the most common issues in Linux

Check the status of the particular FS by
df -ih

Check for the recently created files by entering the FS which is showing high inodes
find $1 -type f -print0 | xargs -0 stat --format '%Y :%y %n' | sort -nr | cut -d: -f2- | head

Check the directory which is having most of the files
find . -type d -print0 | xargs -0 -n1 count_files | sort -n

Creating a ssh config file so as not to pass the key or username in multiple servers

If you are running your servers in different VPC than based on the cidr range of the ip addresses and different username its difficult to remember all the keys  and username while connecting to the servers.

Concepts for Rolling Deployment strategy for applications in production environment

Rolling deployment strategy follows wherein servers are taken out in a batch for deployment instead of deploying onto all servers at once. This helps in lowering the HTTP 5xx and other such issues when the traffic is high on the application and the deployment needs to be performed.
What Applications Qualifies for this Deployment:
  • Applications running behind an Elastic load balancer.
  • Application which have been moved to build and deploy model for NodeJS Applications (Wherein we build the deployment package once instead of doing on the fly compilation/installation)

Saturday, August 25, 2018

Switching Roles between Different AWS Accounts

If you are having the Infrastructure running in different AWS Accounts than you don't need to logout and login individually to each AWS Account or use the different browser. You can simply switch the roles between the different account

1. Login to your primary account this would be the entry level account through which you can switch to any of your other aws account.

2. Click on your username at the top of the screen and choose Switch Role > Then choose switch role again.

It will ask you for the following information

Account Number:- (Account number in which you want to switch from existing account)
Role:- (Role name which has been assigned to your user in IAM)
Display Name:- (Easy to recognise name e.g. Production-Appname)

3. Click on switch role and you should be in the other account without loging out from your current account.

4. When done editing revert back to your account.

5. You can add any number of accounts under the switch role and move between different accounts seamlessly.

Kubernetes Architecture Explanation

Kubernetes Terminology

Kubernetes Master:- The master nodes in the kubernetes is a node which controls all the other nodes in the cluster. There can be several of those master nodes in a cluster if its said to be highly available or there can only be 1 if you got a single node cluster than that node will always going to be a master node.

Kube-Apiserver:- Is the main part of the cluster which answer all the apicalls. This uses the key value store for storing the configuration of other persistent storage such as ETCD.

ETCD:- Etcd is open source distributed key value store that provides the shared onfiguration and service discovery for the containers linux clusters. etcd runs on each machine in a cluster and gracefully handles the leader election during the network partitions and loss of the current leader. It is responsible for storing and replicating all kubernetes cluster state.

Service Discovery:- It is the automatic detection of the devices and services offerentby the devices on a computer network which uses service discovery protocol (SDP) which is network protocol that helps accompllish service discovery that aims to reduce the configuration effort from users.

Kube-Scheduler:- IT is reponsible for pods and there requisite containers which are going to come up on the cluster.Kube-Scheduler needs to take into account individual and collective resource requirements, hardware/software/policy constraints,data locality, interworkload interferene and so on. Workload specific requirements will be exposed through the API as necessary.

Cloud-Controller-Manager:- It is a daemon that embeds cloud-specific control loops. Since cloud provider develop and release at a different pace compared to the kubernetes project, abstracting the provider-specific code to the cloud-controller-manager binary allows cloud vendors to evolve independently from the core kubernetes code. It is responsible for persistent storage, routes for networking.

Control loop:- In application of automation, a control loop is a non terminating loop that regulates the state of the system.

Kube-Controller-Manager:- It is a daemon which embeds the control loop which ships with the kubernetes. It is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state. For e.g. Kubernetes uses replication controller, endpoint controller, namespace controller and serviceaccounts controller. Kube-Controller manager can use the Cloud-Controller manager to expose them to the cloud.

Node:- It was previously refered to as the minions. Node is worker machine in kubernetes and it can be VM or physical machine depending on the cluster. Each node contains services necessary to run pods and is managed by the master components. The service on a node include the container runtime, kubelet and kube-proxy.

Kubelet:- Kubelet is the primary "node agent" which runs on each node and works in terms of PodSpec. A PodSpec is a YAML or Json object that describes a pod. The Kubelet takes a set of PodSpecs that are provided through various mechanisms primarily through the KubeApiserver and ensures the containers running in those PodSpecs are running and healthy.

Kube-proxy:- It runs on each node in the cluster. This reflects services as defined in the Kubernetes API on each node and can do simple TCP and UDP stream forwarding/RoundRobin across a set of the backends. Service cluster IPs and ports are currently found through Docker-links-compatible-environment variables specifying the ports opened by the service proxy. There is an optional addon that provides cluster DNS for these cluster IPs. The user must create a service with apiserver API to configure the proxy.

Kubernetes Features

Creating Ubuntu Vms through Vagrant on Windows Hosts operating system

Important points to consider while create zookeepeer and kafka cluster

The kafka queue is a highly scalable and extreme fast queuing service which can really come in handy when you have to handle large amount of messages and have to build and services to work in async mode with an option to handle the fault in the services but not loosing the data at the same time you need the system to be scalable which can meet the ever growing demand of messages which would be pushed to this cluster.

Following are some of the important points to consider while creating the highly available kafka zookeeper cluster:-

1. If you want to scale your kafka nodes you should consider keeping the zookeeper on the separate nodes. This is particularly useful for environment where kafka messages throughput is extremely large and more number of brokers would be required after certain period of time to deal with the fault tolerance while maintaining the system to be scalable.

Kafka in itself is very scalable solution and in case you are not getting the data in TBs you can consider to keep the kafka and zookeeper on same nodes which will help you in the cost saving of the instances if you are running it in the cloud. So the best approach is the end use of the kafka and how much messages will it be handling over a period of time and the scalability required overall.

2. Zookeeper nodes saves the overall state of kafka cluster so your zookeeper cluster can be smaller in resources of infrastructure as compared to the kafka cluster. You should use the bigger instance sizes and more disk space in the kafka cluster than the zookeeper cluster.