-->

Monday, October 30, 2017

Installing Salt on Centos 7

Install the latest version of the salt from the salt repo directly instead of the epel repo as the salt repo provides you with the latest version of the salt available while epel repo is having slightly lower end packages as of now due to the dependency issues of packages of lower version.

Create a following salt repo in the /etc/yum.repos.d directory

vim saltstack.repo


Saturday, October 14, 2017

Comparision between the Amazon AWS Transit VPC and Ipsec VPN Environment

-> The AWS Transit VPC is a Transit Overlay connection solution between multiple Spoke VPC's and reduces the overhead of configuring mesh of VPN connections between your datacenter to each different VPC's. Rather, the Spoke VPC's just need to connect to the Transit VPC for inter-spoke(VPC-VPC) connectivity and to extend the spokes to connect to the remote network/datacenters. This solution reduces the pain to implement individual IPsec VPN connections from your datacenters to each different VPCs located in different regions.

-> Transit VPC uses the same IPsec VPN environment to connect to different AWS regions as a normal IPsec VPN connection between two different regions. I reviewed the output [mtr and ping] you shared "mtr_result_manual_ipsec.png", "mtr_result_transit_vpc.png" and it looks like you are getting relatively same latency over Transit VPC setup and manual IPsec setup, which is normal as both the setup are using IPsec VPN tunnel between Singapore and Mumbai.

-> VPN tunnels are created over public Internet (one or more ISP) so they are subject to all states on path like traffic congestion,ISP limits, latency, fragmentation, MSS, MTU, and TCP windowing that will affect the traffic. Any disruptions or network changes over the Internet(ISP networks) will also affect the VPN traffic.

-> To conclude, both the setup uses IPsec VPN connection over Internet to connect to different geographical regions and there is no service level agreement provided for Transit VPC setup. Traffic for both the setup will be subjected to all adverse conditions over ISPs network.

-> Transit VPC provides better management in terms of architectural design and is not related to the traffic behavior. Transit VPC makes it easier for customer datacenter to connect to a transit hub which will establish a VPN connections with each different(Spoke) VPC's in different region rather establishing VPN connections separately from remote network to each VPCs in different region. Also, Transit VPC allows you to scale your infrastructure easily.

Wednesday, September 27, 2017

Enabling the JMX PORT in kafka

If you want to get the metrics for the monitoring of the kafka than you need to enable the JMX port for the kafka which is 9999 be default.

You need to configure this port in the kafka/bin/kafka-server-start.sh by exporting the JMX_PORT which can be used to get the metrics for the kafka. The same port is also used by the datadog agent for providing the metrices of the kafka cluster.

Just enter the following line in the kafka-server-start.sh

 export JMX_PORT=${JMX_PORT:-9999}  
  exec $base_dir/kafka-run-class.sh $EXTRA_ARGS kafka.Kafka "$@"

Afterwards you will need to restart the kafka broker service to make this active.

Verify if the service is listening on the port 9999

# telnet localhost 9999
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

# netstat -tunlp | grep 9999
tcp        0      0 0.0.0.0:9999            0.0.0.0:*               LISTEN      20031/java

This confirms the jmx port was successfully configured.

Saturday, September 23, 2017

Manually allocating shards when Elasticsearch cluster is red

If you have large number of shards  with replica sets with huge amount of data its possible that you  get the ES cluster as red. The ES cluster goes red due to the issues with the primary shards which gets unassigned now depending on the situation you can take number of steps to resolve this issue.

However as the last resort you might have to allocate the shard manually but its last recommendation best way it to figure out whats the issue with the cluster i.e. why its not assigning the shards.

As a pre step you need to set the replication off otherwise you would have comparatively higher number of unassigned shards and that might take lot of time so if you want to quickly recover its better to set the replicas to 0 and than you can allow them back at a later point in time

Monday, September 18, 2017

Adding logrotation in elasticsearch

Elasticsearch supports log rotation with built in functionality you just need to configure the log4j.properties for the same.

Just copy the below configuration file at the following location

vim /etc/elasticsearch/log4j2.properties

 +status = error  

 +
 +# log action execution errors for easier debugging
 +logger.action.name = org.elasticsearch.action
 +logger.action.level = debug
 +
 +appender.console.type = Console
 +appender.console.name = console
 +appender.console.layout.type = PatternLayout
 +appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%m%n
 +
 +appender.rolling.type = RollingFile
 +appender.rolling.name = rolling
 +appender.rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}.log
 +appender.rolling.layout.type = PatternLayout
 +appender.rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%.-10000m%n
 +appender.rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}-%d{yyyy-MM-dd}.log.gz
 +appender.rolling.policies.type = Policies
 +appender.rolling.policies.time.type = TimeBasedTriggeringPolicy
 +appender.rolling.policies.time.interval = 1
 +appender.rolling.policies.time.modulate = true
 +
 +appender.rolling.strategy.type = DefaultRolloverStrategy
 +appender.rolling.strategy.action.type = Delete
 +appender.rolling.strategy.action.basepath = ${sys:es.logs.base_path}
 +appender.rolling.strategy.action.condition.type = IfLastModified
 +appender.rolling.strategy.action.condition.age = 5D
 +appender.rolling.strategy.action.PathConditions.type = IfFileName
 +appender.rolling.strategy.action.PathConditions.glob = ${sys:es.logs.cluster_name}-*
 +
 +rootLogger.level = info
 +rootLogger.appenderRef.console.ref = console
 +rootLogger.appenderRef.rolling.ref = rolling
 +
 +appender.deprecation_rolling.type = RollingFile
 +appender.deprecation_rolling.name = deprecation_rolling
 +appender.deprecation_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation.log
 +appender.deprecation_rolling.layout.type = PatternLayout
 +appender.deprecation_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%.-10000m%n
 +appender.deprecation_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation-%i.log.gz
 +appender.deprecation_rolling.policies.type = Policies
 +appender.deprecation_rolling.policies.size.type = SizeBasedTriggeringPolicy
 +appender.deprecation_rolling.policies.size.size = 1GB
 +appender.deprecation_rolling.strategy.type = DefaultRolloverStrategy
 +appender.deprecation_rolling.strategy.max = 4
 +
 +logger.deprecation.name = org.elasticsearch.deprecation
 +logger.deprecation.level = warn
 +logger.deprecation.appenderRef.deprecation_rolling.ref = deprecation_rolling
 +logger.deprecation.additivity = false
 +
 +appender.index_search_slowlog_rolling.type = RollingFile
 +appender.index_search_slowlog_rolling.name = index_search_slowlog_rolling
 +appender.index_search_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_search_slowlog.log
 +appender.index_search_slowlog_rolling.layout.type = PatternLayout
 +appender.index_search_slowlog_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c] %marker%.-10000m%n
 +appender.index_search_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_search_slowlog-%d{yyyy-MM-dd}.log
 +appender.index_search_slowlog_rolling.policies.type = Policies
 +appender.index_search_slowlog_rolling.policies.time.type = TimeBasedTriggeringPolicy
 +appender.index_search_slowlog_rolling.policies.time.interval = 1
 +appender.index_search_slowlog_rolling.policies.time.modulate = true
 +
 +logger.index_search_slowlog_rolling.name = index.search.slowlog
 +logger.index_search_slowlog_rolling.level = trace
 +logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
 +logger.index_search_slowlog_rolling.additivity = false
 +
 +appender.index_indexing_slowlog_rolling.type = RollingFile
 +appender.index_indexing_slowlog_rolling.name = index_indexing_slowlog_rolling
 +appender.index_indexing_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_indexing_slowlog.log
 +appender.index_indexing_slowlog_rolling.layout.type = PatternLayout
 +appender.index_indexing_slowlog_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c] %marker%.-10000m%n
 +appender.index_indexing_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_indexing_slowlog-%d{yyyy-MM-dd}.log
 +appender.index_indexing_slowlog_rolling.policies.type = Policies
 +appender.index_indexing_slowlog_rolling.policies.time.type = TimeBasedTriggeringPolicy
 +appender.index_indexing_slowlog_rolling.policies.time.interval = 1
 +appender.index_indexing_slowlog_rolling.policies.time.modulate = true
 +
 +logger.index_indexing_slowlog.name = index.indexing.slowlog.index
 +logger.index_indexing_slowlog.level = trace
 +logger.index_indexing_slowlog.appenderRef.index_indexing_slowlog_rolling.ref = index_indexing_slowlog_rolling
 +logger.index_indexing_slowlog.additivity = false

Monday, August 28, 2017

Listing the complete IP range of the AWS

If you want to whitelist the IPs of AWS, use the following command to list the all ips of  aws

curl https://ip-ranges.amazonaws.com/ip-ranges.json -s | jq '.prefixes[] | select(.region=="us-east-1" and .service=="EC2").ip_prefix'



Tuesday, June 20, 2017

docker version and docker info commands

You can use the docker version command which gives you information about Current version of the docker installed, api version, go version and built.
 [user@ankit63001 ~]$ docker images  
 REPOSITORY     TAG         IMAGE ID      CREATED       SIZE
  hello-world     latest       1815c82652c0    5 days ago     1.84kB
 [user@ankit63001 ~]$ docker version
 Client:
 Version:   17.05.0-ce
 API version: 1.29
 Go version:  go1.7.5
 Git commit:  89658be
 Built:    Thu May 4 22:06:25 2017
 OS/Arch:   linux/amd64 
 Server:
 Version:   17.05.0-ce
 API version: 1.29 (minimum version 1.12)
 Go version:  go1.7.5
 Git commit:  89658be
 Built:    Thu May 4 22:06:25 2017
 OS/Arch:   linux/amd64
 Experimental: false
 [user@ankit63001 ~]$

Thursday, March 16, 2017

Deploying an EC2 Instance Using Terraform

You can use the following Terraform script to Deploy an Instance in your AWS Account. Terraform will create a t2.medium instance from the official RHEL7.2 AMI using the AMI ID within the specified subnet. And will create a 30GB root block device and a 10GB Ebs volume. The instance will use a predefined key and will add the specified tags to the Instance Being Launched.

 provider "aws" {  
  access_key = "AKXXXXXXXXXXXXXXXXX"  
  secret_key = "2YXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXxx"  
  region   = "ap-south-1"  
 }  
 resource "aws_instance" "instance_name" {  
  ami = "ami-cdbdd7a2"  
  count = 1  
  instance_type = "t2.medium"  
  security_groups = ["sg-f70674re"]  
  subnet_id = "subnet-526bcb6d"  
  root_block_device = {  
   volume_type = "standard"  
   volume_size = "30"  
  }   
  ebs_block_device = {  
   device_name = "/dev/sdm"  
   volume_type = "gp2"  
   volume_size = "10"  
  }  
  source_dest_check = true  
  key_name = "Keyname"  
  tags {  
   Name = "tagname"  
  }  
 }  

Thursday, March 9, 2017

Custom Cloudwatch Alarm Configuration Part-8

As discussed in the previous post regarding the alarm plugins those plugins are used to push the metrics data to the cloudwatch using the cron running every minute or 5minutes depending upon your requirements.

Next we have to create the alarms in the cloudwatch on the above metrics which works on the logic that if the metrics crosses the threshold value than an event is triggered which could be like send a mail through sns alerting that the value has crossed the threshold and if it agains comes below threshold than it state is changed from alarm to ok which is more like a recovery.

But unlike from the console we are going to trigger this programmatically using the AWS CLI provided by the AWS. The script works sequentially and uses the array which runs in a loop and all the relevant alarms are created.

The most important thing to be considered here is the name of the alarm which is to be created in the cloudwatch. Now you can put any name but the name based on programmatic assumptions following a meaningful pattern should be used so that you are able to easily identify the environment, application, alarm type, service affected is easily convened. And the team receiving can immediately work towards its resolution.

Wednesday, March 8, 2017

Custom Cloudwatch Plugins CW_tcpConnections Part-7

The following Cloudwatch plugin can be used to determine the established tcp connections.

 #!/bin/bash
#
#  About                : Check TCP connections
#
#  Name                 : cw_tcpconnection.sh

DIR=$(dirname $0);
PLUGIN_NAME='cw_tcpconnection';

# Include configuration file
source ${DIR}/../conf/plugin.conf;


#Get Current Instance ID
INSTANCE_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`);
#Get Hostname
HOST_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`);

# Help
usage() {
        echo "Usage: $0 [-n ] [-d ] [-m ] [-h ] [-p ]" 1>&2;
        exit 1;
}

# Logger
logger(){

 SEVERITY=$1;
 MESSAGE=$2;
 DATE=`date +"[%Y-%b-%d %H:%M:%S.%3N]"`;

 echo -e "${DATE} [${SEVERITY}] [${PLUGIN_NAME}] [${INSTANCE_ID}] [${HOST_ID}] ${MESSAGE}" >> ${DIR}/../logs/appcwmon.log;

}

# Process Arguments

if [ $# -eq 0 ]; then
        # When no argument is passed
        logger ERROR "Invalid arguments passed";
        usage;
fi



while getopts ":n:d:m:h:p:" o; do
    case "${o}" in
        n)
            NAMESPACE=${OPTARG}
            if [ -z "${NAMESPACE}" ]; then
                logger ERROR "Invalid Namespace passed";
                usage;
            fi
            ;;
        d)
            DIMENSION=${OPTARG};

            DNAME=${DIMENSION%=*};
            DVALUE=${DIMENSION#*=};

            if [ -z "${DIMENSION}" ] || [ -z "${DNAME}" ] || [ "${DNAME}" == "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi

            # If Dimension name is 'InstanceId' then Value is not required to be passed
            if [ "${DNAME}" != 'InstanceId' ] && [ -z "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi
            ;;
        m)
            METRICS=${OPTARG}
            if [ -z "${METRICS}" ]; then
                logger ERROR "Invalid metrices passed <${METRICS}>";
                usage;
            fi
            ;;
        h)
            HOST=${OPTARG}
            if [ -z "${HOST}" ]; then
                logger ERROR "Invalid hostname passed <${HOST}>";
                usage;
            fi
            ;;
        p)
            PORT=${OPTARG}
            if [ -z "${PORT}" ]; then
                logger ERROR "Invalid port passed <${PORT}>";
                usage;
            fi
            ;;

        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

# Input Validation
if [ -z "${NAMESPACE}" ] || [ -z "${DNAME}" ] || [ -z "$METRICS" ] || [ -z "$HOST" ] || [ -z "$PORT" ]; then
                logger ERROR "Invalid argument passed";
    usage
fi


##########################################################
##########################################################


# If "INSTANCE_ID" is passed as Dimension, then use actual AWS Instanec ID as Dimension
if [ "${DNAME}" == "InstanceId" ]; then
        DVALUE=${INSTANCE_ID};
fi


UNIT="Count";

&1); if [ "$?" -ne "0" ]; then logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT} $HOST $PORT | ${OUTPUT}"; exit 1; fi; logger INFO "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT} $HOST $PORT"; # Success exit 0;

Custom Cloudwatch Plugins CW_Rabbitmq Queue Message Length Part-6

The following Cloudwatch plugin helps to measure the number of messages in the Rabbitmq unack,Ready,Total Message on which alarms can be configured later using the cloudwatch API.


#!/bin/bash
#
#  About                : Check RabbitMQ Queue Message Length
#
#  Name                 : cw_rabbitmq.sh



DIR=$(dirname $0);
PLUGIN_NAME='cw_rabbitmq';

# Include configuration file
source ${DIR}/../conf/plugin.conf;

#Get Current Instance ID
INSTANCE_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`);
#Get Hostname
HOST_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`);

# Help
usage() {
        echo "Usage: $0 [-n ] [-d ] [-m ] [-u ] [-p ] [-q ]" 1>&2;
        exit 1;
}

# Logger
logger(){

 SEVERITY=$1;
 MESSAGE=$2;
 DATE=`date +"[%Y-%b-%d %H:%M:%S.%3N]"`;

 echo -e "${DATE} [${SEVERITY}] [${PLUGIN_NAME}] [${INSTANCE_ID}] [${HOST_ID}] ${MESSAGE}" >> ${DIR}/../logs/appcwmon.log;

}

# Process Arguments

if [ $# -eq 0 ]; then
        # When no argument is passed
        logger ERROR "Invalid arguments passed";
        usage;
fi



while getopts ":n:d:m:u:p:q:" o; do
    case "${o}" in
        n)
            NAMESPACE=${OPTARG}
            if [ -z "${NAMESPACE}" ]; then
                logger ERROR "Invalid Namespace passed";
                usage;
            fi
            ;;
        d)
            DIMENSION=${OPTARG};

            DNAME=${DIMENSION%=*};
            DVALUE=${DIMENSION#*=};

            if [ -z "${DIMENSION}" ] || [ -z "${DNAME}" ] || [ "${DNAME}" == "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi

            # If Dimension name is 'InstanceId' then Value is not required to be passed
            if [ "${DNAME}" != 'InstanceId' ] && [ -z "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi
            ;;
        m)
            METRICS=${OPTARG}
            if [ -z "${METRICS}" ]; then
                logger ERROR "Invalid metrices passed <${METRICS}>";
                usage;
            fi
            ;;
        u)
            USERNAME=${OPTARG}
            if [ -z "${USERNAME}" ]; then
                logger ERROR "Invalid username passed <${USERNAME}>";
                usage;
            fi
            ;;
        p)
            PASSWORD=${OPTARG}
            if [ -z "${PASSWORD}" ]; then
                logger ERROR "Invalid password passed <${PASSWORD}>";
                usage;
            fi
            ;;
        q)
            QUEUE=${OPTARG}
            if [ -z "${QUEUE}" ]; then
                logger ERROR "Invalid queue passed <${QUEUE}>";
                usage;
            fi
            ;;


        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

# Input Validation
if [ -z "${NAMESPACE}" ] || [ -z "${DNAME}" ] || [ -z "$METRICS" ] || [ -z "$USERNAME" ] || [ -z "$PASSWORD" ] || [ -z "$QUEUE" ]; then
                logger ERROR "Invalid argument passed";
    usage
fi


##########################################################
##########################################################


# If "INSTANCE_ID" is passed as Dimension, then use actual AWS Instanec ID as Dimension
if [ "${DNAME}" == "InstanceId" ]; then
        DVALUE=${INSTANCE_ID};
fi


UNIT="Count";

## Total Message
TOTAL_MESSAGE=$(/usr/local/bin/rabbitmqadmin --username=${USERNAME} --password="${PASSWORD}" list queues name messages | grep -w "${QUEUE} " | awk '{print $4}' 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=NULL unit=${UNIT} | ${TOTAL_MESSAGE}";
        exit 1;
fi;

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name Total-${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${TOTAL_MESSAGE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=${TOTAL_MESSAGE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

## Unacknowledge Message
UNACK_MESSAGE=$(/usr/local/bin/rabbitmqadmin --username=${USERNAME} --password="${PASSWORD}" list queues name messages_unacknowledged | grep -w "${QUEUE} " | awk '{print $4}' 2>&1);
if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=NULL unit=${UNIT} | ${UNACK_MESSAGE}";
        exit 1;
fi;

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name UnACK-${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${UNACK_MESSAGE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=${UNACK_MESSAGE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

## Ready Message
READY_MESSAGE=$(/usr/local/bin/rabbitmqadmin --username=${USERNAME} --password="${PASSWORD}" list queues name messages_ready | grep -w "${QUEUE} " | awk '{print $4}' 2>&1);
if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=NULL unit=${UNIT} | ${READY_MESSAGE}";
        exit 1;
fi;

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name Ready-${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${READY_MESSAGE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=${READY_MESSAGE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

## Consumer Count
CONSUMER=$(/usr/local/bin/rabbitmqadmin --username=${USERNAME} --password="${PASSWORD}" list queues name consumers | grep -w "${QUEUE} " | awk '{print $4}' 2>&1);
if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=NULL unit=${UNIT} | ${CONSUMER}";
        exit 1;
fi;

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name ConsumerCount-${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${CONSUMER} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=${CONSUMER} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

logger INFO "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${QUEUE} value=${TOTAL_MESSAGE},${UNACK_MESSAGE},${READY_MESSAGE},${CONSUMER} unit=${UNIT}";
# Success
exit 0;

   

Custom Cloudwatch Plugins CW_ProcessCount Part-5

You can monitor the number of process running for a service to determine whether the service is running or not on the Server using the following cloudwatch plugin.

 #!/bin/bash
#
#  About                : Check Process Running Status
#
#  Name                 : cw_process.sh

DIR=$(dirname $0);
PLUGIN_NAME='cw_process';

# Include configuration file
source ${DIR}/../conf/plugin.conf;


#Get Current Instance ID
INSTANCE_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`);
#Get Hostname
HOST_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`);

# Help
usage() {
        echo "Usage: $0 [-n ] [-d ] [-m ] [-p ]" 1>&2;
        exit 1;
}

# Logger
logger(){

 SEVERITY=$1;
 MESSAGE=$2;
 DATE=`date +"[%Y-%b-%d %H:%M:%S.%3N]"`;

 echo -e "${DATE} [${SEVERITY}] [${PLUGIN_NAME}] [${INSTANCE_ID}] [${HOST_ID}] ${MESSAGE}" >> ${DIR}/../logs/appcwmon.log;

}

# Process Arguments

if [ $# -eq 0 ]; then
        # When no argument is passed
        logger ERROR "Invalid arguments passed";
        usage;
fi



while getopts ":n:d:m:p:" o; do
    case "${o}" in
        n)
            NAMESPACE=${OPTARG}
            if [ -z "${NAMESPACE}" ]; then
                logger ERROR "Invalid Namespace passed";
                usage;
            fi
            ;;
        d)
            DIMENSION=${OPTARG};

            DNAME=${DIMENSION%=*};
            DVALUE=${DIMENSION#*=};

            if [ -z "${DIMENSION}" ] || [ -z "${DNAME}" ] || [ "${DNAME}" == "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi

            # If Dimension name is 'InstanceId' then Value is not required to be passed
            if [ "${DNAME}" != 'InstanceId' ] && [ -z "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi
            ;;
        m)
            METRICS=${OPTARG}
            if [ -z "${METRICS}" ]; then
                logger ERROR "Invalid metrices passed <${METRICS}>";
                usage;
            fi
            ;;
        p)
            PROCESS=${OPTARG}
            if [ -z "${PROCESS}" ]; then
                logger ERROR "Invalid process passed <${PROCESS}>";
                usage;
            fi
            ;;
        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

# Input Validation
if [ -z "${NAMESPACE}" ] || [ -z "${DNAME}" ] || [ -z "$METRICS" ] || [ -z "$PROCESS" ]; then
                logger ERROR "Invalid argument passed";
    usage
fi


##########################################################
##########################################################


# If "INSTANCE_ID" is passed as Dimension, then use actual AWS Instanec ID as Dimension
if [ "${DNAME}" == "InstanceId" ]; then
        DVALUE=${INSTANCE_ID};
fi


UNIT="Count";

VALUE=$(ps aux | grep "${PROCESS}" | grep -v cw_process.sh | grep -vc grep 2>&1);

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name ${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${VALUE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

logger INFO "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT}";
# Success
exit 0;

   

Tuesday, March 7, 2017

Custom Cloudwatch Plugins CW_Netconnection Part-4

Cloudwatch can be used to monitor the established connection to the vm. This helps in tracking connections in case your application is network intensive

#!/bin/bash
#
#  About                : Check Local and Foreign Network Connections
#
#  Name                 : cw_netconnection.sh

DIR=$(dirname $0);
PLUGIN_NAME='cw_netconnection';

# Include configuration file
source ${DIR}/../conf/plugin.conf;


#Get Current Instance ID
INSTANCE_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`);
#Get Hostname
HOST_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`);

# Help
usage() {
        echo "Usage: $0 [-n ] [-d ] [-m ] [-s ] -t [ LOCAL | FOREIGN ] -p " 1>&2;
        exit 1;
}

# Logger
logger(){

 SEVERITY=$1;
 MESSAGE=$2;
 DATE=`date +"[%Y-%b-%d %H:%M:%S.%3N]"`;

 echo -e "${DATE} [${SEVERITY}] [${PLUGIN_NAME}] [${INSTANCE_ID}] [${HOST_ID}] ${MESSAGE}" >> ${DIR}/../logs/appcwmon.log;

}

# Process Arguments

if [ $# -eq 0 ]; then
        # When no argument is passed
        logger ERROR "Invalid arguments passed";
        usage;
fi



while getopts ":n:d:m:p:s:t:" o; do
    case "${o}" in
        n)
            NAMESPACE=${OPTARG}
            if [ -z "${NAMESPACE}" ]; then
                logger ERROR "Invalid Namespace passed";
                usage;
            fi
            ;;
        d)
            DIMENSION=${OPTARG};

            DNAME=${DIMENSION%=*};
            DVALUE=${DIMENSION#*=};

            if [ -z "${DIMENSION}" ] || [ -z "${DNAME}" ] || [ "${DNAME}" == "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi

            # If Dimension name is 'InstanceId' then Value is not required to be passed
            if [ "${DNAME}" != 'InstanceId' ] && [ -z "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi
            ;;
        m)
            METRICS=${OPTARG};
            if [ -z "${METRICS}" ]; then
                logger ERROR "Invalid metrices passed <${METRICS}>";
                usage;
            fi
            ;;
        s)
            STATE=${OPTARG}
            if [ "${STATE}" != "ESTABLISHED" ] && [ "${STATE}" != "LISTEN" ] && [ "${STATE}" != "TIME_WAIT" ]; then
                logger ERROR "Invalid connection state passed <${STATE}>";
                usage;
            fi
            ;;
        t)
            TYPE=${OPTARG}
            if [ "${TYPE}" != "LOCAL" ] && [ "${TYPE}" != "FOREIGN" ]; then
                logger ERROR "Invalid connection type passed <${TYPE}>";
                usage;
            fi
            ;;
        p)
            PORT=${OPTARG}
            if [ -z "${PORT}" ]; then
                logger ERROR "Invalid process passed <${PORT}>";
                usage;
            fi
            ;;
        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

# Input Validation
if [ -z "${NAMESPACE}" ] || [ -z "${DNAME}" ] || [ -z "$METRICS" ] || [ -z "$PORT" ] || [ -z "${STATE}" ] || [ -z "${TYPE}" ]; then
                logger ERROR "Invalid argument passed";
    usage
fi


##########################################################
##########################################################


# If "INSTANCE_ID" is passed as Dimension, then use actual AWS Instanec ID as Dimension
if [ "${DNAME}" == "InstanceId" ]; then
        DVALUE=${INSTANCE_ID};
fi


UNIT="Count";

if [ "${TYPE}" == "LOCAL" ]; then
        VALUE=$(netstat -alntp | grep ${STATE} | grep -v grep | awk '{print $4}' | awk -F[:] '{print $2}' | grep -cw ${PORT} 2>&1);
else
        echo ${TYPE};
        VALUE=$(netstat -alntp | grep ${STATE} | grep -v grep | awk '{print $5}' | awk -F[:] '{print $2}' | grep -cw ${PORT} 2>&1);
fi;

if [ "$VALUE" -ne "$VALUE" ] 2>/dev/null; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${STATE} ${TYPE} ${PORT} | value=NULL unit=${UNIT} | ${VALUE}";
        exit 1;
fi;

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name ${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${VALUE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${STATE} ${TYPE} ${PORT} value=${VALUE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

logger INFO "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | ${STATE} ${TYPE} ${PORT} value=${VALUE} unit=${UNIT}";
# Success
exit 0;

   

Custom Cloudwatch Plugins CW_MemoryUsage Part-3

The following plugin pushes the memory consumption of the vm to the cloudwatch which you can use to set the alarms and also can use for the autoscaling or taking actions when combined with the events.

#!/bin/bash
#
#  About                : Check used memory in percentage
#
#  Name                 : cw_memory.sh


DIR=$(dirname $0);
PLUGIN_NAME='cw_memory';

# Include configuration file
source ${DIR}/../conf/plugin.conf;


#Get Current Instance ID
INSTANCE_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`);
#Get Hostname
HOST_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`);

# Help
usage() {
        echo "Usage: $0 [-n ] [-d ] [-m ] " 1>&2;
        exit 1;
}

# Logger
logger(){

 SEVERITY=$1;
 MESSAGE=$2;
 DATE=`date +"[%Y-%b-%d %H:%M:%S.%3N]"`;

 echo -e "${DATE} [${SEVERITY}] [${PLUGIN_NAME}] [${INSTANCE_ID}] [${HOST_ID}] ${MESSAGE}" >> ${DIR}/../logs/appcwmon.log;

}

# Process Arguments

if [ $# -eq 0 ]; then
        # When no argument is passed
        logger ERROR "Invalid arguments passed";
        usage;
fi


while getopts ":n:d:m:a:" o; do
    case "${o}" in
        n)
            NAMESPACE=${OPTARG}
            if [ -z "${NAMESPACE}" ]; then
                logger ERROR "Invalid Namespace passed";
                usage;
            fi
            ;;
        d)
            DIMENSION=${OPTARG};
            DNAME=${DIMENSION%=*};
            DVALUE=${DIMENSION#*=};

            if [ -z "${DIMENSION}" ] || [ -z "${DNAME}" ] || [ "${DNAME}" == "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi

            # If Dimension name is 'InstanceId' then Value is not required to be passed
            if [ "${DNAME}" != 'InstanceId' ] && [ -z "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi
            ;;
        m)
            METRICS=${OPTARG}
            if [ -z "${METRICS}" ]; then
                logger ERROR "Invalid metrics passed <${METRICS}>";
                usage;
            fi
            ;;
        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

# Input Validation
if [ -z "${NAMESPACE}" ] || [ -z "${DNAME}" ] || [ -z "$METRICS" ]; then
    logger ERROR "Invalid arguments passed";
    usage
fi


##########################################################
##########################################################


# If "INSTANCE_ID" is passed as Dimension, then use actual AWS Instanec ID as Dimension
if [ "${DNAME}" == "InstanceId" ]; then
        DVALUE=${INSTANCE_ID};
fi

##########################

UNIT="Percent"

# Get Total Memory
#MEM_TOTAL=$(free -m | grep 'Mem' | awk '{print $2}' 2>&1);
MEM_TOTAL=$(awk '/^MemTotal/ {print $2}' /proc/meminfo 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=NULL | ${MEM_TOTAL}";
        exit 1;
fi;

MEM_FREE=$(awk '/^MemFree/ {print $2}' /proc/meminfo 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=NULL | ${MEM_FREE}";
        exit 1;
fi;

MEM_BUFFER=$(awk '/^Buffers/ {print $2}' /proc/meminfo 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=NULL | ${MEM_BUFFER}";
        exit 1;
fi;

MEM_CACHED=$(awk '/^Cached/ {print $2}' /proc/meminfo 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=NULL | ${MEM_CACHED}";
        exit 1;
fi;

# Memory Used (In Percentage)
let "VALUE=((MEM_TOTAL - (MEM_FREE + MEM_BUFFER + MEM_CACHED))*100)/MEM_TOTAL";

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=NULL | ${VALUE}";
        exit 1;
fi;

## Pushing Cloudwatch Metric data
OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name ${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${VALUE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

logger INFO "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT}";

# Success
exit 0;


Monday, March 6, 2017

Custom Cloudwatch Plugins CW_DiskUsage Part-2

Below is the plugin for monitoring the diskusage of specific mount via the cloudwatch. This would go in the bin folder in the cloudwatch and you need to create a file name like cw_diskusage.sh with the following script

 #!/bin/bash
#
#  About                : Percent of Disk usage by Mount based on Mount name
#
#  Name                 : cw_diskuage.sh

DIR=$(dirname $0);
PLUGIN_NAME='cw_diskusage';

# Include configuration file
source ${DIR}/../conf/plugin.conf;


#Get Current Instance ID
INSTANCE_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`);
#Get Hostname
HOST_ID=(`wget -q -O - http://169.254.169.254/latest/meta-data/hostname`);

# Help
usage() {
        echo "Usage: $0 [-n ] [-d ] [-m ] [-f Mount Point]" 1>&2;
        exit 1;
}

# Logger
logger(){

 SEVERITY=$1;
 MESSAGE=$2;
 DATE=`date +"[%Y-%b-%d %H:%M:%S.%3N]"`;

 echo -e "${DATE} [${SEVERITY}] [${PLUGIN_NAME}] [${INSTANCE_ID}] [${HOST_ID}] ${MESSAGE}" >> ${DIR}/../logs/appcwmon.log;

}

# Process Arguments

if [ $# -eq 0 ]; then
        # When no argument is passed
        logger ERROR "Invalid arguments passed";
        usage;
fi


while getopts ":n:d:m:f:" o; do
    case "${o}" in
        n)
            NAMESPACE=${OPTARG}
            if [ -z "${NAMESPACE}" ]; then
                logger ERROR "Invalid Namespace passed";
                usage;
            fi
            ;;
        d)
            DIMENSION=${OPTARG};

            DNAME=${DIMENSION%=*};
            DVALUE=${DIMENSION#*=};

            if [ -z "${DIMENSION}" ] || [ -z "${DNAME}" ] || [ "${DNAME}" == "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi

            # If Dimension name is 'InstanceId' then Value is not required to be passed
            if [ "${DNAME}" != 'InstanceId' ] && [ -z "${DVALUE}" ]; then
                logger ERROR "Invalid dimension passed <${DIMENSION}>";
                usage;
            fi
            ;;
        m)
            METRICS=${OPTARG}
            if [ -z "${METRICS}" ]; then
                logger ERROR "Invalid metric passed <${METRICS}>";
                usage;
            fi
            ;;
        f)
            MOUNT_POINT=${OPTARG}
            if [ -z "${MOUNT_POINT}" ]; then
                logger ERROR "Invalid mount point passed <${MOUNT_POINT}>";
                usage;
            fi
            ;;
        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

# Input Validation
if [ -z "${NAMESPACE}" ] || [ -z "${DNAME}" ] || [ -z "$METRICS" ] || [ -z "$MOUNT_POINT" ]; then
    logger ERROR "Invalid arguments passed";
    usage
fi


##########################################################
##########################################################


# If "INSTANCE_ID" is passed as Dimension, then use actual AWS Instanec ID as Dimension
if [ "${DNAME}" == "InstanceId" ]; then
        DVALUE=${INSTANCE_ID};
fi


##########################################################

#Check if mount point is valid
if ! grep -qs ${MOUNT_POINT} /proc/mounts; then
    logger ERROR "Mount point <${MOUNT_POINT}> not found";
    exit 1;
fi;

VALUE=$(df "${MOUNT_POINT}" -m | sed -n 2p | awk '{print $5}' | cut -d% -f1 2>&1);
UNIT="Percent";

## Pushing Cloudwatch Metric data

OUTPUT=$(/usr/local/bin/aws cloudwatch put-metric-data --namespace ${NAMESPACE} --metric-name ${METRICS} --dimensions ${DNAME}=${DVALUE} --value ${VALUE} --unit ${UNIT} 2>&1);

if [ "$?" -ne "0" ]; then
        logger ERROR "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT} | ${OUTPUT}";
        exit 1;
fi;

logger INFO "${NAMESPACE} ${METRICS} ${DNAME}=${DVALUE} | value=${VALUE} unit=${UNIT}";
# Success
exit 0;


Custom Cloudwatch Plugins Part-1

The Cloudwatch is a hosted tool provided by the aws to monitor different resources in your Cloud Infrastructure. AWS provides you with various metrice(data) related to resources to determine its state on per minute basis which can be used to monitor and raise an alarm whenever a certain threshold is crossed. You can configure the cloudwatch with the SNS to send the notification once the state of the alarm changes.

Further you can configure the events and take any action on these alarms. The only limitation is that AWS provides you with certain metrices to monitor but there are times when you want to monitor the resources which are not provided by AWS. Like your services, established connections, processes, memory etc. For this you need to create your own custom cloudwatch metrics which you can push to the cloudwatch using the AWS Cli.

Once the metrice has been configured in the cloudwatch than you can put the alarms on these metrices. You need to push the metrice regularly using the scheduler(cron) so that the state of the alarm is ok , if its not having the relevant data of the metrice than its state would change to the insufficient and no alarm would be raised.

We are going to follow the following format while creating our custom cloudwatch plugins which would comprise of the metrice which provides the data to the cloudwatch about the state of the resource and configure the alarms on these metrices to form the  overall monitoring via the cloudwatch. You need to be familiar with bash scripting to use these plugins of the cloudwatch.

We are going to follow the following directory structure and would design our scripts to follow the following design patterns starting with the appcwplugins directory which will contain following directories.

1. bin:- Executable cloudwatch plugins will go in this directory, but won't be having any configuration part.
2. conf:- Alarm configuration with cloudwatch cli and Access/Secret Key for AWS CLI along with proxy details will be this directory.
3. extra:- log rotation for the cloudwatch alarms, start and stop service for the cloudwatch alarms would be in this directory.
4. logs:- Cloudwatch logs would be in this directory.
5. script:- A script to copy the application/system logs to s3 and a script to push the metrics to the cloudwatch which would be executed by cron.




Tuesday, February 28, 2017

Custom Cloudwatch RDS Monitoring Plugins Part-2

In the part-1 we discussed about the executable RDS monitoring script which enabled you to pass any sql and than take output of the sql and fetch the result set to the cloudwatch and create the alarms which works as an custom metrics for the monitoring and will raise alarm whenever the threshold is crossed.

In our use case this result of the sql execution is 0 which denotes there is no error on the RDS. If there is any error than an error message will be displayed and the result will be non-zero which causes the cloudwatch to trigger an alarm.

Further the sql output is posted in the email body and sent to the DBA and devops DL.

In this post we are covering the configuration part to be used along with the previous executable script. Once you have configured like this you can schedule this script in the cron service on any server , use the awscli on it to create the alarms and trigger alerts on the rds.

Monday, February 27, 2017

Custom Cloudwatch RDS Monitoring Plugins Part-1

Monitoring the RDS Instances is necessary for detecting the issues, AWS RDS provides outbox metrices i.e. system level metrices. But there are occassions when you want to monitor things like blocked connections, advance queue etc. So you can use the below cloudwatch plugin to monitor anything in RDS based on custom query.

The plugin works on logic that the query which is executed on RDS does not provide any error message than the count would be 0 which means ok and if something is wrong than an error message would be prevented whose count would be not 0 which means alarm.

Than if you are using the sharding than you would need to execute the same query on all your shard databases. The below script can run on 1 database or number of database in case of sharding.

Saturday, February 18, 2017

Why Security is Devops Concern

Since the Devops deals with the rapid releases of the application over a short period of time using the CI-CD and automation combination so security plays a very significant role to make the overall process more secure so that you doesn't loose out to the loop holes which someone can take the advantage, penetrate your systems or insert there malicious code.

Following are the key ways through which you can adopt them in your day to day activities

1. Security as part of the team
Someone within the team should take the responsibility and whether you need to secure it up or indulge the security team to get it secured should be done as to when and where required.

2. Understand the Risks
Understanding the Risks helps in involving the security in your day to day operations and  close the loop holes. Once you understand Risks you would automatically take the necessary steps to fix on those Risks.

3. Security is part of Everything
Security forms the core of everything whether they are your network, systems, code , monitoring etc.

4. User Experience is important
The End user experience is important like if you use the too complex password in your environment than they will write that up which can easily be exploited and get access to your systems so always consider your user experience with that security policy that you are enforcing.

Application Security Principles

If you are using the Cloud to power your web or mobile applications than understanding the security is key aspect to deliver a good business application.

Following are summarized security priciples:-

1. Data in Transit protection
Consumer data transiting networks should be adequately protected against tampering and eavesdropping which can be done using the SSL Certificates via encryption and a combination of the network protection tools such as vpn networks etc.

2. Asset protection
The asset storing or processing  the data should be protected against physical tempering, loss and damage. The cloud provider limited access, moreover securing the access with key based authentication, storing data in encrypted format, backing up data can be used.

3. Separation Between Consumers
Preventing one malicious or compromised consumer from affecting service or data of  another. This can be done by interval user profiling, authentication and database where limited access to there own  account and data should be provided.

Thursday, February 16, 2017

Eternal Bash History for users command auditing in linux

There are times when there is need to track the commands executed by the users. This include all the system users irrespective of the teams, so that if things go wrong it can easily be tracked who executed that command.

This also helps to resolve the disputes within team when 2 users claims that they haven't executed the command. Also if you are installing or doing some new configuration then you can refer to the commands executed by you.

Place the configuration in the /etc/bashrc

 if [ "$BASH" ]; then  
 export HISTTIMEFORMAT="%Y-%m-%d_%H:%M:%S "
 export PROMPT_COMMAND="${PROMPT_COMMAND:+$PROMPT_COMMAND ; }"'echo "`date +'%y.%m.%d-%H:%M:%S:'`" $USER "("$ORIGINAL_USER")" "COMMAND: " "$(history 1 | cut -c8-)" >> /var/log/bash_eternal_history'
 alias ehistory='cat /var/log/bash_eternal_history'
 readonly PROMPT_COMMAND
 readonly HISTSIZE
 readonly HISTFILE
 readonly HOME
 readonly HISTIGNORE
 readonly HISTCONTROL
 fi

The output will be copied in a file generated under the /var/log directory. Execute the following commands to create the log file

 touch /var/log/bash_eternal_history  
 chmod 777 /var/log/bash_eternal_history
 chattr +a /var/log/bash_eternal_history



Tuesday, February 7, 2017

Pulling the Messages from Amazon SQS Queue to a file with python

The Amazon SQS Queue is a high throughput messaging Queue from Amazon AWS.

You can send any type of messages or logs to SQS and than use a consumer(Scripts) to pull those messages from  the SQS and take an action based on those Queue. One of the use cases can be like you can push all your ELB Logs to SQS and than from SQS you can send it anywhere including your Events Notifier(SIEM) tools, batch processing , automation etc.

The following Generalized python script will pull 10 messages at a time from SQS(polling period) provided by SQS and write in a file. The scripts pulls messages from SQS and writes them in a file. If you want to increase the number of messages you just need to run more number of processes. Like if you want to download 50 messages / minute than you just need to start 10 processes of your script and it will start downloading 50 messages/ minute. Kindly note python or sqs is not having any limitations in this case and you can increase it n number of process , however your base line operating system is the limiting factor in this case and would depend upon overall cpu available and I/O operations.

After downloading these logs you may analyze them , send it somewhere else through syslog or write your own scripts for automation.

[Solved] S3 Bucket Creation Fails with IllegalLocationConstraintException Error

While creating the bucket using the s3api , the bucket creation fails with the error message


An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The unspecified location constraint is incompatible for the region specific endpoint this request was sent to.


The error message came specifically in the Mumbai region but the same command was running in the singapore region

Not Working

aws s3api create-bucket --bucket bucketname --region ap-south-1

Working

aws s3api create-bucket --bucket bucketname --region ap-southeast-1


The reason for this error is creating additional parameters which needs to be passed in the mumbai for bucket creation using the s3api i.e. --create-bucket-configuration and LocationConstraint=ap-south-1. Once you pass it you should be able to create the bucket at command line

Working

aws s3api create-bucket --bucket bucketname --region ap-south-1 --create-bucket-configuration LocationConstraint=ap-south-1

Output

{
"Location": "http://bucketname.s3.amazonaws.com/"
}


RDS Alerts

The RDS Forms the crucial part of the web application and any problem in it can lead to downtime in application, reduced performance, 5xx errors, degraded user performance. RDS Monitoring plays a important part in this sense. Below is the list of the parameters which can be monitored to measure the normal operations of the RDS. Some of the monitoring metrics are provided by the AWS and rest can be created using the custom scripts.

Thresholds depends upon the size of the RDS(Cpu cores, Memory etc). We are just providing some idea about the connections threshold.

1. Cpu Utilization:- Cpu utilization increases as the workload and processing on the RDS increases. Alert threshold if [Cpu Utilization]  >= 80% for 5minutes .

2. Database Connections:- Database connections if increases beyond a limit should be alerted because if application doesn't get free connections than it will result in error as connectivity for those request would break. Alert threshold if [Database Connections] >= 10000 .

3. Disk Queue Depth:- The Disk queue depth can be significantly increased if your RDS is doing lot more I/O operations which would result in increased latency. Disk Queue represents the pending I/O operations for the volume. Adding more hard disk can be used to overcome this scenario.

4. Free Storage Space:- Represents the amount of storage space available on RDS. Alert threshold [Free Storage Space] < 2048 GB

Wednesday, February 1, 2017

EOF use to execute multiple commands on remote server

You can run the Linux commands on the remote machine using the loop in the bash script.

If you want to run a multiple commands on the remote server than you can use the EOF which opens a buffer/file in which you can enter the multiple commands which you want to execute on the remote machine. Once done entering the command than you can again use the EOF to end entering the command and in a way it closes the buffer/file.

EOF allows you to redirect the output of the EOF to some command. Like in our case we are redirecting the output to the sudo -i to execute those commands using the root user.


 for i in `cat file.txt`;do echo "###$i####";ssh -t -i key.pem -p 22 ec2-user@$i 'sudo -i << \EOF   
 export https_proxy=http://proxy.example.com:3128;export http_proxy=http://proxy.example.com:3128;  
 yum install sendmail -y  
 EOF'; done  

Monday, January 30, 2017

Auto AMI Backup of Ec2 instance


 
 #!/bin/bash  
 #Script Details  
 #Script to create AMI of server on (based on cron timeperiod) and deleting AMI older than 3 days.  
 #Time period can be controlled by the cron like Daily AMI Creation, every 3 days or weekly AMI Creation  
 #The Retaining time for the AMI is 3days by default , however this can be customized on your requirement.  
 #Deletion on AMI removes the associated snapshots  
 #You need to pass the instance ID along as the arguments to the script  
 #Credentials are fetched from the config file of the user  
 #Uses the Sns configuration for sending the AMI Status  
 #Instance Name is determined from the tag Name assigned to the instance  
 #If Name tag is not found than script would exist with error message mail.  
 #AMI Backup name would be having Instancename following by date in the YYYYMMDD format.  
 #The backed up AMI would be having additional tags to identify the necessary information as follows  
 #The instance ID from which this AutoAMI was created  
 #The date tag on which this AutoAMI was created  


 instance_list=$1  
 DATE=`date +%Y%m%d`  
 From="[email protected] "  
 To="[email protected]"  
 mail_body=/tmp/ami_report  
 echo -e "----------------------------------\n  `date`  \n----------------------------------" &gt; $mail_body  
 for instance_id in ${instance_list//,/ }; do  
 #Get the instance name from the instance id.  
 instance_name=$(aws ec2 describe-instances --instance-ids $instance_id --query 'Reservations[].Instances[].[Tags[?Key==`Name`].Value[]]' --output text)  
 if [[ $instance_name == "" ]] ; then  
 echo -e "Instance-ID ($instance_id) scheduled for auto AMI creation doesn't exist. Please check." | /bin/mail -A ses -s "$instance_id scheduled for AMI doesn't exist" -r $From $To  
 exit  
 else  
 #Create the AMI name.  
 ami_name=$(echo "$instance_name-$DATE")  
 #To create AMI from the instance  
 ami_id=$(aws ec2 create-image --instance-id "$instance_id" --name "$ami_name" --description "Auto AMI from $instance_name ($instance_id)" --no-reboot --output text)  
 #Tag the AMI.  
 aws ec2 create-tags --resources $ami_id --tags Key=Instance_id,Value=$instance_id Key=Date,Value=$DATE  
 if [[ $ami_id != "" ]];then  
 echo -e "$ami_id ($ami_name) created successfully from $instance_name ($instance_id).\n" &gt;&gt; $mail_body  
 else  
 echo -e "AMI creation failed from $instance_name ($instance_id). Please check.\n" &gt;&gt; $mail_body  
 fi  
 #############Auto Delete 3 days old AMI.#############  
 DATE_d=`date +%Y%m%d --date '3 days ago'`  
 ami_name_d=$(echo "$instance_name-$DATE_d")  
 #Find the AMI need to be Deregister.  
 ami_id_d=$(aws ec2 describe-images --filters Name=name,Values=$ami_name_d Name=tag-key,Values=Instance_id Name=tag-value,Values=$instance_id --query 'Images[*].{ID:ImageId}' --output text)  
 if [[ $ami_id_d != "" ]]; then  
 #Find the snapshots attached to the AMI need to be Deregister.  
 aws ec2 describe-images --image-ids $ami_id_d --query 'Images[].BlockDeviceMappings[].Ebs.SnapshotId' --output text &gt; /tmp/snap.txt  
 #Deregistering the AMI  
 aws ec2 deregister-image --image-id $ami_id_d  
 #Deleting snapshots attached to AMI  
 for i in `cat /tmp/snap.txt`;do aws ec2 delete-snapshot --snapshot-id $i ; done  
 echo -e "$ami_id_d deleted with attached snapshot `cat /tmp/snap.txt`\n" &gt;&gt; $mail_body  
 fi  
 fi  
 done  
 cat $mail_body | /bin/mail -A ses -s "Auto backup report `date +%d%b%y`" -r $From $To  


Example 

Add a crontab entry on the server having the AWS Access and Secret key installed
 # crontab -e

 # Auto AMI creation & Deletion
 # Every 3rd day at 10pm # Test-Server(i-ejhc4dfer45), Test-Server2(i-ejhc4dfer58)

0 22 */3 * * /opt/aws_automation/autoamibackup.sh i-ejhc4dfer45,i-ejhc4dfer58