Friday, November 13, 2020

[Solved] Network Split and High Erlang process on one node in the Rabbitmq Cluster

Problem:- The network split has occured in the Rabbitmq cluster causing the cluster of node1, node2 and node3 divide in two. Also the erlang process count was continuously high and hitting the upper limit. Further on network split the main cluster node hang up.

Cause:- The network split and high erlang process count might have occured if the request are not equally split across different nodes rather application is using one server as its endpoint. Due to which the erlang process count was continuously high on the node and that node got hanged , even it was hard to restart the process again.


1. As network split occurs you need to stop the rabbitmq across all the nodes using the following command.

service rabbitmq-server stop  

Tuesday, September 22, 2020

[Solved] OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"exit status 40\"": unknown

 Problem:- OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"exit status 40\"": unknown


This problem can occur when too much memory is used in cache, running echo 1 > /proc/sys/vm/drop_caches should solve the problem. It will clear the pagecache only.

Saturday, September 12, 2020

[Resolved] Vagrant failed to initialize at a very early stage


Vagrant failed to initialize at a very early stage:

The version of powershell currently installed on this host is less than

the required minimum version. Please upgrade the installed version of

powershell to the minimum required version and run the command again.

  Installed version: N/A

  Minimum required version: 3


1. The issue is  with the powershell
2. Search for the powershell and open the shell
3. Run the following command
# update-help
It's going to update the modules of powershell
4. Now powershell problem should have resolved go back and use the vagrant now. It should work now.

Monday, August 24, 2020

[Resolved] Failed to get system container stats for "/system.slice/docker.service


Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"


While installing kubernetes and starting the kubelet service. The kubelet service fails with the above message and will not start.


After installing the docker and kubelet just enable the kubelet service dont start the kubelet service. Run the kubeadm init command as follows
 kubeadm init  
After the kubeadm command has been  completed  it will automatically start the kubelet on the system. There are some additional files required by the kubelet which are available after you have initialized the kubeadm.

Tuesday, July 21, 2020

[Solved] Unknown table 'COLUMN_STATISTICS' in information_schema

mysqldump: Couldn't execute 'SELECT COLUMN_NAME,                       JSON_EXTRACT(HISTOGRAM, '$."number-of-buckets-specified"')                FROM information_schema.COLUMN_STATISTICS                WHERE SCHEMA_NAME = 'uadmin' AND TABLE_NAME = 'uauth_group';': Unknown table 'COLUMN_STATISTICS' in information_schema (1109) 
t is due to a new flag that has been enabled by default in the mysqldump 8. you will have to disable it by adding the column-statistics=0. After which the command would be become something like
 mysqldump --column-statistics=0 --host=<server> --user=<user> --password=<password  

Saturday, June 27, 2020

[Solved] Unknown configuration section 'hostmanager'

Recently working with the vagrant I came across this issue of 

Unknown configuration section 'hostmanager'

I have defined the hostmanager in my Vagrantfile but its an additional plugin which you need to install first before you can make the use of it.

Run the following command to resolve this issue
 vagrant plugin install hostmanager  

Wednesday, June 24, 2020

Authorising AWS using temporary credentials from a role

Using the Access and secret key can result in a significant security issues if compromised.

So its better to use the role based authentication instead. But running the scripts might not be that easy with the role. So you can use the temporary credentials which are valid for 15minutes created by the role and authenticate the aws services.

This can come in handy while configuring the jobs in jenkins , running shell scripts etc.

So below is the process of how to achieve this.

 aws sts assume-role --role-arn arn:aws:iam::189786521149:role/ec2fullpermission --role-session-name "Session1" --profile prod2 > temp-creds.txt  
 # set the temporary credentials as the default AWS credentials in your console session  
 export AWS_ACCESS_KEY_ID=`cat temp-creds.txt | grep -w AccessKeyId | awk '{print $2}' | sed 's/"//g;s/,//g'`  
 export AWS_SECRET_ACCESS_KEY=`cat temp-creds.txt | grep -w SecretAccessKey | awk '{print $2}' | sed 's/"//g;s/,//g'`  
 export AWS_SECURITY_TOKEN=`cat temp-creds.txt | grep -w SessionToken | awk '{print $2}' | sed 's/"//g;s/,//g'  

Sunday, June 14, 2020

Container Security

Usually when deploying a network for the application it flows like below

Internet (User) Network ----> DMZ (demilitarized zone) ----> Internal Network

Internet (User) Network :- Usually all the web applications receives request over the Internet from the end users.

DMZ (demilitarized zone) :- This is a isolation from the internal network and usually the webservers/Loadbalancers comes under this network so it only proceeds to internal network when user is logged and verified to be a genuine request. In cloud usualy loadbalancers are deployed in public subnet and webservers can than reside in private subnet.

Internal Network :- This is the private network which comprises of the application servers and the database servers. So webservers cant connect directly to the database servers they have to interact with the application servers first and application servers connect with the database servers.

Saturday, June 13, 2020

Understanding Cloud Agreements

It is important to understand the components of the Cloud Agreement.

There are majorly two important cloud service agreements which are as follows
1. Acceptable Use Policy (AUP)
2. Service Level Agreements (SLA)

1. Acceptable Use Policy (AUP) :-
Acceptable use policy should be implemented in on-premise solutions to educate the users regarding the accepted and prohibited actions which can be taken for those systems.

AUP thus can be used by the cloud service provider, to release of any legal liability in the case that unlawful actions are carried out in the cloud environment by the customer.

AUP policies mostly describe about the violations to the AUP policy itself and describes about the punitive actions which can be taken if the AUP is not implemented or practiced. Usually if the AUP is violated than it may negatively impact the reputation of the CSP(Cloud service provider).

For eg:- Any type of vulnerability scanner software can't be run in the cloud.

2. Service Level Agreement(SLA):-
This document outlines all the services which are provided by the CSP to their customers and could include vital information which may affect the solutions deployed in the cloud directly like Availability, Serviceability, Performance. These SLA would usually provide the thresholds and financial repercussions associated with not meeting those thresholds. Well designed SLA would help resolve conflicts between the provider and the customer.

These can be created and identified by collecting and monitoring the key metrics. Usually CSPs doesn't provide this by default and customer needs to ask for them specifically , the burden of proof is on the customer if they want to push against SLA violations.

SLAs are  often non-negotiable documents that strictly limit the liability of the provider.

Friday, May 22, 2020

[Solved] OutofMemory Exception on Java Application running on Docker Containers

We recently came across a issue where the java application was frequently facing the OutofMemory Exception.

Usually the Java based applications use the parameters -XX:MaxRAMPercentage / -XX:MinRAMPercentage are used to restrict the heap utilization within certain % limits from 1 to 100 which holds good when you running these applications on the virtual instances like EC2.

But when you running them on containers than VM allocates a larger fraction of memory to the Java Heap. To turn off this behaviour, set the -XX:-UseContainerSupport

When -XX:MaxRAMPercentage / -XX:InitialRAMPercentage are used with -XX:+UseContainerSupport, the corresponding heap setting is determined based on the memory limit of the container.

Saturday, May 16, 2020

Installing Terraform on Centos Linux

1. Download the terraform for the linux from the site using wget
wget https://releases.hashicorp.com/terraform/0.12.25/terraform_0.12.25_linux_amd64.zip

2.Install the unzip if not already installed as
yum install unzip

3. Unzip the zip file to the /usr/local/bin as 
unzip terraform_0.12.25_linux_amd64.zip -d /usr/local/bin/

4. Verify the terraform has been successfully installed as
[root@localhost ~]# terraform -v
Terraform v0.12.25

Wednesday, March 4, 2020

Using Netcat to check connectivity to mysql on port 3306

Netcat is a tool in linux which can be very powerful is used correctly.

If you want to check the default port is pingable for mysql i.e. 3306 you can validate it via netcat

[ankit.mittal@bastion.test2]# nc -vz master-db.unixcloudfusion.in 3306
Connection to master-db.unixcloudfusion.in 3306 port [tcp/mysql] succeeded!

Tuesday, March 3, 2020

[Solved] Message: Field 'id' doesn't have a default value

Message: Field 'id' doesn't have a default value

The field ID error occurred when i was inserting the ID column as the primary key


The error signifies for the default value to be assigned since it would be unique everytime.

Adding the AUTO_INCREMENT in the query resolved the issue


[Solved] ERROR 1227 (42000) at line 18: Access denied; you need (at least one of) the SUPER privilege(s) for this operation


ERROR 1227 (42000) at line 18: Access denied; you need (at least 
one of) the SUPER privilege(s) for this operation

Scenario:- I have taken a backup of the Mysql table using mysqldump and tried to restore the same in the Amazon AWS RDS

Cause:- The error occurs when database has the binary log enabled, and mysqldump file contains an object(trigger, view, function or event)

Now if any of the create statements dont include "NO SQL", "READS SQL DATA" or "DETERMINISTIC" keywords, than mysql cant write that object and import fails

Change the parameter group value

log_bin_trust_function_creators = 1
global_log_bin_trust_function_creators = 1 (More relaxed permission for 
                                            allowing import of all objects)

Thursday, February 27, 2020

[Solved] com/okta/tools/WithOkta : Unsupported major.minor version


 Exception in thread "main" java.lang.UnsupportedClassVersionError: com/okta/tools/WithOkta : Unsupported major.minor version 52.0  
      at java.lang.ClassLoader.defineClass1(Native Method)  
      at java.lang.ClassLoader.defineClass(ClassLoader.java:808)  
      at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)  
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:442)  
      at java.net.URLClassLoader.access$100(URLClassLoader.java:64)  
      at java.net.URLClassLoader$1.run(URLClassLoader.java:354)  
      at java.net.URLClassLoader$1.run(URLClassLoader.java:348)  
      at java.security.AccessController.doPrivileged(Native Method)  
      at java.net.URLClassLoader.findClass(URLClassLoader.java:347)  
      at java.lang.ClassLoader.loadClass(ClassLoader.java:430)  
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:323)  
      at java.lang.ClassLoader.loadClass(ClassLoader.java:363)  
      at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482)  


Main reason for this error is older version of the Java JDK being used. Like in mine case it was

 java -version  
 java version "1.7.0_191"  
 OpenJDK Runtime Environment (amzn- u191-b01)  
 OpenJDK 64-Bit Server VM (build 24.191-b01, mixed mode)  

But the application requires atleast JDK Version 1.8 .

So to resolve the issue consider upgrading the JDK to version 1.8 instead.

Wednesday, February 26, 2020

Learnings Shared in Kubernetes Conference in Delhi 2020

1. Kubernetes implementation on cloud and on-premise are very different.

2. Enough linux internals for a solid understanding of how to operate kubernetes in production environment.

3. Install and operate kubernetes using only community tools.

4. Deploy community kubernetes cluster on manually VMs from scratch.

5. Design and implement CI/CD piepelines for independent deployments.

6. Figure out governance strategies to independently develop, configure and operate each microservice in a kubernetes cluster.

7. Configure istio  in a flexible manner to govern east-west traffic.

8. Run all K8s processes as Docker containers rather than binaries.

9. In the absence of open internet, start with docker registry first and populate all necessary images.

10. Use kubespray to setup RHEL VMs
--> Use Ansible playbooks for opinionated provisioning
--> Sets up Calcio overlay networking

Tuesday, February 18, 2020

[Solved] Difference between the Variable vs Global variable in Amazon RDS

Recently faced the issue after making changes in the RDS Parameters and querying the same within the mysql rds in the Amazon AWS.

 mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';  
 | Variable_name      | Value                   |  
 | character_set_client   | utf8                   |  
 | character_set_connection | utf8                   |  
 | character_set_database  | utf8mb4                  |  
 | character_set_filesystem | binary                  |  
 | character_set_results  | utf8                   |  
 | character_set_server   | utf8mb4                  |  
 | character_set_system   | utf8                   |  
 | character_sets_dir    | /rdsdbbin/mysql-5.7.22.R5/share/charsets/ |  
 | collation_connection   | utf8_general_ci              |  
 | collation_database    | utf8mb4_unicode_ci            |  
 | collation_server     | utf8mb4_unicode_ci            |  
 11 rows in set (0.01 sec)  
 mysql> SHOW GLOBAL VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';  
 | Variable_name      | Value                   |  
 | character_set_client   | utf8mb4                  |  
 | character_set_connection | utf8mb4                  |  
 | character_set_database  | utf8mb4                  |  
 | character_set_filesystem | binary                  |  
 | character_set_results  | utf8mb4                  |  
 | character_set_server   | utf8mb4                  |  
 | character_set_system   | utf8                   |  
 | character_sets_dir    | /rdsdbbin/mysql-5.7.22.R5/share/charsets/ |  
 | collation_connection   | utf8mb4_unicode_ci            |  
 | collation_database    | utf8mb4_unicode_ci            |  
 | collation_server     | utf8mb4_unicode_ci            |  
 11 rows in set (0.00 sec)  

session variables are getting overridden is because the client auto detects which character set to use based on the operating system setting. 

for reproducing the case two different MySQL clients running on separate servers. One was installed on an Ubuntu subsystem running on my local machine and the other was installed on a Ubuntu Linux server running on an EC2 instance. MySQL client running on my local machine the variables were not overridden. However, on the Ubuntu Linux server running on EC2 the session variables got overridden. 

setting the 'skip-character-set-client-handshake' parameter to 1 using you custom parameter group. This will ignore the character set information detected by the client and therefore set the session character set variable to be the same value as your global variables