-->

Wednesday, March 20, 2019

3 Creating pods running containers minikube

[Solved] Error restarting cluster: wait: waiting for k8s-app=kube-proxy: timed out waiting for the condition

Error:-
Error restarting cluster: wait: waiting for k8s-app=kube-proxy: timed out waiting for the condition

Solution:-
This occured during the minikube installation. To resolve this issue just delete the installation and start again that should resolve the issue
 ./minikube delete  
 ./minikube start
That should resolve this Error

[Solved] Unable to start VM: create: precreate: exec: "docker": executable file not found in $PATH

Error:-
Unable to start VM: create: precreate: exec: "docker": executable file not found in $PATH

Occurence:- 
Occured during the minikube installation

Resolution:-
Docker was not installed on the vm so installed the docker using the get.docker.com script as
 curl -fsSL https://get.docker.com/ | sh  
This should automatically detect the operating system and install the docker on your system.

[Solved] Unable to start VM: create: precreate: VBoxManage not found. Make sure VirtualBox is installed and VBoxManage is in the path

Error:-
Unable to start VM: create: precreate: VBoxManage not found. Make sure VirtualBox is installed and VBoxManage is in the path

Occurence:-
Following Error during the minikube installation on the virtualbox VM

Cause/Resolution:- 
Minikube and Vagrant vm dont work good simultaneously as its like running type2 virtualization over another type2 virtualization.
However it makes sense to run minikube on linux and if you running windows machine and want linux machine than you want to use virtualbox.

The solution is to disable the vm-driver of minikube to none as follows
 ./minikube config set vm-driver none  

That should solve your  problem

2 Minikube Installation

1 About Minikube and features

growpart fails to extend disk volume ( attempt to resize /dev/xvda failed. sfdisk output below )

Error:-

attempt to resize /dev/xvda failed. sfdisk output below:
|
| Disk /dev/xvda: 104433 cylinders, 255 heads, 63 sectors/track
| Old situation:
| Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
|
|    Device Boot Start     End   #cyls    #blocks   Id  System
| /dev/xvda1   *      1   78324   78324  629137530   83  Linux
| /dev/xvda2          0       -       0          0    0  Empty
| /dev/xvda3          0       -       0          0    0  Empty
| /dev/xvda4          0       -       0          0    0  Empty
| New situation:
| Units = sectors of 512 bytes, counting from 0
|
|    Device Boot    Start       End   #sectors  Id  System
| /dev/xvda1   *     16065 1677716144 1677700080  83  Linux
| /dev/xvda2             0         -          0   0  Empty
| /dev/xvda3             0         -          0   0  Empty
| /dev/xvda4             0         -          0   0  Empty
| Successfully wrote the new partition table
|
| Re-reading the partition table ...
| BLKRRPART: Device or resource busy
| The command to re-read the partition table failed.
| Run partprobe(8), kpartx(8) or reboot your system now,
| before using mkfs
| If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
| to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
| (See fdisk(8).)
FAILED: failed to resize
***** WARNING: Resize failed, attempting to revert ******
Re-reading the partition table ...
BLKRRPART: Device or resource busy
The command to re-read the partition table failed.
Run partprobe(8), kpartx(8) or reboot your system now,
before using mkfs
***** Appears to have gone OK ****

Resolution:-

# growpart /dev/xvda 1

If you are wondering you doing something wrong so there is absolutely nothing wrong with above command.

As you see there was no issue in creation of the new partition table it was successful. However I suspected that before completing it started to reread again due to which the disk was not increasing. I did tried multiple solutions and got some results for sfdisk however in my case the growpart was latest one only still the issue was coming.

At this point you will need to restart the server to fix this issue. If its the production server than you might have to take appropriate approvals as there is no other way after you restart the server the partition size should have increased at that time itself.

1. Rundeck Installation on Centos7.5



[Solved] invalid principal in policy

Problem:- I created a S3 policy same as the other policy which was above and when i saved the s3 policy it gave me the Invalid principal in policy and wont allow me to save the policy.


Cause:- I have given the wrong name of the arn due to which this issue was occurring, logically everything was correct. I believe AWS checked in backend that there was no such arn due to which it didn't allowed me to save the arn in first place.

Wrong ARN in my case:-
"AWS": "arn:aws:iam::446685876341:role/something-something-test-role"


Right ARN in my case:-
"AWS": "arn:aws:iam::446685876341:role/service-role/something-something-test-role"


Resolution:- Once i have resolved the above arn correctly so the error was resolved.

Monday, March 4, 2019

[Solved] url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [75/120s]: unexpected error ['NoneType' object has no attribute 'status_code']

Issue:- I was enabling the ENA support for the centos7.1 on the ec2 instance when i received following error
url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [75/120s]: unexpected error ['NoneType' object has no attribute 'status_code']

Due to which mynetwork card was not coming up for the instance and it was further resulting the instance-id failure due to which url_helper.py script of the AWS was failing to get the ip address. So when finally instance was booted as no ip was assigned to it the ssh checks known as instance checks were failing on the instance.

I was getting following logs which confirmed it

Cloud-init v. 0.7.5 running 'init' at Mon, 04 Mar 2018 06:33:38 +0000. Up 5.17 seconds.
cis-info: +++++++++++++++++++++++Net device info++++++++++++++++++++++++
cis-info: +--------+-------+-----------+-----------+-------------------+
cis-info: | Device |   Up  |  Address  |    Mask   |     Hw-Address    |
cis-info: +--------+-------+-----------+-----------+-------------------+
cis-info: | ens5:  | False |     .     |     .     | 06:f7:b8:fc:f1:20 |
cis-info: |  lo:   |  True | 127.0.0.1 | 255.0.0.0 |         .         |
cis-info: +--------+-------+-----------+-----------+-------------------+
cis-info: ++++++++++++++++++++++++++Route info+++++++++++++++++++++++++++
cis-info: +-------+-------------+---------+---------+-----------+-------+
cis-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
cis-info: +-------+-------------+---------+---------+-----------+-------+
cis-info: +-------+-------------+---------+---------+-----------+-------+
2018-03-03 22:33:38,836 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: unexpected error ['NoneType' object has no attribute 'status_code']


Cause:-
In the AWS Documentation it is mentioned to add GRUB_CMDLINE_LINUX=”net.ifnames=0” in the /boot/grub2/grub.cfg but for me.

Solution:-
I changed it and updated in the /etc/default/grub and recreated the grub.

After which the problem was resolved and I was successfully able to upgrade the instance to 5th generation support.

After the change i got the following output in the logs

Cloud-init v. 0.7.5 running 'init' at Mon, 04 Mar 2018 07:43:28 +0000. Up 8.73 seconds.
cis-info: ++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++
cis-info: +--------+------+-------------+---------------+-------------------+
cis-info: | Device |  Up  |   Address   |      Mask     |     Hw-Address    |
cis-info: +--------+------+-------------+---------------+-------------------+
cis-info: | ens5:  | True | 10.98.16.98 | 255.255.255.0 | 06:f7:b8:fc:f1:20 |
cis-info: |  lo:   | True |  127.0.0.1  |   255.0.0.0   |         .         |
cis-info: +--------+------+-------------+---------------+-------------------+
cis-info: +++++++++++++++++++++++++++++++Route info+++++++++++++++++++++++++++++++
cis-info: +-------+-------------+------------+---------------+-----------+-------+
cis-info: | Route | Destination |  Gateway   |    Genmask    | Interface | Flags |
cis-info: +-------+-------------+------------+---------------+-----------+-------+
cis-info: |   0   |   0.0.0.0   | 10.98.16.1 |    0.0.0.0    |    ens5   |   UG  |
cis-info: |   1   |  10.98.16.0 |  0.0.0.0   | 255.255.255.0 |    ens5   |   U   |
cis-info: +-------+-------------+------------+---------------+-----------+-------+
Cloud-init v. 0.7.5 running 'modules:config' at Mon, 04 Mar 2018 07:43:30 +0000. Up 10.16 seconds.

[Solved] /etc/default/grub: line 60: serial: command not found

Issue:- When i tried running the below command it resulted in the error
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
/etc/default/grub: line 60: serial: command not found

Cause:- You at some point made some mistake and run grub2-mkconfig -o /etc/default/grub which has overwritten your default grub file and when you are trying to create a grub file as mentioned above its erroring out in your old grub file

Resolution:- Manually edit and copy the following content in the grub file
vi /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet"
GRUB_DISABLE_RECOVERY="true"

Friday, March 1, 2019

[Solved] Rate Limiting Errors in the Awscli

Error:- An error occurred (Throttling) when calling the DescribeLoadBalancers operation (reached max retries: 2): Rate exceeded
Error:- An error occurred (Throttling) when calling the GenerateCredentialReport operation (reached max retries: 4): Rate exceeded


Cause:- These types of Error occur when the rate limiting imposed by the AWS on its services crosses the threshold set by the AWS on its services. This can cause drop in your request due to which the automation scripts might not function or some of the request if run in batch is not completed which can further result in other issues.

Solution:-
1. Create models folder in your awscli path i.e. ~/.aws/models

mkdir ~/.aws/models

2. Create a retry with the following content inside the retry json file "~/.aws/models/_retry.json"

[Solved] Error: Driver 'pcspkr' is already registered, aborting

pcspkr is related to the pc speaker, so its safe to disable it, you can do it as follows

Solution:-
echo "blacklist pcspkr" > /etc/modprobe.d/blacklist-pcspkr.conf