-->

Saturday, November 20, 2021

[Solved] Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

 Issue:- 

The issue was occuring when i tried to run the metric server inside the kubernetes. And after deploying tried to list the cpu utilization based on the nodes

[[email protected] volumes]$ kubectl top nodes


Error:- 

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)


Resolution:-

This can easily be resolved by editing the deployment yaml files and adding the hostNetwork: true after the dnsPolicy: ClusterFirst

kubectl edit deployments.apps -n kube-system metrics-server
hostNetwork: ture



Sunday, November 14, 2021

[Solved] UI_set_result:result too small:ui_lib.c:869:Yo u must type in 4 to 1024 characters

 Issue:- 

The issue was occuring when i tried to generate the certificate using the following command.

sudo openssl genrsa -des3 -out server.key 1024

Error:- 

UI_set_result:result too small:ui_lib.c:869:You must type in 4 to 1024 characters

Cause:-

Leave off the -des3 flag, which is an instruction to the openssl to encrypt the server.keynew which is not a new key at all - its exactly the same key as server.key only with the passphrase changed or removed.

Resolution:-

so i updated the command as 

sudo openssl genrsa -out server.key 1024
which worked fine.

[Solved] ssl received a record that exceeded the maximum permissible length centos

 Issue:- 

Issue occured when i tried to create the reverse proxy in the nginx

upstream dashboard {
    server 172.31.4.205:30545;
}

server {
    listen 443;
    listen [::]:443;

    ssl_certificate /etc/nginx/ssl/server.crt;
    ssl_certificate_key /etc/nginx/ssl/server.key;

    location / {
        proxy_set_header X-Forwarded-Host $host:$server_port;
        proxy_set_header X-Forwarded-Server $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_pass https://dashboard;
    }
}
   

Error:- 

ssl received a record that exceeded the maximum permissible length centos

Friday, November 5, 2021

[Solved] nginx unknown directive "upstream"

Issue:- 

The issue occured when i was trying to use the upstream to proxypass i.e. use the nginx as reverse proxy for exposing the kubernetes application.


Error:- 

unknown directive "upstream" in /etc/nginx/nginx.conf:1 configuration file /etc/nginx/nginx.conf test failed


Cause:-

In Nginx upstream directive is only valid in the http contex.


Resolution:-

So instead of making the changes in the nginx.conf try to use this in the default.conf. Also I preferred to proxy the request directly in the default.conf which actually got the rid of the upstream although when i tested upstream also worked fine. So i made the change under /etc/nginx/sites-enabled/default.conf which is soft link to /etc/nginx/sites-available/default.conf

[Solved] ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec

 Issue:- 

Issue occured when i tried to deployed the deployment yaml file on the kubernetes.

kind: Deployment  
 metadata:  
  name: jenkins  
  namespace: jenkins  
  template:  
   metadata:  
    labels:  
     app: jenkins  
   spec:  
    volumes:  
    - name: jenkins-home  
     persistentVolumeClaim:  
      claimName: pvc-nfs-pv1  
    containers:  
     - name: jenkins  
      image: jenkins/jenkins:lts  
      lifecycle:  
       postStart:  
        exec:  
         command: ["/bin/sh", "-c", "/var/jenkins_home/ip-update.sh"]  
      env:  
       - name: JAVA_OPTS  
        value: -Djenkins.install.runSetupWizard=false  
      ports:  
       - name: http-port  
        containerPort: 8080  
       - name: jnlp-port  
        containerPort: 50000  
      volumeMounts:  
       - name: jenkins-home  
        mountPath: /var/jenkins_home  


Error:- 

ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec

Saturday, October 16, 2021

[Solved]Jenkins won't listen on the port 8080 in the browser

 Issue:- 

The issue occurs when trying to install the jenkins on the Centos8 EC2 Instance in the AWS.

 # yum install jenkins -y


Error:- 

After successful installation and starting of the jenkins service and verifying from the 
netstat command still the jenkins installer will not open in the Browser. 


Cause:-

Install centos8 the firewalld is blocking the connection to the jenkins. So you have to specifically allow the ports in the centos8 separately to expose them else if you using the network firewalls like security group in the AWS Cloud you can simply disable the firewalld which will resolve your issue.


Resolution:-

So in order to overcome this challenge simply disable the firewalld since the security groups are already doing this work for us

 # firewall --cmd --state 
 # systemctl stop firewalld 
 # systemctl status firewalld    
 # systemctl disable 
Restart the jenkins if required and that should solve your issue.

[Solved] DB_RUNRECOVERY: Fatal error, run database recovery

 Issue:-

The issue occurs when trying to install the java-1.8.0-openjjdk-devel on the Centos8 EC2 Instance in the AWS.

 # yum install java-1.8.0-openjdk-devel -y
So this command took some time to run during which phase i thought it was struck and press the control+c to stop it.

When i try to run again it showed that there was pid running. 

So i killed the pid and when i tried to install again I got the following error.


Error:-
error: rpmdb: BDB0113 Thread/process 3446/140301746613120 failed: BDB1507 Thread died in Berkeley DB library
error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 -  (-30973)
error: cannot open Packages database in /var/lib/rpm
Error: Error: rpmdb open failed


Cause:-

The error occured because when you run the command some changes were made in the rpmdb which got exited forcefully before the clean up cloud have occured due to which the rpmdb got corrupted. Once the Rpm db is corrupted you wont be able to use the yum or rpm to install packages. You need to recreate the rpmdb and install the changes again to fix the issue.


Resolution:-

To recover this error you will have to recreate the rpm database by using the following commands which help you recover from the error above. So please follow the steps below to backup and rebuild the rpmdb database

mkdir /var/lib/rpm/backup
cp -a /var/lib/rpm/__db* /var/lib/rpm/backup/
rm -f /var/lib/rpm/__db.[0-9][0-9]*
rpm --quiet -qa
rpm --rebuilddb
yum clean all

Tuesday, September 7, 2021

[Solved] Error: Package: jenkins-2.303.1-1.1.noarch (jenkins) Requires: daemonize --skip-broken to work

Issue:- 

The issue occurs when trying to install the jenkins on the Centos7 EC2 Instance in the AWS.

 # yum install jenkins -y


Error:- 

 --> Finished Dependency Resolution  
 Error: Package: jenkins-2.303.1-1.1.noarch (jenkins)  
       Requires: daemonize  
  You could try using --skip-broken to work around the problem  
  You could try running: rpm -Va --nofiles --nodigest  


Cause:-

Basically daemonize runs a command as a Daemon in the Centos. Since the package is missing in the version of the Centos you running thats why this error is coming. Think of it like a dependency which is required by the Jenkins to run but since you missing on the Daemonize thats why its giving the error.


Resolution:-

Daemonize doesn't ship in the default repository thats why yum is not able to resolve it. You will need to install the Daemonize from the Epel repository which is the extra package for enterprise linux as

 # yum install epel-release -y  
 # yum install daemonize -y  
 Than you can continue on installing the Jenkins as  
 # yum install jenkins -y  



Tuesday, August 17, 2021

Terraform Certification Details

1. The Duration for the Terraform Exam is 1 Hour

2. You will have 50 to 60 questions and you will be tested on Terraform version 0.12 and higher so if you have worked on the version older than 0.12 than there has been considerable changes in the syntax and the logic as well.

3. The exam is online proctored and the whole certification is quite handsoff. 

4. You will have to register on the hashicorp website from where you will be redirected to the exam portal. And you will have to make sure your system meets the requirement for the online exam. 

5. The certification will have 2 years expiration from the day which you passed the exam.

Friday, August 6, 2021

[Solved] kubelet isn't running or healthy.The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error

Description:-

So the issue came when i was setting up the kubernetes cluster on the AWS centos VM. Although these steps were followed every time, this particular time it resulted in the error below

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

Issue:-

If you google this issue you will find the solution that this occurs because of the swap which was not the case with me and definately aws instance was not having the swap configured.

So in this particular case docker was using the groupfs which i changed to systemd and thats it volla it got resolved.

Create the file as

[[email protected] ~]# vim /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"]
}
[[email protected] ~]# systemctl restart docker
[[email protected] ~]# systemctl status docker

Thats it you problem should be resolved now and you can the kubeadm reset command followed by the kubeadm init command to reinitialise your kubernetes cluster which will work this time.


Thursday, August 5, 2021

Preventing Google bot to scrawl certain website pages via Nginx

Sometimes you might want to skip the google bot from scrawling your certain pages so you can use the robots.txt file to decline them.

But at times during the migration or testing newer changes you allocate a small traffic on new endpoints to verify if things are working fine or not. Sometimes the newer pages might not have certain components which googlebot might be using from the seo perspective.

Also newer limited allocations of a part of traffic might cause bot to view pages differently and mark them as copied content due to which search results might get affected.

So you can prevent and control the google bot from scrawling pages from the nginx webserver itself as well.

First two important things are there:-

1. Google has multiple bots no one actually knows however google give some idea about its bots. But one thing is common they all have google in it

2. This is not a replacement for robots.txt rather we implementing because of the partioning/allocation of small traffic to new site which gradually increases over time. So we don't want both the sites to be simultaneously visible and remove it once the complete migration has occurred.

So you can detect the google bot with the help of the http_user_agent which nginx provides and you can look for the string google in it. If you find the user_agent is having google than you can be certain that its google bot.

So based on above conclusion we can control google bot via user_agent in nginx and restrict and proxy some particular site page based on this approach

So in location directive you can send 420 error to google_bot as and you can use this error condition in all your if statements wherever required.

 location = / {  
   error_page 420 = @google_bot;  
   # Checking for google bot  
   if ($http_user_agent ~* (google)) {  
     return 420;  
   }  
You can also proxy_pass and make the google bot to always come on the old page as 
location @google_bot {
    proxy_pass $scheme://unixcloudfusion;
}

[Resolved] Kubernetes showing older version of master after successful upgrade

 Issue:- Recently updated my kubernetes cluster

 [[email protected] ~]# kubeadm upgrade apply v1.21.3  
 [upgrade/config] Making sure the configuration is correct:  
 [upgrade/config] Reading configuration from the cluster...  
 [addons] Applied essential addon: CoreDNS  
 [addons] Applied essential addon: kube-proxy  
 [upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.21.3". Enjoy!  
 [upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.  

So the upgrade message clearly shows the cluster was upgraded to "v1.21.3" in master node. However when i run the command to verify

[[email protected] ~]$ kubectl get nodes -o wide
NAME                            STATUS     ROLES                  AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8smaster.unixcloudfusion.in    Ready      control-plane,master   9d    v1.21.2   172.31.36.208   <none>        CentOS Linux 7 (Core)   3.10.0-1160.31.1.el7.x86_64   docker://20.10.7
k8sworker1.unixcloudfusion.in   NotReady   <none>                 9d    v1.21.3   172.31.39.6     <none>        CentOS Linux 7 (Core)   3.10.0-1160.31.1.el7.x86_64   docker://20.10.7
k8sworker2.unixcloudfusion.in   Ready      <none>                 9d    v1.21.3   172.31.46.144   <none>        CentOS Linux 7 (Core)   3.10.0-1160.31.1.el7.x86_64   docker://20.10.7

Even after updating the version still showed v1.21.2

Resolution:-

So the cluster is showing you the old version because you have not updated the kubelet which updates the version in the etcd which is storing all the configuration. So just run the below command to update the kubelet and kubectl

[[email protected] ~]$ sudo yum install -y kubelet kubectl --disableexcludes=kubernetes
Loaded plugins: fastestmirror, versionlock
Loading mirror speeds from cached hostfile
 * base: mirror.centos.org
 * epel: repos.del.extreme-ix.org
 * extras: mirror.centos.org

Updated:
  kubectl.x86_64 0:1.21.3-0

Updated:
  kubelet.x86_64 0:1.21.3-0

Complete!

Once the update is complete for both the kubectl and kubelet , now verify the version again

[[email protected] ~]$ kubectl get nodes -o wide
NAME                            STATUS     ROLES                  AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8smaster.unixcloudfusion.in    Ready      control-plane,master   9d    v1.21.3   172.31.36.208   <none>        CentOS Linux 7 (Core)   3.10.0-1160.31.1.el7.x86_64   docker://20.10.7
k8sworker1.unixcloudfusion.in   NotReady   <none>                 9d    v1.21.3   172.31.39.6     <none>        CentOS Linux 7 (Core)   3.10.0-1160.31.1.el7.x86_64   docker://20.10.7
k8sworker2.unixcloudfusion.in   Ready      <none>                 9d    v1.21.3   172.31.46.144   <none>        CentOS Linux 7 (Core)   3.10.0-1160.31.1.el7.x86_64   docker://20.10.7
So the issue is resolved and it shows the new version

Saturday, July 24, 2021

[Solved] Can't join Existing kubernetes cluster could not find a JWS signature in the cluster-info ConfigMap for token ID

 [Error] 

error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "nwuv63"

To see the stack trace of this error execute with --v=5 or higher


[Cause]

The token which you are using in kubeadm command to join the worker node to the existing kubernetes cluster has expired and you will need to generate a new valid token on the master node


[Resolution]

 [[email protected] ~]$ kubeadm token create  
 mq4znm.pb5so8esum3watjl  


You can check the details of the token like its expiring time as


 [[email protected] ~]$ kubeadm token list  

 TOKEN                    TTL     EXPIRES               USAGES                  DESCRIPTION        EXTRA GROUPS  
 mq4znm.pb5so8esum3watjl  23h     2021-07-24T18:26:37Z  authentication,signing  <none>             system:bootstrappers:kubeadm:default-node-token  

Thursday, February 25, 2021

[Solved] Restructuring the CDN Logs

 Problem:- Cloudfront logs are stored in the following format

 distributionid-year-month-date-hour.gz  

So if you are looking to analyse these logs you need something similar to the Athena which can directly run your queries over the s3 bucket which is storing these logs.

But Athena requires partition data which simply means storing data in a format of (e.g. a folder structure). This allows you to restrict the athena to the limited data which you want to analyze other by default it will take the entire data and cost you more while reading GBs of data which you dont want.

By default Athena tries to "read all" the data. But if you have partitioned it like year/month/day than you can register it like

  year=2021/month=02/day=25 -- s3://logs/2021/02/25  

This allows your to simply use the where clause and with partition indices to restrict the athena to read the data you are interested in

  SELECT uri, count(1)   
   FROM cloudfront_logs  
   WHERE status = 404   
    AND (year || month || day || hour) > ‘20200225’  

Sunday, February 7, 2021

[Solved] cannot load such file -- puppetserver/ca/cli (LoadError)

Issue:- When try to list the CA certificates in the puppet getting below error 
 [[email protected] bin]# /opt/puppetlabs/bin/puppetserver ca list 
Ignoring executable-hooks-1.3.2 because its extensions are not built. Try: gem pristine executable-hooks --version 1.3.2 
Ignoring gem-wrappers-1.3.2 because its extensions are not built. Try: gem pristine gem-wrappers --version 1.3.2 
Traceback (most recent call last): 
 2: from /opt/puppetlabs/server/apps/puppetserver/cli/apps/ca:3:in `
 1: from /opt/puppetlabs/puppet/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require' 
/opt/puppetlabs/puppet/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': cannot load such file -- puppetserver/ca/cli (LoadError) 


Resolution:- The above error comes when you are not using sudo command. Even if you are running via root than also you need to use the sudo command to resolve this issue 

 [[email protected] bin]# sudo /opt/puppetlabs/bin/puppetserver ca list 
Requested Certificates: puppetclient.unixcloudfusion.in (SHA256) B3:51:2A:59:CE:68:29:B0:68:9B:C1:53:59:28:97:45:AE:B6:61:97:64:DE:AE:64:40:D7:BE:93:78:65:42:1D