Posts

Showing posts from 2022

[Solved] sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '127.0.0.1'

Image
  Issue:-  When  launching a container from the application image, application needs to connect to the mysql database running on the host machine. But when you try to connect using the localhost or 127.0.0.1 you get the following error. Error:-  sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '127.0.0.1' ([Errno 111] Connection refused)") Effect:- Application container went down because application was not able to connect to the mysql database. Resolution:- Since the mysql database is running on the host machine so it makes sense that we use the --network=host which will disable the docker networking and use the host based networking and than your docker container will be able to connect to the host database since both are in the same network. docker run -d --network=host project_app1:latest

[Solved] Failed to pull image rpc error: code = Unknown desc = context deadline exceeded

Image
   Issue:-  When creating a tomcat:9 image pod in the minikube got the following error   Warning  Failed     58s                  kubelet            Failed to pull image "tomcat:9": rpc error: code = Unknown desc = context deadline exceeded Error:-  Warning Failed 58s kubelet Failed to pull image "tomcat:9": rpc error: code = Unknown desc = context deadline exceeded

Detail overview about ISTIO Service Mesh

Image

[Solved] warning: containerd.io.rpm: error: Failed dependencies:container-selinux >= 2:2.74 is needed by containerd.io

Image
  Issue:-  When installing the containerd rpm on the centos7 get a dependency issue related to container-selinux preventing the containerd from getting installed Error:-  [root@kubemaster ~]# rpm -ivh containerd.io-1.6.8-3.1.el7.x86_64.rpm warning: containerd.io-1.6.8-3.1.el7.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 621e9f35: NOKEY error: Failed dependencies: container-selinux >= 2:2.74 is needed by containerd.io-1.6.8-3.1.el7.x86_64 Effect:- Was not able to install the containerd on the Centos7.

[Solved] No package containerd available.

Image
  Issue:-  When installing the containerd on the centos7 using the yum package manager it gives the error mentioned below Error:-  No package containerd available. Effect:- Was not able to install the containerd on the Centos7. Resolution:- Download the rpm for the containerd from the following link https://download.docker.com/linux/centos/7/x86_64/stable/Packages/ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.6.8-3.1.el7.x86_64.rpm

Generating token in kubernetes using kubeadm command for adding the worker nodes

 Issue:- Kubeadm provides you a join token command when you first create a kubernetes cluster. But if you dont have that token handy for the future requirement for addition of the worker nodes to increase the cluster capacity ? Solution:-   you can run the following command which will allow you to generate the full token command which can be used to add the worker nodes to master in the future. [centos@kubemaster ~]$ kubeadm token create --print-join-command kubeadm join 172.31.98.106:6443 --token ix1ien.29glfz1p04d7ymtd --discovery-token-ca-cert-hash sha256:1f202db500d698032d075433176dd62f5d0074453daa12ccdfffd637a966a771 Once the token has been generated than you can run the command on the worker node to add it in the kubernetes cluster.

[Solved] Persistentvolume claim pending while installing the Elasticsearch using Helm

Image
  Issue:-  When installing the elasticsearch using the helm , the elasticsearch continaer fails as the multimaster nodes go in pending state for the persistentvolumeclaim and continaer remains in the pending state. Error:-  Persistent volume claim remains in the pending state Effect:- Was not able to install the elasticsearch as persistent volume claim was not ready for the  Elasticsearch.

[Solved] stacktrace":ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];

  Issue:-  When installing the elasticsearch using the helm , the elasticsearch continaer fails with an exception AccessDeniedException[/usr/share/elasticsearch/data/nodes]; Error:-  "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0", "message": "uncaught exception in thread [main]", "stacktrace": ["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];" Effect:- Was not able to install the elasticsearch and elasticsearch pod keeps crashing again and again as the healthcheck is not passed and the liveness probe fails restarting the pod again and again. Resolution:- Follow the following steps to resolve the issue 1. The issue comes because the elasticsearch user is not having the permission on the  /usr/share/elasticsearch/data/nodes directory. 2. But you cannot directly use kubectl ...

Terraform variables input output local variable theory Part2

Image

[Solved] too early for operation, device not yet seeded or device model not acknowledged

  Issue:-  WWhen installing the terragrunt using the snap got the following error Error:-  error: too early for operation, device not yet seeded or device model not acknowledged Effect:- Was not able to install the terragrunt as installation failed at that very moment. [root@aafe920be71c ~]# snap install terragrunt error: too early for operation, device not yet seeded or device model not acknowledged Resolution:- Follow the following steps to resolve the issue 1. Try checking the status of the snapd service which was inactive in my case  [root@aafe920be71c ~]# systemctl status snapd.seeded.service ● snapd.seeded.service - Wait until snapd is fully seeded Loaded: loaded (/usr/lib/systemd/system/snapd.seeded.service; disabled; vendor preset: disabled) Active: inactive (dead) 2. Now start the snapd service as [root@aafe920be71c ~]# systemctl status snapd.seeded.service ● snapd.seeded.service - Wait until snapd is fully seeded Loaded: loaded (/usr/lib/syst...

Terraform theory part1

Image

[Resolved] ERROR Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1': (org.apache.kafka.common.utils.KafkaThread) java.lang.OutOfMemoryError: Java heap space

  Issue:-  When trying to delete the topic in the Amazon MSK kafka clusterr got the following error Error:-    ERROR Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1': (org.apache.kafka.common.utils.KafkaThread) java.lang.OutOfMemoryError: Java heap space Effect:- Was not able to delete the Topic in the MSK kafka cluster due to the above error message. ERROR Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1': (org.apache.kafka.common.utils.KafkaThread) java.lang.OutOfMemoryError: Java heap space at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:61) at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:348) at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30) at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:112) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424) at org.apache.kafka.common.network.KafkaChannel.r...

[Resolved] default.svc.cluster.local: Name or service not known

  Issue:-  After creating a service when I tried to verify if the DNS name for the service is getting resolved or I got the following error.  Error:-   my-service.default.svc: Name or service not known Effect:- I was unable to confirm if the service DNS was actually resolving or not and if there was some issue as the service itself was not accessible via curl or the browser [centos@kubemaster service]$ nslookup my-service.default.svc -bash: nslookup: command not found [centos@kubemaster service]$ dig nslookup my-service.default.svc -bash: dig: command not found [centos@kubemaster service]$ ping nslookup my-service.default.svc ping: my-service.default.svc: Name or service not known [centos@kubemaster service]$ ping my-service.default.svc ping: my-service.default.svc: Name or service not known Resolution:- Follow the following steps 1. Create a pod with the DNS utils installed on it for making the nslookup command work inside the pod kubect...

[Resolved] groupVersion shouldn't be empty

  Issue:-  When creating the simple resource like pod, replicaset, deployments etc got a groupVersion error specified below.  Error:-   groupVersion shouldn't be empty Effect:- Not able to create the resource because of the above error apiversion: v1 kind: Pod metadata: name: pod2 spec: containers: - name: c1 image: nginx Resolution:- If you look at the above configuration precisely you will find the  apiversion   has been specified incorrectly. It should have been  apiVersion   k.So just a difference of block letter can make that error. The same error will occur even if you forgot to mention the apiVersion in the configuration or it is misspelled. Below configuration will work fine. apiVersion: v1 kind: Pod metadata: name: pod2 spec: containers: - name: c1 image: nginx   Explanation:- apiVersion is hardcoded in the kubernetes. So if you misspell it, not use it or make a e...

Understanding Docker TOCTOU Vulnerability

Image

[Resolved] Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds.

  Issue:-  Issue is with the dashboard service. When deploying the Dashboard service using the yaml in the kubernetes it gives the following error. Error:-   Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds. Effect:- Because the dashboard service is not able to connect to the dashboard-metrics-scraper service the UI for the dashboard service is not loading up due to which the Dashboard is not working in the UI and timeout after some time.

[Resolved] Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

Issue:-  When installing the metricserver in the kubernetes getting the following error.  Error:-   Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io) Effect:- Due to the above error the metricserver will not work [centos@kubemaster dashboard]$ kubectl top nodes W0529 10:18:25.234815   13218 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

Analysing various threats in application tampering and security and SLSA mitigation of such threats

Image

Linux Fundamentals for Devops/Developers/Beginners Part1

Image

Understanding the Levels of assurances in SLSA - Part2

Image

Understanding Supply chain Levels for Software Artifacts SLSA Part1

Image

Understanding Dockershim removal by kubernetes with version v1.24

Image

Understanding CI/CD devops pipelines with GITLAB with realworld analogy ...

Image

[Resolved] An error occurred (Throttling) when calling the DescribeLoadBalancers operation (reached max retries: 4): Rate exceeded

  Issue:-  If you are having big infrastructure and you have put lot of automation in place than your awscli limits might reach the thresholds which might result in error like. Error:-   An error occurred (Throttling) when calling the DescribeLoadBalancers operation (reached max retries: 4): Rate exceeded Effect:- The command or the script which you have run might failed due to the limit being reached for the calls to the aws resource and retries also exhausted. You might run the command again if you doing it manually and facing error but in case its some script that will cause bigger issue to make the script failed without the retry logic written within the script as well which you have created.

Weaknesses/Limitations of gRPC Part5

Image

Strengths of gRPC

Image

Understanding gRPC ARchitecture Part3

Image

Basic gRPC concepts Part2

Image

Introduction to gRPC for fast and scalable api development - Part1

Image

Understanding kubernetes components kubectl, daemon, apiserver, apiversi...

Image

Terraform modules practices for scalable architecture implementation & i...

Image

Terraform modules practices for scalable architecture implementation

Image

[Resolved] from setuptools_rust import RustExtension ModuleNotFoundError: No module named 'setuptools_rust'

  Issue:-  Issue with the cryptography package and Rust during the Ansible Installation on the Centos. Error:-    Downloading https://files.pythonhosted.org/packages/3d/5f/addb8b91fd356792d28e59a8275fec833323cb28604fb3a497c35d7cf0a3/cryptography-37.0.1.tar.gz (585kB) 100% |████████████████████████████████| 593kB 2.0MB/s Complete output from command python setup.py egg_info: =============================DEBUG ASSISTANCE========================== If you are seeing an error here please try the following to successfully install cryptography: Upgrade to the latest pip and try again. This will fix errors for most users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip =============================DEBUG ASSISTANCE========================== Traceback (most recent call last): File "<string>", line 1, in <module> File "/tmp/pip-build-nfv80r3s/cryptography/...

[Resolved] Error response from daemon: invalid MountType: "=bind"

  Issue:-  Unable to deploy the visualizer service in the Docker Swarm Error:-    Error response from daemon: invalid MountType: "=bind" Effect:- # docker service create --name=viz --publish=8080:8080/tcp --constraint=node.role==manager --mount=type==bind,src=/var/run/docker.sock,dst=/var/run/docker.sock dockersamples/visualizer Resolution:- # docker service create --name=viz --publish=8080:8080/tcp --constraint=node.role==manager --mount=type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock dockersamples/visualizer Explanation:- You need to use the =bind and not ==bind to solve the problem

All About IPAM and use in cloud, devops, vpc troubleshooting

Image

Preventing DDOS Attacks Using AWS WAF Rule based WEBACL PART 3

Image

Preventing DDOS Attacks Using AWS WAF Rule based WEBACL PART 2

Image

Preventing DDOS Attacks Using AWS WAF Rule based WEBACL

Image

Understanding Kubernetes Canary Deployment With Architecture Diagram Part-II

Image

Understanding the Concept of Canary Deployment in Kubernetes Part-1

Image

Troubleshooting and Logging in Distroless Images

Image

Signing the Docker Images using Cosign - Part 4

Image

Comparing distroless vs distrobased vs alpine Docker Image on basis of vulnerability scan

Image

Handson Node Application build on Distroless Image Docker - Part 2

Image

Handson Distroless Installation from Scratch - Part 1

Image

Understanding Distroless Container Images

Image

[Solved] Intermittent / burst logs in the Newrelic / ELK

Image
  Issue:-  Although the application was writing the logs continuously and shipper shipping but the logs were missing for a particular period and burst of logs with spikes being observed in the Newrelic ELK. Error:-  Following graph shows the actual issue of intermittent or burst of the logs in ELK Effect:- Due to the non availability of the logs it was becoming difficult to troubleshoot the issue as the logs were getting delayed and sometimes might be missed out as well. Resolution:- Printing the error logs or logs required for troubleshooting helps to overcome this issue. Explanation:- More than 1million event logs are getting posted in an hour due to which the Disk would be becoming a bottleneck and burst of events are being pushed into the Newrelic ELK. Lowering down and printing the error logs or logs required for troubleshooting should help to overcome this issue of intermittent logs in the Newrelic/ELK.