Runbook to resolve some of the most common issues in Linux

Sunday, August 26, 2018

Runbook to resolve some of the most common issues in Linux

Check the status of the particular FS by

df -ih

Check for the recently created files by entering the FS which is showing high inodes

find $1 -type f -print0 | xargs -0 stat --format '%Y :%y %n' | sort -nr | cut -d: -f2- | head

Check the directory which is having most of the files

find . -type d -print0 | xargs -0 -n1 count_files | sort -n

Check for the directories containing most of the inodes.

for i in /*; do echo $i; find $i |wc -l; done

Check the status of the Memory from the server.

htop

free -m /free -g

Check if there is more of cache memory occupied by the server if yes then clear the cache by the following command by checking with your vertical.

sync

echo 3 > /proc/sys/vm/drop_caches

Check for the processes consuming most of the memory

ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head

Check the list of opened files from the server.

lsof | awk '{ print $2 " " $1; }' | sort -rn | uniq -c | sort -rn | head -20

Check if the current opened files exceeds the set ulimit of the server.

Compare lsof | wc -l

cat /proc/sys/fs/file-max

Check the hardlimit and softlimit as well .

ulimit -Hn ulimit -Sn

Check the I/O wait on server.

iotop

pidstat -d 2 5

iostat -txk 5

Check if the inodes are not full on the server.

for i in /*; do echo $i; find $i |wc -l; done

df -ih

Check dmesg to see what is performing block read / writes or dirtying inodes

Also check nofile limit in limits.conf, a process could be requesting more files than it is permitted to open.

Check the status of CPU and check the load average.

htop

Use pstree to look for any suspicious processes or unusually high number of a particular service. You can compare the process listing with a similarly loaded server to do a quick check.

Use netstat to look for any suspicious connections, or too many connections from one particular IP

Check for the maximum no of processes consuming the CPU.

ps -eo pcpu,pid,user,args | sort -k 1 -r | head

Check the status of the mentioned file system for free space

df -h

For detailed analysis, check with du command

du -sh * | sort -hr | head -n10

Check for large files that are open but are deleted from file system.

lsof -nP | grep '(deleted)'

Check if system Logs or Nginx logs are taking much of the space then run logrotate

logrotate -f /etc/logrotate.d/nginx

check the health status of the elasticsearch cluster, by doing an API call.

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'

if the status is yellow then

All primary shards are allocated, but at least one replica is missing. No data is missing, so search results will still be complete. However, your high availability is compromised to some degree. If more shards disappear, you might lose data. Think of yellow as a warning that should prompt investigation

If the status is Red then

At least one primary shard (and all of its replicas) is missing. This means that you are missing data: searches will return partial results, and indexing into that shard will return an exception

Pages

Cloud Devops Automation

Book 1:1 call

Hireme Freelance Project Work

Join Slack Channel

Subscribe Our Youtube Channel

Sunday, August 26, 2018

Runbook to resolve some of the most common issues in Linux

0 comments:

Post a Comment

Blog Archive

Join Whatsapp Learning Group