DEV Community

Sagar R Ravkhande
Sagar R Ravkhande

Posted on • Updated on

Linux Troubleshooting Scenarios - Part 1

It is always crucial to understand the issue. There should be the right approach or a step-by-step process to be followed to troubleshoot the issues. Doesn’t matter if you are a Software Developer or DevOps Engineer or Architect. Unix. /Linux is used widely, and you should be aware of the issues and the correct approach to resolve them.

Let’s discuss a few of them:

Issue 1: Server is not reachable or unable to connect

Approach / Solution:

├── Ping the server by Hostname and IP Address
│ ├── Hostname/IP Address is pingable
│ │ ├── The Issue might be on the client side as the server is 
        reachable
│ ├── Hostname is not pingable but IP Address is pingable
│ │ ├── Could be the DNS issue
│ │ │ ├── check /etc/hosts
│ │ │ ├── check /etc/resolv.conf
│ │ │ ├── check /etc/nsswitch.conf
│ │ │ ├── (Optional) DNS can also be defined in the 
          /etc/sysconfig/network-scripts/ifcfg-<interface>
│ ├── Hostname/IP Address both are not pingable
│ │ ├── Check the other server on the same network to see if there 
        is it a Network side access issue or other overall 
        something bad
│ │ │ ├── False: The issue is not overall network side but with 
          that host/server
│ │ │ ├── True: Might be an overall network-side issue
│ │ ├── Logged into the server by Virtual Console, if the server 
        is Powered ON. Check the uptime
│ │ ├── Check if the server has the IP, and has UP status of the 
        Network interface
│ │ │ ├── (Optional) Also check IP-related information from
          /etc/sysconfig/network-scripts/ifcfg-<interface>
│ │ ├── Ping the gateway, also check routes
│ │ ├── Check Selinux, Firewall rules
│ │ ├── Check physical cable conn

Enter fullscreen mode Exit fullscreen mode

Issue 2: Unable to connect to a website or an application

Approach / Solution:

├── Ping the server by Hostname and IP Address
│ ├── False: Above Troubleshooting Diagram "Server is not 
      reachable or cannot connect"
│ ├── True: Check the service availability by using the telnet 
      command with port
│ │ ├── True: Service is running
│ │ ├── False: Service is not reachable or running
│ │ │ ├── Check the service status using systemctl or other 
          commands
│ │ │ ├── Check the firewall/selinux
│ │ │ ├── Check the service logs
│ │ │ ├── Check the service configuration

Enter fullscreen mode Exit fullscreen mode

Issue 3: Unable to ssh as root or any other user.

Approach / Solution:

├── Ping the server by Hostname and IP Address
│ ├── False: Above Troubleshooting Diagram "Server is not 
      reachable or cannot connect"
│ ├── True: Check the service availability by using the telnet 
      command with port
│ │ ├── True: Service is running
│ │ │ ├── Issue might be on the client side
│ │ │ ├── User might be disabled, no-login shell, disabled root 
          login and other configuration
│ │ ├── False: Service is not reachable or running
│ │ │ ├── Check the service status using systemctl or other 
          commands
│ │ │ ├── Check the firewall/selinux
│ │ │ ├── Check the service logs
│ │ │ ├── Check the service configuration

Enter fullscreen mode Exit fullscreen mode

Issue 4: Disk Space is full issue or add/extend disk space

Approach / Solution:

├── System Performance degradation detection
│ ├── Application getting slow/unresponsive
│ ├── Commands are not running (For Example: as / disk space is 
      full)
│ ├── Cannot do logging and other etc.
├── Analyse the issue
│ ├── df command to find the problematic filesystem space issue
├── Action
│ ├── After finding the specific filesystem, use du command in 
      that filesystem to get which files/directories are large
│ ├── Compress/remove big files
│ ├── Move the items to another partition/server
│ ├── Check the health status of the disks using badblocks command 
      (For Example, #badblocks -v /dev/sda)
│ ├── Check which process is IO Bound (using iostat)
│ ├── Create a link to file/dir
├── New disk addition
│ ├── Simple partition
│ │ ├── Add disk to VM
│ │ ├── Check the new disk with df/lsblk command
│ │ ├── fdisk to create the partition. Better to have LVM 
        partition
│ │ ├── Create filesystem and mount it
│ │ ├── fstab entry for persistent
│ ├── LVM Partition
│ │ ├── Add disk to VM
│ │ ├── Check the new disk with df/lsblk command
│ │ ├── fdisk to create LVM partition
│ │ ├── PV, VG, LV
│ │ ├── Create filesystem and mount it
│ │ ├── fstab entry for persistent
│ ├── Extend LVM partition
│ │ ├── Add disk, and create LVM partition
│ │ ├── Add LVM partition (PV) in existing VG
│ │ ├── Extend LV and resize the filesystem

Enter fullscreen mode Exit fullscreen mode

Issue 5: Filesystem corrupted

Approach / Solution:

├── One of the errors that cause the system unable to BOOT UP
├── Check /var/log/messages, dmesg, and other log files
├── If we have bad sector logs, we have to run fsck
│ ├── True:
│ │ ├── reboot the system into rescue mode by booting it from 
        CDROM by applying ISO
│ │ ├── proceed with option 1, which mounts the original root 
        filesystem under /mnt/sysimage.
│ │ ├── edit fstab entries or create a new file with the help of 
        blkid and reboot.

Enter fullscreen mode Exit fullscreen mode

Oldest comments (0)