Linux Network Troubleshooting for Beginners: From ‘Cannot Connect’ to Root Cause in 10 Minutes
π― Key Takeaways
- The Art of Systematic Network Troubleshooting
- The Troubleshooting Stack: Work From Bottom to Top
- Step 1: Is the Network Interface Up?
- Step 2: Can You Reach the Gateway?
- Step 3: Can You Reach the Internet?
π Table of Contents
- The Art of Systematic Network Troubleshooting
- The Troubleshooting Stack: Work From Bottom to Top
- Step 1: Is the Network Interface Up?
- Step 2: Can You Reach the Gateway?
- Step 3: Can You Reach the Internet?
- Step 4: Can You Reach the Specific Server? Use ping and traceroute
- Step 5: Is the Service Running and Listening?
- Step 6: Is DNS Resolving Correctly?
- Step 7: Is a Firewall Blocking the Connection?
- Step 8: Debug the Application Layer with curl -v
- A Complete Real-World Scenario: SSH Not Connecting
- Your Network Troubleshooting Quick Reference
The Art of Systematic Network Troubleshooting
“I can’t connect to the server.” Six words that could mean a hundred different things. Is the server down? Is the network route broken? Is a firewall blocking you? Is the service not running? Is DNS failing? Is it a TLS certificate issue? Each of these has a completely different fix, and if you start guessing randomly, you will waste enormous amounts of time.
π Table of Contents
- The Art of Systematic Network Troubleshooting
- The Troubleshooting Stack: Work From Bottom to Top
- Step 1: Is the Network Interface Up?
- Step 2: Can You Reach the Gateway?
- Step 3: Can You Reach the Internet?
- Step 4: Can You Reach the Specific Server? Use ping and traceroute
- Step 5: Is the Service Running and Listening?
- Step 6: Is DNS Resolving Correctly?
- Step 7: Is a Firewall Blocking the Connection?
- Step 8: Debug the Application Layer with curl -v
- A Complete Real-World Scenario: SSH Not Connecting
- Your Network Troubleshooting Quick Reference
Professional network troubleshooting is not about trying random things until something works. It is about systematically eliminating possible causes from the bottom of the networking stack to the top β from physical connectivity up through to the application layer. This guide gives you an exact, repeatable process that will get you from “cannot connect” to root cause in 10 minutes or less, every time.
The Troubleshooting Stack: Work From Bottom to Top
Network problems exist at different layers, and each layer depends on the one below it. There is no point debugging DNS if the network interface is down. There is no point checking if a service is running if there is a firewall blocking all traffic to the server. Always troubleshoot from the lowest layer upward:
- Is the network interface up and configured?
- Can I reach the local gateway (next hop)?
- Can I reach the internet (external connectivity)?
- Can I reach the specific server?
- Is the service running and listening?
- Is DNS resolving correctly?
- Is a firewall blocking the connection?
- Is the application responding correctly?
Each step in this list depends on all the previous steps succeeding. Let’s work through each one.
Step 1: Is the Network Interface Up?
The most basic check β is your network interface even configured and active?
ip addr show
# or the shorter version:
ip a
Look at the output carefully. You are looking for your main network interface (usually eth0, ens3, enp0s3, or ens192 depending on your system). The interface should show:
UPin the flags (like<BROADCAST,MULTICAST,UP,LOWER_UP>)- An
inetline with an IP address (likeinet 192.168.1.50/24)
If the interface shows DOWN instead of UP, bring it up:
ip link set eth0 up
If there is no IP address, the interface is up but not configured. Check if DHCP is running:
dhclient eth0
# or for systemd-networkd systems:
networkctl status
Also check if the link is physically connected:
ip link show eth0
# Look for: LOWER_UP meaning physical carrier is present
# If you see NO-CARRIER, the cable is unplugged or the port is down
Step 2: Can You Reach the Gateway?
If your interface is up and has an IP address, find your default gateway and ping it:
ip route show
# Look for the line starting with "default via"
# Example: default via 192.168.1.1 dev eth0
# Ping the gateway:
ping -c 4 192.168.1.1
If your gateway is unreachable, the problem is local β either your network configuration is wrong, or there is a problem between you and the gateway (a switch, a VLAN, a physical cable). On a cloud VM, an unreachable gateway usually means a network configuration error in the instance or at the cloud provider level.
If the gateway responds, you have confirmed basic local network connectivity. Move up the stack.
Step 3: Can You Reach the Internet?
Try pinging a well-known, highly reliable IP address β use an IP address, not a hostname, to avoid DNS confusion at this step:
ping -c 4 8.8.8.8
# Google's DNS β use the IP directly
If this fails but your gateway succeeds, the problem is upstream of your gateway β your ISP, your cloud provider’s routing, or a routing table issue on the gateway itself. If you are on a cloud VM, check the cloud console for network/VPC configuration.
If this succeeds, you have internet connectivity. Now check if you can reach your specific target.
Step 4: Can You Reach the Specific Server? Use ping and traceroute
ping -c 4 target.server.com
# or by IP:
ping -c 4 203.0.113.50
Pay attention to what ping output tells you. There are several distinct failure modes:
- 100% packet loss, no response: The target is either down, unreachable, or filtering ICMP ping packets.
- Partial packet loss (e.g., 25% loss): There is a flaky network path between you and the target. Something along the route is dropping packets intermittently.
- “Network is unreachable”: Your local routing table has no route to this destination.
- “Host is unreachable”: Your gateway sent back an ICMP unreachable message β it knows where the destination should be but cannot reach it.
If ping fails or shows packet loss, use traceroute (or tracepath) to find exactly where packets stop:
traceroute target.server.com
# On some systems use:
tracepath target.server.com
Traceroute shows each hop between you and the destination. Look for where the asterisks (* * *) start appearing β that is where packets stop getting through. If the last successful hop is inside your network, the problem is local. If it is at a distant router, it is an upstream network issue.
Step 5: Is the Service Running and Listening?
You can reach the server’s IP, but still cannot connect to the application. The next question: is the service actually listening on the expected port?
Check from the server itself using ss (the modern replacement for netstat):
ss -tlnp
# -t: TCP sockets
# -l: listening sockets only
# -n: show numbers, not service names
# -p: show the process name/PID
The output looks like:
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:80 0.0.0.0:* users:(("nginx",pid=1234,fd=6))
LISTEN 0 128 0.0.0.0:443 0.0.0.0:* users:(("nginx",pid=1234,fd=7))
LISTEN 0 128 127.0.0.1:5432 0.0.0.0:* users:(("postgres",pid=5678,fd=3))
This tells you immediately whether your service is running and what address it is listening on. Notice that postgres is listening on 127.0.0.1:5432 β the loopback interface only. If you try to connect to PostgreSQL from another server, it will fail even though the service is running, because it is only accepting local connections. That is a configuration issue, not a service issue.
If the port is not listed at all, the service is not running. Start it:
systemctl start nginx
systemctl status nginx # Check why it started or failed
Step 6: Is DNS Resolving Correctly?
DNS failures are common and produce symptoms that look like connectivity problems. If you can reach a server by IP but not by hostname, DNS is your culprit.
dig target.server.com
# or
nslookup target.server.com
With dig, look at the ANSWER SECTION. It should contain an A record (for IPv4) with the server’s IP address. If you get NXDOMAIN, the hostname does not exist in DNS. If you get no answer but no error, DNS is not resolving.
Try using a specific DNS server to isolate whether it is your local DNS or universal:
dig @8.8.8.8 target.server.com
# Forces the query to Google's DNS instead of your configured resolver
If dig @8.8.8.8 works but your local DNS does not, your local DNS server has a problem. Check /etc/resolv.conf to see which DNS server you are configured to use:
cat /etc/resolv.conf
On systemd systems, DNS is managed by systemd-resolved:
systemd-resolve --status
resolvectl status
Step 7: Is a Firewall Blocking the Connection?
The service is running, DNS is resolving, but connections are still refused or timing out. A firewall is a likely suspect.
Check the local firewall on the server using iptables or ufw:
# For ufw (Ubuntu/Debian):
ufw status verbose
# For iptables directly:
iptables -L -n -v
# For firewalld (RHEL/CentOS/Fedora):
firewall-cmd --list-all
Look for rules that DROP or REJECT traffic to the port you are trying to connect on. A common pattern is a firewall that has rules to allow traffic on ports 22, 80, and 443 but drops everything else.
If you need to allow a port through ufw:
ufw allow 8080/tcp
ufw reload
For iptables directly:
iptables -A INPUT -p tcp --dport 8080 -j ACCEPT
Also consider cloud provider firewalls β AWS security groups, GCP firewall rules, Azure NSGs. These exist outside the VM’s operating system and are not visible from inside the VM. If your Linux firewall shows no blocking rules but connections still fail from outside, check the cloud provider’s firewall console.
Step 8: Debug the Application Layer with curl -v
If everything below is working β network is up, server is reachable, service is listening, firewall is open β but the application is not behaving correctly, use curl -v for verbose HTTP debugging:
curl -v http://target.server.com/
# or for HTTPS:
curl -v https://target.server.com/
The verbose output shows you every step of the connection: DNS resolution, TCP connection, TLS handshake, HTTP request headers sent, response headers received, and the response body. This reveals issues like SSL certificate mismatches, HTTP redirect loops, wrong content types, and authentication failures that you cannot see with lower-level tools.
Common curl diagnostic flags:
curl -Iβ Fetch headers only (faster for quick checks)curl -kβ Skip SSL certificate verification (useful to test if cert issues are the problem)curl --resolve hostname:port:IPβ Force a specific IP for testing before DNS is updatedcurl -x proxy:portβ Test through a specific proxy
A Complete Real-World Scenario: SSH Not Connecting
Let’s walk through a real scenario: you cannot SSH to a server at web01.example.com. The connection just hangs. Here is the exact diagnostic process:
Step 1 β Check your own network interface: ip addr show. Your interface is up with an IP. Good.
Step 2 β Try pinging the target: ping -c 4 web01.example.com. You get a response β the server is reachable and DNS resolves. The problem is not network connectivity to the server.
Step 3 β Try connecting to the SSH port specifically:
nc -zv web01.example.com 22
# or
telnet web01.example.com 22
This either connects (port is open) or refuses/times out (port is closed or firewalled). If it times out, that is a firewall. If it refuses immediately, SSH is not running.
Step 4 β Get on the server via console (cloud provider’s serial console, or physical access) and check SSH:
systemctl status sshd
ss -tlnp | grep :22
You find: sshd is running and listening. Back to the firewall hypothesis.
Step 5 β Check the firewall on the server:
iptables -L INPUT -n -v
You find a rule: DROP all -- 0.0.0.0/0 0.0.0.0/0 as the last rule in the INPUT chain, and there is no rule specifically allowing port 22 traffic before it. Someone added a default-drop policy without first adding allow rules.
Fix:
iptables -I INPUT 1 -p tcp --dport 22 -j ACCEPT
# -I INPUT 1 inserts the rule at the top of the INPUT chain
SSH connections now work immediately. Total time from “cannot connect” to root cause: about 8 minutes using systematic elimination.
Your Network Troubleshooting Quick Reference
- Interface check:
ip addr show,ip link show - Gateway check:
ip route show, thenping <gateway IP> - Internet reachability:
ping 8.8.8.8 - Route tracing:
traceroute <target> - Port/service check:
ss -tlnpon the server,nc -zv <host> <port>from the client - DNS check:
dig <hostname>,dig @8.8.8.8 <hostname> - Firewall check:
ufw statusoriptables -L -n -v - Application debug:
curl -v http://target/
Network troubleshooting feels like magic to those who do not have a systematic approach, and like basic methodology to those who do. Start at the bottom of the stack, eliminate one layer at a time, and you will find your answer every time.
”
}
]
“`
Was this article helpful?
About Ramesh Sundararamaiah
Red Hat Certified Architect
Expert in Linux system administration, DevOps automation, and cloud infrastructure. Specializing in Red Hat Enterprise Linux, CentOS, Ubuntu, Docker, Ansible, and enterprise IT solutions.