Why QA Engineers Need Linux
When I started as a QA engineer, I assumed Linux was something only backend developers and sysadmins needed to know. That assumption lasted about two weeks into my first job. The test environments ran on Ubuntu servers. Logs lived on remote Linux hosts. The CI pipeline ran in Docker containers based on Alpine Linux. The Android devices under test exposed a Linux shell through ADB.
The reality of modern QA work is that Linux is everywhere in the toolchain:
- CI/CD agents — GitHub Actions, Jenkins, GitLab CI all run on Linux runners by default
- Application servers — most backend systems run on Linux, so reproducing server-side issues requires Linux access
- Docker and Kubernetes — containers are Linux, and the shell inside is always Linux regardless of your host OS
- Android devices — ADB shell gives you a Linux shell on every Android device under test
- Log files — server logs, application logs, and system logs all live on Linux hosts and need Linux tools to process efficiently
- Network testing — tools like tcpdump, netstat, ss, and nmap are Linux-native
You do not need to be a Linux administrator. But fluency in the commands covered in this article will make you significantly more effective at every stage of the testing lifecycle.
Architecture Overview
Understanding the relationship between a QA engineer and the Linux system layers they interact with daily:
Each of those four domains — file system, processes, network, and logs — maps directly to a set of commands you will use repeatedly in your QA work. The sections below cover each domain in depth.
File System Navigation
These are the commands you use dozens of times per session. Master them first.
# Print working directory (where am I?)
pwd
# /home/honnesh/projects/automation
# List files (long format, human-readable sizes, show hidden files)
ls -lah
# Change directory
cd /var/log/nginx
cd ~ # go to home directory
cd - # go to previous directory
# Find files by name recursively
find /var/log -name "*.log" -type f
# Find files modified in the last 24 hours
find /var/log -name "*.log" -mtime -1
# Find files larger than 100MB
find / -size +100M -type f 2>/dev/null
# Locate a file by name (uses a pre-built index — much faster than find)
locate nginx.conf
# Update the locate database
sudo updatedb
The find command is especially useful in QA work when you need to locate log files generated by a test run, find configuration files on a remote server, or identify which test report files were created during the last build.
File Operations
Reading, copying, moving, and modifying files are daily activities during log analysis, environment setup, and test data management.
# Read entire file
cat application.log
# Read large file one page at a time (q to quit, / to search)
less application.log
# Read first 20 lines
head -20 application.log
# Read last 50 lines
tail -50 application.log
# Follow a log file in real time (Ctrl+C to stop)
tail -f /var/log/nginx/error.log
# Copy a file
cp config.template.yaml config.yaml
# Copy a directory recursively
cp -r test-results/ test-results-backup/
# Move / rename
mv old-report.html new-report.html
# Remove file (no confirmation — be careful)
rm test-output.tmp
# Remove directory recursively
rm -rf old-reports/
# Create directories
mkdir -p reports/sprint-42/smoke
# Change file permissions (rwxr-xr-x)
chmod 755 run-tests.sh
# Make a script executable
chmod +x run-tests.sh
tail -f the application log on the test server while reproducing the issue, then use grep and awk to filter down from millions of lines to the specific error window. Being comfortable with these tools meant the difference between a 10-minute investigation and a 2-hour one.
grep Deep Dive
grep is the single most important log analysis tool in a QA engineer's arsenal. It searches for patterns in files or stdin and prints matching lines. Learn these flags and you can find anything in any log file.
# Basic pattern search
grep "ERROR" application.log
# Case-insensitive search
grep -i "error" application.log
# Recursive search through all files in a directory
grep -r "NullPointerException" /var/log/tomcat/
# Show line numbers with matches
grep -n "FAILED" test-output.log
# Invert match — show lines that do NOT match
grep -v "DEBUG" application.log
# Count matching lines
grep -c "ERROR" application.log
# Show only the matching part (not the whole line)
grep -o "session_id=[a-z0-9]*" access.log
# Extended regex — match multiple patterns
grep -E "ERROR|WARN|FATAL" application.log
# Fixed string (no regex interpretation — faster)
grep -F "user.email@example.com" access.log
# Show 3 lines of context before and after each match
grep -C 3 "OutOfMemoryError" application.log
# Show 5 lines before the match
grep -B 5 "Connection refused" application.log
# Show 5 lines after the match
grep -A 5 "Connection refused" application.log
# Search for a pattern and highlight matches (useful when piping to less)
grep --color=auto "ERROR" application.log | less -R
# Pipe: find all unique error messages in a log
grep "ERROR" application.log | sort | uniq -c | sort -rn | head -20
The pipe chain grep | sort | uniq -c | sort -rn is one of the most valuable log analysis patterns. It extracts all error lines, groups identical ones together, counts occurrences, and sorts by frequency — giving you an instant ranked list of your most common errors.
awk and sed for Log Transformation
awk processes text line by line and excels at extracting specific fields. sed performs stream editing — find-and-replace operations on text streams.
# awk: print specific fields from space-separated log
# Nginx access log: IP, method, URL, status code, response size
awk '{print $1, $6, $7, $9, $10}' access.log
# awk: filter lines where the 9th field (status code) is 500
awk '$9 == 500 {print}' access.log
# awk: calculate average response time from field 10
awk '{sum += $10; count++} END {print "Avg:", sum/count, "ms"}' access.log
# awk: extract timestamps and error messages from application log
awk -F'|' '{print $1, $4}' application.log
# awk: count requests per IP address
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
# sed: replace all occurrences of a string in a file
sed -i 's/staging.api.example.com/api.example.com/g' config.yaml
# sed: delete lines matching a pattern
sed '/^#/d' config.yaml # remove comment lines
# sed: print only lines 100 to 200 of a large file
sed -n '100,200p' large-log-file.log
# sed: extract timestamps (first 19 characters of each line)
sed 's/^\(.\{19\}\).*/\1/' application.log | head -5
Process Management
Understanding which processes are running, how much resource they consume, and how to control them is essential when testing applications on Linux servers.
# List all running processes
ps aux
# Find a specific process by name
ps aux | grep nginx
ps aux | grep "java"
# Real-time process viewer (q to quit)
top
htop # more user-friendly (install with: apt install htop)
# Kill a process by PID
kill 12345
# Force kill (SIGKILL — immediate termination)
kill -9 12345
# Kill all processes matching a name
pkill nginx
pkill -f "python run_tests.py"
# Run a process in the background
./run-long-test.sh &
# List background jobs
jobs
# Bring a background job to foreground
fg %1
# Run a process that survives after you log out (no hangup)
nohup ./run-overnight-tests.sh > overnight.log 2>&1 &
# Check if a port is being used
lsof -i :8080
ss -tlnp | grep 8080
The nohup pattern is particularly useful in QA: when you need to kick off a long test run on a remote server and then disconnect your SSH session. Without nohup, the process would be killed when you log out.
Network Commands
Network commands let you diagnose connectivity issues, inspect API responses, and understand how your application communicates over the network.
# Test connectivity to a host
ping -c 4 api.example.com
# Make an HTTP request and see the response
curl https://api.example.com/health
# GET with headers
curl -H "Authorization: Bearer token123" https://api.example.com/user
# POST with JSON body
curl -X POST https://api.example.com/login \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"Pass123"}'
# Show full request and response including headers (-v verbose)
curl -v https://api.example.com/products
# Timing breakdown of an HTTP request
curl -w "\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
-o /dev/null -s https://api.example.com/products
# Download a file
wget https://example.com/large-dataset.csv
# Show all open network connections
netstat -tulnp
# Modern replacement for netstat
ss -tulnp
# Trace the network path to a host
traceroute api.example.com
# DNS lookup
dig api.example.com
nslookup api.example.com
# Port scan a host (install nmap first)
nmap -p 80,443,8080 api.example.com
The curl -w timing breakdown is invaluable during API performance investigation. It breaks the total request time into DNS resolution, TCP connection, TLS handshake, and time-to-first-byte — letting you pinpoint exactly which phase of the request is slow.
Log Analysis
Most QA defect investigations start and end with logs. Linux gives you a powerful set of tools for navigating and analysing logs at scale.
# Follow a log file in real time
tail -f /var/log/application/app.log
# Follow multiple log files simultaneously
tail -f /var/log/nginx/error.log /var/log/application/app.log
# View systemd service logs
journalctl -u nginx
journalctl -u nginx --since "2026-05-05 09:00" --until "2026-05-05 10:00"
journalctl -u nginx -n 100 -f # last 100 lines, then follow
# View kernel messages
dmesg | tail -50
dmesg | grep -i "error\|fail"
# Search compressed log archives (no need to decompress first)
zgrep "ERROR" /var/log/application/app.log.2.gz
# Count errors per hour from a timestamped log
grep "ERROR" app.log | awk '{print substr($1,1,13)}' | sort | uniq -c
# Extract all unique HTTP status codes from nginx access log
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# Find all lines between two timestamps
awk '/2026-05-05 09:30/,/2026-05-05 09:45/' application.log
# Extract all stack traces following an exception
grep -A 20 "Exception" application.log | head -100
grep, awk, and tail -f to correlate events across logs from multiple services during a test run. One particular investigation involved a Fire TV device registration failure that only happened during peak concurrent load. By tailing logs across three services simultaneously — the registration service, the identity service, and the device state service — and filtering for a specific device serial number, I identified that the failure was caused by a race condition in the identity token refresh flow. The entire investigation took under 30 minutes because I could slice the logs precisely with shell tools.
Shell Scripting for Test Automation
Shell scripts let you automate repetitive tasks: running test suites, collecting results, generating reports, and managing test environments.
#!/bin/bash
# run-smoke-tests.sh — Run smoke tests and notify on failure
set -e # exit immediately if any command fails
set -u # treat unset variables as errors
ENVIRONMENT=${1:-staging}
TEST_SUITE="tests/smoke/"
REPORT_DIR="reports/$(date +%Y-%m-%d)"
TIMESTAMP=$(date +%H%M%S)
mkdir -p "$REPORT_DIR"
echo "=== Running smoke tests against $ENVIRONMENT ==="
echo "Started at: $(date)"
# Variables
BASE_URL="https://${ENVIRONMENT}.api.example.com"
REPORT_FILE="${REPORT_DIR}/smoke_${TIMESTAMP}.html"
# Run pytest with HTML report
if pytest "$TEST_SUITE" \
--base-url="$BASE_URL" \
--html="$REPORT_FILE" \
--self-contained-html \
-v; then
echo "All smoke tests PASSED"
EXIT_CODE=0
else
echo "Smoke tests FAILED — check $REPORT_FILE"
EXIT_CODE=1
fi
# Loop example: retry failed tests up to 3 times
RETRY_COUNT=0
MAX_RETRIES=3
while [ $RETRY_COUNT -lt $MAX_RETRIES ] && [ $EXIT_CODE -ne 0 ]; do
RETRY_COUNT=$((RETRY_COUNT + 1))
echo "Retry attempt $RETRY_COUNT of $MAX_RETRIES..."
if pytest "$TEST_SUITE" --last-failed --base-url="$BASE_URL"; then
EXIT_CODE=0
fi
done
# Conditional
if [ $EXIT_CODE -eq 0 ]; then
echo "Tests passed after $RETRY_COUNT retries"
else
echo "Tests still failing after $MAX_RETRIES retries"
fi
# Function example
send_slack_alert() {
local message=$1
curl -s -X POST "$SLACK_WEBHOOK_URL" \
-H "Content-Type: application/json" \
-d "{\"text\": \"$message\"}"
}
if [ $EXIT_CODE -ne 0 ]; then
send_slack_alert "Smoke tests FAILED on $ENVIRONMENT at $(date)"
fi
exit $EXIT_CODE
Cron Jobs for Scheduled Test Runs
Cron is the Linux task scheduler. It runs commands or scripts at specified times — perfect for nightly regression runs, scheduled smoke tests, or periodic performance checks.
# Edit your crontab
crontab -e
# Cron syntax: minute hour day-of-month month day-of-week command
# ┌─ minute (0-59)
# │ ┌─ hour (0-23)
# │ │ ┌─ day of month (1-31)
# │ │ │ ┌─ month (1-12)
# │ │ │ │ ┌─ day of week (0-7, 0 and 7 = Sunday)
# │ │ │ │ │
# * * * * * command
# Run smoke tests every day at 7:00 AM
0 7 * * * /home/honnesh/run-smoke-tests.sh staging >> /var/log/qa/smoke.log 2>&1
# Run regression suite every night at 11:30 PM Mon-Fri
30 23 * * 1-5 /home/honnesh/run-regression.sh >> /var/log/qa/regression.log 2>&1
# Run k6 load test every Sunday at 2:00 AM
0 2 * * 0 k6 run /home/honnesh/tests/load-test.js >> /var/log/qa/perf.log 2>&1
# List your cron jobs
crontab -l
SSH and SCP for Remote Access
SSH (Secure Shell) lets you connect to remote Linux servers. SCP (Secure Copy) lets you transfer files securely over SSH.
# Connect to a remote server
ssh honnesh@192.168.1.100
ssh honnesh@test-server.company.com
# Connect using a key file
ssh -i ~/.ssh/id_rsa honnesh@test-server.company.com
# Run a single command on a remote server without an interactive session
ssh honnesh@test-server.company.com "tail -50 /var/log/application/app.log"
# Create an SSH key pair
ssh-keygen -t ed25519 -C "honnesh.qa@example.com"
# Copy your public key to a server (enables passwordless login)
ssh-copy-id honnesh@test-server.company.com
# Copy a file TO the remote server
scp test-data.csv honnesh@test-server.company.com:/home/honnesh/data/
# Copy a file FROM the remote server
scp honnesh@test-server.company.com:/var/log/application/app.log ./
# Copy a directory recursively
scp -r test-results/ honnesh@test-server.company.com:/home/honnesh/results/
# SSH tunnelling — forward remote port 8080 to local port 8080
ssh -L 8080:localhost:8080 honnesh@test-server.company.com
SSH tunnelling is particularly useful in QA when you need to access an internal web application or API that is only accessible from within the server's network. You can forward a remote port to your local machine and then connect to it via localhost.
Disk and Memory Commands
Monitoring disk space and memory usage helps you diagnose environment issues — full disks, memory leaks, and resource starvation are common culprits behind intermittent test failures.
# Show disk usage of filesystems (human-readable)
df -h
# Show disk usage of a directory and its subdirectories
du -sh /var/log/*
# Find the top 10 largest directories
du -sh /var/log/* | sort -rh | head -10
# Show memory usage
free -h
# Real-time memory and CPU stats
vmstat 1 10 # print stats every 1 second, 10 times
# Detailed memory information
cat /proc/meminfo
# Show top memory-consuming processes
ps aux --sort=-%mem | head -20
# Show top CPU-consuming processes
ps aux --sort=-%cpu | head -20
Linux Distro Comparison for QA
| Distro | Package Manager | Best QA Use Case | Docker Image Size | Notes |
|---|---|---|---|---|
| Ubuntu 22.04 LTS | apt | CI runners, Selenium Grid, general testing | ~77MB | Most tool support, best documentation, LTS lifecycle |
| Debian 12 | apt | Stable server environments | ~49MB | Extremely stable, slower update cycle than Ubuntu |
| Alpine Linux 3.x | apk | Docker containers, microservices testing | ~7MB | Tiny image, uses musl libc (some binary compatibility issues) |
| CentOS Stream / RHEL | dnf / yum | Enterprise environments, compliance testing | ~230MB | Common in corporate/financial environments |
| Amazon Linux 2 | yum | AWS EC2-based test environments | ~200MB | Optimised for AWS, good integration with EC2 and ECS |
10 Most Useful QA-Specific One-Liners
These commands solve real problems that come up repeatedly in QA work. Bookmark them.
-
Find all ERROR lines in the last hour's worth of logs:
grep "$(date -d '1 hour ago' '+%Y-%m-%d %H')" app.log | grep "ERROR" -
Count HTTP status codes in nginx access log:
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn -
Watch a file for new content matching a pattern (live grep):
tail -f app.log | grep --line-buffered "ERROR" -
Find which process is listening on a specific port:
ss -tlnp | grep :8080 # or lsof -i :8080 -
Extract all unique IP addresses from an access log:
awk '{print $1}' access.log | sort -u -
Check if an API endpoint returns 200 (exit 0) or not (exit 1):
curl -sf https://api.example.com/health && echo "UP" || echo "DOWN" -
Kill all processes matching a pattern (safe preview first):
pgrep -fa "pytest" # preview what will be killed pkill -f "pytest" # then kill -
Find test report files created in the last 24 hours:
find ./reports -name "*.html" -mtime -1 -ls -
Monitor memory usage of a specific process every 2 seconds:
watch -n 2 'ps -p $(pgrep -f "node server.js") -o pid,vsz,rss,%mem,cmd' -
Compress and transfer test results to a remote server in one command:
tar czf - reports/ | ssh honnesh@test-server.company.com 'cat > /home/honnesh/reports.tar.gz'
Back to Blog