Advanced Bash Scripting: Professional Automation and System Integration

Advanced Bash scripting transforms simple shell commands into powerful automation tools used in enterprise environments. This comprehensive guide covers professional scripting techniques, error handling strategies, system integration patterns, and best practices used by DevOps engineers and system administrators worldwide.

Table of Contents

  1. Advanced Scripting Fundamentals
  2. Professional Error Handling
  3. Advanced Parameter Processing
  4. Arrays and Data Structures
  5. Modular Functions and Libraries
  6. Advanced File Operations
  7. Network and API Integration
  8. Parallel Processing and Job Control
  9. Security and Input Validation
  10. System Integration Scripts
  11. Testing and Debugging
  12. Best Practices and Patterns

1. Advanced Scripting Fundamentals

Robust Script Template

#!/usr/bin/env bash

#############################################################################
# Script Name: professional-template.sh
# Description: Professional Bash script template with best practices
# Author: Your Name
# Version: 1.0.0
# Date: $(date +%Y-%m-%d)
#############################################################################

# Strict mode - exit on errors, undefined variables, pipe failures
set -euo pipefail
IFS=$'
	'

# Constants
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")"
readonly LOG_DIR="/var/log/$(basename "$SCRIPT_NAME" .sh)"
readonly LOG_FILE="${LOG_DIR}/$(date +%Y%m%d).log"
readonly PID_FILE="/var/run/$(basename "$SCRIPT_NAME" .sh).pid"

# Color codes for output
readonly RED='33[0;31m'
readonly GREEN='33[0;32m'
readonly YELLOW='33[1;33m'
readonly NC='33[0m' # No Color

# Create log directory
mkdir -p "$LOG_DIR"

# Logging functions
log() {
    echo "[$(date +"%Y-%m-%d %H:%M:%S")] [INFO] $*" | tee -a "$LOG_FILE"
}

log_error() {
    echo -e "${RED}[$(date +"%Y-%m-%d %H:%M:%S")] [ERROR] $*${NC}" | tee -a "$LOG_FILE" >&2
}

log_success() {
    echo -e "${GREEN}[$(date +"%Y-%m-%d %H:%M:%S")] [SUCCESS] $*${NC}" | tee -a "$LOG_FILE"
}

log_warning() {
    echo -e "${YELLOW}[$(date +"%Y-%m-%d %H:%M:%S")] [WARNING] $*${NC}" | tee -a "$LOG_FILE"
}

# Cleanup function
cleanup() {
    local exit_code=$?
    log "Cleaning up..."
    rm -f "$PID_FILE"
    trap - EXIT
    exit "$exit_code"
}

# Error trap
error_exit() {
    log_error "Error on line $1"
    cleanup
}

# Setup traps
trap 'error_exit $LINENO' ERR
trap cleanup EXIT INT TERM

# Check if script is already running
if [[ -f "$PID_FILE" ]]; then
    old_pid=$(cat "$PID_FILE")
    if ps -p "$old_pid" > /dev/null 2>&1; then
        log_error "Script already running with PID $old_pid"
        exit 1
    else
        log_warning "Stale PID file found, removing..."
        rm -f "$PID_FILE"
    fi
fi

# Write current PID
echo $$ > "$PID_FILE"

# Main script logic goes here
main() {
    log "Script started"

    # Your code here

    log_success "Script completed successfully"
}

main "$@"

2. Professional Error Handling

Comprehensive Error Handling

#!/bin/bash

# Advanced error handling with retry logic
retry() {
    local max_attempts="$1"
    local delay="$2"
    local command="${@:3}"
    local attempt=1

    while [ $attempt -le $max_attempts ]; do
        log "Attempt $attempt of $max_attempts: $command"

        if eval "$command"; then
            log_success "Command succeeded on attempt $attempt"
            return 0
        fi

        if [ $attempt -lt $max_attempts ]; then
            log_warning "Command failed, retrying in ${delay}s..."
            sleep "$delay"
        fi

        ((attempt++))
    done

    log_error "Command failed after $max_attempts attempts"
    return 1
}

# Example usage
retry 3 5 "curl -sf https://api.example.com/health"

# Graceful degradation
backup_with_fallback() {
    local primary_backup="/mnt/backup"
    local fallback_backup="/tmp/backup"

    if ! mount | grep -q "$primary_backup"; then
        log_warning "Primary backup location not available, using fallback"
        backup_location="$fallback_backup"
    else
        backup_location="$primary_backup"
    fi

    tar -czf "${backup_location}/backup-$(date +%Y%m%d).tar.gz" /opt/app/ || {
        log_error "Backup failed"
        return 1
    }

    log_success "Backup completed to $backup_location"
}

# Error context preservation
run_with_context() {
    local context="$1"
    shift

    {
        "$@"
    } || {
        log_error "Failed in context: $context"
        log_error "Command: $*"
        log_error "Working directory: $(pwd)"
        log_error "User: $(whoami)"
        log_error "Environment: $(env | grep -E '^(PATH|HOME|USER)=')"
        return 1
    }
}

Signal Handling and Cleanup

#!/bin/bash

# Temporary file management
TEMP_FILES=()

create_temp_file() {
    local temp_file=$(mktemp)
    TEMP_FILES+=("$temp_file")
    echo "$temp_file"
}

cleanup_temp_files() {
    log "Cleaning up temporary files..."
    for temp_file in "${TEMP_FILES[@]}"; do
        if [[ -f "$temp_file" ]]; then
            rm -f "$temp_file"
            log "Removed: $temp_file"
        fi
    done
}

# Advanced signal handling
declare -A SIGNAL_NAMES=(
    [1]="SIGHUP"
    [2]="SIGINT"
    [3]="SIGQUIT"
    [15]="SIGTERM"
)

signal_handler() {
    local signal=$1
    log_warning "Received ${SIGNAL_NAMES[$signal]}, shutting down gracefully..."

    # Perform cleanup
    cleanup_temp_files

    # Stop background jobs
    jobs -p | xargs -r kill 2>/dev/null

    # Save state if needed
    save_state

    exit $((128 + signal))
}

save_state() {
    local state_file="/var/lib/myapp/state.json"
    cat > "$state_file" << EOF
{
    "timestamp": "$(date -Iseconds)",
    "pid": $$,
    "status": "interrupted"
}
EOF
}

# Set up signal traps
for signal in "${!SIGNAL_NAMES[@]}"; do
    trap "signal_handler $signal" "$signal"
done

3. Advanced Parameter Processing

Getopts with Long Options

#!/bin/bash

# Advanced argument parsing with validation
usage() {
    cat << EOF
Usage: $0 [OPTIONS] SOURCE DESTINATION

Advanced backup script with multiple options

OPTIONS:
    -h, --help              Show this help message
    -v, --verbose           Enable verbose output
    -c, --compress LEVEL    Compression level (1-9, default: 6)
    -e, --exclude PATTERN   Exclude pattern (can be used multiple times)
    -r, --retention DAYS    Retention period in days (default: 30)
    -n, --dry-run           Perform dry run without actual backup
    -p, --parallel JOBS     Number of parallel jobs (default: 1)
    -m, --email EMAIL       Send notification to email

EXAMPLES:
    $0 -v -c 9 /opt/app /backup/app
    $0 --compress 5 --exclude "*.log" --retention 7 /data /backup/data
    $0 -vn --parallel 4 /home /backup/home

EOF
    exit 1
}

# Default values
VERBOSE=false
COMPRESS_LEVEL=6
EXCLUDE_PATTERNS=()
RETENTION_DAYS=30
DRY_RUN=false
PARALLEL_JOBS=1
EMAIL=""

# Parse arguments
parse_args() {
    while [[ $# -gt 0 ]]; do
        case $1 in
            -h|--help)
                usage
                ;;
            -v|--verbose)
                VERBOSE=true
                shift
                ;;
            -c|--compress)
                COMPRESS_LEVEL="$2"
                if ! [[ "$COMPRESS_LEVEL" =~ ^[1-9]$ ]]; then
                    log_error "Invalid compression level: $COMPRESS_LEVEL"
                    exit 1
                fi
                shift 2
                ;;
            -e|--exclude)
                EXCLUDE_PATTERNS+=("$2")
                shift 2
                ;;
            -r|--retention)
                RETENTION_DAYS="$2"
                if ! [[ "$RETENTION_DAYS" =~ ^[0-9]+$ ]]; then
                    log_error "Invalid retention days: $RETENTION_DAYS"
                    exit 1
                fi
                shift 2
                ;;
            -n|--dry-run)
                DRY_RUN=true
                shift
                ;;
            -p|--parallel)
                PARALLEL_JOBS="$2"
                shift 2
                ;;
            -m|--email)
                EMAIL="$2"
                if ! [[ "$EMAIL" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$ ]]; then
                    log_error "Invalid email address: $EMAIL"
                    exit 1
                fi
                shift 2
                ;;
            -*)
                log_error "Unknown option: $1"
                usage
                ;;
            *)
                break
                ;;
        esac
    done

    # Validate required positional arguments
    if [[ $# -lt 2 ]]; then
        log_error "Missing required arguments: SOURCE and DESTINATION"
        usage
    fi

    SOURCE="$1"
    DESTINATION="$2"

    # Validate paths
    if [[ ! -d "$SOURCE" ]]; then
        log_error "Source directory does not exist: $SOURCE"
        exit 1
    fi
}

parse_args "$@"

# Use parsed arguments
$VERBOSE && log "Verbose mode enabled"
log "Compression level: $COMPRESS_LEVEL"
log "Retention days: $RETENTION_DAYS"
[[ ${#EXCLUDE_PATTERNS[@]} -gt 0 ]] && log "Exclude patterns: ${EXCLUDE_PATTERNS[*]}"

4. Arrays and Data Structures

Advanced Array Operations

#!/bin/bash

# Associative arrays for configuration
declare -A SERVER_CONFIG=(
    [hostname]="web01.example.com"
    [port]="8080"
    [max_connections]="1000"
    [timeout]="30"
)

# Indexed arrays for lists
PACKAGES=(
    "nginx"
    "postgresql"
    "redis"
    "nodejs"
)

# Array manipulation functions
array_contains() {
    local needle="$1"
    shift
    local haystack=("$@")

    for item in "${haystack[@]}"; do
        [[ "$item" == "$needle" ]] && return 0
    done
    return 1
}

array_remove() {
    local remove="$1"
    shift
    local -n arr=$1
    local new_array=()

    for item in "${arr[@]}"; do
        [[ "$item" != "$remove" ]] && new_array+=("$item")
    done

    arr=("${new_array[@]}")
}

# Array sorting
array_sort() {
    local -n arr=$1
    IFS=$'
' arr=($(sort <<<"${arr[*]}"))
    unset IFS
}

# Example usage
if array_contains "nginx" "${PACKAGES[@]}"; then
    log "Nginx is in the package list"
fi

array_remove "redis" PACKAGES
array_sort PACKAGES

# Iterate with index
for i in "${!PACKAGES[@]}"; do
    log "Package $((i+1)): ${PACKAGES[$i]}"
done

# Multi-dimensional array simulation
declare -A SERVERS
SERVERS[web1_ip]="192.168.1.10"
SERVERS[web1_port]="80"
SERVERS[web2_ip]="192.168.1.11"
SERVERS[web2_port]="80"

get_server_config() {
    local server_name="$1"
    local property="$2"
    local key="${server_name}_${property}"
    echo "${SERVERS[$key]}"
}

# Usage
web1_ip=$(get_server_config "web1" "ip")
log "Web1 IP: $web1_ip"

JSON Processing

#!/bin/bash

# Parse JSON with jq
parse_json_config() {
    local config_file="$1"

    if ! command -v jq &>/dev/null; then
        log_error "jq is not installed"
        return 1
    fi

    # Read values from JSON
    local database_host=$(jq -r '.database.host' "$config_file")
    local database_port=$(jq -r '.database.port' "$config_file")

    # Process array from JSON
    local -a servers
    mapfile -t servers < <(jq -r '.servers[]' "$config_file")

    for server in "${servers[@]}"; do
        log "Processing server: $server"
    done
}

# Create JSON output
create_json_report() {
    local output_file="$1"

    cat > "$output_file" << EOF
{
    "timestamp": "$(date -Iseconds)",
    "hostname": "$(hostname)",
    "status": "success",
    "metrics": {
        "cpu_usage": $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1),
        "memory_used": $(free -m | awk 'NR==2{print $3}'),
        "disk_usage": $(df -h / | awk 'NR==2{print $5}' | sed 's/%//')
    },
    "services": [
        $(systemctl is-active nginx postgresql redis | jq -Rs 'split("
")[:-1] | map({"name": ., "status": .})')
    ]
}
EOF

    log "JSON report created: $output_file"
}

5. Modular Functions and Libraries

Function Library Pattern

# lib/common.sh - Reusable function library

# Check if library already loaded
[[ -n "${__COMMON_LIB_LOADED__:-}" ]] && return 0
readonly __COMMON_LIB_LOADED__=1

# Validation functions
validate_ip() {
    local ip="$1"
    local pattern='^([0-9]{1,3}.){3}[0-9]{1,3}$'

    [[ "$ip" =~ $pattern ]] || return 1

    IFS='.' read -ra octets <<< "$ip"
    for octet in "${octets[@]}"; do
        ((octet > 255)) && return 1
    done

    return 0
}

validate_email() {
    local email="$1"
    [[ "$email" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$ ]]
}

validate_url() {
    local url="$1"
    [[ "$url" =~ ^https?://[a-zA-Z0-9.-]+(:[0-9]+)?(/.*)?$ ]]
}

# File operations
safe_remove() {
    local file="$1"

    if [[ -f "$file" ]]; then
        local backup="${file}.backup.$(date +%Y%m%d%H%M%S)"
        cp "$file" "$backup"
        rm "$file"
        log "Removed $file (backup: $backup)"
    fi
}

ensure_directory() {
    local dir="$1"
    local mode="${2:-0755}"

    if [[ ! -d "$dir" ]]; then
        mkdir -p "$dir"
        chmod "$mode" "$dir"
        log "Created directory: $dir"
    fi
}

# String manipulation
to_uppercase() {
    echo "$1" | tr '[:lower:]' '[:upper:]'
}

to_lowercase() {
    echo "$1" | tr '[:upper:]' '[:lower:]'
}

trim() {
    local var="$1"
    var="${var#"${var%%[![:space:]]*}"}"
    var="${var%"${var##*[![:space:]]}"}"
    echo "$var"
}

# Usage in main script
#!/bin/bash
source "$(dirname "${BASH_SOURCE[0]}")/lib/common.sh"

if validate_ip "192.168.1.1"; then
    log "Valid IP address"
fi

6. Advanced File Operations

File Processing Patterns

#!/bin/bash

# Process large files efficiently
process_large_file() {
    local input_file="$1"
    local batch_size=1000
    local line_count=0
    local batch=()

    while IFS= read -r line; do
        batch+=("$line")
        ((line_count++))

        if ((line_count % batch_size == 0)); then
            process_batch "${batch[@]}"
            batch=()
        fi
    done < "$input_file"

    # Process remaining lines
    [[ ${#batch[@]} -gt 0 ]] && process_batch "${batch[@]}"
}

process_batch() {
    local lines=("$@")
    # Process batch of lines
    for line in "${lines[@]}"; do
        # Your processing logic
        echo "Processing: $line"
    done
}

# Atomic file operations
atomic_write() {
    local target_file="$1"
    local content="$2"
    local temp_file="${target_file}.tmp.$$"

    # Write to temporary file
    echo "$content" > "$temp_file"

    # Verify write was successful
    if [[ ! -f "$temp_file" ]]; then
        log_error "Failed to create temporary file"
        return 1
    fi

    # Atomic rename
    mv "$temp_file" "$target_file"
    log "Atomically updated: $target_file"
}

# File locking
with_file_lock() {
    local lock_file="$1"
    local timeout="${2:-10}"
    shift 2
    local command="$@"

    # Create lock file with timeout
    local count=0
    while ! mkdir "$lock_file" 2>/dev/null; do
        ((count++))
        if ((count >= timeout)); then
            log_error "Failed to acquire lock: $lock_file"
            return 1
        fi
        sleep 1
    done

    # Execute command with lock
    eval "$command"
    local exit_code=$?

    # Release lock
    rmdir "$lock_file"

    return $exit_code
}

# Usage
with_file_lock "/tmp/myapp.lock" 30 "critical_operation"

Directory Synchronization

#!/bin/bash

# Intelligent directory sync
sync_directories() {
    local source="$1"
    local destination="$2"
    local exclude_patterns="${3:-}"

    local rsync_opts=(
        -avz
        --delete
        --progress
        --stats
        --human-readable
    )

    # Add exclude patterns
    if [[ -n "$exclude_patterns" ]]; then
        while IFS= read -r pattern; do
            rsync_opts+=(--exclude="$pattern")
        done <<< "$exclude_patterns"
    fi

    # Perform dry run first
    log "Performing dry run..."
    if rsync --dry-run "${rsync_opts[@]}" "$source/" "$destination/"; then
        log "Dry run successful, proceeding with actual sync"

        if rsync "${rsync_opts[@]}" "$source/" "$destination/"; then
            log_success "Synchronization completed"
            return 0
        else
            log_error "Synchronization failed"
            return 1
        fi
    else
        log_error "Dry run failed, aborting"
        return 1
    fi
}

# Example usage
exclude_patterns="*.log
*.tmp
.git/
node_modules/"

sync_directories "/opt/app" "/backup/app" "$exclude_patterns"

7. Network and API Integration

REST API Interaction

#!/bin/bash

# API client functions
api_call() {
    local method="$1"
    local endpoint="$2"
    local data="${3:-}"
    local headers="${4:-}"

    local base_url="https://api.example.com"
    local auth_token="${API_TOKEN:-}"

    local curl_opts=(
        -s
        -X "$method"
        -H "Content-Type: application/json"
        -H "Authorization: Bearer $auth_token"
    )

    # Add custom headers
    if [[ -n "$headers" ]]; then
        while IFS= read -r header; do
            curl_opts+=(-H "$header")
        done <<< "$headers"
    fi

    # Add data for POST/PUT
    if [[ -n "$data" ]]; then
        curl_opts+=(-d "$data")
    fi

    # Make request and capture response
    local response
    local http_code

    response=$(curl "${curl_opts[@]}" -w "
%{http_code}" "${base_url}${endpoint}")
    http_code=$(echo "$response" | tail -n1)
    response=$(echo "$response" | sed '$d')

    # Check HTTP status
    if [[ "$http_code" =~ ^2[0-9]{2}$ ]]; then
        echo "$response"
        return 0
    else
        log_error "API call failed with status $http_code"
        log_error "Response: $response"
        return 1
    fi
}

# Example usage
get_user() {
    local user_id="$1"
    api_call "GET" "/users/$user_id"
}

create_user() {
    local name="$1"
    local email="$2"

    local data=$(cat <

Network Monitoring

#!/bin/bash

# Check service availability
check_service() {
    local host="$1"
    local port="$2"
    local timeout="${3:-5}"

    if timeout "$timeout" bash -c "cat < /dev/null > /dev/tcp/$host/$port" 2>/dev/null; then
        log_success "Service $host:$port is available"
        return 0
    else
        log_error "Service $host:$port is not available"
        return 1
    fi
}

# Monitor multiple services
monitor_services() {
    local -A services=(
        [web]="example.com:80"
        [database]="db.example.com:5432"
        [cache]="cache.example.com:6379"
    )

    local failed_services=()

    for service_name in "${!services[@]}"; do
        local host_port="${services[$service_name]}"
        local host="${host_port%:*}"
        local port="${host_port#*:}"

        if ! check_service "$host" "$port"; then
            failed_services+=("$service_name")
        fi
    done

    if [[ ${#failed_services[@]} -gt 0 ]]; then
        log_error "Failed services: ${failed_services[*]}"
        send_alert "Services down: ${failed_services[*]}"
        return 1
    fi

    log_success "All services are healthy"
    return 0
}

8. Parallel Processing and Job Control

GNU Parallel

#!/bin/bash

# Process files in parallel
process_files_parallel() {
    local file_list=("$@")
    local max_jobs=4

    export -f process_single_file
    export -f log

    printf '%s
' "${file_list[@]}" | 
        parallel -j "$max_jobs" --bar process_single_file
}

process_single_file() {
    local file="$1"
    log "Processing $file in parallel (PID: $$)"
    # Your processing logic
    sleep 2
    echo "Completed: $file"
}

# Background job management
run_background_jobs() {
    local -a pids=()

    for i in {1..5}; do
        {
            log "Background job $i started"
            sleep $((RANDOM % 10))
            log "Background job $i completed"
        } &
        pids+=($!)
    done

    # Wait for all jobs to complete
    for pid in "${pids[@]}"; do
        wait "$pid"
        log "Job with PID $pid finished"
    done

    log "All background jobs completed"
}

# Process queue with worker pool
declare -a TASK_QUEUE=()
declare -i WORKER_COUNT=4
declare -i ACTIVE_WORKERS=0

add_task() {
    TASK_QUEUE+=("$1")
}

worker() {
    local task="$1"
    log "Worker $$ processing: $task"
    # Process task
    sleep 2
    log "Worker $$ completed: $task"
}

process_queue() {
    while [[ ${#TASK_QUEUE[@]} -gt 0 ]] || [[ $ACTIVE_WORKERS -gt 0 ]]; do
        while [[ $ACTIVE_WORKERS -lt $WORKER_COUNT ]] && [[ ${#TASK_QUEUE[@]} -gt 0 ]]; do
            local task="${TASK_QUEUE[0]}"
            TASK_QUEUE=("${TASK_QUEUE[@]:1}")

            {
                worker "$task"
                ((ACTIVE_WORKERS--))
            } &

            ((ACTIVE_WORKERS++))
        done

        sleep 0.1
    done

    wait
}

# Add tasks
for i in {1..20}; do
    add_task "Task-$i"
done

# Process all tasks
process_queue

9. Security and Input Validation

Input Sanitization

#!/bin/bash

# Sanitize user input
sanitize_input() {
    local input="$1"
    local sanitized

    # Remove dangerous characters
    sanitized="${input//[^a-zA-Z0-9._-]/}"

    echo "$sanitized"
}

# SQL injection prevention (for mysql/psql command line)
safe_sql_query() {
    local table="$1"
    local id="$2"

    # Validate input
    if ! [[ "$id" =~ ^[0-9]+$ ]]; then
        log_error "Invalid ID: $id"
        return 1
    fi

    # Use parameterized query
    mysql -e "SELECT * FROM $table WHERE id = $id"
}

# Command injection prevention
safe_execute() {
    local command="$1"
    shift
    local args=("$@")

    # Whitelist allowed commands
    local -a allowed_commands=(
        "ls"
        "cat"
        "grep"
        "find"
    )

    if ! array_contains "$command" "${allowed_commands[@]}"; then
        log_error "Command not allowed: $command"
        return 1
    fi

    # Execute with explicit arguments (no shell expansion)
    "$command" "${args[@]}"
}

# Path traversal prevention
safe_file_access() {
    local user_path="$1"
    local base_dir="/var/www"

    # Resolve to absolute path
    local real_path=$(readlink -f "$user_path")

    # Check if within allowed directory
    if [[ "$real_path" != "$base_dir"* ]]; then
        log_error "Access denied: $user_path"
        return 1
    fi

    # Safe to access
    cat "$real_path"
}

Secure Credential Management

#!/bin/bash

# Read password securely
read_password() {
    local prompt="$1"
    local password

    read -sp "$prompt: " password
    echo >&2
    echo "$password"
}

# Store credentials encrypted
store_credential() {
    local service="$1"
    local username="$2"
    local password="$3"

    local cred_file="$HOME/.credentials/$service"
    ensure_directory "$(dirname "$cred_file")" 0700

    # Encrypt and store
    echo "$username:$password" | openssl enc -aes-256-cbc -salt -pbkdf2 
        -out "$cred_file"

    chmod 600 "$cred_file"
    log "Credentials stored for $service"
}

# Retrieve credentials
get_credential() {
    local service="$1"
    local cred_file="$HOME/.credentials/$service"

    if [[ ! -f "$cred_file" ]]; then
        log_error "No credentials found for $service"
        return 1
    fi

    # Decrypt and return
    openssl enc -aes-256-cbc -d -pbkdf2 -in "$cred_file"
}

# Use environment variables securely
load_env_file() {
    local env_file="$1"

    if [[ ! -f "$env_file" ]]; then
        log_error "Environment file not found: $env_file"
        return 1
    fi

    # Load without executing arbitrary code
    while IFS='=' read -r key value; do
        # Skip comments and empty lines
        [[ "$key" =~ ^#.*$ ]] && continue
        [[ -z "$key" ]] && continue

        # Remove quotes from value
        value="${value%"}"
        value="${value#"}"

        export "$key=$value"
    done < "$env_file"
}

10. System Integration Scripts

Complete System Monitoring Script

#!/bin/bash

# Complete monitoring script
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/lib/common.sh"

# Configuration
readonly ALERT_EMAIL="admin@example.com"
readonly THRESHOLDS_FILE="/etc/monitoring/thresholds.conf"
readonly STATE_FILE="/var/lib/monitoring/state.json"

# Load thresholds
load_thresholds() {
    [[ -f "$THRESHOLDS_FILE" ]] && source "$THRESHOLDS_FILE"

    CPU_THRESHOLD="${CPU_THRESHOLD:-80}"
    MEMORY_THRESHOLD="${MEMORY_THRESHOLD:-90}"
    DISK_THRESHOLD="${DISK_THRESHOLD:-85}"
    LOAD_THRESHOLD="${LOAD_THRESHOLD:-5.0}"
}

# Check system resources
check_cpu() {
    local cpu_usage=$(top -bn2 -d 0.5 | grep "Cpu(s)" | tail -1 | awk '{print $2}' | cut -d'%' -f1)

    if (($(echo "$cpu_usage > $CPU_THRESHOLD" | bc -l))); then
        log_warning "High CPU usage: ${cpu_usage}%"
        send_alert "CPU Alert" "CPU usage is ${cpu_usage}% (threshold: ${CPU_THRESHOLD}%)"
        return 1
    fi

    log "CPU usage: ${cpu_usage}%"
    return 0
}

check_memory() {
    local mem_total=$(free -m | awk 'NR==2{print $2}')
    local mem_used=$(free -m | awk 'NR==2{print $3}')
    local mem_percent=$((mem_used * 100 / mem_total))

    if ((mem_percent > MEMORY_THRESHOLD)); then
        log_warning "High memory usage: ${mem_percent}%"
        send_alert "Memory Alert" "Memory usage is ${mem_percent}% (threshold: ${MEMORY_THRESHOLD}%)"
        return 1
    fi

    log "Memory usage: ${mem_percent}%"
    return 0
}

check_disk() {
    local failed=false

    while IFS= read -r line; do
        local usage=$(echo "$line" | awk '{print $5}' | sed 's/%//')
        local mount=$(echo "$line" | awk '{print $6}')

        if ((usage > DISK_THRESHOLD)); then
            log_warning "High disk usage on $mount: ${usage}%"
            send_alert "Disk Alert" "Disk usage on $mount is ${usage}% (threshold: ${DISK_THRESHOLD}%)"
            failed=true
        fi
    done < <(df -h | grep -vE '^Filesystem|tmpfs|cdrom')

    [[ "$failed" == "true" ]] && return 1
    return 0
}

check_load() {
    local load_avg=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | sed 's/,//')

    if (($(echo "$load_avg > $LOAD_THRESHOLD" | bc -l))); then
        log_warning "High load average: $load_avg"
        send_alert "Load Alert" "Load average is $load_avg (threshold: $LOAD_THRESHOLD)"
        return 1
    fi

    log "Load average: $load_avg"
    return 0
}

check_services() {
    local -a required_services=("nginx" "postgresql" "redis")
    local failed_services=()

    for service in "${required_services[@]}"; do
        if ! systemctl is-active --quiet "$service"; then
            log_error "Service $service is not running"
            failed_services+=("$service")

            # Attempt restart
            if systemctl restart "$service"; then
                log_success "Successfully restarted $service"
            else
                log_error "Failed to restart $service"
            fi
        fi
    done

    if [[ ${#failed_services[@]} -gt 0 ]]; then
        send_alert "Service Alert" "Services down: ${failed_services[*]}"
        return 1
    fi

    return 0
}

# Send alert
send_alert() {
    local subject="$1"
    local message="$2"
    local hostname=$(hostname)

    local email_body=$(cat < "$STATE_FILE" << EOF
{
    "timestamp": "$(date -Iseconds)",
    "checks": {
        "cpu": $1,
        "memory": $2,
        "disk": $3,
        "load": $4,
        "services": $5
    }
}
EOF
}

# Main monitoring logic
main() {
    log "Starting system monitoring"

    load_thresholds

    local cpu_ok=0
    local mem_ok=0
    local disk_ok=0
    local load_ok=0
    local services_ok=0

    check_cpu && cpu_ok=1
    check_memory && mem_ok=1
    check_disk && disk_ok=1
    check_load && load_ok=1
    check_services && services_ok=1

    save_state $cpu_ok $mem_ok $disk_ok $load_ok $services_ok

    log "System monitoring completed"
}

main "$@"

11. Testing and Debugging

Unit Testing with BATS

#!/usr/bin/env bats

# tests/test_common.sh

setup() {
    source "$BATS_TEST_DIRNAME/../lib/common.sh"
}

@test "validate_ip: valid IP should return 0" {
    run validate_ip "192.168.1.1"
    [ "$status" -eq 0 ]
}

@test "validate_ip: invalid IP should return 1" {
    run validate_ip "999.999.999.999"
    [ "$status" -eq 1 ]
}

@test "validate_email: valid email should return 0" {
    run validate_email "test@example.com"
    [ "$status" -eq 0 ]
}

@test "validate_email: invalid email should return 1" {
    run validate_email "invalid-email"
    [ "$status" -eq 1 ]
}

@test "array_contains: existing item should return 0" {
    local arr=("apple" "banana" "cherry")
    run array_contains "banana" "${arr[@]}"
    [ "$status" -eq 0 ]
}

# Run tests with: bats tests/

Debugging Techniques

#!/bin/bash

# Enable different debug levels
DEBUG_LEVEL="${DEBUG_LEVEL:-0}"

debug() {
    local level="$1"
    shift

    if ((DEBUG_LEVEL >= level)); then
        echo "[DEBUG $level] $*" >&2
    fi
}

# Use throughout script
debug 1 "Entering function: ${FUNCNAME[1]}"
debug 2 "Variable value: var=$var"
debug 3 "Full environment: $(env)"

# Trace execution
enable_trace() {
    set -x
    PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'
}

# Profile script execution
profile_start() {
    PROFILE_START=$(date +%s%N)
}

profile_end() {
    local PROFILE_END=$(date +%s%N)
    local DURATION=$(( (PROFILE_END - PROFILE_START) / 1000000 ))
    log "Execution time: ${DURATION}ms"
}

# Usage
profile_start
# Your code here
profile_end

12. Best Practices and Patterns

Script Organization

project/
├── bin/
│   └── main-script.sh          # Main entry point
├── lib/
│   ├── common.sh              # Common functions
│   ├── logging.sh             # Logging utilities
│   └── validation.sh          # Input validation
├── config/
│   ├── default.conf           # Default configuration
│   └── production.conf        # Production settings
├── tests/
│   ├── test_common.sh         # Unit tests
│   └── integration_test.sh    # Integration tests
├── docs/
│   └── README.md              # Documentation
└── .editorconfig              # Editor configuration

Code Style Guidelines

#!/bin/bash

# 1. Use meaningful variable names
readonly DATABASE_CONNECTION_STRING="postgresql://localhost/mydb"  # Good
readonly DB="postgresql://localhost/mydb"                          # Bad

# 2. Use functions for code organization
process_user_data() {  # Good: descriptive function name
    local user_id="$1"
    # Process user
}

proc() {  # Bad: unclear function name
    # Do something
}

# 3. Always quote variables
echo "$variable"      # Good
echo $variable        # Bad

# 4. Use [[ ]] for conditionals
if [[ "$var" == "value" ]]; then  # Good
    echo "Match"
fi

if [ "$var" == "value" ]; then    # Less preferred
    echo "Match"
fi

# 5. Check command success
if command_that_might_fail; then  # Good
    log "Success"
else
    log_error "Failed"
    exit 1
fi

command_that_might_fail  # Bad: ignores errors

# 6. Use readonly for constants
readonly MAX_RETRIES=3  # Good
MAX_RETRIES=3          # Bad

# 7. Local variables in functions
my_function() {
    local temp_var="value"  # Good
    temp_var="value"        # Bad: pollutes global scope
}

Conclusion

Advanced Bash scripting is a powerful skill that enables automation of complex system administration tasks. Key takeaways:

  • Error Handling: Use strict mode (set -euo pipefail) and comprehensive error trapping
  • Modularity: Organize code into reusable functions and libraries
  • Security: Always validate and sanitize user input
  • Logging: Implement comprehensive logging for debugging and auditing
  • Testing: Write unit tests and perform thorough testing
  • Documentation: Comment complex logic and maintain README files
  • Performance: Use parallel processing for CPU-intensive tasks
  • Maintainability: Follow consistent coding standards and best practices

Professional Bash scripting combines technical expertise with software engineering principles to create reliable, maintainable automation tools for enterprise environments.

Additional Resources

Was this article helpful?

🏷️ Tags: automation bash error-handling scripting system-administration
R

About Ramesh Sundararamaiah

Red Hat Certified Architect

Expert in Linux system administration, DevOps automation, and cloud infrastructure. Specializing in Red Hat Enterprise Linux, CentOS, Ubuntu, Docker, Ansible, and enterprise IT solutions.