Heinan Cabouly

Posted on May 29

Stop Writing Slow Bash Scripts: Performance - Optimization Techniques That Actually Work

#devops #bash #linux #programming

After optimizing hundreds of production Bash scripts and teaching performance best practices, I've discovered that most "slow" scripts aren't inherently slow—they're just poorly optimized.

The difference between a script that takes 30 seconds and one that takes 3 minutes often comes down to a few key optimization techniques that most developers never learn. Here's how to write Bash scripts that perform like they should.

🚀 The Performance Mindset: Think Before You Code

Before diving into specific techniques, understand that Bash performance optimization is about reducing system calls, minimizing subprocess creation, and leveraging built-in capabilities.

The golden rule: Every time you call an external command, you're creating overhead. The goal is to do more work with fewer external calls.

⚡ 1. Built-in String Operations vs External Commands

Slow Approach:

# Don't do this - calls external commands repeatedly
for file in *.txt; do
    basename=$(basename "$file" .txt)
    dirname=$(dirname "$file")
    extension=$(echo "$file" | cut -d. -f2)
done

Fast Approach:

# Use parameter expansion instead
for file in *.txt; do
    basename="${file##*/}"      # Remove path
    basename="${basename%.*}"   # Remove extension
    dirname="${file%/*}"        # Extract directory
    extension="${file##*.}"     # Extract extension
done

Performance impact: Up to 10x faster for large file lists.

🔄 2. Efficient Array Processing

Slow Approach:

# Inefficient - recreates array each time
users=()
while IFS= read -r user; do
    users=("${users[@]}" "$user")  # This gets slower with each iteration
done < users.txt

Fast Approach:

# Efficient - use mapfile for bulk operations
mapfile -t users < users.txt

# Or for processing while reading
while IFS= read -r user; do
    users+=("$user")  # Much faster than recreating array
done < users.txt

Why it's faster: += appends efficiently, while ("${users[@]}" "$user") recreates the entire array.

📁 3. Smart File Processing Patterns

Slow Approach:

# Reading file multiple times
line_count=$(wc -l < large_file.txt)
word_count=$(wc -w < large_file.txt)
char_count=$(wc -c < large_file.txt)

Fast Approach:

# Single pass through file
read_stats() {
    local file="$1"
    local lines=0 words=0 chars=0

    while IFS= read -r line; do
        ((lines++))
        words+=$(echo "$line" | wc -w)
        chars+=${#line}
    done < "$file"

    echo "Lines: $lines, Words: $words, Characters: $chars"
}

Even Better - Use Built-in When Possible:

# Let the system do what it's optimized for
stats=$(wc -lwc < large_file.txt)
echo "Stats: $stats"

🎯 4. Conditional Logic Optimization

Slow Approach:

# Multiple separate checks
if [[ -f "$file" ]]; then
    if [[ -r "$file" ]]; then
        if [[ -s "$file" ]]; then
            process_file "$file"
        fi
    fi
fi

Fast Approach:

# Combined conditions
if [[ -f "$file" && -r "$file" && -s "$file" ]]; then
    process_file "$file"
fi

# Or use short-circuit logic
[[ -f "$file" && -r "$file" && -s "$file" ]] && process_file "$file"

🔍 5. Pattern Matching Performance

Slow Approach:

# External grep for simple patterns
if echo "$string" | grep -q "pattern"; then
    echo "Found pattern"
fi

Fast Approach:

# Built-in pattern matching
if [[ "$string" == *"pattern"* ]]; then
    echo "Found pattern"
fi

# Or regex matching
if [[ "$string" =~ pattern ]]; then
    echo "Found pattern"
fi

Performance comparison: Built-in matching is 5-20x faster than external grep for simple patterns.

🏃 6. Loop Optimization Strategies

Slow Approach:

# Inefficient command substitution in loop
for i in {1..1000}; do
    timestamp=$(date +%s)
    echo "Processing item $i at $timestamp"
done

Fast Approach:

# Move expensive operations outside loop when possible
start_time=$(date +%s)
for i in {1..1000}; do
    echo "Processing item $i at $start_time"
done

# Or batch operations
{
    for i in {1..1000}; do
        echo "Processing item $i"
    done
} | while IFS= read -r line; do
    echo "$line at $(date +%s)"
done

💾 7. Memory-Efficient Data Processing

Slow Approach:

# Loading entire file into memory
data=$(cat huge_file.txt)
process_data "$data"

Fast Approach:

# Stream processing
process_file_stream() {
    local file="$1"
    while IFS= read -r line; do
        # Process line by line
        process_line "$line"
    done < "$file"
}

For Large Data Sets:

# Use temporary files for intermediate processing
mktemp_cleanup() {
    local temp_files=("$@")
    rm -f "${temp_files[@]}"
}

process_large_dataset() {
    local input_file="$1"
    local temp1 temp2
    temp1=$(mktemp)
    temp2=$(mktemp)

    # Clean up automatically
    trap "mktemp_cleanup '$temp1' '$temp2'" EXIT

    # Multi-stage processing with temporary files
    grep "pattern1" "$input_file" > "$temp1"
    sort "$temp1" > "$temp2"
    uniq "$temp2"
}

🚀 8. Parallel Processing Done Right

Basic Parallel Pattern:

# Process multiple items in parallel
parallel_process() {
    local items=("$@")
    local max_jobs=4
    local running_jobs=0
    local pids=()

    for item in "${items[@]}"; do
        # Launch background job
        process_item "$item" &
        pids+=($!)
        ((running_jobs++))

        # Wait if we hit max concurrent jobs
        if ((running_jobs >= max_jobs)); then
            wait "${pids[0]}"
            pids=("${pids[@]:1}")  # Remove first PID
            ((running_jobs--))
        fi
    done

    # Wait for remaining jobs
    for pid in "${pids[@]}"; do
        wait "$pid"
    done
}

Advanced: Job Queue Pattern:

# Create a job queue for better control
create_job_queue() {
    local queue_file
    queue_file=$(mktemp)
    echo "$queue_file"
}

add_job() {
    local queue_file="$1"
    local job_command="$2"
    echo "$job_command" >> "$queue_file"
}

process_queue() {
    local queue_file="$1"
    local max_parallel="${2:-4}"

    # Use xargs for controlled parallel execution
    cat "$queue_file" | xargs -n1 -P"$max_parallel" -I{} bash -c '{}'
    rm -f "$queue_file"
}

📊 9. Performance Monitoring and Profiling

Built-in Timing:

# Time specific operations
time_operation() {
    local operation_name="$1"
    shift

    local start_time
    start_time=$(date +%s.%N)

    "$@"  # Execute the operation

    local end_time
    end_time=$(date +%s.%N)
    local duration
    duration=$(echo "$end_time - $start_time" | bc)

    echo "Operation '$operation_name' took ${duration}s" >&2
}

# Usage
time_operation "file_processing" process_large_file data.txt

Resource Usage Monitoring:

# Monitor script resource usage
monitor_resources() {
    local script_name="$1"
    shift

    # Start monitoring in background
    {
        while kill -0 $$ 2>/dev/null; do
            ps -o pid,pcpu,pmem,etime -p $$
            sleep 5
        done
    } > "${script_name}_resources.log" &
    local monitor_pid=$!

    # Run the actual script
    "$@"

    # Stop monitoring
    kill "$monitor_pid" 2>/dev/null || true
}

🔧 10. Real-World Optimization Example

Here's a complete example showing before/after optimization:

Before (Slow Version):

#!/bin/bash
# Processes log files - SLOW version

process_logs() {
    local log_dir="$1"
    local results=()

    for log_file in "$log_dir"/*.log; do
        # Multiple file reads
        error_count=$(grep -c "ERROR" "$log_file")
        warn_count=$(grep -c "WARN" "$log_file")
        total_lines=$(wc -l < "$log_file")

        # Inefficient string building
        result="File: $(basename "$log_file"), Errors: $error_count, Warnings: $warn_count, Lines: $total_lines"
        results=("${results[@]}" "$result")
    done

    # Process results
    for result in "${results[@]}"; do
        echo "$result"
    done
}

After (Optimized Version):

#!/bin/bash
# Processes log files - OPTIMIZED version

process_logs_fast() {
    local log_dir="$1"
    local temp_file
    temp_file=$(mktemp)

    # Process all files in parallel
    find "$log_dir" -name "*.log" -print0 | \
    xargs -0 -n1 -P4 -I{} bash -c '
        file="{}"
        basename="${file##*/}"

        # Single pass through file
        errors=0 warnings=0 lines=0
        while IFS= read -r line || [[ -n "$line" ]]; do
            ((lines++))
            [[ "$line" == *"ERROR"* ]] && ((errors++))
            [[ "$line" == *"WARN"* ]] && ((warnings++))
        done < "$file"

        printf "File: %s, Errors: %d, Warnings: %d, Lines: %d\n" \
            "$basename" "$errors" "$warnings" "$lines"
    ' > "$temp_file"

    # Output results
    sort "$temp_file"
    rm -f "$temp_file"
}

Performance improvement: 70% faster on typical log directories.

💡 Performance Best Practices Summary

Use built-in operations instead of external commands when possible
Minimize subprocess creation - batch operations when you can
Stream data instead of loading everything into memory
Leverage parallel processing for CPU-intensive tasks
Profile your scripts to identify actual bottlenecks
Use appropriate data structures - arrays for lists, associative arrays for lookups
Optimize your loops - move expensive operations outside when possible
Handle large files efficiently - process line by line, use temporary files

🎓 Master Performance Optimization

Performance optimization is a crucial skill that separates amateur scripts from production-ready automation. These techniques can dramatically improve your script performance, but knowing when and how to apply them comes with practice and deeper understanding.

If you want to master these performance techniques and many more professional Bash scripting skills, I cover optimization strategies, profiling methods, and production-ready patterns in my comprehensive Bash Scripting for DevOps course.

What you'll learn:

Advanced performance optimization techniques
Memory management and resource monitoring
Parallel processing patterns for real-world scenarios
Profiling and debugging slow scripts
Production-ready automation that scales
Complete performance-optimized example projects

Perfect for:

DevOps engineers managing large-scale automation
System administrators processing large datasets
Developers building high-performance scripts
Anyone tired of waiting for slow scripts to finish

Ready to write Bash scripts that perform like they should?

→ Master performance optimization in the complete course!

Found this helpful? Share it with your team and follow for more performance tips and DevOps automation content. What's your biggest Bash performance challenge? Drop it in the comments!

DEV Community

Stop Writing Slow Bash Scripts: Performance - Optimization Techniques That Actually Work

🚀 The Performance Mindset: Think Before You Code

⚡ 1. Built-in String Operations vs External Commands

🔄 2. Efficient Array Processing

📁 3. Smart File Processing Patterns

🎯 4. Conditional Logic Optimization

🔍 5. Pattern Matching Performance

🏃 6. Loop Optimization Strategies

💾 7. Memory-Efficient Data Processing

🚀 8. Parallel Processing Done Right

📊 9. Performance Monitoring and Profiling

🔧 10. Real-World Optimization Example

💡 Performance Best Practices Summary

🎓 Master Performance Optimization

Top comments (0)