DEV Community

Cover image for Stop Writing Slow Bash Scripts: Performance - Optimization Techniques That Actually Work
Heinan Cabouly
Heinan Cabouly

Posted on

Stop Writing Slow Bash Scripts: Performance - Optimization Techniques That Actually Work

After optimizing hundreds of production Bash scripts and teaching performance best practices, I've discovered that most "slow" scripts aren't inherently slowβ€”they're just poorly optimized.

The difference between a script that takes 30 seconds and one that takes 3 minutes often comes down to a few key optimization techniques that most developers never learn. Here's how to write Bash scripts that perform like they should.


πŸš€ The Performance Mindset: Think Before You Code

Before diving into specific techniques, understand that Bash performance optimization is about reducing system calls, minimizing subprocess creation, and leveraging built-in capabilities.

The golden rule: Every time you call an external command, you're creating overhead. The goal is to do more work with fewer external calls.


⚑ 1. Built-in String Operations vs External Commands

Slow Approach:

# Don't do this - calls external commands repeatedly
for file in *.txt; do
    basename=$(basename "$file" .txt)
    dirname=$(dirname "$file")
    extension=$(echo "$file" | cut -d. -f2)
done
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Use parameter expansion instead
for file in *.txt; do
    basename="${file##*/}"      # Remove path
    basename="${basename%.*}"   # Remove extension
    dirname="${file%/*}"        # Extract directory
    extension="${file##*.}"     # Extract extension
done
Enter fullscreen mode Exit fullscreen mode

Performance impact: Up to 10x faster for large file lists.


πŸ”„ 2. Efficient Array Processing

Slow Approach:

# Inefficient - recreates array each time
users=()
while IFS= read -r user; do
    users=("${users[@]}" "$user")  # This gets slower with each iteration
done < users.txt
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Efficient - use mapfile for bulk operations
mapfile -t users < users.txt

# Or for processing while reading
while IFS= read -r user; do
    users+=("$user")  # Much faster than recreating array
done < users.txt
Enter fullscreen mode Exit fullscreen mode

Why it's faster: += appends efficiently, while ("${users[@]}" "$user") recreates the entire array.


πŸ“ 3. Smart File Processing Patterns

Slow Approach:

# Reading file multiple times
line_count=$(wc -l < large_file.txt)
word_count=$(wc -w < large_file.txt)
char_count=$(wc -c < large_file.txt)
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Single pass through file
read_stats() {
    local file="$1"
    local lines=0 words=0 chars=0

    while IFS= read -r line; do
        ((lines++))
        words+=$(echo "$line" | wc -w)
        chars+=${#line}
    done < "$file"

    echo "Lines: $lines, Words: $words, Characters: $chars"
}
Enter fullscreen mode Exit fullscreen mode

Even Better - Use Built-in When Possible:

# Let the system do what it's optimized for
stats=$(wc -lwc < large_file.txt)
echo "Stats: $stats"
Enter fullscreen mode Exit fullscreen mode

🎯 4. Conditional Logic Optimization

Slow Approach:

# Multiple separate checks
if [[ -f "$file" ]]; then
    if [[ -r "$file" ]]; then
        if [[ -s "$file" ]]; then
            process_file "$file"
        fi
    fi
fi
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Combined conditions
if [[ -f "$file" && -r "$file" && -s "$file" ]]; then
    process_file "$file"
fi

# Or use short-circuit logic
[[ -f "$file" && -r "$file" && -s "$file" ]] && process_file "$file"
Enter fullscreen mode Exit fullscreen mode

πŸ” 5. Pattern Matching Performance

Slow Approach:

# External grep for simple patterns
if echo "$string" | grep -q "pattern"; then
    echo "Found pattern"
fi
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Built-in pattern matching
if [[ "$string" == *"pattern"* ]]; then
    echo "Found pattern"
fi

# Or regex matching
if [[ "$string" =~ pattern ]]; then
    echo "Found pattern"
fi
Enter fullscreen mode Exit fullscreen mode

Performance comparison: Built-in matching is 5-20x faster than external grep for simple patterns.


πŸƒ 6. Loop Optimization Strategies

Slow Approach:

# Inefficient command substitution in loop
for i in {1..1000}; do
    timestamp=$(date +%s)
    echo "Processing item $i at $timestamp"
done
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Move expensive operations outside loop when possible
start_time=$(date +%s)
for i in {1..1000}; do
    echo "Processing item $i at $start_time"
done

# Or batch operations
{
    for i in {1..1000}; do
        echo "Processing item $i"
    done
} | while IFS= read -r line; do
    echo "$line at $(date +%s)"
done
Enter fullscreen mode Exit fullscreen mode

πŸ’Ύ 7. Memory-Efficient Data Processing

Slow Approach:

# Loading entire file into memory
data=$(cat huge_file.txt)
process_data "$data"
Enter fullscreen mode Exit fullscreen mode

Fast Approach:

# Stream processing
process_file_stream() {
    local file="$1"
    while IFS= read -r line; do
        # Process line by line
        process_line "$line"
    done < "$file"
}
Enter fullscreen mode Exit fullscreen mode

For Large Data Sets:

# Use temporary files for intermediate processing
mktemp_cleanup() {
    local temp_files=("$@")
    rm -f "${temp_files[@]}"
}

process_large_dataset() {
    local input_file="$1"
    local temp1 temp2
    temp1=$(mktemp)
    temp2=$(mktemp)

    # Clean up automatically
    trap "mktemp_cleanup '$temp1' '$temp2'" EXIT

    # Multi-stage processing with temporary files
    grep "pattern1" "$input_file" > "$temp1"
    sort "$temp1" > "$temp2"
    uniq "$temp2"
}
Enter fullscreen mode Exit fullscreen mode

πŸš€ 8. Parallel Processing Done Right

Basic Parallel Pattern:

# Process multiple items in parallel
parallel_process() {
    local items=("$@")
    local max_jobs=4
    local running_jobs=0
    local pids=()

    for item in "${items[@]}"; do
        # Launch background job
        process_item "$item" &
        pids+=($!)
        ((running_jobs++))

        # Wait if we hit max concurrent jobs
        if ((running_jobs >= max_jobs)); then
            wait "${pids[0]}"
            pids=("${pids[@]:1}")  # Remove first PID
            ((running_jobs--))
        fi
    done

    # Wait for remaining jobs
    for pid in "${pids[@]}"; do
        wait "$pid"
    done
}
Enter fullscreen mode Exit fullscreen mode

Advanced: Job Queue Pattern:

# Create a job queue for better control
create_job_queue() {
    local queue_file
    queue_file=$(mktemp)
    echo "$queue_file"
}

add_job() {
    local queue_file="$1"
    local job_command="$2"
    echo "$job_command" >> "$queue_file"
}

process_queue() {
    local queue_file="$1"
    local max_parallel="${2:-4}"

    # Use xargs for controlled parallel execution
    cat "$queue_file" | xargs -n1 -P"$max_parallel" -I{} bash -c '{}'
    rm -f "$queue_file"
}
Enter fullscreen mode Exit fullscreen mode

πŸ“Š 9. Performance Monitoring and Profiling

Built-in Timing:

# Time specific operations
time_operation() {
    local operation_name="$1"
    shift

    local start_time
    start_time=$(date +%s.%N)

    "$@"  # Execute the operation

    local end_time
    end_time=$(date +%s.%N)
    local duration
    duration=$(echo "$end_time - $start_time" | bc)

    echo "Operation '$operation_name' took ${duration}s" >&2
}

# Usage
time_operation "file_processing" process_large_file data.txt
Enter fullscreen mode Exit fullscreen mode

Resource Usage Monitoring:

# Monitor script resource usage
monitor_resources() {
    local script_name="$1"
    shift

    # Start monitoring in background
    {
        while kill -0 $$ 2>/dev/null; do
            ps -o pid,pcpu,pmem,etime -p $$
            sleep 5
        done
    } > "${script_name}_resources.log" &
    local monitor_pid=$!

    # Run the actual script
    "$@"

    # Stop monitoring
    kill "$monitor_pid" 2>/dev/null || true
}
Enter fullscreen mode Exit fullscreen mode

πŸ”§ 10. Real-World Optimization Example

Here's a complete example showing before/after optimization:

Before (Slow Version):

#!/bin/bash
# Processes log files - SLOW version

process_logs() {
    local log_dir="$1"
    local results=()

    for log_file in "$log_dir"/*.log; do
        # Multiple file reads
        error_count=$(grep -c "ERROR" "$log_file")
        warn_count=$(grep -c "WARN" "$log_file")
        total_lines=$(wc -l < "$log_file")

        # Inefficient string building
        result="File: $(basename "$log_file"), Errors: $error_count, Warnings: $warn_count, Lines: $total_lines"
        results=("${results[@]}" "$result")
    done

    # Process results
    for result in "${results[@]}"; do
        echo "$result"
    done
}
Enter fullscreen mode Exit fullscreen mode

After (Optimized Version):

#!/bin/bash
# Processes log files - OPTIMIZED version

process_logs_fast() {
    local log_dir="$1"
    local temp_file
    temp_file=$(mktemp)

    # Process all files in parallel
    find "$log_dir" -name "*.log" -print0 | \
    xargs -0 -n1 -P4 -I{} bash -c '
        file="{}"
        basename="${file##*/}"

        # Single pass through file
        errors=0 warnings=0 lines=0
        while IFS= read -r line || [[ -n "$line" ]]; do
            ((lines++))
            [[ "$line" == *"ERROR"* ]] && ((errors++))
            [[ "$line" == *"WARN"* ]] && ((warnings++))
        done < "$file"

        printf "File: %s, Errors: %d, Warnings: %d, Lines: %d\n" \
            "$basename" "$errors" "$warnings" "$lines"
    ' > "$temp_file"

    # Output results
    sort "$temp_file"
    rm -f "$temp_file"
}
Enter fullscreen mode Exit fullscreen mode

Performance improvement: 70% faster on typical log directories.


πŸ’‘ Performance Best Practices Summary

  1. Use built-in operations instead of external commands when possible
  2. Minimize subprocess creation - batch operations when you can
  3. Stream data instead of loading everything into memory
  4. Leverage parallel processing for CPU-intensive tasks
  5. Profile your scripts to identify actual bottlenecks
  6. Use appropriate data structures - arrays for lists, associative arrays for lookups
  7. Optimize your loops - move expensive operations outside when possible
  8. Handle large files efficiently - process line by line, use temporary files

πŸŽ“ Master Performance Optimization

Performance optimization is a crucial skill that separates amateur scripts from production-ready automation. These techniques can dramatically improve your script performance, but knowing when and how to apply them comes with practice and deeper understanding.

If you want to master these performance techniques and many more professional Bash scripting skills, I cover optimization strategies, profiling methods, and production-ready patterns in my comprehensive Bash Scripting for DevOps course.

What you'll learn:

  • Advanced performance optimization techniques
  • Memory management and resource monitoring
  • Parallel processing patterns for real-world scenarios
  • Profiling and debugging slow scripts
  • Production-ready automation that scales
  • Complete performance-optimized example projects

Perfect for:

  • DevOps engineers managing large-scale automation
  • System administrators processing large datasets
  • Developers building high-performance scripts
  • Anyone tired of waiting for slow scripts to finish

Ready to write Bash scripts that perform like they should?

β†’ Master performance optimization in the complete course!


Found this helpful? Share it with your team and follow for more performance tips and DevOps automation content. What's your biggest Bash performance challenge? Drop it in the comments!

Top comments (0)