0

I have a small shell script that performs a very simple task

#!/bin/bash
while read LINE; do
  output=$(curl -iL --head "$LINE" | grep laravel)
  if [[ $output ]] && echo "$LINE"
done < url-list.txt

I pass a file called url-list.txt to the script, and it outputs a file that contains a subset of the urls from the original file that match the grep laravel.

However, I seem to have a syntax error in the script:

$ ./processurls.sh > laravel.txt
./processurls.sh: line 5: syntax error near unexpected token `done'
./processurls.sh: line 5: `done < url-list.txt'

Can someone help me debug this, as it should be rather simple, but I am confused as to what the correct syntax would be.

Also, is there a better way to do this?

What I am trying to do is determine which sites (out of a list of approximately 1000 sites) are using what platform or framework.

Naturally, I realise that there are limitations to sniffing out the platforms and frameworks, but I need to gather statistics on various sites for the purposes of providing a report to a client.

Specifically, I need to try and see how many of these sites are using Joomla, Drupal, Wordpress or a PHP Framework such as Laravel, Code Ignitor, Symfony2, CakePHP and FuelPHP.

I assumed that this script could be modified multiple times to replace the grep search with a new term and then run the script with a different output file for each search term.

1
  • so instead of done, I would use then? Commented Apr 21, 2014 at 15:20

4 Answers 4

3

Your if statement is wrong, it has to be

if <test>; then
    <block>
fi

What you want to do (<test> && <somethingelse>) isn't an if but an boolean expression:

[[ $output ]] && echo "$LINE"
Sign up to request clarification or add additional context in comments.

Comments

1

You're missing both then and fi in the if statement

while read LINE; do
  output=$(curl -iL --head "$LINE" | grep laravel)
  if [[ $output ]]
  then echo "$LINE"
  fi
done < url-list.txt

Or leave out if:

while read LINE; do
  output=$(curl -iL --head "$LINE" | grep laravel)
  [[ $output ]] && echo "$LINE"
done < url-list.txt

You can also leave out the output variable, and just test whether grep was successful.

while read LINE; do
    curl -iL --head "$LINE" | grep -q laravel && echo "$LINE"
done

Comments

1

Two mistakes:
1st:

if condition && echo "$LINE"

Should have been

condition && echo "$LINE"

OR

if condition; then echo "$LINE"; fi

2nd:
[[ $output ]] should be [[ "$output" ]]

1 Comment

With [[ ... ]], I think the absence of quotes is permissible, though I don't like the irregularity it introduces in the syntax. Similarly with not needing $ in front of variables in ((...)) and $((...)).
1

About the syntax error…

You don't need the if in the context:

[[ $output ]] && echo "$LINE"

Or you need a then and a fi:

if [[ $output ]]
then echo "$LINE"
fi

And the rest of the question…

The loop you show is not the best way to do things. You want to access each site just once since that will be the slow part of the processing. Therefore, you should be using something like:

shopt -s nocasematch
while read url
do
    output=$(curl -iL --head "$url")
    for tooling in Joomla Drupal Wordpress Laravel "Code Igniter" Symfony2 CakePHP FuelPHP
    do
        if [[ "$output" == $tooling ]] 
        then echo "$url" >> "$tooling.list"
        fi
    done
done < url-list.txt

See Conditional Constructs and The Shopt Builtin for more information. Note that the $tooling is interpreted as a pattern. If you need a different way of identifying each tool, you can alter the values in the inner list. If need be, you can use different names for the files, and it might be sensible to record both the URL and the response in the file. Generally, there are a lot of tweaks you can make to the script so it works as you want. The key point is that it accesses the web once per site, and then analyzes the response multiple times.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.