1

I have to fetch a website (multiple redirections possible using -L) and save the html content in a file named as [HTTP_Status_code]_[Website_name].html

Currently I am using two curl calls one for the dump and the other for header. Is there any way to club them into one?

Script:

cat url_list.txt | while read line; do 
if curl  -L  $line -o `curl -I $line 2>/dev/null | head -n 1 | cut -d$' ' -f2`_`basename $line`.html 
 then
   :
 else
    echo $line >>error.txt
fi 
done

EDIT: I have to find the header of the last redirection.

1 Answer 1

0

what about

cat url_list.txt | while read line; do 
if curl  -D  tmp_status.txt -L  $line -o tmp_file.html 
then
   mv tmp_file.html $(awk '/HTTP/  { print $2}' tmp_status.txt)_$(basename $line)
else
   echo $line >>error.txt
   # processing from tmp_status
fi 
done
  • only one call of curl, but a post processing ...
5
  • Hi @Archemar , The file is being overwritten every time. So, I am getting the response of last url only but I want response of each url. So, Can you please suggest me ? Commented Mar 20, 2015 at 8:45
  • strange .. both tmp_ sould be updated for every curl call. maybe rm tmp_status.txt before if ? Commented Mar 20, 2015 at 9:46
  • Hi@ArchemarIf I do so then I need to manually save the response in text file then it will take a lot of time and I need response of One million sites. What exactly I needed is; I want url's name under text file named their response i.e there are 100 url's whose response is 200 OK in the last redirection(If there is any redirection) then i need a text file named 200 in which these 100 url's will be there whose last response is 200 OK. But your code is doing something else. Commented Mar 20, 2015 at 12:00
  • Hey @Archemar : Do you know how to do this? Commented Mar 24, 2015 at 5:50
  • I have done it :) Commented Apr 11, 2015 at 9:18

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.