3

I wrote a simple bash script who takes a list of URLs and outputs a CSV with some data for each one : url, status code and target url :

while read url
do
    urlstatus=$(curl -H 'Cache-Control: no-cache' -o /dev/null --silent --head --insecure --write-out '%{http_code} , %{redirect_url}' "$url" -I )
    echo "$url , $urlstatus" >> "$1-out.csv"
done < $1

Sometimes an URL have 2 or 3 redirects, I'd like to get them all and print them in the output file.

I've found the -L option and the %{url_effective} filter to the last URLs :

    urlstatus2=$(curl -H 'Cache-Control: no-cache' -o /dev/null --silent --head --insecure --write-out ' , %{url_effective}' "$url" -L -I )

But I'd like to have all URLs from the origin to the final one and add them to the csv.

3
  • Do you have an example of a page with more than one redirects so we can test? Commented Mar 4, 2021 at 10:51
  • 1
    Yes of course : aep-beta.onpc.fr/lycees/dom/region/DOM/ECOL 301 -> 301 -> 200 Commented Mar 4, 2021 at 11:04
  • if you make it http, it has even 3 redirections 302 -> 301 -> 301 -> 200 :-D Commented Mar 4, 2021 at 11:14

2 Answers 2

2

Make a recursive function:

#!/bin/bash
get_redirects(){
    i=${2:-1}
    read status url <<< $(curl -H 'Cache-Control: no-cache' -o /dev/null --silent --head --insecure --write-out '%{http_code}\t%{redirect_url}\n' "$1" -I)
    printf '%d: %s --> %s\n' "$i" "$1" "$status";
    if [ "$1" = "$url" ] || [ $i -gt 9 ]; then
        echo "Recursion detected or more redirections than allowed. Stop."
    else
      case $status in
          30*) get_redirects "$url" "$((i+1))"
               ;;
      esac
    fi
}

Usage:

$ get_redirects https://aep-beta.onpc.fr/lycees/dom/region/DOM/ECOL
https://aep-beta.onpc.fr/lycees/dom/region/DOM/ECOL --> 301
https://aep-beta.onpc.fr/onglet/lycee/dom --> 301
https://aep-beta.onpc.fr/onglet/lycee/outre-mer --> 200
3
  • Thanks, but I forgot to tell that some times I deal with infinite redirections (due to errors in redirection rules) ! This case will crash the recursive function I think ! Commented Mar 4, 2021 at 11:09
  • What does curl do in this case? You could add a counter and quit after n redirects. Or quit directly if "$1==$url" (or simply hit ctrl+c if interactive).... Commented Mar 4, 2021 at 11:09
  • 1
    added a counter to stop after 10th request. Commented Mar 4, 2021 at 11:27
0

Building on @pLumo's answer, this function removes the recursiveness and allows custom arguments to be supplied to curl

#!/bin/bash
get_redirects(){
  url=$1
  declare -a params=("${@:2}")

  for i in {1..100}; do
    read status url <<< $(curl "$url" -H 'Cache-Control: no-cache' -o /dev/null --silent --write-out '%{http_code}\t%{redirect_url}\n' "${params[@]}");
    printf '%d: %s --> %s\n' "$i" "$url" "$status";
    if (( status < 300 || status >= 400 )); then break; fi
  done
}

Usage:

get_redirects http://example.com/endpoint -X POST -H "Authentication: Bearer ..." -d "param=value"

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.