1

In my script, I have two http requests. I would like reuse the connection, so for example what I do is:

curl -v 'http://example.com?id=1&key1=value1' 'http://example.com?id=1&key2=value2'

Is there any way to store the output of each http request in two different variables? I have been searching. I haven't found any solution yet.

I understand I can do the following to store output in two different files.

curl -v 'http://example.com?id=1&key1=value1' -o output1 'http://example.com?id=1&key2=value2' -o output2

Edit: here is my use case

I have a cronjob that runs the parallel (GNU parallel) command below every few minutes. And 'get_data.sh' will be run 2000 times, because there are 2000 rows in input.csv. I would like to avoid using tmp file to get the best performance.

parallel \
  -a input.csv \
  --jobs 0 \
  --timeout $parallel_timeout \
  "get_data.sh {}"

In get_data.sh:

id=$1
curl -v "http://example.com?id=${id}&key1=value1" -o output1 \
"http://example.com?id=${id}&key2=value2" -o output2

stat1=$(cat output1 | sed '' | cut ..)
stat2=$(cat output2 | awk '')
6
  • What are you going to do next with the variables? Commented Mar 23, 2020 at 17:50
  • Using commands like sed, awk, cut to get the data I care about Commented Mar 23, 2020 at 18:42
  • And how many hundred shell variables do you have? Commented Mar 23, 2020 at 18:52
  • The script 'get_data.sh' will be run 2000 times by parallel actually. I edited my question. I hope that explain my use case better. Please let me know if you need more info : ) Commented Mar 24, 2020 at 0:12
  • It seems to me you are running maybe 8 or more processes per line of your file (bash, curl, awk, sed, cat etc) making 16,000+ processes. I can't help thinking you would be better off using Python and multi-threading. Failing that, write your temporary files in /tmp which is a RAM-based filesystem and should be faster. Commented Mar 24, 2020 at 8:49

2 Answers 2

2

You are looking for parset. It is part of env_parallel which is part of the GNU Parallel package (https://www.gnu.org/software/parallel/parset.html):

parset myarr \
  -a input.csv \
  --jobs 0 \
  --timeout $parallel_timeout \
  get_data.sh {}

echo "${myarr[3]}"

You can have parset run all combinations - just like you would with GNU Parallel:

echo www.google.com > input.txt
echo www.bing.com >> input.txt

# Search for both foo and bar on all sites
parset output curl https://{1}/?search={2} :::: input.txt ::: foo bar

echo "${output[1]}"
echo "${output[2]}"

If you are doing different processing for foo and bar you can make functions and run those:

# make all new functions, arrays, variables, and aliases defined after this
# available to env_parset
env_parallel --session

foofunc() {
  id="$1"
  curl -v "http://example.com?id=${id}&key1=value1" | sed '' | cut -f10
}

barfunc() {
  id="$1"
  curl -v "http://example.com?id=${id}&key2=value2" | awk '{print}'
}

# Run both foofunc and barfunc on all sites
env_parset output {1} {2} ::: foofunc barfunc :::: input.txt

echo "${output[1]}"
echo "${output[2]}"
env_parallel --end-session

--(end-)session and env_parset are needed if you do not want to export -f the functions and variables that you use in the functions.

GNU Parallel uses tempfiles. If your command runs fast then these tempfiles never touch the disk before they are deleted. Instead they stay in the disk cache in RAM. You can even force them to stay in RAM by pointing --tmpdir to a ramdisk:

mkdir /dev/shm/mydir
parset output --tmpdir /dev/shm/mydir ...
Sign up to request clarification or add additional context in comments.

1 Comment

Yeah, that's very helpful for me. I can store the output of get_data.sh into an array. Can you explain whether parset works for curl -v 'http://example.com?id=1&key1=value1' -o output1 'http://example.com?id=1&key2=value2' -o output2 Instead of two tmp files output1 and output2, can I save the result in two variables or in an array?
0

Ok, here is some inspiration:

id=$1
output1=$(curl -v "http://example.com?id=${id}&key1=value1")
output2=$(curl -v "http://example.com?id=${id}&key2=value2")

stat1=$(echo "$output1" | sed '' | cut ..)
stat2=$(echo "$output2" | awk '')

This way you avoid writing stuff to disk.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.