1

G'day all,

This is actually the first question I have asked, however I use stack overflow religiously with its awesome search function, but I have come to a stop here.

I've been writing a bit of PHP code that basically takes the user input for Australian Airports, fetches the PDF's relevant to the aircraft type (for whatever reason the publisher releases them as single PDF's), and puts them into one PDF file. I've got it working reasonably smoothly now, but the last hitch in the plan is that when you place in lots of airfields (or ones with lots of PDF's) it exceeds the max_execution_time and gives me a 500 Internal Server Error. Unfortunately I'm with GoDaddy's shared hosting and cant change this, either in the php.ini, or in a script with set_time_limit(). This guy had the same problem and I have come out as fruitless as he: PHP GoDaddy maximun execution time not working

Anyway, apart from switching my hosting my only thought is to break up the php code so it doesn't run all at once. The only problem being is I am running a foreach loop and I haven't the faintest idea where to start.

Here is the code I have for the saving of the PDF's:

foreach ($pos as $po){
    file_put_contents("/dir/temp/$chartNumber$po", file_get_contents("http://www.airservicesaustralia.com/aip/current/dap/$po"));   
    $chartNumber = $chartNumber + 1;    
}

The array $pos is generated by a regex search of the website and takes very little time, it is the saving of the PDF files that kills me, and if it manages to get them all, the combining can take a bit of time as well with this code:

exec("gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=/dir/finalpdf/$date.pdf /dir/temp/*.pdf");

My question is, is there any way I can do each of the foreach loop in a seperate script, and then pick up where I left off? Or is it time to get new hosting?

Cheers in advance!

1
  • What's the bottleneck? The fetching of the documents or the ghostscript call? Commented Jan 5, 2016 at 15:25

4 Answers 4

1

My suggestion would be to use AJAX requests, splitting each request per file.

Here's how I would approach it:

  1. Make a request to generate $pos array and return it in JSON.
  2. Make a request to generate each file, by passing $po and it's position in array (assuming that's the $chartNumber).
  3. Check if last file was generated in jquery (returned true), and call the script to write the final file, returning the filename for download.

But ofcourse the best solution would be to switch to a cloud hosting. I personally use digitalocean.com where I'm running big PHP fetching scripts without any limitations.

Sign up to request clarification or add additional context in comments.

7 Comments

As per answer elsewhere, how is this faster?
Its actually slower :) But to avoid hitting max_execution_time in PHP you would have to call the script in parts, so execution of each individual part should never exceed the max_execution_time
So what do you make the requests from? Where do you assemble the final PDF? If this is a PHP script then it will hit the max execution time :(
Ajax. Basically you'd need to create a basic web interface which would make these requests through javascript, calling the PHP.
Sorry Edvinas - I don't see how that solves the problem of NOT having a PHP script running longer the the max execution time. The PHP script making the AJAX requests still has to wait for the called scripts to execute. The only way to reduce the execution time is to run the slave scripts in parallel
|
1

I've taken Edvinas advice and transferred to digitalocean.com and have the script running now with no problems whatsoever. I have also managed to reduce the time by downloading each file with parallelcurl, which will download 5 at a time, so I can have a full, 100 page file (larger than I'll expect I'll ever need) downloaded and generated in just under 5 minutes. I guess other than hosting the PDF's on my own server (in which case I may miss updated of charts), this will be about as quick as I can get it to run.

Thanks for the advice!

Comments

0

Breaking down the operations into batches and running them serially will actually take longer than what you are currently doing. If the performance bottleneck is in creation of the component parts, a better solution would be to generate the parts in parallel.

the combining can take a bit of time as well with this code

Well, thge first part of fixing any performance issue should be profiling to identify the bottleneck. Without direct admin access to the host there's not a lot you can do to speed up the execution of a single line shell script - but if you can run shell commands then you can run a background job outside of the webserver process group.

Comments

0

For anyone looking for a way to run some slow php script and avoid timeouts, here's how I do it without using ajax.

This is not for production of any kind of course: just for internal use, e.g. to update a lot of fields in the database via PHP while the script is slow.

Just run the script on some pages and use some $_GET[] parameter value as iterator.

Here's an example of the code I used with woocommerce wc_get_orders() function that is very slow. My environment only allowed to get 500 orders at once without going into timeout and I had 11000 orders to update:

$page = $_GET["pageid"];
$page = intval($page);

$args = array(
'limit' => 500,
'paged' => $page
);

$orders = wc_get_orders( $args );

foreach ($orders as $order) {
    //do something
}

$page++;

if ($page < 23) {    
  header("Location: https://example.com/slow-php-script/?pageid=".$page);
}

Then open your page with the get parameter, grab a coffee, and relax.

E.g.: https://example.com/slow-php-script/?pageid=1

It's going to redirect every time the script has finished running and use the number from the get parameter for another run. You can use this as an iterator for basically anything.

Hope that will save you some time.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.