PHP asynchronous multiple file write and read - php

I am using a cURL based php application to make requests to another webserver that does asynchronous requests. So what I am doing is creating files with the name as .req with the info I will need on the return and as the identification in the request. The requests are done using HTTP-XML-POST. The file is written using: -
file_get_contents(reqs/<databaseid>.req, FILE_APPEND);
What happens is that while the requests are being generated in bulk (about 1500 per second), the responses start coming back from the webserver. The response is caught by a another script which received the from the response and opens the request file based on it using: -
$aResponse = file(reqs/<databaseid>.req);
Now what happens is that in about 15% of requests, the file() request fails and generates a log entry in apache log like this: -
file(reqs/<databaseid>.req): failed to open stream: No such file or directory in <scriptname> on line <xyz>
It has been verified using a cleaner script that runs later that the file did exist.
Any ideas?!!!

There are some functions to handle simultaneous file access such as flock() but it's normally easier to simply use a database. Any decent DBMS has already worked it out for you.

Related

Intermittently failing to open stream (HTTP request)

I am running Windows Server and it hosts my PHP files.
I am using "file_get_contents()" to call another PHP script and return the results. (I have also tried cURL with the same result)
This works fine. However if I execute my script, then re execute it almost straight away, I get an error:
"Warning: file_get_contents(http://...x.php): failed to open stream: HTTP request failed!"
So this works fine if I leave a minute or two between calling this PHP file via the browser. But after a successful attempt, if I retry too quickly, then it fails. I have even changed the URL in the line "$html = file_get_contents($url, false, $context);" to an empty file that simply prints out a line, and the HTTP stream still doesn't open.
What could be preventing me to open a new HTTP stream?
I suspect my server is blocking further outgoing streams but cannot find out where this would be configured in IIS.
Any help on this problem would be much appreciated.
**EDIT: ** In the script, I am calling a Java file that takes around 1.5 mins, and it is after this that I then call the PHP script that fails.
Also, when it fails, the page hangs for quite some time. During this time, if I open another connection to the initial PHP page then the previous page (still hanging) then completes. It seems like a connection timeout somewhere.
I have set the timeout appropriately in IIS Manager and in PHP

Is it possible to start executing PHP script on a multipart/form-data file upload request before file is uploaded?

It should be be a common use-case, but I can't find whether it's achievable at all.
I want to validate a multipart/form-data uploaded file extension on server-side - must I wait for the file to fully upload?
I might be missing something, but it doesn't make sense, especially when handling large files.
Can't I execute PHP before file is uploaded, get metadata and maybe cancel the request altogether?
I know I can split to two separate requests, I'm looking for a single-request solution, if applicable
You should wait until the file is fully uploaded so you can then validate. There is no one single-request solution.
If you use an Apache/Nginx HTTP server, it executes PHP scripts only after it finished loading the whole request from the client - which is too late for your use case, as Sergio in the other answer correctly points out.
There is a single-request solution in PHP, but you need to have control over the HTTP requests in your PHP script.
You can chose to not use Apache, but instead start a HTTP server from your php-cli (either by using the native socket functions or some HTTP server package such as react/socket that uses them in the background).
$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server('127.0.0.1:8080', $loop);
$socket->on('connection', function (React\Socket\ConnectionInterface $connection) {
// here you can have $connection->on(...) event handlers
});
$loop->run();
Then you can have handlers that handle each chunk of the incoming request (example from the react/socket package, specifically ReadableResourceStream):
$connection->on('data', function ($chunk) {
echo $chunk;
});
And instead of echoing the chunk, you can validate its contents and call $connection->close() if you need, which effectively terminates the unfinished upload.
But this whole thing is a complex solution, and I'd recommend to use it only for a upload service that is completely separated from the application that generates the form page (which can still run under a regular Apache HTTP server because it's just much easier).
You can validate it before interact with the server in the frontend, in PHP the script executes when request is finished

Can high load cause PHP to not be able to write to a textfile?

I have a PHP script which is under high load with multiple calls per second (originating from another computer). It's running on PHP 5.5.14 on an IIS server. Every request and response to the script is logged using
file_put_contents('log_2019-09-12.txt', $msg, FILE_APPEND);
Every request and response is also logged on the client computer, and there I see occasional PHP errors like this one:
PHP ERROR 2: file_put_contents(C:\\WWW\\project-x\\logs\\log_2019-09-11.txt): failed to open stream: Permission denied
These seem to happen about every ~140 minutes, usually there are 8 of them in a row and then things work for another ~140 minutes, handling several requests per second and logging successfully to the log file.
Could it be that PHP is usually writing to an in-memory file and then actually writes the contents to disk every ~140 minutes, and that's what's causing this error? If so, how can I circumvent it?
Answered by Magnus Eriksson in the comments
Try and add the LOCK_EX argument when writing: file_put_contents($file, $text, FILE_APPEND|LOCK_EX) From the manual: the LOCK_EX flag to prevent anyone else writing to the file at the same time
You have no permission to write this.
mkdir($path, 0755, true)
Check to see if you have permissions as you write.

php's file_get_contents provokes crash

I am having some trouble with PHP's file_get_contents function.
My setup is basically as follows:
-> From a php page (let's call it a.php), a POST request is sent to another php page (b.php) through file_get_contents.
-> b.php does some stuff with the POST input and then sends another POST request through file_get_contents to itself (b.php)
->This is repeated a couple of times (for example 4 times), so basically it looks like this:
a.php -> POST request through file_get_contents -> b.php -> POST request through file_get_contents -> b.php -> POST request through file_get_contents -> b.php -> POST request through file_get_contents -> b.php
At the last post request to b.php, the script echoes something to the "poster", he adds something to it etc. etc. all the way back to a.php.
For clarity's sake: in production all those php files will be on different servers, and each server has an added value in the process.
For testing however, all pages are on the same server (and I add "?server=x" to the URL so that the same file uses a different database at every "call").
This works like a charm :) ... Unless there are more than 5 file_get_contents are "active" simultaneously ...
This works fine:
a.php->b.php->b.php->b.php->b.php
This doesn't:
a.php->b.php->b.php->b.php->b.php->b.php
As a matter of fact it crashes my server (not responding to ANY http requests anymore), and only restarting Apache "deblocks" it.
The same happens when I load the working "circuit" a.php->b.php->b.php->b.php->b.php different times from the browser.
Error in the Apache error log:
failed to open stream: HTTP request failed!
I thought it might be related to the POST size being too large, but sending a HUGE POST request through the a.php->b.php->b.php->b.php->b.php circuit works just fine ....
So it looks like somehow only 5 simultaneous file_get_contents are allowed ...
Anyone's got some ideas ?
EDIT: As mentioned below, it looks like the real problem is a deadlock, which will not happen in PRD since there will be no "loop" on the same server ... I solved this issue by using CURL instead with a timeout. When a deadlock risks to occur, the curl requests simply time out without freezing the server.
I'm still interested however to get an answer to this question: How can I check/reconfigure the amount of simultaneous requests in Apache2? It's not in the conf file afaik.
Thanks !!!
I would suggest using CURL instead of file_get_contents in this case - although I've never had an issue with this except(!) when using sessions since they lock the process up till the session closes.

MongoDB php driver, script ends when inserting data

Im playing with MongoDB and Im trying to import .csv files to DB and Im getting strange error. In process of uploading script just ends for no reason and when I try to run it again nothing happens only solution is to restart apache. I have already set unlimited timeout in php.ini Here is the script.
$dir = "tokens/";
$fileNames = array_diff( scandir("data/"), array(".", "..") );
foreach($fileNames as $filename)
if(file_exists($dir.$filename))
exec("d:\mongodb\bin\mongoimport.exe -d import -c ".$filename." -f Date,Open,Next,Amount,Type --type csv --file ".$dir.$filename."");
I got around 7000 .csv files and it manage to insert only about 200 before script ends.
Can anyone help? I would appreciate any help
You are missing back end infrastructure. It is just insane to try to load 7000 files into a database as part of a web request that is supposed to be short lived and is expected, by some of the software components as well as the end user, to only last a few seconds or maybe a minute.
Instead, create a backend service and command and control for this procedure. In the web app, write each file name to be processed to a database table or even a plain text file on the server and then tell the end user that their request has been queued and will be processed within the next NN minutes. Then have a cron job that runs every 5 minutes (or even 1 minute) that looks in the right place for stuff to do and can create reports of success or failure and/or send emails to tell the original requestor that it is done.
If this is intended as an import script and you are set on using PHP, it would be preferable to at least use the PHP CLI environment instead of performing this task through a web server. As it stands, it appears the CSV files are located on the server itself, so I see no reason to get HTTP involved. This would avoid an issue where the web request terminates and abruptly aborts the import process.
For processing the CSV, I'd start by looking at fgetcsv or str_getcsv. The mongoimport command really does very little in the way of validation and sanitization. Parsing the CSV yourself will allow you to skip records that are missing fields, provide default values where necessary, or take other appropriate action. As you iterate through records, you can collect documents to insert in an array and then pass the results on to MongoCollection::batchInsert() in batches. The driver will take care of splitting up large batches into chunks to actually send over the wire in 16MB messages (MongoDB's document size limit, which also applies to wire protocol communication).

Categories