I have to write a PHP script that works as a client against another HTTP Server. This Server ignores the HTTP Connection:Close header and keeps the TCP connection open unless it is closed by the client. And here is my dilemma. I (the client) have to deciede when a HTTP request/response has finished and then close the connection. Simply use:
$data = file_get_contents($url);
.. won't work, as file_get_contents returns only if the connection timeout (default 30 seconds) has reached.
So I have to write my own read - loop like this (pseudo code):
$sock = fsockopen(...);
$data = '';
while($line = fgets($sock)) {
$data .= $line;
if(http_package_recieved()) {
break;
}
}
Unfortunately there is no Content-Length header in the response. My question is, how the function
http_package_recieved()
... should look like.
Greets
Thorsten
You can check if $line is empty to see if the server isn't sending anything. You can also set a small read timeout on the socket with stream_set_timeout() , and then inside the loop check stream_get_meta_data() to see if it has been reached in order to break out.
If it doesn't close the connection and it doesn't tell you the total length of the response, you have no way to know whether all the data has been received.
You could specify a maximum time interval between packets, but that won't be reliable.
You'd be better of using a library, such as cURL (http://uk.php.net/manual/en/intro.curl.php), to handle this. The HTTP spec isn't simple: https://www.rfc-editor.org/rfc/rfc2616 (see Section 4.4) and you'd likely miss something crucial.
When the entity ends is either guided by:
Content-Length header (which you don't have)
HTTP Chunked Transfer Encoding (see Transfer-Encoding: chunked header: do you have one of these?).
It's possible you may have to process this chunked transfer encoding if you get this header. There are libraries to do so.
feof($sock) would be OK
Related
I'm trying to work with Paypal's IPN and they state that I should 'return an empty HTTP 200 response'. I need to return this and then for the PHP script to continue processing.
There are loads of examples on stack overflow and other places that show how to use the 'connection: close' header with output buffering to send a response to the client and keep processing. However, no solution I've found works when the content length is set to zero. Whenever I set a content length header of zero, the connection remains open until the script terminates.
How can I terminate a connection early, send no content and continue processing?
UPDATE:
ob_flush() won't help here. It will flush the buffer, but it wont close the connection. If you set the headers connection: close and a non zero content-length header then the connection will be closed from the client side when you have sent content-length bytes; this is where ob_flush() can help. However if you set a content length of zero and flush a buffer without anything in it, then nothing gets sent to the client and the connection remains open.
You have to use a fork mechanism : send 200 il parent while processing data in the child.
You can also use ReactPHP for asynchronous processing or some queuing systems like JmsJobQueue : the processing is done in a separate script that you call and pass arguments to.
I want to have an HTTP GET request sent from PHP. Example:
http://tracker.example.com?product_number=5230&price=123.52
The idea is to do server-side web-analytics: Instead of sending tracking
information from JavaScript to a server, the server sends tracking
information directly to another server.
Requirements:
The request should take as little time as possible, in order to not
noticeably delay processing of the PHP page.
The response from the tracker.example.com does not need to be
checked. As examples, some possible responses from
tracker.example.com:
200: That's fine, but no need to check that.
404: Bad luck, but - again - no need to check that.
301: Although a redirect would be appropriate, it would delay
processing of the PHP page, so don't do that.
In short: All responses can be discarded.
Ideas for solutions:
In a now deleted answer, someone suggested calling command line
curl from PHP in a shell process. This seems like a good idea,
only that I don't know if forking a lot of shell processes under
heavy load is a wise thing to do.
I found php-ga, a package for doing server-side Google
Analytics from PHP. On the project's page, it is
mentioned: "Can be configured to [...] use non-blocking requests."
So far I haven't found the time to investigate what method php-ga
uses internally, but this method could be it!
In a nutshell: What is the best solution to do generic server-side
tracking/analytics from PHP.
Unfortunately PHP by definition is blocking. While this holds true for the majority of functions and operations you will normally be handling, the current scenario is different.
The process which I like to call HTTP-Ping, requires that you only touch a specific URI, forcing the specific server to boot-strap it's internal logic. Some functions allow you to achieve something very similar to this HTTP-ping, by not waiting for a response.
Take note that the process of pinging an url, is a two step process:
Resolve the DNS
Making the request
While making the request should be rather fast once the DNS is resolved and the connection is made, there aren't many ways of making the DNS resolve faster.
Some ways of doing an http-ping are:
cURL, by setting CONNECTION_TIMEOUT to a low value
fsockopen by closing immediately after writing
stream_socket_client (same as fsockopen) and also adding STREAM_CLIENT_ASYNC_CONNECT
While both cURL and fsockopen are both blocking while the DNS is being resolved. I have noticed that fsockopen is significantly faster, even in worst case scenarios.
stream_socket_client on the other hand should fix the problem regarding DNS resolving and should be the optimal solution in this scenario, but I have not managed to get it to work.
One final solution is to start another thread/process that does this for you. Making a system call for this should work, but also forking the current process should do that also. Unfortunately both are not really safe in applications where you can't control the environment on which PHP is running.
System calls are more often than not blocked and pcntl is not enabled by default.
I would call tracker.example.com this way:
get_headers('http://tracker.example.com?product_number=5230&price=123.52');
and in the tracker script:
ob_end_clean();
ignore_user_abort(true);
ob_start();
header("Connection: close");
header("Content-Length: " . ob_get_length());
ob_end_flush();
flush();
// from here the response has been sent. you can now wait as long as you want and do some tracking stuff
sleep(5); //wait 5 seconds
do_some_stuff();
exit;
I implemented function for fast GET request to url without waiting for response:
function fast_request($url)
{
$parts=parse_url($url);
$fp = fsockopen($parts['host'],isset($parts['port'])?$parts['port']:80,$errno, $errstr, 30);
$out = "GET ".$parts['path']." HTTP/1.1\r\n";
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Length: 0"."\r\n";
$out.= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
fclose($fp);
}
We were using fsockopen and fwrite combo, then it up and stopped working one day. Or it was kind of intermittent. After a little research and testing, and if you have fopen wrappers enabled, I ended up using file_get_contents and stream_context_create functions with a timeout that is set to 100th of second. The timeout parameter can receive floating values (https://www.php.net/manual/en/context.http.php). I wrapped it in a try...catch block so it would fail silently. It works beautifully for our purposes. You can do logging stuff in the catch if needed. The timeout is the key if you don't want the function to block runtime.
function fetchWithoutResponseURL( $url )
{
$context = stream_context_create([
"http" => [
"method"=>"GET",
"timeout" => .01
]
]
);
try {
file_get_contents($url, 0, $context);
}catch( Exception $e ){
// Fail silently
}
}
For those of you working with wordrpess as a backend -
it is as simple as:
wp_remote_get( $url, array(blocking=>false) );
Came here whilst researching a similar problem. If you have a database connection handy, one other possibility is to quickly stuff the request details into a table, and then have a seperate cron-based process that periodically scans that table for new records to process, and makes the tracking request, freeing up your web application from having to make the HTTP request itself.
You can use shell_exec, and command line curl.
For an example, see this question
You can actually do this using CURL directly.
I have both implemented it using a very short timeout (CURLOPT_TIMEOUT_MS) and/or using curl_multi_exec.
Be advised: eventually i quit this method because not every request was correctly made. This could have been caused by my own server though i haven't been able to rule out the option of curl failing.
I needed to do something similar, just ping a url and discard all responses. I used the proc_open command which lets you end the process right away using proc_close. I'm assuming you have lynx installed on your server:
<?php
function ping($url) {
$proc = proc_open("lynx $url",[],$pipes);
proc_close($proc);
}
?>
<?php
// Create a stream
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en"
)
);
$context = stream_context_create($opts);
// Open the file using the HTTP headers set above
$file = file_get_contents('http://tracker.example.com?product_number=5230&price=123.52', false, $context);
?>
This is something that has been bugging me for a while.. I'm building of a RESTful API that has to receive files on some occasions.
When using HTTP POST, we can read data from $_POST and files from $_FILES.
When using HTTP GET, we can read data from $_GET and files from $_FILES.
However, when using HTTP PUT, AFAIK the only way to read data is to use the php://input stream.
All good and well, untill I want to send a file over HTTP PUT. Now the php://input stream doesn't work as expected anymore, since it has a file in there as well.
Here's how I currently read data on a PUT request:
(which works great as long as there are no files posted)
$handle = fopen('php://input', 'r');
$rawData = '';
while ($chunk = fread($handle, 1024)) {
$rawData .= $chunk;
}
parse_str($rawData, $data);
When I then output rawData, it shows
-----ZENDHTTPCLIENT-44cf242ea3173cfa0b97f80c68608c4c
Content-Disposition: form-data; name="image_01"; filename="lorem-ipsum.png"
Content-Type: image/png; charset=binary
�PNG
���...etc etc...
���,
-----ZENDHTTPCLIENT-8e4c65a6678d3ef287a07eb1da6a5380
Content-Disposition: form-data; name="testkey"
testvalue
-----ZENDHTTPCLIENT-8e4c65a6678d3ef287a07eb1da6a5380
Content-Disposition: form-data; name="otherkey"
othervalue
Does anyone know how to properly receive files over HTTP PUT, or how to parse files out of the php://input stream?
===== UPDATE #1 =====
I have tried only the above method, don't really have a clue as to what I can do else.
I have gotten no errors using this method, besides that I don't get the desired result of the posted data and files.
===== UPDATE #2 =====
I'm sending this test request using Zend_Http_Client, as follows:
(haven't had any problems with Zend_Http_Client so far)
$client = new Zend_Http_Client();
$client->setConfig(array(
'strict' => false,
'maxredirects' => 0,
'timeout' => 30)
);
$client->setUri( 'http://...' );
$client->setMethod(Zend_Http_Client::PUT);
$client->setFileUpload( dirname(__FILE__) . '/files/lorem-ipsum.png', 'image_01');
$client->setParameterPost(array('testkey' => 'testvalue', 'otherkey' => 'othervalue');
$client->setHeaders(array(
'api_key' => '...',
'identity' => '...',
'credential' => '...'
));
===== SOLUTION =====
Turns out I made some wrong assumptions, mainly that HTTP PUT would be similar to HTTP POST. As you can read below, DaveRandom explained to me that HTTP PUT is not meant for transferring multiple files on the same request.
I have now moved the transferring of formdata from the body to url querystring. The body now holds the contents of a single file.
For more information, read DaveRandom's answer. It's epic.
The data you show does not depict a valid PUT request body (well, it could, but I highly doubt it). What it shows is a multipart/form-data request body - the MIME type used when uploading files via HTTP POST through an HTML form.
PUT requests should exactly compliment the response to a GET request - they send you the file contents in the message body, and nothing else.
Essentially what I'm saying is that it is not your code to receive the file that is wrong, it is the code that is making the request - the client code is incorrect, not the code you show here (although the parse_str() call is a pointless exercise).
If you explain what the client is (a browser, script on other server, etc) then I can help you take this further. As it is, the appropriate request method for the request body that you depict is POST, not PUT.
Let's take a step back from the problem, and look at the HTTP protocol in general - specifically the client request side - hopefully this will help you understand how all of this is supposed to work. First, a little history (if you're not interested in this, feel free to skip this section).
History
HTTP was originally designed as a mechanism for retrieving HTML documents from remote servers. At first it effectively supported only the GET method, whereby the client would request a document by name and the server would return it to the client. The first public specification for HTTP, labelled as HTTP 0.9, appeared in 1991 - and if you're interested, you can read it here.
The HTTP 1.0 specification (formalised in 1996 with RFC 1945) expanded the capabilities of the protocol considerably, adding the HEAD and POST methods. It was not backwards compatible with HTTP 0.9, due to a change in the format of the response - a response code was added, as well as the ability to include metadata for the returned document in the form of MIME format headers - key/value data pairs. HTTP 1.0 also abstracted the protocol from HTML, allowing for the transfer of files and data in other formats.
HTTP 1.1, the form of the protocol that is almost exclusively in use today is built on top of HTTP 1.0 and was designed to be backwards compatible with HTTP 1.0 implementations. It was standardised in 1999 with RFC 2616. If you are a developer working with HTTP, get to know this document - it is your bible. Understanding it fully will give you a considerable advantage over your peers who do not.
Get to the point already
HTTP works on a request-response architecture - the client sends a request message to the server, the server returns a response message to the client.
A request message includes a METHOD, a URI and optionally, a number of HEADERS. The request METHOD is what this question relates to, so it is what I will cover in the most depth here - but first it is important to understand exactly what we mean when we talk about the request URI.
The URI is the location on the server of the resource we are requesting. In general, this consists of a path component, and optionally a query string. There are circumstances where other components may be present as well, but for the purposes of simplicity we shall ignore them for now.
Let's imagine you type http://server.domain.tld/path/to/document.ext?key=value into the address bar of your browser. The browser dismantles this string, and determines that it needs to connect to an HTTP server at server.domain.tld, and ask for the document at /path/to/document.ext?key=value.
The generated HTTP 1.1 request will look (at a minimum) like this:
GET /path/to/document.ext?key=value HTTP/1.1
Host: server.domain.tld
The first part of the request is the word GET - this is the request METHOD. The next part is the path to the file we are requesting - this is the request URI. At the end of this first line is an identifier indicating the protocol version in use. On the following line you can see a header in MIME format, called Host. HTTP 1.1 mandates that the Host: header be included with every request. This is the only header of which this is true.
The request URI is broken into two parts - everything to the left of the question mark ? is the path, everything to the right of it is the query string.
Request Methods
RFC 2616 (HTTP/1.1) defines 8 request methods.
OPTIONS
The OPTIONS method is rarely used. It is intended as a mechanism for determining what kind of functionality the server supports before attempting to consume a service the server may provide.
Off the top of my head, the only place in fairly common usage that I can think of where this is used is when opening documents in Microsoft office directly over HTTP from Internet Explorer - Office will send an OPTIONS request to the server to determine if it supports the PUT method for the specific URI, and if it does it will open the document in a way that allows the user to save their changes to the document directly back to the remote server. This functionality is tightly integrated within these specific Microsoft applications.
GET
This is by far and away the most common method in every day usage. Every time you load a regular document in your web browser it will be a GET request.
The GET method requests that the server return a specific document. The only data that should be transmitted to the server is information that the server requires to determine which document should be returned. This can include information that the server can use to dynamically generate the document, which is sent in the form of headers and/or query string in the request URI. While we're on the subject - Cookies are sent in the request headers.
HEAD
This method is identical to the GET method, with one difference - the server will not return the requested document, if will only return the headers that would be included in the response. This is useful for determining, for example, if a particular document exists without having to transfer and process the entire document.
POST
This is the second most commonly used method, and arguably the most complex. POST method requests are almost exclusively used to invoke some actions on the server that may change its state.
A POST request, unlike GET and HEAD, can (and usually does) include some data in the body of the request message. This data can be in any format, but most commonly it is a query string (in the same format as it would appear in the request URI) or a multipart message that can communicate key/value pairs along with file attachments.
Many HTML forms use the POST method. In order to upload files from a browser, you would need to use the POST method for your form.
The POST method is semantically incompatible with RESTful APIs because it is not idempotent. That is to say, a second identical POST request may result in a further change to the state of the server. This contradicts the "stateless" constraint of REST.
PUT
This directly complements GET. Where a GET requests indicates that the server should return the document at the location specified by the request URI in the response body, the PUT method indicates that the server should store the data in the request body at the location specified by the request URI.
DELETE
This indicates that the server should destroy the document at the location indicated by the request URI. Very few internet facing HTTP server implementations will perform any action when they receive a DELETE request, for fairly obvious reasons.
TRACE
This provides an application-layer level mechanism to allow clients to inspect the request it has sent as it looks by the time it reaches the destination server. This is mostly useful for determining the effect that any proxy servers between the client and the destination server may be having on the request message.
CONNECT
HTTP 1.1 reserves the name for a CONNECT method, but does not define its usage, or even its purpose. Some proxy server implementations have since used the CONNECT method to facilitate HTTP tunnelling.
I've never tried using PUT (GET POST and FILES were sufficient for my needs) but this example is from the php docs so it might help you (http://php.net/manual/en/features.file-upload.put-method.php):
<?php
/* PUT data comes in on the stdin stream */
$putdata = fopen("php://input", "r");
/* Open a file for writing */
$fp = fopen("myputfile.ext", "w");
/* Read the data 1 KB at a time
and write to the file */
while ($data = fread($putdata, 1024))
fwrite($fp, $data);
/* Close the streams */
fclose($fp);
fclose($putdata);
?>
Here is the solution that I found to be the most useful.
$put = array();
parse_str(file_get_contents('php://input'), $put);
$put will be an array, just like you are used to seeing in $_POST, except now you can follow true REST HTTP protocol.
Use POST and include an X- header to indicate the actual method (PUT in this case). Usually this is how one works around a firewall which does not allow methods other than GET and POST. Simply declare PHP buggy (since it refuses to handle multipart PUT payloads, it IS buggy), and treat it as you would an outdated/draconian firewall.
The opinions as to what PUT means in relation to GET are just that, opinions. The HTTP makes no such requirement. It simply states 'equivalent' .. it is up to the designer to determine what 'equivalent' means. If your design can accept a multi-file upload PUT and produce an 'equivalent' representation for a subsequent GET for the same resource, that's just fine and dandy, both technically and philosophically, with the HTTP specifications.
Just follow what it says in the DOC:
<?php
/* PUT data comes in on the stdin stream */
$putdata = fopen("php://input", "r");
/* Open a file for writing */
$fp = fopen("myputfile.ext", "w");
/* Read the data 1 KB at a time
and write to the file */
while ($data = fread($putdata, 1024))
fwrite($fp, $data);
/* Close the streams */
fclose($fp);
fclose($putdata);
?>
This should read the whole file that is on the PUT stream and save it locally, then you could do what you want with it.
With PHP and Apache, is it possible to start the PHP script after the headers are received, but before the body is sent ?
I mean kind of this:
PUT /sync/testStream.php HTTP/1.1
Host: localhost
User-Agent: test
Content-Length: 500
Content-Type: text/plain
<start the script here>
hello
this is a test
The objective is to lock a file during the script (during its upload actually).
I tried to read from php://input, with no success: the script is started only when the whole body is received.
Here is my script:
<?php
echo "Hello. Start sending data.";
ob_flush();
$file = fopen('php://input', 'r');
if ($file === false) {
die("Could not open php://input");
}
while (($line = fgets($file)) !== false) {
echo $line;
}
echo "Bye";
Any hint is welcome !
I have seen this question on here before (can't find it right now, but I have) and after much discussion/debate, that conclusion was that No, you can't.
This is because the file upload process and receiving of the body of the request is handled by the overlying web server, and PHP is not fired up until after the request has been fully received/parsed/validated.
One point here is that your script appears to be attempting to establish full-duplex communication over HTTP, which is not possible. HTTP is by it's very nature a half-duplex protocol - it has a request-response architecture and you cannot start sending a response before you have completely received the request, it would be a protocol violation.
If you explain exactly what you are trying to do/why you want to lock a file during upload, maybe we can find an alternative solution.
Actually, it's not possible because of PHP isn't aware of any request before the HTTP request is completed. In other words, PHP receives the query after the upload is completed.
PHP is executed after whole request.
To achieve this functionality you could write custom apache module.
I have a PHP script on my server that is making a request to another server for an image.
The script is accessed just like a regular image source like this:
<img src="http://example.com/imagecontroller.php?id=1234" />
Browser -> Script -> External Server
The script is doing a CURL request to the external server.
Is it possible to "stream" the CURL response directly back to the client (browser) as it is received on the server?
Assume my script is on a slow shared hosting server and the external server is blazing fast (a CDN). Is there a way to serve the response directly back to the client without my script being a bottleneck? It would be great if my server didn't have to wait for the entire image to be loaded into memory before beginning the response to the client.
Pass the -N/--no-buffer flag to curl. It does the following:
Disables the buffering of the output stream. In normal work
situations, curl will use a standard buffered output stream that will
have the effect that it will output the data in chunks, not
necessarily exactly when the data arrives. Using this option will
disable that buffering.
Note that this is the negated option name documented. You can thus use
--buffer to enforce the buffering.
Check out Pascal Martin's answer to an unrelated question, in which he discusses using CURLOPT_FILE for streaming curl responses. His explanation for handling " Manipulate a string that is 30 million characters long " should work in your case.
Hope this helps!
Yes you can use the CURLOPT_WRITEFUNCTION flag:
curl_setopt($ch, CURLOPT_WRITEFUNCTION, $callback);
Where $ch is the Curl handler, and $callback is the callback function name.
This command will stream response data from remote site. The callback function can look something like:
$result = '';
$callback = function ($ch, $str) {
global $result;
$result .= $str;//$str has the chunks of data streamed back.
//here you can mess with the stream data either with $result or $str
return strlen($str);//don't touch this
};
If not interrupted at the end $result will contain all the response from remote site.
Not with curl, you could use fsocket to do streaming.