I want to know if there is any way to get only a particular amount of data through cURL?
Option 1:
curl_setopt ($curl_handle, CURLOPT_HTTPHEADER, array("Range: bytes=0-1000"));
but Its not supported by all servers
Option 2:
Having trouble limiting download size of PHP's cURL function but this function is giving me error Failed writing body (0 != 11350) and reading for which I found that many say its a bug.
So following the above write_function I tried to curl_close($handle) instead of returning 0 but this throws an error Attempt to close cURL handle from a callback
Now the only way I can think of is parsing headers for content length but this will eventually result in 2 requests ?? first getting headers with CURLOPT_NOBODY then getting full content?
Option 2: Having trouble limiting
download size of PHP's cURL function
but this function is giving me error
Failed writing body (0 != 11350) and
reading for which I found that many
say its a bug.
It's not clear what you are doing there exactly. If you return 0 then cURL will signal an error, sure, but you will have read all the data you need. Just ignore the error.
Another option that you don't mention if you have tried is to use fopen with the http:// wrapper. For example:
$h = fopen('http://example.com/file.php', 'r');
$first1000Bytes = fread($h, 1000);
fclose($h);
Is it possible to use fopen and fgets to read a line at a time until you believe you've read enough lines, or read a character at a time using fgetc.
fgets
Not sure if this is excatly what you're looking for, but should limit the amount of data gotten from the remote source.
This seems to solve your problem:
mb_strlen($string, '8bit');
Related
From a comment to this answer I read that "stream_get_contents is low-level" comparing to file_get_contents. However according to Manual, stream_get_contents is
Identical to file_get_contents(), except that stream_get_contents() operates on an already open stream resource and returns the remaining contents in a string, up to maxlength bytes and starting at the specified offset.
Which statement is correct?
Is stream_get_contents really lower level and faster?
Specifically I am interested in reading local files from HD.
I'm late here but it might help others
file_get_contents() loads the file content into memory. It sits there in memory and waits for the program to call echo upon which it will be delivered to the output buffer.
A good usage example is:
echo file_get_contents('file.txt');
stream_get_contents() delivers the content on an already open stream. An example is this:
$handle = fopen('file.txt', 'w+');
echo stream_get_contents($handle);
You could see that stream_get_contents() used an existing stream created by fopen() to get the contents as a string.
file_get_contents() is the more preferred way as it doesn't depend on an open stream, and is efficient with your memory using memory mapping techniques. For external sites reading, you can also set HTTP headers when getting the content. (Refer to https://www.php.net/manual/en/function.file-get-contents.php for more info)
For larger files/resources, stream_get_contents() may be preferred as it delivers the content fractionally as opposed to file_get_contents() where the entire data is dumped in memory.
I'm using fsockopen to connect to an OpenVAS manager and send XML. The code I am using is:
$connection = fsockopen('ssl://'.$server_data['host'], $server_data['port']);
stream_set_timeout($connection, 5);
fwrite($connection, $xml);
while ($chunk = fread($connection, 2048)) {
$response .= $chunk;
}
However after reading the first two chunks of data, PHP hangs on fread and doesn't time out after 5 seconds. I have tried using stream_get_contents, which gives the same result, BUT if I only use one fread, it works ok, just that I want to read everything, regardless of length.
I am guessing, it is an issue with OpenVAS, which doesn't end the stream the way PHP expects it to, but that's a shot in the dark. How do I read the stream?
I believe that fread is hanging up because on that last chunk, it is expecting 2048 bytes of information and is probably getting less that that, so it waits until it times out.
You could try to refactor your code like this:
$bytes_to_read = 2048;
while ($chunk = fread($connection, $bytes_to_read)) {
$response .= $chunk;
$status = socket_get_status ($connection);
$bytes_to_read = $status["unread_bytes"];
}
That way, you'll read everything in two chunks.... I haven't tested this code, but I remember having a similar issue a while ago and fixing it with something like this.
Hope it helps!
I come from a C and C++ background but also play with some web stuff. All us C folks (hopefully) know that calling feof on a FILE* before doing a read is an error. This is something that stings newbies to C and C++ very often. Is this also the case for PHP's implementation?
I figure it has to be because the file could be a socket or anything else where it is impossible to know the size before finishing reading. But just about every PHP example (even those found on php.net I've seen looks something like this (and alarms go off in my head):
$f = fopen("whatever.txt", "rb");
while(!feof($f)) {
echo fgets($f);
}
fclose($f);
I know it is preferable to write this like this and avoid this issue:
$f = fopen("whatever.txt", "rb");
while($line = fgets($f)) {
echo $line;
}
fclose($f);
but that's besides the point. I tried testing if things would fail if I did it "the wrong way", but I could not get it to cause incorrect behavior. This isn't exactly scientific, but I figured it was worth a try.
So, is it incorrect to call feof before an fread in PHP?
There are a couple of ways that PHP could have done this differently that C's version, but I feel they have downsides.
they could have it default to !EOF. This is sub-optimal because it may be incorrect for some corner cases.
they could get the file size during an fopen call, but this couldn't work on all types of file resources, yielding inconsistent behavior and would be slower.
PHP doesn't know whether it's at the end of the file until you've tried to read from it. Try this simple example:
<?php
$fp = fopen("/dev/null","rb");
while (!feof($fp)) {
$data = fread($fp, 1);
echo "read " . strlen($data) . " bytes";
}
fclose($fp);
?>
You will get one line reading read 0 bytes. feof() returned true even though you were technically at the end of the file. Usually this doesn't cause a problem because fread($fp, 1) returns no data, and whatever processing you're doing on that handles no data just fine. If you really do need to know if you're at the end of the file, you do have to do a read first.
This may be more of a question than an answer, but why not use file_get_contents()? In my experience, if you are reading a file or a stream, this function does the dirty work for you (assuming you want to read the entire resource to a string, or know/can compute a limit and offset). Its sister function file_put_contents() works well, in reverse.
For instance, here's an example:
$expected_content = "Hello Stack Overflow!"
$real_content = file_get_contents("/path/to/file.txt");
if ($expected_content != $real_content){
file_put_contents("/path/to/file.txt", $real_content);
}
or a stream:
$expected_content = "Hello Stack Overflow!"
$real_content = file_get_contents("http://host.domain.com/file.txt");
if ($expected_content != $real_content){
$options = array('ftp' => array('overwrite' => true));
$stream = stream_context_create($options);
file_put_contents("ftp://user:pass#host.domain.com/file.txt", $real_content, 0, $stream);
}
Then you don't need to worry about EOF or anything, it does it for you (ftp put gets a bit dicey, but that's ok). Of course, this won't work in all situations...
Is there something that I'm missing from the original question that makes this approach unfeasible?
Your assertion about not calling feof before an fread is not correct - consequently the question is not valid.
I read some URL with fsockopen() and fread(), and i get this kind of data:
<li
10
></li>
<li
9f
>asd</li>
d
<li
92
Which is totally messed up O_O
--
While using file _ get _ contents() function i get this kind of data:
<li></li>
<li>asd</li>
Which is correct! So, what the HELL is wrong? i tried on my windows server and linux server, both behaves same. And they dont even have the same PHP version.
--
My PHP code is:
$fp = #fsockopen($hostname, 80, $errno, $errstr, 30);
if(!$fp){
return false;
}else{
$out = "GET /$path HTTP/1.1\r\n";
$out .= "Host: $hostname\r\n";
$out .= "Accept-language: en\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
$data = "";
while(!feof($fp)){
$data .= fread($fp, 1024);
}
fclose($fp);
Any help/tips is appreciated, been wondering this whole day now :/
Oh, and i cant use fopen() or file _ get _ contents() because the server where my script runs doesnt have fopen wrappers enabled > __ <
I really want to know how to fix this, just for curiousity. and i dont think i can use any extra libraries on this server anyways.
About your "strange data" problem, this might be because the server you are requesting data from is transferring it in chunked mode.
You can take a look at the HTTP headers, when calling the same URL in your browser ; one of those headers might be like this :
Transfer-encoding: chunked
Quoting wikipedia's article on that matter :
Each non-empty chunk starts with the
number of octets of the data it embeds
(size written in hexadecimal) followed
by a CRLF (carriage return and line
feed), and the data itself. The chunk
is then closed with a CRLF. In some
implementations, white space
characters (0x20) are padded between
chunk-size and the CRLF.
The last chunk is a single line,
simply made of the chunk-size (0),
some optional padding white spaces and
the terminating CRLF. It is not
followed by any data, but optional
trailers can be sent using the same
syntax as the message headers.
The message is finally closed by a
final CRLF combination.
This looks close to what you are getting... So I'm guessing this is the problem.
As far as I remember, curl knows how to deal with that -- so, the easy way would be to use curl instead of fsockopen and the like
And using curl is often a better idea that using sockets : it will deal with many problems you might encounter ; like this one ;-)
Anoter idea, if you don't have curl enabled on your server, would be to use some already existing library based on fsockopen -- hoping it would take care of those kind of things for you already.
For instance, I've worked with Snoopy a couple of times ; maybe it ealready knows how to deal with that ?
(Not sure : you'll have to test by yourself -- or take a look at the documentation to know if this is OK)
Still, if you want to deal with the mysteries of the HTTP protocol by yourself... Well, I wish you luck !
You probably want to use cURL.
<?php
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// grab URL and pass it to the browser
$output = curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
?>
With fsockopen(), you get the raw TCP data, not the HTTP contents. I assume you also see the HTTP headers, right? If it's in chunked encoding, you will get all the chunk headers.
This is a known issue. Someone posted a solution here on how to remove chunk headers.
What I need is an equivalent for PHP's fseek() function. The function works on files, but I have a variable that contains binary data and I want to work on it. I know I could use substr(), but that would be lame - it's used for strings, not for binary data. Also, creating a file and then using fseek() is not what I am looking for either.
Maybe something constructed with streams?
EDIT: Okay, I'm almost there:
$data = fopen('data://application/binary;binary,'.$bin,'rb');
Warning: failed to open stream: rfc2397: illegal parameter
Kai:
You have almost answered yourself here. Streams are the answer. The following manual entry will be enlightening: http://us.php.net/manual/en/wrappers.data.php
It essentially allows you to pass arbitrary data to PHP's file handling functions such as fopen (and thus fseek).
Then you could do something like:
<?php
$data = fopen('data://mime/type;encoding,' . $binaryData);
fseek($data, 128);
?>
fseek on data in a variable doesn't make sense. fseek just positions the file handle to the specified offset, so the next fread call starts reading from that offset. There is no equivalent of fread for strings.
Whats wrong with substr()?
With a file you would do:
$f = fopen(...)
fseek($f, offset)
$x = fread($f, len)
with substr:
$x = substr($var, offset, len)
I'm guessing, but maybe what is being asked for is a way to access bytes in a variable by using a pointer.. (using it like an array of bytes like you could do in c - without the memory overhead of putting the data in php arrays) and being able to edit them inplace without the overhead of copying the data.
Not being able to do this is a BIG problem, but if the operating system caches disk data well using fseek on a temporary file could be a workaround.