I'm using PHP's cURL function to read profiles from steampowered.com. The data retrieved is XML, and only the first roughly 1000 bytes are needed.
The method I'm using is to add a Range header, which I read on a Stack Overflow answer (curl: How to limit size of GET?). Another method I tried was using the curlopt_range but that didn't work either.
<?
$curl_url = 'http://steamcommunity.com/id/edgen?xml=1';
$curl_handle = curl_init($curl_url);
curl_setopt ($curl_handle, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($curl_handle, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt ($curl_handle, CURLOPT_HTTPHEADER, array("Range: bytes=0-1000"));
$data_string = curl_exec($curl_handle);
echo $data_string;
curl_close($curl_handle);
?>
When this code is executed, it returns the whole thing.
I'm using PHP Version 5.2.14.
The server does not honor the Range header. The best you can do is to cancel the connection as soon as you receive more data than you want. Example:
<?php
$curl_url = 'http://steamcommunity.com/id/edgen?xml=1';
$curl_handle = curl_init($curl_url);
$data_string = "";
function write_function($handle, $data) {
global $data_string;
$data_string .= $data;
if (strlen($data_string) > 1000) {
return 0;
}
else
return strlen($data);
}
curl_setopt ($curl_handle, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($curl_handle, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt ($curl_handle, CURLOPT_WRITEFUNCTION, 'write_function');
curl_exec($curl_handle);
echo $data_string;
Perhaps more cleanly, you could use the http wrapper (this would also use curl if it was compiled with --with-curlwrappers). Basically you would call fread in a loop and then fclose on the stream when you got more data than you wanted. You could also use a transport stream (open the stream with fsockopen, instead of fopen and send the headers manually) if allow_url_fopen is disabled.
Related
My php curl request is timing out as i expected it to and giving me the error message: "Operation timed out after 120000 milliseconds with 234570 bytes received"
But how do i get the bytes received despite its timeout?
$url = "example.com";
$timeout = 120;
$ch = curl_init();
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_FORBID_REUSE, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$curl_page = curl_exec($ch);
$error = curl_error($ch);
curl_close($ch);
var_dump($curl_page, $error);
don't use CURLOPT_RETURNTRANSFER. use CURLOPT_FILE instead, eg
$outfileh=tmpfile();
$outfile=stream_get_meta_data($outfileh)['uri'];
curl_setopt($ch,CURLOPT_FILE,$outfileh);
curl_exec($ch);
$curl_page=file_get_contents($outfile);
(and don't forget to fclose($outfileh), or you'll have a resource leak, and keep in mind that with tmpfile()'s, fclose() will delete the file for you as well... the good news is, php will clean it up anyway at the end of execution, though) - another option is to use CURLOPT_WRITEFUNCTION, eg
$curl_page = '';
curl_setopt ( $ch, CURLOPT_WRITEFUNCTION, function ($ch, $recieved) use (&$curl_page) {
$curl_page .= $recieved;
return strlen ( $recieved );
} );
curl_exec($ch);
which has the advantage of less IO, this will be handled enterly in memory, unlike the CURLOPT_FILE approach, which may start writing it to disk, depending on the OS IO cache.
I'm Currently proxying an endpoint by running a cURL however the size of my cURL is about 10 times larger than the original API, Why is that, and how can I decrease the size? This is all JSON BTW.
Original API return size = 32.2kb
cURL return size = 488KB
And here is my cURL script:
$ch = curl_init();
// set url
$url = 'http://domain.com/api/v1';
// set options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4 );
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
return $output;
ob_start('ob_gzhandler');
php output buffer controll was the fix to my problem. Thanks all that tried to help!
I have used the cURL solution to solve XSS but there is an issue with it.
My proxy.php file contents are:-
<?php
$url = "http://www.yahoo.com";
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
echo $file_contents;
?>
And this is how i am trying to execute php script
$("#tempButton").click(function(){
$("#pageContent").load('http://localhost:8080/proof/proxy.php',function() {
var t = $("#pageContent").html();
alert(t);
});
});
But variable t is showing the contents of proxy.php file while it is expected to show contents of yahoo.com which was set in proxy.php file. Am i doing something silly. #FirstTimePHP
As variable t is showing the content of the file the server software must not be recognising thee script as PHP.
There are several reasons that this may happen. Not having opening tags would be 1 but you of course have these.
Another potential reason is that php has not been loaded as a module in the server software.
Another potential reason is that the server does not parse files with the extension of php (this is configurable).
You should start from basics. Ignore the javascript, instead call the url manually and see what you get. The chances are you will see the code.
If this does happen ensure that server software (usually apache) is set to recognise the extension php is associated with the php module. Lastly ensure that PHP is actually properly installed.
Make your proxy.php like this.
<?php
if(in_array('curl', get_loaded_extensions())) {
$url = "http://www.yahoo.com";
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
echo $file_contents;
}
else {
echo 'No cUrl here';
}
?>
I'm having dificulties to query a webform using CURL with a PHP script. I suspect, that I'm sending something that the webserver does not like. In order to see what CURL realy sends I'd like to see the whole message that goes to the webserver.
How can I set-up CURL to give me the full output?
I did
curl_setopt($ch, CURLOPT_VERBOSE, TRUE);
but that onyl gives me a part of the header. The message content is not shown.
Thanks for all the answers! After all, they tell that It's not possible. I went down the road and got familiar with Wireshark. Not an easy task but definitely worth the effort.
Have you tried CURLINFO_HEADER_OUT?
Quoting the PHP manual for curl_getinfo:
CURLINFO_HEADER_OUT - The request string sent. For this to work, add
the CURLINFO_HEADER_OUT option to the handle by calling curl_setopt()
If you are wanting the content can't you just log it? I am doing something similar for my API calls
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, self::$apiURL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_POST, count($dataArray));
curl_setopt($ch, CURLOPT_POSTFIELDS, $dataString);
$logger->info("Sending " . $dataString);
self::$results = curl_exec($ch);
curl_close($ch);
$decoded = json_decode(self::$results);
$logger->debug("Received " . serialize($decoded));
Or try
curl_setopt($ch, CURLOPT_STDERR, $fp);
I would recommend using curl_getinfo.
<?php
curl_exec($ch);
$info = curl_getinfo($ch);
if ( !empty($info) && is_array($info) {
print_r( $info );
} else {
throw new Exception('Curl Info is empty or not an array');
};
?>
I want to connect to a remote file and writing the output from the remote file to a local file, this is my function:
function get_remote_file_to_cache()
{
$the_site="http://facebook.com";
$curl = curl_init();
$fp = fopen("cache/temp_file.txt", "w");
curl_setopt ($curl, CURLOPT_URL, $the_site);
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_exec ($curl);
$httpCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if($httpCode == 404) {
touch('cache/404_err.txt');
}else
{
touch('cache/'.rand(0, 99999).'--all_good.txt');
}
curl_close ($curl);
}
It creates the two files in the "cache" directory, but the problem is it does not write the data into the "temp_file.txt", why is that?
Actually, using fwrite is partially true.
In order to avoid memory overflow problems with large files (Exceeded maximum memory limit of PHP), you'll need to setup a callback function to write to the file.
NOTE: I would recommend creating a class specifically to handle file downloads and file handles etc. rather than EVER using a global variable, but for the purposes of this example, the following shows how to get things up and running.
so, do the following:
# setup a global file pointer
$GlobalFileHandle = null;
function saveRemoteFile($url, $filename) {
global $GlobalFileHandle;
set_time_limit(0);
# Open the file for writing...
$GlobalFileHandle = fopen($filename, 'w+');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FILE, $GlobalFileHandle);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, "MY+USER+AGENT"); //Make this valid if possible
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); # optional
curl_setopt($ch, CURLOPT_TIMEOUT, -1); # optional: -1 = unlimited, 3600 = 1 hour
curl_setopt($ch, CURLOPT_VERBOSE, false); # Set to true to see all the innards
# Only if you need to bypass SSL certificate validation
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
# Assign a callback function to the CURL Write-Function
curl_setopt($ch, CURLOPT_WRITEFUNCTION, 'curlWriteFile');
# Exceute the download - note we DO NOT put the result into a variable!
curl_exec($ch);
# Close CURL
curl_close($ch);
# Close the file pointer
fclose($GlobalFileHandle);
}
function curlWriteFile($cp, $data) {
global $GlobalFileHandle;
$len = fwrite($GlobalFileHandle, $data);
return $len;
}
You can also create a progress callback to show how much / how fast you're downloading, however that's another example as it can be complicated when outputting to the CLI.
Essentially, this will take each block of data downloaded, and dump it to the file immediately, rather than downloading the ENTIRE file into memory first.
Much safer way of doing it!
Of course, you must make sure the URL is correct (convert spaces to %20 etc.) and that the local file is writeable.
Cheers,
James.
Let's try sending GET request to http://facebook.com:
$ curl -v http://facebook.com
* Rebuilt URL to: http://facebook.com/
* Hostname was NOT found in DNS cache
* Trying 69.171.230.5...
* Connected to facebook.com (69.171.230.5) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: facebook.com
> Accept: */*
>
< HTTP/1.1 302 Found
< Location: https://facebook.com/
< Vary: Accept-Encoding
< Content-Type: text/html
< Date: Thu, 03 Sep 2015 16:26:34 GMT
< Connection: keep-alive
< Content-Length: 0
<
* Connection #0 to host facebook.com left intact
What happened? It appears that Facebook redirected us from http://facebook.com to secure https://facebook.com/. Note what is response body length:
Content-Length: 0
It means that zero bytes will be written to xxxx--all_good.txt. This is why the file stays empty.
Your solution is absolutelly correct:
$fp = fopen('file.txt', 'w');
curl_setopt($handle, CURLOPT_FILE, $fp);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
All you need to do is change URL to https://facebook.com/.
Regarding other answers:
#JonGauthier: No, there is no need to use fwrite() after curl_exec()
#doublehelix: No, you don't need CURLOPT_WRITEFUNCTION for such a simple operation which is copying contents to file.
#ScottSaunders: touch() creates empty file if it doesn't exists. I think it was intention of OP.
Seriously, three answers and every single one is invalid?
You need to explicitly write to the file using fwrite, passing it the file handle you created earlier:
if ( $httpCode == 404 ) {
...
} else {
$contents = curl_exec($curl);
fwrite($fp, $contents);
}
curl_close($curl);
fclose($fp);
In your question you have
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
but from PHP's curl_setopt documentation notes...
It appears that setting CURLOPT_FILE before setting CURLOPT_RETURNTRANSFER doesn't work, presumably because CURLOPT_FILE depends on CURLOPT_RETURNTRANSFER being set.
So do this:
<?php
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FILE, $fp);
?>
not this:
<?php
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
?>
...stating "CURLOPT_FILE depends on CURLOPT_RETURNTRANSFER being set".
Reference: https://www.php.net/manual/en/function.curl-setopt.php#99082
To avoid memory leak problems:
I was confronted with this problem as well. It's really stupid to say but the solution is to set CURLOPT_RETURNTRANSFER before CURLOPT_FILE!
it seems CURLOPT_FILE depends on CURLOPT_RETURNTRANSFER.
$curl = curl_init();
$fp = fopen("cache/temp_file.txt", "w+");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_setopt($curl, CURLOPT_URL, $url);
curl_exec ($curl);
curl_close($curl);
fclose($fp);
The touch() function doesn't do anything to the contents of the file. It just updates the modification time. Look at the file_put_contents() function.