I'm using a simple method of loading a remote webpage that works fine mostly:
$output = file_get_contents($item['URL']);
$html->loadHTML($output);
After which I can search for tags by type or name or ID, but the problem is that the main content I want is generated after the fact by JS in the last second. When loading in a browser, you don't notice it, but when trying to get it with file_get_contents, I get the page as it exists before the last minute JS runs.
Here's the partial code that loads what I want so you can see what I mean, but it's pretty straighforward: the page I get isn't the "complete" page.
<script type="text/javascript">ImageMachine.prototype.ImageMachine_Generate_Thumbnail = function (thumbnail_image, main_image, closeup_image, type_code) {
var thumbnail,
img;
I tried using CURL too, but no luck.
$header[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,;q=0.8";
$header[] = "Connection: keep-alive";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $item['URL']);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header_str);
// curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_USERAGENT, $this_time);
$output = curl_exec($ch);
curl_close($ch);
#$html->loadHTML($output);
Is there a way to get the whole thing? I want the same page a browser or user would see if they load the page.
Related
I am calling an API using CURL.
When I run it directly in browser or via ajax request it runs well and gives xml output.The Api am calling stores the xml in a database table and would only then work well.
However when I call it via PHP curl their table is not getting updated.
The code am doing it with PHP curl is
curl_setopt($ch, CURLOPT_URL, $api);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET");
$headers = array();
$headers[] = "Accept: application/json";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
Sample URL : http://example.com/code=x05a&businessKeyValue=8519ada0-9e2f-e265-8698-5b6145af9704&entity=funded_program_concession¶meters=fpc_funded_program_concession_id%7C8519ada0-9e2f-e265-8698-5b6145af9704¶meters=fps_funded_program_subsidy_id%7Cf320f2d9-7c6a-0b56-2940-5b61147a0f3d
If I open this link which opens it in browser, it works good, but if I run it via curl the API is not receiving xml content.
How can I resolve this issue?
$api='http://example.com/code=x05a&businessKeyValue=8519ada0-9e2f-e265-8698-5b6145af9704&entity=funded_program_concession¶meters=fpc_funded_program_concession_id%7C8519ada0-9e2f-e265-8698-5b6145af9704¶meters=fps_funded_program_subsidy_id%7Cf320f2d9-7c6a-0b56-2940-5b61147a0f3d';
$ch = curl_init();
$headers = array();//put your headers only if you need them
curl_setopt($ch, CURLOPT_URL, $api);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec($ch);
return $result;
This should work for you. Inside your headers array make sure you put the headers you need and not extra. This is a simple curl, get call so i guess the above code will work without any trouble.
If you want to accept xml then you can got for Accept:application/xml or Accept:application/xhtml+xml
If the above methods don't work provide us with your error.
echo 'Curl error: ' . curl_error($ch);
Add the above line as a breaking point in your code and see what's the error.
Maybe I just need a pair of fresh eyes....
I need to POST to a page behind .htaccess Basic Authentication. I successfully log in and get past the .htBA, then POST to the target page. I know that the script is getting to that page as I'm logging the access. However $_POST is empty -- evident from both checking the var as well as the target script not working the way it should. (I control all pages).
I've tried many combos of the various curl opts below to no avail. I'm not getting any errors from the second hit.
Thanks.
$post_array = array(
'username'=>$u,
'password'=>$p
);
// Login here
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://example.com/admin/login.php');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0');
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath('temp/cookies.txt') );
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath('temp/cookies.txt'));
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_REFERER, 'http://example.com/index.php');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_array));
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'method' => 'POST',
"Authorization: Basic ".base64_encode("$username:$password"),
));
$logInFirst = curl_exec ($ch);
/* Don't close handle as need the auth for next page
* load up a new page */
$post_array_2 = array(
'localfile'=>'my_data.csv',
'theater_mode'=>'normal'
);
//curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath('temp/cookies.txt') );
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath('temp/cookies.txt'));
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, 'http://example.com/admin/post_here.php');
curl_setopt($ch, CURLOPT_URL, 'http://example.com/admin/post_here.php');
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_array_2));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: multipart/form-data;',
"Authorization: Basic ".base64_encode("$username:$password"),
));
$runAi = curl_exec($ch);
$run_error = curl_error($ch); echo '<hr>'.$run_error.'<hr>';
curl_close($ch);
Here's the code on the target page (post_here.php), which results in a zero count. So I know that the target script is being hit, and based on the output, there are no POSTs.
$pa = ' There are this many keys in POST: '.count($_POST);
foreach ($_POST as $key => $value) {
$pa .= ' '.$key.':'.$value.' ---- ';
}
The error is on the second request:
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_array_2));
// ...
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: multipart/form-data;',
// ...
You send the header Content-Type: multipart/form-data but the data is encoded as application/x-www-form-urlencoded (by http_build_query()).
The data you want to post on the second request contains 'localfile'=>'my_data.csv'. If you want to upload a file on the second request then the content type is correct (but you don't need to set it manually). Don't use http_build_query() but pass an array to CURLOPT_POSTFIELDS, as is explained in the documentation.
Also, for file uploads you have to put a # in front of the file name and make sure curl is able to find the file. The best way to do this is to use the complete file path:
$post_array_2 = array(
'localfile' => '#'.__DIR__'/my_data.csv',
'theater_mode' => 'normal'
);
The code above assumes my_data.csv is located in the same directory as the PHP script (which is not recommended). You should use dirname() to navigate from the script's directory to the directory where the CSV file is stored, to compose the correct path.
As the documentation also states, since PHP 5.5 the # prefix is deprecated and you should use the CURLFile class for file uploads:
$post_array_2 = array(
'localfile' => new CURLFile(__DIR__'/my_data.csv'),
'theater_mode' => 'normal'
);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array_2);
As a side note, when you call curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY); it means curl is allowed to negotiate the authentication method with the server. But you also send the header "Authorization: Basic ".base64_encode("$username:$password") and this removes any negotiation because it forces Authorization: Basic.
Also, in order to negociate, curl needs to know the (user, password) combination. You should always use curl_setopt(CURLOPT_USERPWD, "$username:$password") to tell it the user and password. Manual crafting the Authorization header is not recommended.
If you are sure Authorization: Basic is the method you need then you can
use curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC).
You do not see anything inside post because you are using 'Content-Type: multipart/form-data;',. Just remove that and you should be fine.
If you want to upload a file (i.e. my_data.csv) that case you need to follow this way:
## change your file name as following in your param
'localfile'=> '#'.'./my_data.csv',
## after that remove http_build_query() from post
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array_2);
This will automatically add the header multipart with your post.
You may look your uploaded file using $_FILES variable.
Finally, You can observe what curl is enabling verbose mode.
curl_setopt($ch, CURLOPT_VERBOSE, true);
Tips: While using cookie, always close curl after each and every curl_exec() you do. Otherwise it will not probably write things into cookie file after every requests you make!
I am working on two system.In which asterisk runs on one system-1.I want to run command in asterisk and get result back in system-2.I make curl request like below.How to get value back on system2?enter code here
exec('asterisk -rx "sip show peers"',$sip);
$POST_DATA = array(
'filename'=>$sip,
);
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL,'http://192.168.50.138/test.php');
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $POST_DATA);
$response = curl_exec($curl);
curl_close ($curl);
?>
Since you already have
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
in your code. curl_exec should already returns the content of the page instead of a BOOL.
This is a snippet of a library I use. As pointed out this might not be needed but it helped me out once...
//The content - if true, will not download the contents
curl_setopt($ch, CURLOPT_NOBODY, false);
Also it seems to have some bugs related to CURLOPT_NOBODY (which might explain why you have this issue):
http://osdir.com/ml/web.curl.general/2005-07/msg00073.html
http://curl.haxx.se/mail/curlphp-2008-03/0072.html
I have a Affiliate URL Like http://track.abc.com/?affid=1234
open this link will go to http://www.abc.com
now i want to execute the http://track.abc.com/?affid=1234 Using CURL
and now how i can Get http://www.abc.com
with Curl ?
If you want cURL to follow redirect headers from the responses it receives, you need to set that option with:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
You may also want to limit the number of redirects it follows using:
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
So you'd using something similar to this:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://track.abc.com/?affid=1234");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$data = curl_exec($ch);
Edit: Question wasn't exactly clear but from the comment below, if you want to get the redirect location, you need to get the headers from cURL and parse them for the Location header:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://track.abc.com/?affid=1234");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);
This will give you the headers returned by the server in $data, simply parse through them to get the location header and you'll get your result. This question shows you how to do that.
I wrote a function that will extract any header from a cURL header response.
function getHeader($headerString, $key) {
preg_match('#\s\b' . $key . '\b:\s.*\s#', $headerString, $header);
return substr($header[0], strlen($key) + 3, -2);
}
In this case, you're looking for the value of the header Location. I tested the function by retrieving headers from a TinyURL, that redirects to http://google.se, using cURL.
$url = "http://tinyurl.com/dtrkv";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
$location = getHeader($data, 'Location');
var_dump($location);
Output from the var_dump.
string(16) "http://google.se"
I am trying to parse a page which contains some links. These links, if followed, will redirect to some files to download.
For example, Download which redirects to <a href="http://example.com/1.pdf".
I don't want to download the file, I just want to get the file link (int this case http://example.com/1.pdf).
I am trying this:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, FALSE); // Return in string
curl_setopt($ch, CURLOPT_URL, $url);
curl_exec($ch);
var_dump(curl_getinfo($ch));
But, it gives me the file contents.
Does anyone have any idea how to this?
==EDIT==
Thank you guys. I solved it like this:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLINFO_HEADER_OUT, TRUE);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_NOBODY, TRUE);
curl_exec($ch);
$info = curl_getinfo($ch);
Now, $info contains the header and I can the link from it.
The reason the output is being sent to the screen is because you're telling cURL to do so. If you want to store the response in a variable the following line:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, FALSE);
should read:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
Then, actually retrieve the returned output from curl_exec like so:
$output = curl_exec($ch);
Once you have the returned HTML content from the remote page in the $output variable you can use DOMdocs or regex (but preferably DOM) to parse out any information you want.
UPDATE
I can't tell because the question is vaguely worded: is there actually a Location header redirect happening? If so, you'll want to do as #heiko suggests to prevent cURL from following the redirect and retrieve the headers. Then you can easily parse the contents of the location header:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
curl_setopt($ch, CURLINFO_HEADER, TRUE); // add header output
# make sure to not follow Location: Header
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
# add Response Header to Output, so that you can find the Location-Header in there!
curl_setopt($ch, CURLINFO_HEADER_OUT, TRUE);
Use RETURN TRANSFER as 1, also use htmlentities() if you want to display HTML source on your page , else just echo the variable ( to display the page [redirects to google] ).
<?php
$url = "http://www.google.co.in";
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // Return in string
curl_setopt($ch, CURLOPT_URL, $url);
$varx = curl_exec($ch);
echo htmlentities($varx);
?>
With the $varx variable , use Regular Expressions to match which data you want.