I have 2 questions:
1) When I use the curl, my pictures not shown, why?
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'www.google.com');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
?>
2) In destination page I have a text field, how can I fill it with curl method?
Thanks
I think that:
the images are used relatively, so the paths are incorrect if you use curl.
you should use the result of curl_exec and use that info to write into your textarea.
To get the result from the curl_exec method, set the CURLOPT_RETURNTRANSFER option, like so:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
Now you can get the value like so:
$result = curl_exec($ch);
It's all in the manual by the way.
To get the images working you might want to do some regular expression replace where you look for anything within the src="" attribute and prefix that with, let's say http://www.google.com (or whatever might be correct).
Related
I have the following code to capture the html code of a given url:
$url = "https://fnet.bmfbovespa.com.br/fnet/publico/exibirDocumento?id=77212&cvm=true";
$ch = curl_init();
curl_setopt($ch, CURLOPT_CAINFO, '/etc/ssl/certs/cacert.pem');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$html = curl_exec($ch);
echo "$url\n\n";
die($html);
For some reason the result of the following url is not as expected:
"https://fnet.bmfbovespa.com.br/fnet/publico/exibirDocumento?id=77212&cvm=true"
Instead of the code, the result is a giant meaningless string.
I've have successfully used the same code with other pages of the same domain.
I can assure that the desired page's content is not loaded by any js/ajax method (i did the test loading the page when disabling javascript).
My question is:
There is any cUrl option that i should set to correct this error?
My whole site depends on capturing this pages.
Any help would be truly appreciated.
That is base64 encoded, all you need to do is decode it back to plain text like this
echo base64_decode($html);
and you will see HTML
I have a URL like this https://facebook.com/5 , I want to get HTML of that page, just like view source.
I tried using file_get_contents but that didn't returned me correct stuff.
Am I missing something ?
Is there any other function that I can utilize ?
If I can't get HTML of that page, what special thing did the developer do while coding the site to avoid this thing ?
Warning for being off topic
But does this task have be done using PHP?
Since this sounds like a task of web-scraping, I think you would gain more use in casperjs.
With this, you can target with precision what you would want to retrieved from the fb-page rather than grabbing the whole content, which I assume as of this writing is generated by multiple requests of content and rendered to you through a virtual DOM.
Please note that I haven't tried retrieving content from facebook, but I've done this with multiple services.
Good luck!
You may want to use curl instead: http://php.net/manual/en/curl.examples.php
Edit:
Here is an example of mine:
$url = 'https://facebook.com/5';
$ssl = true;
$ch = curl_init();
$timeout = 3;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, $ssl);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
Note that depending on the websites vhost configuration a slash at the end of the url can make a difference.
Edit: Sorry for the undefined variable.. I copied it out of a helper method i used. Now it should be alright.
Yet another Edit:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
By adding this option you will follow the redirects that are apperently happening in your example. Since you said it was an example I actually didnt run it before. Now I did and it works.
I have following URL
http://www.davesinclairstpeters.com/auto2_inventorylist?i=37647&c=12452&npg=1&ns=50&echo=2
I want to retrieve content of this url using curl but everytime I make this request it is showing me error, as it is not passing required parameters
Below is my code
$ch = curl_init(); // start CURL
curl_setopt($ch, CURLOPT_URL, $json_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPGET, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$response = curl_exec($ch);
That page doesn't give any information stating that the information isn't being passed properly. In fact, it tells you that the information has been recieved - by viewing the source, you can see:
<!--
javax.servlet.forward.request_uri = /auto2_inventorylist
...
javax.servlet.forward.servlet_path = /auto2_inventorylist
...
javax.servlet.forward.query_string = i=37647&c=12452&npg=1&ns=50&echo=2
-->
Which tells you the information has infact been recieved.
Therefore, it's no problem with your code, but with the website itself. You should make sure the URL you are using is valid, or contact that website to get more information.
With regards to your code itself - the curl_setopt($ch, CURLOPT_HTTPGET, true); isn't necessary, as this is already set by default, and you can also pass the URL as an argument of the curl_init function. Doesn't impact performance, but makes for neater code.
$ch = curl_init($json_url); // start CURL
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$response = curl_exec($ch);
You code is perfectly fine and if there's something wrong returned, simply paste this URL to your web browser and check the result. In this case website simply failed for some reasons. There's nothing you can do about that as problem is NOT on your side.
This URL yields a page of cars with links to more cars. Looks like the URL you're starting with is old, or has some sort of expiration factor that's not obvious.
Not knowing which sort of filtering parameters you're shooting for.. hard to say what else my be wrong, other than your starting URL be bad.
working url:
http://www.davesinclairlincolnstpeters.com/all-inventory/index.htm?listingConfigId=auto-new%2Cauto-used&compositeType=&year=&make=&start=0&sort=&facetbrowse=true&quick=true&preserveSelectsOnBack=true&searchLinkText=SEARCH&showInvTotals=false&showRadius=false&showReset=true&showSubmit=true&facetbrowseGridUnit=BLANK&showSelections=true&dependencies=model%3Amake%2Ccity%3Aprovince%2Ccity%3Astate&suppressAllConditions=false
I have one problem regarding CURL.
I am trying send CURL request to http://whatismyipaddress.com
so..
is there any way to get curl response in array ?
because right now it display the HTML Page but i want the response in array.
here is my code...
$ipAdd = '121.101.152.170';
$ch = curl_init("http://whatismyipaddress.com/ip/".$ipAdd);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION ,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
echo $output;
So.. currently i am getting the detail HTML page but i want the out put as array or XML.
The "easiest" path is going to be to find the surrounding text & extract based on that.
If you're willing to step your dedication to this up, you can use something like http://simplehtmldom.sourceforge.net/ & go from there.
edit: actually you can use this (it is built into php5) - http://php.net/manual/en/book.dom.php
more specifically, this - http://www.php.net/manual/en/domdocument.loadhtml.php
I have a webpage which requires a login.
I am using curl to build the HTTP authentication request. It works, but I am not able to grab all the content from my links. I miss all the images.
How can I grab the images as well?
<?php
// create cURL resource
$URL = "http://10.123.22.38/nagios/nagvis/nagvis/index.php?map=Nagvis_CC";
//Initl curl
$ch = curl_init();
//Set HTTP authentication option
curl_setopt($ch, CURLOPT_URL, $URL); // Load in the destination URL
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC); //Normal HTTP request, not SSL
curl_setopt($ch, CURLOPT_USERPWD, "guest:test" ); // Pass the user name and password
// grab URL and pass it to the browser
$content = curl_exec($ch);
$result = curl_getinfo($ch);
// close cURL resource, and free up system resources
curl_close($ch);
echo $content;
echo $result;
?>
I'm getting this warning message Warning: curl_error(): 2 is not a valid cURL handle resource in C:\xampp\htdocs\LiveServices\LoginTest.php on line 24
cURL doesn't get images or any other 'content', it just gets the raw HTML page. Are you saying you are missing <img /> tags that are present on the original page?
cURL also doesn't parse any CSS or JavaScript, so if the content is modified with those, it won't come through. For example, you may be unable to get a background-image of an element unless you do more scraping, that is, get the associated CSS file and parse that.
The main issue I have is that I cannot see the html, so I cannot be sure what the problem is. Having said that, two things occur to me.
The first thing to check is if the images are relative or not. If they are displayed in the form ../xyz/foo.jpg or foo.jpg then you will either need to edit the images src to the full url or add the base tag to the html
For parsing HTML, use the Simple HTML DOM library as it is faster than rolling your own.
The second issue may be that the images also require the user to be logged in. If this is the case you would also have to download all the images, and either embed them in the content after base 64 encoding them, or store them temporally on your server.
Here the some html codes:
The images that I want to get:
<img id="backgroundImage" style="z-index: 0;" src="/nagios/nagvis/nagvis/images/maps/Nagvis_CC.png"/>
<a href="/nagios/cgi-bin/extinfo.cgi?type=2&host=business_processes&service=NLThirdPartyLive" target="_self">
And a lot of javascript.
I tried to use simple HTML dom libray, but the output is array. nothing
require("/simplehtmldom/simple_html_dom.php");
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'WhateverBrowser1.45');
curl_setopt($ch, CURLOPT_URL, 'http://10.123.22.38/nagios/nagvis/nagvis/index.php?map=Nagvis_CC');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC); //Normal HTTP request, not SSL
curl_setopt($ch, CURLOPT_USERPWD, "guest:test" ); // Pass the user name and password
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
$result = curl_exec($ch);
$html= str_get_html($result);
echo $ret= $html->find('table[class=header_table]');
echo $result;