I'm trying to get data from external website using cURL in PHP but, somehow it's not working.
I've checked out that CURL enable in phpinfo(). It shows cURL is enabled
But, my code is not working.
<?php
if (! function_exists ( 'curl_version' )) {
exit ( "Enable cURL in PHP" );
}
$ch = curl_init ();
$timeout = 0; // 100; // set to zero for no timeout
$myHITurl = "http://www.google.com";
curl_setopt ( $ch, CURLOPT_URL, $myHITurl );
curl_setopt ( $ch, CURLOPT_HEADER, 0 );
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt ( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
$file_contents = curl_exec ( $ch );
if (curl_errno ( $ch )) {
echo curl_error ( $ch );
curl_close ( $ch );
exit ();
}
curl_close ( $ch );
// dump output of api if you want during test
echo "$file_contents";
?>
It goes timeout.
I'm not using WAMP or XAMPP server. The above code runs directly on the server.
I've no idea what's going wrong.
Your code is perfect, I have tested it on my own server (data center in Texas) and it worked fine.
My guess is that your server IP is banned. Try to fetch a different URL, and see if it works for you. If it does then you are banned, if it doesn't then it might be a firewall configuration issue in your server.
disable SELinux if you are on Centos or Fedora or any Redhat Distro
nano /etc/selinux/config
Change
SELINUX=enforcing
to
SELINUX=disabled
Related
Has anybody had an issue with display content from a website thats over https? The code was working until all the sites on server got ssl. maybe something to so with the certificate being tlss 1.2? So the site im trying to do this from has this certificate now.
$data = file_get_contents('https://www.ladygaga.com/');
echo $data;
According to php.net
When using SSL, Microsoft IIS will violate the protocol by closing the
connection without sending a close_notify indicator. PHP will report
this as "SSL: Fatal Protocol Error" when you reach the end of the
data. To work around this, the value of error_reporting should be
lowered to a level that does not include warnings. PHP can detect
buggy IIS server software when you open the stream using the https://
wrapper and will suppress the warning. When using fsockopen() to
create an ssl:// socket, the developer is responsible for detecting
and suppressing this warning.
Source link
Based on OpenSSL changes in PHP 5.6, try this:
$arrContextOptions=array(
"ssl"=>array(
"verify_peer"=>false,
"verify_peer_name"=>false,
),
);
$response = file_get_contents("https://www.ladygaga.com/", false, stream_context_create($arrContextOptions));
echo $response;
Another option would be to use curl as such:
function file_get_contents_curl( $url ) {
$ch = curl_init();
curl_setopt( $ch, CURLOPT_AUTOREFERER, TRUE );
curl_setopt( $ch, CURLOPT_HEADER, 0 );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, TRUE );
$data = curl_exec( $ch );
curl_close( $ch );
return $data;
}
$data = file_get_contents_curl("https://www.ladygaga.com");
This is a piggy back off this question -- I've discovered some more information such that the question, itself, needed to change.
I'm attempting to pass data from javascript SPA to a php file (dbPatch.php) to another php file (mongoPatch_backend.php). dbPatch.php is effectively acting as a middle-man to get data to appropriate servers.
My javascript fetch looks like this:
const API = PHP_FILE_LOCATION + 'dbPatch.php/';
const query =
"needleKey=" + encodeURIComponent(needleKey) + "&" +
"needle=" + encodeURIComponent(needle) + "&" +
"newData=" + encodeURIComponent(JSON.stringify(newData));
let URI = API;
fetch(URI, {
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
},
body: query
}).then.... blah...blah....
This calls my php file, dbPatch...
<?php
$API = "https://SERVER/php/mongoPatch_backend.php?";
$needleKey = $_REQUEST['needleKey'];
$needle = $_REQUEST['needle'];
$newData = $_REQUEST['newData'];
$postData = "needleKey=".urlencode($needleKey);
$postData .= "&needle=".urlencode($needle);
$postData .= "&newData=".urlencode($newData); //THIS IS THE LINE I TALK ABOUT BELOW
$data = file_get_contents($API.$postData);
echo $data;
?>
which in turn calls my mongoPatch_backend.php file...
<?php
$user = "xxx";
$pwd = 'xxx';
$needleKey = $_REQUEST['needleKey'];
$needle = $_REQUEST['needle'];
$filter = [$needleKey => $needle];
$newData = $_REQUEST['newData'];
$filter = ['x' => ['$gt' => 1]];
$options = [
'projection' => ['_id' => 0],
'sort' => ['x' => -1],
];
$bson = MongoDB\BSON\fromJSON($newData);
$value = MongoDB\BSON\toPHP($bson);
$manager = new MongoDB\Driver\Manager("mongodb://${user}:${pwd}#DBSERVER:27017");
$bulk = new MongoDB\Driver\BulkWrite;
$bulk->update(
[$needleKey => $needle],
['$set' => $value],
['multi' => false, 'upsert' => true]
);
$results = $manager->executeBulkWrite('dbname.users', $bulk);
var_dump($results);
?>
This does not work.
If I call mongoPatch_backend.php directly from the javascript, it DOES work.
This leads me to believe the problem is in the passing of the data located in the dbPatch.php file.
Further, if I call dbPatch with different 'newData' (shorter) it DOES work. This leads me to believe it's something with the data being passed in (but remember, it works if I call directly... so it's right coming out of the javascript).
Spitting out $newData from dbPatch.php via var_dump($_REQUEST['newData']); gives me JSON data which has been stringified but it is not character-escaped. It's about 5,000 characters.
Here is the interesting part.
If I change mongoPatch_backend.php to JUST <?php echo "Hello World"; ?> I STILL do not get anything passed back through dbPatch.php to my SPA. This REALLY makes me think something is wrong in the dbPatch.php file.
So... I comment out the $postData .= "&newData=".urlencode($newData); line from the dbPatch.php ... I DO get the "Hello World" back.
if I just remove .urlencode and instead just have $postData .= "&newData=".$newData; I still get nothing back.
So the problem seems to be with putting $newData in my post. The mongoPatch_backend.php is not even doing anything with the $newData... dbPatch.php (it appears) is simply having trouble sending that data.
Unfortunately... I"m not sure where to go from here... given, I do, indeed, need to send the $newData to the backend.
EDIT: In reponse to suggestions that I use "POST" instead of "GET" I did a search and found this Stack question: POST data to a URL in PHP
From that, I now have this:
dbPatch.php:
$url = 'https://SERVERNAME/php/mongoPatch_backend.php';
$myvars = 'myvar1=' . "TEST" . '&myvar2=' . "ALSOTEST";
$ch = curl_init( $url );
curl_setopt( $ch, CURLOPT_POST, 1);
curl_setopt( $ch, CURLOPT_POSTFIELDS, $myvars);
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt( $ch, CURLOPT_HEADER, 0);
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec( $ch );
echo $response;
and I changed my mongoPatch_backend.php to:
<?php
echo "HELLO WORLD";
?>
... and I get nothing as the response. (that is, I do not get "HELLO WORLD" form the backend).
My PHP log shows no errors.
My curl config from phpinfo() is:
cURL support enabled
cURL Information 7.59.0
Age 4
Features
AsynchDNS Yes
CharConv No
Debug No
GSS-Negotiate No
IDN Yes
IPv6 Yes
krb4 No
Largefile Yes
libz Yes
NTLM Yes
NTLMWB No
SPNEGO Yes
SSL Yes
SSPI Yes
TLS-SRP No
HTTP2 Yes
GSSAPI No
KERBEROS5 Yes
UNIX_SOCKETS No
PSL No
Protocols dict, file, ftp, ftps, gopher, http, https, imap, imaps, ldap, pop3, pop3s, rtsp, scp, sftp, smb, smbs, smtp, smtps, telnet, tftp
Host x86_64-pc-win32
SSL Version OpenSSL/1.1.0h
ZLib Version 1.2.11
libSSH Version libssh2/1.8.0
I'm not entirely sure why, but this question:PHP - CURL is enabled but not working led me to an example of using cURL that worked.
My dbPatch.php now looks like this and appears to work...
<?php
$url = 'https://SERVERNAME/php/mongoPatch_backend.php';
$params = 'needleKey=' . $_REQUEST['needleKey'] . '&needle=' . $_REQUEST['needle'] . '&newData='. $_REQUEST['newData'];
if (! function_exists ( 'curl_version' )) {
exit ( "Enable cURL in PHP" );
}
$ch = curl_init ();
$timeout = 0; // 100; // set to zero for no timeout
$myHITurl = "http://152.61.248.218/php/mongoPatch_backend.php";
curl_setopt ( $ch, CURLOPT_URL, $myHITurl );
curl_setopt ( $ch, CURLOPT_HEADER, 0 );
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_POSTFIELDS, $params);
curl_setopt ( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
$file_contents = curl_exec ( $ch );
if (curl_errno ( $ch )) {
echo curl_error ( $ch );
curl_close ( $ch );
exit ();
}
curl_close ( $ch );
echo "$file_contents";
?>
file_get_contents is only to be used to read a file into a string. Think of it like opening a text document in Notepad or Textedit.
For API requests to dynamically rendered PHP files, you'll want to use the PHP library "curl"
http://php.net/manual/en/book.curl.php
http://codular.com/curl-with-php
I am working on a site where realtors can create online presentations for clients and using examples from a website called streeteasy.com, a zillow owned site. I can successfully scrape everything I want with the exception of you have to be logged in in order to see the sold price for closed properties. The login is very low in security as far as I can tell and it sets a cookie for 10 years when you login. I used the chrome developers tools to get the post data and the post URL.
You can see the page comes up just fine but it still says 'Register to see what it closed for about 5 weeks ago' to see the final sale price. When you are logged in, the message is different. I still cannot get this code to work and I have no idea why. I am testing on my WAMP so the cookie.txt file is not an issue with permissions. I even tried to just create my own cookie file with my browser cookies and just access the page but still no luck.
$url = "http://streeteasy.com/sale/1253471";
$login_url = 'https://streeteasy.com/nyc/user/sign_in';
$data = 'utf8=%E2%9C%93&authenticity_token=MYCz6A5PK%2B3I3N%2BgHekaNc4IuQEruBrCPBjSxm1B9dg%3D&do=login&return_to=http%3A%2F%2Fstreeteasy.com%2F&origin=&page_category=&page_type=&boundary=&label=&remember=true&return_to_save_search=&login=john%40telesh.com&password=dman4578';
login($login_url,$data);
echo grab_page ($url);
function login($url,$data){
$fp = fopen("cookie.txt", "w");
fclose($fp);
$login = curl_init();
curl_setopt($login, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($login, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($login, CURLOPT_TIMEOUT, 40000);
curl_setopt($login, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($login, CURLOPT_URL, $url);
curl_setopt($login, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($login, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($login, CURLOPT_POST, TRUE);
curl_setopt($login, CURLOPT_POSTFIELDS, $data);
return curl_exec ($login);
}
function grab_page($site){
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, $site);
return curl_exec ($ch);
}
(i hope that is a dummy account...)
first off, to do a successful login, you need a valid "authenticity_token" which you get on the login page, which is probably different for each session, but your code has a HARDCODED authenticity_token which probably expired long ago, and was only ever valid in your browser. second, your login() function is so bugged that it SHOULD result in a 500 internal server error when calling the script from WAMP, because the output buffer it creates is never ended. third, for some weird reason, seems it REQUIRES the login request to contain a browser-like accept header, for example Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 , but curl, by default, just says Accept: */* - which it doesn't accept.
start a cookie session, grab the sign in page, parse out the authenticity_token from the signin page (actually, just parse out all the "input" tags), and log in with the fresh one, and make sure to send Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 with the login request.
using hhb_curl from https://github.com/divinity76/hhb_.inc.php/blob/master/hhb_.inc.php , here is a working example:
<?php
declare(strict_types = 1);
require_once ('hhb_.inc.php');
$hc = new hhb_curl ();
$hc->_setComfortableOptions ();
$hc->exec ( 'https://streeteasy.com/nyc/user/sign_in' );
$html = $hc->getResponseBody ();
$domd = #DOMDocument::loadHTML ( $html );
$inputs = array ();
foreach ( $domd->getElementsByTagName ( "input" ) as $input ) {
$inputs [$input->getAttribute ( "name" )] = $input->getAttribute ( "value" );
}
assert ( array_key_exists ( 'authenticity_token', $inputs ) );
$inputs ['login'] = 'john#telesh.com';
$inputs ['password'] = 'dman4578';
var_dump ( $inputs );
$hc->setopt_array ( array (
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query ( $inputs ),
CURLOPT_URL => 'https://streeteasy.com/nyc/user/sign_in',
CURLOPT_HTTPHEADER => array (
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
)
) );
$hc->exec ();
$html = $hc->getResponseBody ();
if (false === strpos ( $html, 'You have successfully logged in' )) {
throw new RuntimeException ( 'login failed! (could not find `You have successfully logged in` in the response body!' );
}
hhb_var_dump ( $hc->getStdErr (), $hc->getResponseBody () );
it dumps the logged in html at the end, proving that it has logged in. it even verifies this by checking for the You have successfully logged in string in the response.
-edit: as for parsing out the "sold for" price, you can use DOMDocument for that.. the html is shitty, so its a bit cumbersome, but i found a way to get it:
$hc->exec('http://streeteasy.com/sale/1253471');
$html = $hc->getResponseBody ();
$domd = #DOMDocument::loadHTML ( $html );
$sold_for=NULL;
foreach($domd->getElementsByTagName("div") as $div){
if(false!==strpos($div->getAttribute("class"),'status_sold')){
$sold_for=trim($div->nextSibling->nextSibling->textContent);
break;
}
}
var_dump($sold_for);
output:
string(63) "Sold for $16,550,062
as of about 5 weeks ago"
If the token expires in such a long time, why not just hardcode the cookie session into your cURL requests, simply add the value for the cookie header to be:
$headers = [
'Cookie: _actor=eyJpZCI6IitEZG9VenJLc00wVENZSXFYZWlrVlE9PSJ9--4d8449347e46c32eaae1a8189b83881b7abc6e24; _se_t=f9f7ae31-80cd-4698-bdc8-fbdff739009b; _gcl_au=1.1.73614610.1670886677; pxcts=48f6f381-7a72-11ed-bc8c-4d6a52455343; _pxvid=48f6e270-7a72-11ed-bc8c-4d6a52455343; _ga=GA1.2.1781461336.1670886678; __gads=ID=f12dde3526d46fa2:T=1670886679:S=ALNI_MY5HCF64ez59fkAePUrvcVo2YMjbg; zg_anonymous_id=%2222fd5f18-5b53-4213-8734-1983e805f19e%22; google_one_tap=0; ezab_gold_price_widget=no_widget; ezab_ursa_srp_expanded_map=reduced_map; last_search_tab=rentals; se%3Asearch%3Arentals%3Astate=false%7C%7C%7C%7C; se_login_trigger=8; ezab_orca_1999_show_agent_matches=show_agent_matches_immediately; onboarding_flow=false; se%3Asearch%3Ashared%3Astate=300||||false; anon_searcher_stage=initial; tracked_search=; __gpi=UID=000008d033ec5e0d:T=1670886679:RT=1671549920:S=ALNI_Mbh6kKe3MrPpvhniGrkDAtpQwm-Zg; se%3Abig_banner%3Asearch=%7B%229991594%22%3A1%7D; zjs_anonymous_id=%22f9f7ae31-80cd-4698-bdc8-fbdff739009b%22; _uetvid=485c00c07a7211ed8f4eeb1ea6ad9f20; ezab=%7B%22gold_price_widget%22%3A%22no_widget%22%2C%22orca_1999_show_agent_matches%22%3A%22show_agent_matches_immediately%22%2C%22ursa_srp_expanded_map%22%3A%22reduced_map%22%7D; _gid=GA1.2.608953748.1672024129; _pxff_cc=U2FtZVNpdGU9TGF4Ow==; ki_r=; g_state=; zjs_user_id=%222058917%22; remember_user_token=eyJfcmFpbHMiOnsibWVzc2FnZSI6Ilcxc3lNRFU0T1RFM1hTd2lNbDl6U2paTWIzQlJNMEpZYzNNMlVsbDJjMU1pTENJeE5qY3lNREkwTWpJeExqWTBNRFk0TWpJaVhRPT0iLCJleHAiOiIyMDMyLTEyLTI2VDAzOjEwOjIxWiIsInB1ciI6bnVsbH19--fac8f407a824eba1fb7aeac12d574c6bff565a20; user_auth_token=Q6_wPheB4hx2QseGxyy3; se_lsa=2022-12-25+22%3A10%3A41+-0500; _ses=MVUweDEySUpkZEo1c3pQek96OFpQTmgxWUoyTmIyeElaSWxhR0hCS2I2Z1VESnp4b3NnM2VJQm1Kelo2SzcxeHBIQkxBUmJwWE9zc20valdITU5XM2lRUTJhTjlSa3UzR3p0Yk5yTVJnR1pPQ0RuZ1hsZ1F0SzVCb1dEbkoxYWRCN0hYUzAxZlNZckZ1TVNxZ2diQTNzcXVVL2tIb0lNdk9nb09NNWw1V1p0NDFKNEFGcWp3SkQ2S1N4eElObjJqVWVxRm9IOEVyQXVUMEpORURadGxnUT09LS05S3JuL1Y2TWlNOEdaZFJPYWdlZ3RnPT0%3D--c9565ba32e0b2adae10d9076a0b7442b5edb5d81; ki_t=1670886678145%3B1672024129813%3B1672024241463%3B5%3B102; _px3=d3827445b59b0a26866c26cebaa57f08257be464948e0a69625cdc878a36c4af:bIEndpROKJiy9bn1Wb/Ez8jOc1wjprWLUJjlVz8c37ayGumhnRQ1s9VBQ6XmxIo/gzER64vFdxi2f60X6WLoPA==:1000:Re35eAC4PIw8oV16m1QenX871u0lx4QnQfNVNegIoEFZRiehpPIiiTARGdPIwr6O3A/Aey5zb9hn3BxkW0sD4fU+v/V1zNU/uIlbp9PByj+r4dE8usVlgb3G2grMJw5I+x/yH7N2T6qwRgDiyXcYLTRSCtiwnFQoDvvctIGTuOUOXZmpNUC40YzcPijIrv1UAzXyt0oahTIG815/me9QDA=='
];
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
This will set the header to send the auth cookie string the server checks against; As long as you pass the cookie, and the session is non-expired - you will be able to send all http requests successfully.
I wrote a few scrapers for Streeteasy, know that Distil Networks is the vendor whom they signed up for to protect their server from DDoS/Scrape/Dir Fuzzing - you will likely get banned after a few requests so make sure to use MiT proxy to hit that captcha page to whitelist your server.
Btw if on localhost you're experiencing the same SSL related errors I used to, add the following to your cURL:
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPHOST, FALSE);
That out to do the trick; you'll be able to make calls with or without self-signed certs;
$data = file_get_contents("http://randomword.setgetgo.com/get.php");
var_dump($data);
I keep getting false when sending this get request, anyone have an idea why?
It works just fine with a simple php script I wrote hosted from the same domain, might that be the issue, how do I go about sending a get request to this API if that is the case?
I tried using curl as well with the same result. It works with my test script but not the API.
As noted in the various comments above, the code originally posted works fine for me but not for the OP - most likely due to a restriction placed upon various standard PHP functions by the webhost. As an alternative, cURL should be able to retrieve the content unless a similar restriction has been placed on standard curl functions too.
$url='http://randomword.setgetgo.com/get.php';
$ch = curl_init( $url );
curl_setopt( $ch, CURLOPT_HEADER, 0 );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_USERAGENT, 'curl-wordfetcher' );
$result = curl_exec( $ch );
if( curl_errno( $ch ) ) echo 'Curl error: ' . curl_error( $ch );
curl_close( $ch );
print_r( $result );
I read over 20 related questions on this site, searched in Google but no use. I'm new to PHP and am using PHP Simple HTML DOM Parser to fetch a URL. While this script works with local test pages, it just won't work with the URL that I need the script for.
Here is the code that I wrote for this, following an example file that came with the PHP Simple DOM parser library:
<?php
include('simple_html_dom.php');
$html = file_get_html('http://www.farmersagent.com/Results.aspx?isa=1&name=A&csz=AL');
foreach($html->find('li.name ul#generalListing') as $e)
echo $e->plaintext;
?>
And this is the error message that I get:
Warning: file_get_contents(http://www.farmersagent.com/Results.aspx?isa=1&name=A&csz=AL) [function.file-get-contents]: failed to open stream: Redirection limit reached, aborting in /home/content/html/website.in/test/simple_html_dom.php on line 70
Please guide me what should be done to make it work. I'm new so please suggest a way that is simple. While reading other questions and their answers on this site, I tried cURL method to create a handle but I failed to make it work. The cURL method that I tried keeps returning "Resources" or "Objects". I don't know how to pass that to Simple HTML DOM Parser to make $html->find() work properly.
Please help!
Thanks!
Had a similar problem today. I was using CURL and it wasn't returning my any error. Tested with file_get_contents() and I got...
failed to open stream: Redirection limit reached, aborting in
Made a few searches and I'v ended with this function that works on my case...
function getPage ($url) {
$useragent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36';
$timeout= 120;
$dir = dirname(__FILE__);
$cookie_file = $dir . '/cookies/' . md5($_SERVER['REMOTE_ADDR']) . '.txt';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($ch, CURLOPT_ENCODING, "" );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt($ch, CURLOPT_AUTOREFERER, true );
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt($ch, CURLOPT_MAXREDIRS, 10 );
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com/');
$content = curl_exec($ch);
if(curl_errno($ch))
{
echo 'error:' . curl_error($ch);
}
else
{
return $content;
}
curl_close($ch);
}
The website was checking for a valid user agent and for cookies.
The cookie issue was causing it! :)
Peace!
Resolved with:
<?php
$context = stream_context_create(
array(
'http' => array(
'max_redirects' => 101
)
)
);
$content = file_get_contents('http://example.org/', false, $context);
?>
You can also inform if you have a proxy in the middle:
$aContext = array('http'=>array('proxy'=>$proxy,'request_fulluri'=>true));
$cxContext = stream_context_create($aContext);
More details on: https://cweiske.de/tagebuch/php-redirection-limit-reached.htm (thanks #jqpATs2w)
Using cURL you would need to have the CURLOPT_RETURNTRANSFER option set to true in order to return the body of the request with call to curl_exec like this:
$url = 'http://www.farmersagent.com/Results.aspx?isa=1&name=A&csz=AL';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
// you may set this options if you need to follow redirects. Though I didn't get any in your case
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$content = curl_exec($curl);
curl_close($curl);
$html = str_get_html($content);
I also needed to add this HTTP context options ignore_errors :
see : https://www.php.net/manual/en/context.http.php
$arrContextOptions = array(
"ssl" => array(
// skip error "Failed to enable crypto" + "SSL operation failed with code 1."
"verify_peer" => false,
"verify_peer_name" => false,
),
// skyp error "failed to open stream: operation failed" + "Redirection limit reached"
'http' => array(
'max_redirects' => 101,
'ignore_errors' => '1'
),
);
$file = file_get_contents($file_url, false, stream_context_create($arrContextOptions));
Obviously, I only use it for quick debugging purpose on my local environment. It is not for production.
I'm not sure exactly why you redefined the $html object with a string from get html, The object is meant to be used for searching the string. If you overwrite the object with a string, the object no longer exists and cannot be used.
In any case, to search the string returned from curl.
<?php
$url = 'http://www.example.com/Results.aspx?isa=1&name=A&csz=AL';
include('simple_html_dom.php');
# create object
$html = new simple_html_dom();
#### CURL BLOCK ####
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
# you may set this options if you need to follow redirects.
# Though I didn't get any in your case
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$content = curl_exec($curl);
curl_close($curl);
# note the variable change.
$string = str_get_html($content);
# load the curl string into the object.
$html->load($string);
#### END CURL BLOCK ####
# without the curl block above you would just use this.
$html->load_file($url);
# choose the tag to find, you're not looking for attributes here.
$html->find('a');
# this is looking for anchor tags in the given string.
# you output the attributes contents using the name of the attribute.
echo $html->href;
?>
you might be searching a different tag, the method is the same
# just outputting a different tag attribute
echo $html->class;
echo $html->id;