Curl - Request with wp_remote_get() responds 500, curl_exec responds 200 - php

I am getting an odd server php curl error in both my local and production servers (Ubuntu 14.04.2 LTS, PHP 5.5.9-1ubuntu4.11, Apache 2.4.7).
Basically, a curl request to a remote API returns a status code 500 response, ONLY in wp_remote_get(), where it returns status 200 in both curl_exec() and a browser request.
My debug code:
<?php
$url = 'https://yoast.com?edd_action=activate_license&license=my-license-key-here&item_name=WooCommerce+Yoast+SEO&url=https://google.com';
// this return status 200:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
echo '<pre>' . print_r($result, true) . '</pre>';
// this return status 500:
$testResp = wp_remote_get($url);
echo '<pre>' . print_r($testResp, true) . '</pre>';
I cannot figure out why it responds 500 for wp_remote_get(). I've tried adjusting args passed to wp_remote_get(), but still a 500 with it.
I've also disabled all plugins in debugging.
Any Ideas?

OK, after a bit of debugging, I believe the issue is the default User-Agent string Wordpress sets in wp-includes/class-http.php, set when creating an http request for wp_remote_get().
The option has a filter, but the default is created like so:
'user-agent' => apply_filters( 'http_headers_useragent', 'WordPress/' . $wp_version . '; ' . get_bloginfo( 'url' ) ),
So in my case, the 'user-agent' header value was: "Wordpress/4.3.1; http://myurl.com"
When I hook into the filter http_headers_useragent and return an empty string, or even a different user-agent string such as: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8) AppleWebKit/535.6.2 (KHTML, like Gecko) Version/5.2 Safari/535.6.2', the request will return a successful 200 response.
Not sure if the semicolon is the true culprit, but if I remove it and set the user-agent string to just "Wordpress/4.3.1", the request is successful as well.

I had the same problems - wp_remote_get was not working while the classic Curl calls were making the calls. Indeed the problem is on 'user agent' . This is my solution based on "chuuke" findings
$args = array(
'user-agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8) AppleWebKit/535.6.2 (KHTML, like Gecko) Version/5.2 Safari/535.6.2',
);
$data = wp_remote_get($new_url_signed,$args);
Thanks

Related

file_get_contents does not work for getting json from API in PHP 7

I juz upgraded php from 5.6 to 7.2.
Before 7.2, both file_get_contents worked fine for getting json from API but after upgrade to 7.2,
it returned false.
file_get_contents($url) => false
The url is like this:
'https://username:password#project_domain/api/json/xxx/?param_a=' . $a . '&param_b='. $b
And I didn't even touch the default setting in php.ini which is probably related to file_get_contents:
allow_url_fopen = On
I did google for this but there is no straight answer for my problem.
What is the reason for this?
How to fix to it?
Thanks!
$url = "https://www.f5buddy.com/";
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A293 Safari/6531.22.7\r\n"
));
$context = stream_context_create($options);
$file = file_get_contents($url, false, $context);
$file=htmlentities($file);
echo json_encode($file);
Finally got it with curl. It only worked when I skipped the ssl stuff. It is juz https to own project anyway so it shouldn't be a problem in security.
function getJsonFromAPI($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);
curl_close($ch);
$data = json_decode($result);
return $data;
}
Btw, I found out file_get_contents only worked for https to external url but not for https connection to the project itself. Any fix to that is appreciated.

Getting json data from a webpage using PHP

I am trying to fetch a response from here (example url), and first, I thought I should use file_get_contents()
When I tried this, I got the following error:
Warning: file_get_contents(https://steamcommunity.com/market/pricehistory/?country=US&currency=1&appid=730&market_hash_name=SG%20553%20|%20Damascus%20Steel%20(Factory%20New)): failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request
I know this is because it is converting & to &. I have tried numerous ways to counter this, however they have all failed and after a quick google I came to the conclusion that file_get_contents() converts & to & automatically.
My next step was to try curl. I tried the below code first:
// Get cURL resource
$curl = curl_init();
// Set some options - we are passing in a useragent too here
curl_setopt_array($curl, array(
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_URL => 'http://steamcommunity.com/market/pricehistory/?country=US&currency=1&appid=730&market_hash_name='.$hash,
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.2 (KHTML, like Gecko) ChromePlus/4.0.222.3 Chrome/4.0.222.3 Safari/532.2'
));
// Send the request & save response to $resp
$resp = curl_exec($curl);
// Close request to clear up some resources
curl_close($curl);
But this returned ‹ŠŽÿÿ)»L as the response. I wondered if this was to do with json encoding, so I tried putting it through json_decode() but it didn't work.
Next, I tried:
// Get cURL resource
$curl = curl_init();
// Set some options - we are passing in a useragent too here
curl_setopt_array($curl, array(
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_URL => 'http://steamcommunity.com/market/pricehistory/',
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.2 (KHTML, like Gecko) ChromePlus/4.0.222.3 Chrome/4.0.222.3 Safari/532.2',
CURLOPT_POST => 1,
CURLOPT_POSTFIELDS => array(
country => "US",
currency => 1,
appid => 730,
market_hash_name => "SG%20553%20|%20Damascus%20Steel%20(Factory%20New)"
)
));
// Send the request & save response to $resp
$resp = curl_exec($curl);
// Close request to clear up some resources
curl_close($curl);
But again got the response ‹ŠŽÿÿ)»L.
What does this response mean, and can I parse it? If not, how should I correctly fetch this data? Furthermore, why didn't file_get_contents() work?
I'm pretty sure this is happening because you need some type of access token to access the steam web API.
See this answer on SO.
Essentially, Steam is returning an error with the "400 Bad Request" status. This error can be ignored, however, by doing this:
<?php
$url = "https://steamcommunity.com/market/pricehistory/?country=US&currency=1&appid=730&market_hash_name=SG%20553%20%7C%20Damascus%20Steel%20(Factory%20New)";
$context = stream_context_create(array(
'http' => array(
'ignore_errors'=>true,
'method'=>'GET'
// for more options check http://www.php.net/manual/en/context.http.php
)
));
$response = file_get_contents($url, false, $context);
echo $response; // returns "[]"
?>
Make sure you take a look at this answer on SO.
May be your response is gzip, try to use CURLOPT_ENCODING.
curl_setopt($curl ,CURLOPT_ENCODING, '')
If you use https don't forget to disable CURLOPT_SSL_VERIFYPEER.
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false)
One thing, if I follow your link with my browser and open my debug console.
I see you request have a 400 Status code (Bad Request).
I cant say about your enpoint, but you can get around your Bad Request error by using urlencode():
$url = urlencode('https://steamcommunity.com/market/pricehistory/?country=US&currency=1&appid=730&market_hash_name=SG%20553%20%7C%20Damascus%20Steel%20(Factory%20New))'
file_get_contencts($url);

file_get_contents from specific URL

I have an API Key that verifies the request URL
If I do
echo file_get_contents('http://myfilelocation.com/?apikey=1234');
RESULT : this api key is not authorized for this domain
However, if I put the requested URL within an iframe with the same URL:
RESULT : this api key is authorized
Obviously, the Server I'm getting the requested JSON return data is working properly because the iframe is outputting the proper information. However, how can I verify that PHP is making the request from the proper domain and URL settings?
By using file_get_contents I am always getting back that the API key is not authorized. However, I'm running the php script from the authorized domain.
Try this PHP code:
<?php
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Host: myfilelocation.com\r\n". // Don't forgot replace with your domain
"Accept-language: en\r\n" .
"User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n"
)
);
$context = stream_context_create($options);
$file = file_get_contents("http://myfilelocation.com/?apikey=1234", false, $context);
?>
file_get_contents doesn't send a any referrer information and the api may need it, this may help you:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://myfilelocation.com/?apikey=1234');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://autorized-domain.here');
$html = curl_exec($ch);
echo $html;
?>

cURL weird status codes when checking URL

I'm checking for the presence of a xml site map on different URLs. If I supply a URL example.com/sitemap.xml, and it has a 301 to www.example.com/sitemap.xml, I get a 301 obviously. If www.example.com/sitemap.xml doesnt exist, I wont see the 404. So, if I get a 301, I execute another cURL to see if a 404 returns for www.example.com/sitemap.xml. But, for reason, I get random 404 and 303 status codes.
private function check_http_status($domain,$file){
$url = $domain . "/" . $file;
$curl = new Curl();
$curl->url = $url;
$curl->nobody = true;
$curl->userAgent = 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.1) Gecko/20060601 Firefox/2.0.0.1 (Ubuntu-edgy)';
$curl->execute();
$retcode = $curl->httpCode();
if ($retcode == 301 || $retcode == 302){
$url = "www." . $domain . "/" . $file;
$curl = new Curl();
$curl->url = $url;
$curl->nobody = true;
$curl->userAgent = 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.1) Gecko/20060601 Firefox/2.0.0.1 (Ubuntu-edgy)';
$curl->execute();
$retcode = $curl->httpCode();
}
return $retcode;
}
Have a look at the list of response codes returned - http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html.
Usually a web browser will automatically handle these, but as you are doing things manually with curl, you need to understand what each response means. The 301 or 302 means that you should use the alternative url supplied to access the resource. This may be a simple as addin www to the request but it also may be more complex as a redirect to a different domain altogather.
The 303 means that you are using a POST attempt to access the resource, and should use GET.
Well, when you receive a 301 or 302 you should use the location found in the response, not just assume another location and try that.
As you can see in this example, the response from the server contains the new location of the file. Use that for your next request:
http://en.wikipedia.org/wiki/HTTP_301#Example
"followLocation" works very well. Here is how I implemented it:
$url = "http://www.YOURSITE.com//"; // Assign you url here.
$ch = curl_init(); // initialize curl.
curl_setopt($ch, CURLOPT_URL, $url); // Pass the URL as the option/target.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // 0 will print html. 1 does not.
curl_setopt($ch, CURLOPT_HEADER, 0); // Please curl, inlude the header in the output.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // ..and yes, follow what the server sends as part of the HTTP header.
$response_data = curl_exec($ch); // execute curl with the target URL.
$http_header = curl_getinfo($ch); // Gets information about the last transfer i.e. our URL
// Print the URLs that are not returning 200 Found.
if($http_header['http_code'] != "200") {
echo " <b> PAGE NOT FOUND => </b>"; print $http_header['http_code'];
}
// print $http_header['url']; // Print the URL sent back in the header. This will print the page to wich you were redirected.
print $url; // this will print the original URLs that you are trying to access
curl_close($ch); // we are done with curl; so let's close it.

unable to get the website content by using file_get_content in php

When i am trying to get the website content from the external url fanpop.com by using file_get_contents in php, i am getting empty data. I used the below code to get the contents
$add_url= "http://www.fanpop.com/";
$add_domain = file_get_contents($add_url);
echo $add_domain;
but here i am getting empty result for $add_domain. But the same code is working for other urls and i tried to send the request from browser not from the script then also it is not working.
Below is the same request, but in CURL:
error_reporting(-1);
ini_set('display_errors','On');
$url="http://www.fanpop.com/";
$ch = curl_init();
$header=array('GET /1575051 HTTP/1.1',
'Host: adfoc.us',
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language:en-US,en;q=0.8',
'Cache-Control:max-age=0',
'Connection:keep-alive',
'Host:adfoc.us',
'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36',
);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,0);
curl_setopt( $ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
curl_setopt($ch,CURLOPT_HTTPHEADER,$header);
echo $result=curl_exec($ch);
curl_close($ch);
... but the above is also not working, can any one tell is there any any changes have to make in that?
The problem with this particular site is that it only serves compressed contents and throws a 404 error otherwise.
Easy fix:
$ch = curl_init('http://www.fanpop.com');
curl_setopt($ch,CURLOPT_ENCODING , "");
curl_exec($ch);
You can also make this work for file_get_contents() but with a substantial amount of effort, as described in this article.

Categories