I was running my WebServer for months with the same Algorithm where I got the content of a URL by using this line of code:
$response = file_get_contents('http://femoso.de:8019/api/2/getVendorLogin?' . http_build_query(array('vendor'=>$vendor,'user'=>$login,'pw'=>$pw),'','&'));
But now something must have changed as out of sudden it stopped working.
In earlier days the URL looked like it should have been:
http://femoso.de:8019/api/2/getVendorLogin?vendor=100&user=test&pw=test
but now I get an error in my nginx log saying that I requested the following URL which returned a 403
http://femoso.de:8019/api/2/getVendorLogin?vendor=100&user=test&pw=test
I know that something changed on the target server, but I think that shouldn't affect me or not?!
I already spent hours and hours of reading and searching through Google and Stackoverflow, but all the suggested ways as
urlencode() or
htmlspecialchars() etc...
didn't work for me.
For your information, the environment is a zend application with a nginx server on my end and a php webservice with apache on the other end.
Like I said, it changed without any change on my side!
Thanks
Let's find out the culprit!
1) Is it http_build_query ? Try replacing:
'http://femoso.de:8019/api/2/getVendorLogin?' . http_build_query(array('vendor'=>$vendor,'user'=>$login,'pw'=>$pw)
with:
"http://femoso.de:8019/api/2/getVendorLogin?vendor={$vendor}&user={$login}&pw={$pw}"
2) Is some kind of post-processing in the place? Try replacing '&' with chr(38)
3) Maybe give a try and play a little bit with cURL?
$ch = curl_init();
curl_setopt_array($ch, array(
CURLOPT_URL => 'http://femoso.de:8019/api/2/getVendorLogin?' . http_build_query(array('vendor'=>$vendor,'user'=>$login,'pw'=>$pw),
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true, // include response header in result
//CURLOPT_FOLLOWLOCATION => true, // uncomment to follow redirects
CURLINFO_HEADER_OUT => true, // track request header, see var_dump below
));
$data = curl_exec($ch);
curl_close($ch);
var_dump($data, curl_getinfo($ch, CURLINFO_HEADER_OUT));
exit;
Sounds like your arg_separator.output is set to "&" in your php.ini. Either comment that line out or change to just "&"
I'm no expert but that's the way the computer reads the address since it's a special character. Something with encoding. Simple fix would be to to filter by utilizing str_replace(). Something along those lines.
Related
This is all of my code:
<html>
<body>
<form>
Playlist to Scrape: <input type="text" name="url" placeholder="Playlist URL">
<input type="submit">
</form>
<?php
if(isset($_GET['url'])){
$source = file_get_contents($_GET['url']);
$regex = '/<a href="(.*?)" class="gothere pl-button" title="/';
preg_match_all($regex,$source,$output);
echo "<textarea cols=100 rows=50>";
$fullUrl = array();
foreach($output[1] as $url){
array_push($fullUrl,"http://soundcloud.com".$url);
}
$final = implode(";",$fullUrl);
echo $final;
echo "</textarea>";
}else{
echo "borks";
}
?>
</body>
</html>
Yesterday, it worked fine.
What the code should do is:
Take a Soundcloud URL, extract the individual songs, and then print them like song1;song2;song3
Again, this worked fine yesterday, and I haven't changed anything since, I think...
I have tried to comment the other code out, and just keeping $source = file_get_contents($_GET['url']); and echoing $source, but it returned blank, which makes me think it is a problem with file_get_contents.
If you have any idea on why this is happening, I would appreciate hearing it. Thanks!
What might have happened is that a new SSL certificate was installed on the server that file_get_contents is trying to access. In our case, the target server had a new SSL certificate installed on its domain from another vendor and another wild-card domain.
Changing our config a little bit fixed the problem.
$opts = array(
'http' => array(
'method' => "GET",
'header' => "Content-Type: application/json\r\n".
"Accept: application/json\r\n",
'ignore_errors' => true
),
// VVVVV The extra config that fixed it
'ssl' => array(
'verify_peer' => false,
'verify_peer_name' => false,
)
// ^^^^^
);
$context = stream_context_create($opts);
$result = file_get_contents(THE_URL_WITH_A_CHANGED_CERTIFICATE, false, $context);
I found this solution thanks to this answer. It even was downvoted.
This certainly explained the fact that file_get_contents suddenly stops working.
Your question doesn't have enough information for someone to help you.
To start with though, I would
Check that the script is receiving the URL get parameter correctly (var_dump($_GET['url']))
Check what PHP fetches from the URL (var_dump(file_get_contents($_GET['url']));
My guess is either your server admin turned off FOPEN URL wrappers, or the owner of the site you're scraping decided they didn't want you scraping their site, and are blocking requests from your PHP scripts.
It also helps to turn error reporting all the way up, and set display errors to 1
error_reporting(E_ALL);
ini_set('display_errors', 1);
Although if you've been developing without this, chances are there's lots of working-but-warning-worthy code in your application.
Good luck.
In my case (I was also frequently downloading one page but not soundcloud) it was because of F5 “bobcmn” Javascript detection at server.
When I wrote into my php script somethinkg like var_dump($source); - to see what server sent - then I saw that response starts with this code: window[“bobcmn”] = ...
More here:
https://blog.dotnetframework.org/2017/10/10/understanding-f5-bobcmn-javascript-detection/
We've gotten permission to periodically copy a webcam image from another site. We use cURL functions elsewhere in our code, but when trying to access this image, we are unable to.
I'm not sure what is going on. The code we use for many other cURL functions is like so:
$image = 'http://island-alpaca.selfip.com:10202/SnapShotJPEG?Resolution=640x480&Quality=Standard'
$options = array(
CURLOPT_URL => $image,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10
);
$ch = curl_init();
curl_setopt_array($ch, $options);
$cURL_source = curl_exec($ch);
curl_close($ch);
This code doesn't work for the following URL (webcam image), which is accessible in a browser from our location: http://island-alpaca.selfip.com:10202/SnapShotJPEG?Resolution=640x480&Quality=Standard
When I run a test cURL, it just seems to hang for the length of the timeout. $cURL_source never has any data.
I've tried some other cURL examples online, but to no avail. I'm assuming there's a way to build the cURL request to get this to work, but nothing I've tried seems to get me anywhere.
Any help would be greatly appreciated.
Thanks
I don't see any problems with your code. You can get error sometimes because of different problems with network. You can try to wait for good response in loop to increase the chances of success.
Something like:
$image = 'http://island-alpaca.selfip.com:10202/SnapShotJPEG?Resolution=640x480&Quality=Standard';
$tries = 3; // max tries to get good response
$retry_after = 5; // seconds to wait before new try
while($tries > 0) {
$options = array(
CURLOPT_URL => $image,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_CONNECTTIMEOUT => 10,
CURLOPT_TIMEOUT => 10,
CURLOPT_MAXREDIRS => 10
);
$ch = curl_init();
curl_setopt_array($ch, $options);
$cURL_source = curl_exec($ch);
curl_close($ch);
if($cURL_source !== false) {
break;
}
else {
$tries--;
sleep($retry_after);
}
}
Can you fetch the URL from the server where this code is running? Perhaps it has firewall rules in place? You are fetching from a non-standard port: 10202. It must be allowed by your firewall.
I, like the others, found it easy to fetch the image with curl/php.
As it was said before, I can either see any problem with the code. However, maybe you should consider setting more timeout for the curl - to be sure that this slow loading picture finally gets loaded. So, as a possibility, try to increase CURLOPT_TIMEOUT to weird big number, as well as corresponding timeout for php script execution. It may help.
Maybe, the best variant is to mix the previous author's variant and this one.
I tried wget on the image URL and it downloads the image and then seems to hang - perhaps the server isn't correctly closing the connection.
However I got file_get_contents to work rather than curl, if that helps:
<?php
$image = 'http://island-alpaca.selfip.com:10202/SnapShotJPEG?Resolution=640x480&Quality=Standard';
$imageData = base64_encode(file_get_contents($image));
$src = 'data: '.mime_content_type($image).';base64,'.$imageData;
echo '<img src="',$src,'">';
Are you sure it's not working? Your code is working fine for me (after adding the missing semicolon after $image = ...).
The reason it might be giving you trouble is because it's not actually an image, it's an MJPEG. It uses an HTTP session that's kept open and with a multipart content (similar to what you see in MIME email), and the server pushes a new JPEG frame to replace the last one on an interval. CURL seems to be happy just giving you the first frame though.
I am developing an app for SoundCloud, written in PHP and I use the php-soundcloud library.
After successfully having instantiated a Services_Soundcloud instance, I can do calls like the following:
echo $soundcloud->get('me');
echo $soudcloud->get('users/12345678');
However, the following call is not working:
echo $soundcloud->get('resolve', array('url' => 'https://soundcloud.com/webfordreams'));
The error I get is:
Services_Soundcloud_Invalid_Http_Response_Code_Exception: The requested URL responded with HTTP code 302. in Services_Soundcloud->_request() (line 933 of ../php-soundcloud/Services/Soundcloud.php).
After several hours of debugging I decided to ask for help, as I really don't understand what I do wrong.
Can anybody help me and tell me how to get the proper response?
Leading on from what Francis.G was saying The syntax you are looking for is:
$soundcloud->get('resolve', array('url' => 'https://soundcloud.com/webfordreams'), array(CURLOPT_FOLLOWLOCATION => true);
The only change is in the definition of the curl-ops array passed into the get() function as the third parameter.
The "HTTP/1.1 302" response that you're getting is the appropriate response according to the api.
If you inspect the full php exception you'll get this in the response body:
[httpBody:protected] => {"status":"302 -Found","location":"https://api.soundcloud.com/users/25071103"}
In order to allow CURL to follow the redirect you need to pass the CURLOPT_FOLLOWLOCATION option while calling soundcloud's get() function.
$result = $scloud->get('resolve', array('url' => 'https://soundcloud.com/webfordreams'), array(CURLOPT_FOLLOWLOCATION => TRUE ));
I'm not too sure about the syntax though but what you eventually wanna end up with is something like this in Soundcloud's _request() function:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
Hopefully someone with more experience on this will come along and give you the proper syntax instead of you having to tinker around with the SoundCloud SDK itself.
I'm trying to use file_get_contents() to get the response from a server and this error was encountered. Could someone tell me what is the reason and how to fix it? The portion of the code is:
$api = "http://smpp5.routesms.com:8080/bulksms/sendsms?username=$username&password=$password&source=$source&destination=$destin&dlr=$dlr&type=$type&message=$message";
$resp = file_get_contents($api);
The server responded correctly while I pasted the url in the browser.
I learned that this is caused by the server rejecting the client's HTTP version, but I have no idea why that is happening in my case.
Any help is much appreciated. Thanks in advance
I found the problem, and it was a simple coding error -- missing url encoding.
The reason I didn't notice it at first was because the code was ok before I did some editing, and I'd missed out the urlencode() function before calling the server, which caused a space in the url.
This does seem to be the reason this error occurs for most people. So if you encounter this, use urlencode() on all variables which may contain white space in it's value used as URL parameters. So in the case in my question the fixed code will look like:
$api = "http://smpp5.routesms.com:8080/bulksms/sendsms?username=$username&password=$password&source=$source&destination=$destin&dlr=$dlr&type=$type&message=" . urlencode($message);
$resp = file_get_contents($api);
Also, thanks for all of your time and responses, those were informational.
You could create a stream context with the HTTP version set to 1.0 and use that context with file_get_contents:
$options = array(
'http' => array(
'protocol_version' => '1.0',
'method' => 'GET'
)
);
$context = stream_context_create($options);
$api = "http://smpp5.routesms.com:8080/bulksms/sendsms?username=$username&password=$password&source=$source&destination=$destin&dlr=$dlr&type=$type&message=$message";
$resp = file_get_contents($api, false, $context);
By the way: Don’t forget to escape your URI argument values properly with urlencode.
I ran into the same issue and in my case the culprit was an errant newline/CRLF character at the end of the request URL, which does not get caught by urlencode() (or maybe it does encode it but it still causes the server to produce the error). Once I found the problem the requests began to work again, even without the stream context options.
Hopefully this will help others.
Can you sniff what's happening on the wire? Seeing the format of the HTTP request as it goes out on the wire would help a lot.
Without seeing that, my best guess would be that the server isn't well-implemented, and is rejecting a HTTP/1.1 request. Try setting --http1.0 on Curl and seeing what happens...
Some time we still get error with
file_get_contents($api);
in that case, try this:
fopen($api,"r");
I was also facing this same issue..
later i found that while retrieving the results from mysql, Limit $count ,
$count was -ve. fixing that the url worked fine.
There is some problem in url only, and its not a file_get_contents or http version issue..
I'm trying to get the contents from another file with file_get_contents (don't ask why).
I have two files: test1.php and test2.php. test1.php returns a string, bases on the user that is logged in.
test2.php tries to get the contents of test1.php and is being executed by the browser, thus getting the cookies.
To send the cookies with file_get_contents, I create a streaming context:
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"))`;
I'm retrieving the contents with:
$contents = file_get_contents("http://www.example.com/test1.php", false, $opts);
But now I get the error:
Warning: file_get_contents(http://www.example.com/test1.php) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found
Does somebody knows what I'm doing wrong here?
edit:
forgot to mention: Without the streaming_context, the page just loads. But without the cookies I don't get the info I need.
First, this is probably just a typo in your question, but the third arguments to file_get_contents() needs to be your streaming context, NOT the array of options. I ran a quick test with something like this, and everything worked as expected
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
$contents = file_get_contents('http://example.com/test1.txt', false, $context);
echo $contents;
The error indicates the server is returning a 404. Try fetching the URL from the machine PHP is running on and not from your workstation/desktop/laptop. It may be that your web server is having trouble reaching the site, your local machine has a cached copy, or some other network screwiness.
Be sure you repeat your exact request when running this test, including the cookie you're sending (command line curl is good for this). It's entirely possible that the page in question may load fine in a browser without the cookie, but when you send the cookie the site actually is returning a 404.
Make sure that $_SERVER['HTTP_COOKIE'] has the raw cookie you think it does.
If you're screen scraping, download Firefox and a copy of the LiveHTTPHeaders extension. Perform all the necessary steps to reach whatever page it is you want in Firefox. Then, using the output from LiveHTTPHeaders, recreate the exact same request requence. Include every header, not just the cookies.
Finally, PHP Curl exists for a reason. If at all possible, (I'm not asking!) use it instead. :)
Just to share this information.
When using session_start(), the session file is lock by PHP. Thus the actual script is the only script that can access the session file. If you try to access it via fsockopen() or file_get_contents() you can wait a long time since you try to open a file that has been locked.
One way to solve this problem is to use the session_write_close() to unlock the file and relock it after with session_start().
Example:
<?php
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
session_write_close(); // unlock the file
$contents = file_get_contents('http://120.0.0.1/controler.php?c=test_session', false, $context);
session_start(); // Lock the file
echo $contents;
?>
Since file_get_contents() is a blocking function, both script won't be in concurrency while trying to modify the session file.
But i'm sure this is not the best manner to manipulate session with an extend connection.
Btw: it's faster than cURL and fsockopen()
Let me know if you find something better.
Just out of curiosity, are you attempting file_get_contents on a page that has a space in it? I remember trying to use fgc on a URL that had a space in the name and while my web browser parsed it just fine, fgc didn't. I ended up having to use a str_replace to replace ' ' with '%20'.
I would think that this should have been relatively easy to spot that though as it would report only half of the filename. Also, I noticed in one of these posts, someone used \r\n while defining the headers. Keep in mind that PHP doesn't like these to be in single quotes, but they work fine in double.
Make sure that file1.php exists on the server. Try opening it in your own browser to make sure!