Background
I am trying to include an rss feed using php into a html document
Code
<?php
include ("feed url");
?>
I have used ssl command to successfully add the include tag in the html file like this
<!--#include virtual="rssfeed.php" -->
which works fine after editing htaccess file. Now problem is because in my php im using include ("feed url") I am getting this error:
Warning: include() [function.include]: URL file-access is disabled in
the server configuration in path/rssfeed.php on line 2
Warning: include(feed url) [function.include]: failed to open stream:
no suitable wrapper could be found in path/rssfeed.php on line 2
Now things to note I have tried setting php_value allow_url_fopen 1 but no luck as the files are held on third party hosting server so I do not have alot of access so they have blocked me from turning allow_url_fopen to ON for obvious reasons. So My question is how do I approch this problem ? Any directions will be greatly apperciated.
Thanks everyone for reading.
Your server is configured in such a way that you cannot include from a remote location. This is common in shared hosting environments to help reduce server load and reduce the possibility of malicious code being accidentally executed.
However, if I understand you right, you could not just include the RSS feed using the include() construct anyway, because it is not valid PHP code - include() expects the path to be a valid PHP source code file. What you are doing, if your server allowed you to do it, would result in either useless output or a parse error.
You need to connect to the RSS feed (e.g. using cURL or fsockopen() depending on the level of control you want over the request to the remote site) and parse the feed data so you can output in a sensible format.
include "http://..." is a bad idea because the contents of http://... are evaluated as PHP code which opens your site open to attacks if someone can inject PHP code in the response of that RSS feed.
Use curl if you want to display data from another site. From the PHP Manual example:
<?php
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
// grab URL and pass it to the browser
curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
?>
Related
I am trying to test a scraping setup targeting sites of my own using servers of my own, but when I do I get an error from WPRobot saying "Error: Inserting post failed. Cannot create attachment file in "/home/myapp/public_html/wp-content/uploads/2021/03/162153879_434009211024473_384527521975744737_n.jpg" Please set correct permissions." I typically get this notice if the source does not allow other sites to copy its content using curl requests. To get around this I setup a Puppeteer script to act as a proxy by taking screenshots of image files on the source website (in this case Instagram), but now I get the same error when trying to download the screenshots from my own IIS server.
I have added nothing to my server that is intended to prevent anyone from being able to curl image files from it. In fact, there are several sites that were hosted on the same server recently. WPRobot never had a problem downloading any images from those. I have changed nothing other than uploading a Node.js application to the default website directory after deleting the old websites for the purpose of using this server to test the app before my account with the host is suspended in a few days.
What would cause images on an IIS 10 server to not be accessible via curl on the default website? Here is an example image for which WPRobot claims correct permissions are not set http://85.17.219.113/images/2021/3/505648a2-3404-c3e6-618f0fa50fd3.jpg? Could this be due to using an IP address instead of a domain name? Could the length of the file name be an issue?
Also: the image is only 92kb in size. I mention this because I have had issues with WPRobot displaying the same error when trying to download large files, but this file is nowhere near large enough to trigger that.
UPDATE: This appears to be a problem with file_get_contents and/or file_put_contents in PHP and NOT curl. For instance, the following results in an error:
$page = file_get_contents('http://85.17.219.113/images/2021/3/505648a2-3404-c3e6-618f0fa50fd3.jpg');
try{
file_put_contents('test.jpg', file_get_contents($page));
echo 'file downloaded';
}catch(Exception $e) {
echo 'Message: ' .$e->getMessage();
}
That code does not create a file and does not echo an error message either, but when I turned on error reporting I got this:
Fatal error: Uncaught ValueError: file_get_contents(): Argument #1 ($filename) must not contain any null bytes in C:\xampp\apps\wordpress\htdocs\sources\test.php:16 Stack trace: #0 C:\xampp\apps\wordpress\htdocs\sources\test.php(16): file_get_contents('\xFF\xD8\xFF\xE0\x00\x10JFIF\x00\x01\x01\x00\x00...') #1 {main} thrown in C:\xampp\apps\wordpress\htdocs\sources\test.php on line 16
I cannot explain this error. How can that URL translate to a filename with null bytes?
This downloads the image just fine:
$url = 'http://85.17.219.113/images/2021/3/505648a2-3404-c3e6-618f0fa50fd3.jpg';
// Image path
$img = 'test.jpg';
// Save image
$ch = curl_init($url);
$fp = fopen($img, 'wb');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
Now I need to find the function in WPRobot that is used to cache images and save them to the file system. Does anyone know where that function is in WPRobot? It would be nice to know exactly how that plugin saves images.
The problem turned out to be WPRobot itself not updating the new feed URL. As a result it was still trying to use the one that included the original Instagram image file URLs. Usually this bug does not take so long to detect. Usually when I load the campaign options it will show the old URL if the update did not work and all you have to do is keep pasting the new URL and hitting the update button until it finally works. I had done this several times and reloading the page showed the new URL in the text area, but for some reason I loaded the page an hour later and the old one showed up.
This really is a horrible bug in WPRobot. When you have to change the URLs in 10-20 campaigns it can take an hour to do it because you keep having to do it over and over again until it finally works.
I'm using simplexml_load_file to get RSS from several websites for a while.
Sometimes I get errors from some of these websites and for about 5 days I'm having errors from 2 specific websites.
Here are the errors from simplexml_load_file:
PHP Warning: simplexml_load_file(http://example.com/feed): failed to open stream: Connection timed out
PHP Warning: simplexml_load_file(): I/O warning : failed to load external entity "http://example.com/feed"
Here are the errors from file_get_contents:
PHP Warning: file_get_contents(http://example.com/page): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden
That's how I'm using simplexml_load_file:
simplexml_load_file( $url );
That's how I'm using file_get_contents:
file_get_contents( $url );
Is that because I'm not using a proxy or invalid arguments?
UPDATE:
The 2 websites are using something like a firewall or a service to check for robots:
Accessing http://example.com/feed securely…
This is an automatic process. Your browser will redirect to your requested content in 5 seconds.
You're relying on an assumption that http://example.com/feed is always going to exist and always return exactly the content you're looking for. As you've discovered, this is a bad assumption.
You're attempting to access the network with your file_get_contents() and simplexml_load_file() and finding out that sometimes those call fail. You must always plan for these calls to fail. It doesn't matter if some websites openly allow this kind of behavior or if you have very reliable web host. There are circumstances out of your control, such as an Internet backbone outage, that will eventually cause your application to get back a bad response. In your situation, the third party has blocked you. This is one of the failures that happen with network requests.
The first take away is that you must handle the failure better. You cannot do this with file_get_contents() because file_get_contents() was designed to get the contents of files. In my opinion the PHP implementers that allowed it to make network calls made a very serious mistake allowing it this functionality. I'd recommend using curl:
function doRequest($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT,10);
$output = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if () {
return $output;
} else {
throw new Exception('Sorry, an error occurred');
}
}
Using this you will be able to handle errors (they will happen) better for your own users.
You're second problem is that this specific host is giving you a 403 error. This is probably intentional on their end. I would assume that this is them telling you that they don't want you using their website like this. However you will need to engage them specifically and ask them what you can do. They might ask you to use a real API, they might just ignore you entirely, they might even tell you to pound sand - but there isn't anything that we can do to advise here. This is strictly a problem (or feature) with their software and you must contact them directly for advice.
You could potentially use multiple IP addresses to connect to websites and rotate IPs each time one gets blocked. But doing so would be considered a malicious attack on their service.
I am having trouble getting either a cURL request or file_get_contents() for a https page that returns JSON. Please note that I have looked at multiple StackOverflow topics with similar problems, but none seem to address my specific issue.
BACKGROUND: I am using a MAC running OS X 10.10.3 (Yosemite). I just installed XAMPP 5.6.8 so that I can run a local web server in order to play around with php, etc.
GOAL: I would like to simply display the contents of a JSON object returned by a https GET request. For context, the page is https://api.discogs.com/releases/249504.
ACTIONS: I have read on other posts that for retrieving HTTPS pages I must make the following out-of-the-box changes in my php.ini file, which I have done:
- uncomment/turn on allow_url_fopen=On
- uncomment/turn on allow_url_include=on
- uncomment extension=php_openssl.dll
RESULTS:
Using file_get_contents()...
Code:
<?php
$url = "https://api.discogs.com/releases/249504";
$text = file_get_contents($url);
var_dump($text);
?>
Result:
Warning: file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed in /Applications/XAMPP/xamppfiles/htdocs/phpWorkspace/firstPhp/firstPhp.php on line 5
Warning: file_get_contents(): Failed to enable crypto in /Applications/XAMPP/xamppfiles/htdocs/phpWorkspace/firstPhp/firstPhp.php on line 5
Warning: file_get_contents(https://api.discogs.com/releases/249504): failed to open stream: operation failed in /Applications/XAMPP/xamppfiles/htdocs/phpWorkspace/firstPhp/firstPhp.php on line 5
bool(false)
Using cURL...
Code:
https://api.discogs.com/releases/249504";
// Initiate curl
$ch = curl_init();
// Will return the response, if false it print the response
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Set the url
curl_setopt($ch, CURLOPT_URL,$url);
// Execute
$result=curl_exec($ch);
// Closing
curl_close($ch);
var_dump($result);
?>
Result:
bool(false)
Any recommendations would be extremely helpful. I've played around with different cURL calls based on other SO posts, each have either returned NULL, an error related to certificates/authentication, or like above.
UPDATE!
I added CURLOPT_SSL_VERIFYPEER to my cURL code and set it to off in order to validate that I could at least communicate with the server - which helped me then identify that I needed to provide a user agent. However, when I removed the SSL_VERIFYPEER, I still had certificate issues.
As a last ditched effort I uninstalled XAMPP, and turned on the php/apache from within the Mac OS. Once configured, out-of-the-box both the the cURL and FILE_GET_CONTENTS() calls worked with the user agent using the same code as above [frustrating!].
My gut tells me that I had a bad configuration in XAMPP related to my certificates. Interestingly though I didn't see much online as this being an issue with other XAMPP users.
Try adding this
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
The certificate is not being successfully verified. This may not correct it depending on how severe the issue is.
You should not rely on this as a fix, however it should tell you if your system can even communicate with theirs. Reference the post Frankzers posted.
I am trying to include a file from a url, however I get the following error thrown up.
Warning: include(): http:// wrapper is disabled in the server configuration by allow_url_include=0 in /Applications/MAMP/htdocs/diningtime/testsite/salims/index1.php on line 58
Warning: include(http://localhost/diningtime/templates/100/index.php): failed to open stream: no suitable wrapper could be found in /Applications/MAMP/htdocs/diningtime/testsite/salims/index1.php on line 58
Warning: include(): Failed opening 'http://localhost/diningtime/templates/100/index.php' for inclusion (include_path='.:/Applications/MAMP/bin/php/php5.4.4/lib/php') in /Applications/MAMP/htdocs/diningtime/testsite/salims/index1.php on line 58
Test
Just wondering if there is anyway around this?
My PHP include code is
<?php include "$location/index.php"; ?>
I appreciate your help on this.
You're using a full URL as you include path, which tells PHP to attempt to do an HTTP request to fetch that file. This is NOT how you do this. Unless that ...100/index.php outputs PHP code, you are going to get some HTML or whatever as the include result, NOT the php code in the file. Remember - you're fetching via URL, which means it's an HTTP request, which means the webserver will EXECUTE that script and deliver its output, not simply serve up its source code.
There's no way for the webserver to tell that the HTTP request for that script is an include call from another PHP script on the same server. It could just as easily be a request for that script from some hacker hiding in Russia wanting to steal your source code. Do you want your source code visible to the world like this?
For local files, you should never use a full-blown url. it's hideously inefficient, and will no do what you want. why not simply have
include('/path/to/templates/100/index.php');
instead, which will be a local file-only request, with no HTTP stage included?
I have had a similar issue.
Considering 'Marc B's' post, it is clear that using absolute URLs is not a great idea, even if it is possible by editing the php.ini as 'ficuscr' states. I'm not sure this workaround will work for your specific situation as it still requires adjustments to each page depending on where it is in your folder structure, but it does makes things simpler if, like me, you have a lot of includes in your website.
<?php $root="../";?>
<?php include($root . 'folder-A/file_to_be_included_1.php');?>
<?php include($root . 'folder-A/file_to_be_included_2.php');?>
<?php include($root . 'folder-B/file_to_be_included_3.php');?>
<?php include($root . 'folder-B/file_to_be_included_4.php');?>
<?php include($root . 'folder-B/sub-folder/file_to_be_included_5.php');?>
For me, this means if I move a page to another location in the folder structure, for example in a sub-folder of its current location, all I need to do is amend <?php $root="../";?> to become <?php $root="../../";?> for example
This problem seems to have been discussed in the past everywhere on google and here, but I have yet to find a solution.
A very simple fopen gives me a
PHP Warning: fopen(http://www.google.ca): failed to open stream: HTTP request failed!".
The URL I am fetching have no importance because even when I fetch http://www.google.com it doesnt work. The exact same script works on different server. The one failing is Ubuntu 10.04 and PHP 5.3.2. This is not a problem in my script, it's something different in my server or it might be a bug in PHP.
I have tried using a user_agent in php.ini but no success. My allow_url_fopen is set to On.
If you have any ideas, feel free!
It sounds like your configuration isn't allowed to use file functions, which is common these days because of security concerns. If you have the cURL libraries available to you, I would recommend trying those out.
PHP: cURL
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.google.ca/");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$file = curl_exec($ch);
curl_close($ch);
echo $file;
Check that your php.ini config is set to allow fopen to open external URL's:
allow_url_fopen "1"
http://www.php.net/manual/en/filesystem.configuration.php#ini.allow-url-fopen
I'm not at all sure about whether this is the problem or not, but I know in the past I've had problems with opening URLs with fopen, often due to php.ini's allow_url_fopen or other unknown security settings
You may want to try cURL in PHP, which often works for me, you'll find an example really easily by googling that.
Check your phpinfo output - is http present under Registered PHP Streams?
Are you getting "HTTP request failed" without further details? The socket timeout could be expired. This default to 60 seconds. See: http://php.net/manual/en/function.socket-set-timeout.php