I have been trying to retrieve the contents of a website (https://www.programsgulf.com/) using file_get_contents. Unfortunately, the resulting output is missing many elements (images, formating, styling, etc...), and just basically looks nothing like the original page I'm trying to retrieve.
This has never happened before with any other URLs I have tried retrieve using this same method, but for some reason, this particular URL (https://www.programsgulf.com/) refuses to work properly.
The code I'm using is:
<?php
$homepage = file_get_contents('https://www.programsgulf.com/');
echo $homepage;
?>
Am I missing anything? All suggestions on how to get this working properly would be greatly appreciated. Thank you all for your time and consideration.
You can't just echo someone's html and expect it to work. Assets (like scripts, images or stylesheets) won't load due to same-origin policy violation unless the server has (mis)configured CORS rules. This is a protection layer in every modern browser that you won't overcome.
If you really want this to work you have to download each asset on the server side, store them locally and replace links in the code to your local copies. This is exactly how web scraping/online proxy software work.
Related
My site is loading images from other sites and this is causing warnings when I implemented HTTPS instead of plain HTTP. I know why this is happening but I'm wondering how to correct.
Best solution I have seen is here, but I don't understand how that works.
The poster suggests prepending https://example.com/imageserver?url= to the image url. This doesn't work. So what am I missing? What is imageserver?
I hope this makes sense, I'm not sure if I'm not just missing something obvious here.
imageserver could be a php script that fetch the image and display its contents.
a very simple example, not very safe
echo file_get_contents($_GET['url']);
The idea here is that the browser now gets the images from your secure server instead of the original non-https server.
I just transferred a website from their previous host to hosting with me. Obviously, I had to change some of the links that pointed to the images to make them display correctly. Unfortunately, it's a huge mess. There were some links described in the mysql database, but i got into MySQL and replaced all of those with the correct link. Originally, it linked to
http://localhost/...
I now need it to link to
http://[subdomain].[website].net/
I've gone through every line of code i could find with fgrep in linux and i can't find where it's inserting localhost. Any ideas where localhost could be stored, if not in the database (as far as i can tell) and not in the physical code? I'm assuming it's a PHP variable somewhere. I'm not sure which, but i already made sure that
<?php echo get_template_directory_uri(); ?>
was set to the correct uri. Any help would be greatly appreciated. thank you.
EDIT
I tried to replace the database information correctly from a clean copy of the database. I used the serialize php script and it didn't work. the images are still not showing up and they're still routing back to
http://localhost
I'm not sure what to do about it. Any more suggestions?
1) Check page source and see exactly where the image URLs point to. Some missing image links may be hardcoded to point to the theme folder or other locations.
2) Did you also move /wp-content/uploads?
3) Dumping the database and doing a find/replace with a text editor will break URLs that are in serialized data. You have to use a tool to correctly deserialize/re-serialize data. See interconnectit.com WordPress Serialized PHP Search Replace Tool
If you're sure that you replaced every occurrence of localhost in the database, then the most likely next culprit is the browser cache, so I recommend you to delete the cache of your browser just to be sure, as this depends on your browser search for the appropriate method, but, for example, on Internet Explorer open the Developer Tools (F12) and go to Cache->Erase cache for this domain.
I have a site complete with CMS etc all working under one domain name. It turns out for legal reasons one page on this site has to sit on a different domain name. The page is hooked into the same CMS as the rest of the site (built using codeigniter). I don't want to have to do another installation just for this page.
Is there any simple way to display just this page under a different domain name without taking it out of the current application?
Thanks a lot
You should look at either (in order):
an include()with correct php.ini configuration
a file_get_content() and printing the variable into your page
an <iframe src="yoururl"> wich would be the easy peasy but unsafe way
using the on-purprose curllibrary
using fopen() wich theorically allows distant files to be opened, but based on my experience, it's not that reliable
Look at this site, it seems rather exhaustive regarding your problem.
Try including the file
<?php include 'http://www.domain.com/url/to/file/page.html' ?>
I think what you need here is a symlink, which is something I don't know too much about. My understanding is that the path displayed to the user does not in fact have to have anything to do with where the file is actually stored, meaning you can set this up to have a completely different URL while keeping it as part of your original application.
A simpler thing is doing a redirect...it's one line of code in your .htaccess file and you're good to go.
include is a possible solution depending on the format of the remote page (ie, this won't work very well if the remote page has a full DOM structure, and you're trying to include the remote page within the DOM structure of your CMS page), however more information about that remote page would be needed to help determine if include() alone would be enough.
Regardless, if include() does, work, you must make sure allow_url_include in php.ini is enabled, as by default script execution will terminate when encoutering a remote URL include statement.
What i basically want to do is to get content from a website and load it into a div of another website. This should be no problem so far.
The problem is, that the content that should be fetched is located on a different server and i have no source access to it.
I'd prefer a solution using JavaScript of jQuery.
Can i use a .htacces redirect to fetch the content from a remote server with client-side (js) techniques?
I will also go with other solutions though.
Thanks a lot in advance!
You can't execute an AJAX call against a different domain, due to the same-origin policy. You can add a <script> tag to the DOM which points at a Javascript file on another domain. If this JS file contains some JSON data that you can use, you're all set.
The only problem is you need to get at the JSON data somehow, which is where JSON-P callbacks come into the picture. If the foreign resource supports JSON-P, it will give you something that looks like
your_callback( { // JSON data } );
You then specify your code in the callback.
See JSONP for more.
If JSONP isn't an option, then the best bet is to probably fetch the data server-side, say with a cron job every few minutes, and store it locally on your own site.
You can use a server-side XMLHTTP request to grab your content from the other server. You can then parse it on you server (A.K.A screen-scraping) and serve-up the portion you want along with your web page.
If the content from the other website is just an HTML doc that you want to display on your site, you could also use an iframe to pull it in. You won't have access to any of its content because of browser security rules.
You will likely have to "scrape" the data you need and store it on your server.
This is a great tutorial on how to cache data from an external site. It is actually written to fetch and store XML, so it'll need some modification. Also, if your site doesn't allow file_get_contents then you may have to modify it to use cUrl.
While cross-site scripting is generally regarded as negative, I've run into several situations where it's necessary.
I was recently working within the confines of a very limiting content management system. I needed to include database code within the page, but the hosting server didn't have anything usable available. I set up a couple bare-bones scripts on my own server, originally thinking that I could use AJAX to import the contents of my scripts directly into the template of the CMS (thus retaining dynamic images, menu items, CSS, etc.). I was wrong.
Due to the limitations of XMLHttpRequest objects, it's not possible to grab content from a different domain. So I thought iFrame - even though I'm not a fan of frames, I thought that I could create a frame that matched the width and height of the content so that it would appear native. Again, I was blocked by cross-site scripting "protections." While I could indeed load a remote file into the iFrame, I couldn't execute JavaScript to modify its size on either the host page or inside the loaded page.
In this particular scenario, I wasn't able to point a subdomain to my server. I also couldn't create a script on the CMS server that could proxy content from my server, so my last thought was to use a remote JavaScript.
A remote JavaScript works. It breaks when the user has JavaScript disabled, which is a downside; but it works. The "problem" I was having with using a remote JavaScript was that I had to use the JS function document.write() to output any content. Any output that isn't JS causes script errors. In addition to using document.write() for every line, you also have to ensure that the content is escaped - or else you end up with more script errors.
My solution was as follows:
My script received a GET parameter ("page") and then looked for the file ({$page}.php), and read the contents into a variable. However, I had to use awkward buffering techniques in order to actually execute the included scripts (for things like database interaction) then strip the final content of all line break characters (\n) followed by escaping all required characters. The end result is that my original script (which outputs JavaScript) accesses seemingly "standard" scripts on my server and converts their standard output to JavaScript for displaying within the CMS template.
While this solution works, it seems like there may be a better way to accomplish the same thing. What is the best way to make cross-site scripting work specifically for the purpose of including content from a completely different domain?
You've got three choices:
Create a server side proxy script.
Create a remote script to read in remote dynamic HTML. Use a library like jQuery to make this easier. You can use the load function to inject HTML where needed. EDIT What I originally meant for example # 2 was utilizing JSONP, which requires the server side script to recognize the "callback=?" param.
Use a client side Flash proxy and setup a crossdomain.xml file on your server's web root.
Personally, I would call to that other domain on the server and get and parse the data there for use in your page. That way you avoid any problems and you get the power of a server-side language/platform for getting and parsing the data.
Not sure if that would work for your specific scenario...hard to know even with your verbose description...
You could try easyXDM, by including very little code, you can pass data or method calls between documents of different domains.
I've come across that YDN server side proxy script before. It says it's built to work with Yahoo's Search APIs.
Will it work with any domain, if you simply trim the Yahoo API code out? Or do you need to replace it with the domain you want it to work with?
iframe remote content can be accessed by local javascript.
The remote server just have to set the document.domain of the page.
Eg:
Site A contain an iframe with src='Site B/home.php'
home.php looks like this :
[php stuff]...[/php]
[script type='text/javascript']document.domain='Site A'[/script]