Convert external resource to https

Convert external resource to https - php

My site is loading images from other sites and this is causing warnings when I implemented HTTPS instead of plain HTTP. I know why this is happening but I'm wondering how to correct.
Best solution I have seen is here, but I don't understand how that works.
The poster suggests prepending https://example.com/imageserver?url= to the image url. This doesn't work. So what am I missing? What is imageserver?
I hope this makes sense, I'm not sure if I'm not just missing something obvious here.

imageserver could be a php script that fetch the image and display its contents.
a very simple example, not very safe
echo file_get_contents($_GET['url']);
The idea here is that the browser now gets the images from your secure server instead of the original non-https server.

Related

file_get_contents not returning entire site

I have been trying to retrieve the contents of a website (https://www.programsgulf.com/) using file_get_contents. Unfortunately, the resulting output is missing many elements (images, formating, styling, etc...), and just basically looks nothing like the original page I'm trying to retrieve.
This has never happened before with any other URLs I have tried retrieve using this same method, but for some reason, this particular URL (https://www.programsgulf.com/) refuses to work properly.
The code I'm using is:
<?php
$homepage = file_get_contents('https://www.programsgulf.com/');
echo $homepage;
?>
Am I missing anything? All suggestions on how to get this working properly would be greatly appreciated. Thank you all for your time and consideration.

You can't just echo someone's html and expect it to work. Assets (like scripts, images or stylesheets) won't load due to same-origin policy violation unless the server has (mis)configured CORS rules. This is a protection layer in every modern browser that you won't overcome.
If you really want this to work you have to download each asset on the server side, store them locally and replace links in the code to your local copies. This is exactly how web scraping/online proxy software work.

Incorrect URL in Wordpress leads to broken images

I just transferred a website from their previous host to hosting with me. Obviously, I had to change some of the links that pointed to the images to make them display correctly. Unfortunately, it's a huge mess. There were some links described in the mysql database, but i got into MySQL and replaced all of those with the correct link. Originally, it linked to
http://localhost/...
I now need it to link to
http://[subdomain].[website].net/
I've gone through every line of code i could find with fgrep in linux and i can't find where it's inserting localhost. Any ideas where localhost could be stored, if not in the database (as far as i can tell) and not in the physical code? I'm assuming it's a PHP variable somewhere. I'm not sure which, but i already made sure that
<?php echo get_template_directory_uri(); ?>
was set to the correct uri. Any help would be greatly appreciated. thank you.
EDIT
I tried to replace the database information correctly from a clean copy of the database. I used the serialize php script and it didn't work. the images are still not showing up and they're still routing back to
http://localhost
I'm not sure what to do about it. Any more suggestions?

1) Check page source and see exactly where the image URLs point to. Some missing image links may be hardcoded to point to the theme folder or other locations.
2) Did you also move /wp-content/uploads?
3) Dumping the database and doing a find/replace with a text editor will break URLs that are in serialized data. You have to use a tool to correctly deserialize/re-serialize data. See interconnectit.com WordPress Serialized PHP Search Replace Tool

If you're sure that you replaced every occurrence of localhost in the database, then the most likely next culprit is the browser cache, so I recommend you to delete the cache of your browser just to be sure, as this depends on your browser search for the appropriate method, but, for example, on Internet Explorer open the Developer Tools (F12) and go to Cache->Erase cache for this domain.

How to check if there is anything at URL?

Part of my site requires user to input URLs, but in case they type the URL incorrectly or just input a non-existent one on purpose I end up with a bad record on my database.
E.G in Chrome if there isn't anything at a URL you get the error
message "Oops! Google Chrome could not find fdsafadsfadsf.com". (this is the case I'm referring)
This could be solved by checking the URL to see if there is anything, I can only think of one which is loading the external URL in a PHP file and then parsing it's content. But I hope there is a method that doesn't put unneeded strain on my server.
What other ways exist to check if there is anything at a particular URL?

I would just make a HEAD request. This will work with most servers, and avoids downloading the entire page, so it is very efficient.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
All you have to do is parse the status code returned. If it is 200, then you're good.
Example implementation with cURL here: http://icfun.blogspot.com/2008/07/php-get-server-response-header-by.html

You can use php get_headers($url), which will return false in case there isn't an answer

if you're willing to include a tiny Flash embed you can do a crossdomain AJAX call from the client to see if anything useful is at the destination. This would alleviate any Server involvement at all.
http://jimbojw.com/wiki/index.php?title=Introduction_to_Cross-Domain_Ajax

I would use cURL to do this, that way you can specify a timeout on it.
See the comments on: http://php.net/manual/en/function.get-headers.php

How do I get this URL without considering the Apache settings?

HEllo I have this URL I need to get with PHP
http://www.domain.com/forum/#forum/General-discussions-0.htm
The problem is this is not a real URL, but this the mask created by the .htaccess.
I need to get the visible URL and not the real path of the file, because I need to compare it with some PHP variables I have.
In fact the real path will look like this:
http://domain.com/modules/boonex/forum/index.php
And in that way is totally useless for me.
How do I get the first URL as it is?

You can't get that from http://www.domain.com/forum/#forum/General-discussions-0.htm. Everything after the fragment (#) is not even send to the server, there is no way to retrieve it save for a delayed update with javascript. All you'll get it is http://www.domain.com/forum/ send to the server, and on the onload event of your document you can possibly load something in with javascript.

Look into the source code or it may not have real urls at all. The part is for ajax based navigation. It may mean that there are no real urls on that site and if there are then they should be extracted from <a href="someurl"> as they might masked using javascript.

With
file_get_contents();
for example. Neither user nor your server mind about .htaccess
It's server proccessing the request who have to direct you to correct address
however php does ignore everything after #, so in this case you have no chance to get it without real url
As #Wrikken said, there is no way to get url after # fragment

Efficient Method for Preventing Hotlinking via .htaccess

I need to confirm something before I go accuse someone of ... well I'd rather not say.
The problem:
We allow users to upload images and embed them within text on our site. In the past we allowed users to hotlink to our images as well, but due to server load we unfortunately had to stop this.
Current "solution":
The method the programmer used to solve our "too many connections" issue was to rename the file that receives and processes image requests (image_request.php) to image_request2.php, and replace the contents of the original with
<?php
header("HTTP/1.1 500 Internal Server Error") ;
?>
Obviously this has caused all images with their src attribute pointing to the original image_request.php to be broken, and is also the wrong code to be sending in this case.
Proposed solution:
I feel a more elegant solution would be:
In .htaccess
If the request is for image_request.php
Check referrer
If referrer is not our site, send the appropriate header
If referrer is our site, proceed to image_request.php and process image request
What I would like to know is:
Compared to simply returning a 500 for each request to image_request.php:
How much more load would be incurred if we were to use my proposed alternative solution outlined above?
Is there a better way to do this?
Our main concern is that the site stays up. I am not willing to agree that breaking all internally linked images is the best / only way to solve this. I refuse to tell our users that because of something WE changed they must now manually change the embed code in all their previously uploaded content.

Ok, then you can use mod_rewrite capability of Apache to prevent hot-linking:
http://www.cyberciti.biz/faq/apache-mod_rewrite-hot-linking-images-leeching-howto/

Using ModRwrite will probably give you less load than running a PHP script. I think your solution would be lighter.
Make sure that you only block access in step 3 if the referer header is not empty. Some browsers and firewalls block the referer header completely and you wouldn't want to block those.

I assume you store image paths in database with ids of images, right?
And then you query database for image path giving it image id.
I suggest you install MemCached to the server and do caching of user requests. It's easy to do in PHP. After that you will see server load and decide if you should stop this hotlinking thing at all.

Your increased load is equal to that of a string comparison in PHP (zilch).
The obfuscation solution doesn't even solve the problem to begin with, as it doesn't stop future hotlinking from happening. If you do check the referrer header, make absolutely certain that all major mainstream browsers will set the header as you expect. It's an optional header, and the behavior might vary from browser to browser for images embedded in an HTML document.
You likely have sessions enabled for all requests (whether they're authenticated or not) -- as a backup plan, you can also rename your session cookie name to something obscure (edit: obscurity here actually doesn't matter as long as the cookie is set for your host only (and it is)) and check that a cookie by that name is set in image_request.php (no cookie set would indicate that it's a first-request to your site). Only use that as a fallback or redundancy check. It's worse than checking the referrer.
If you were generating the IMG HTML on the fly from markdown or something else, you could use a private key hash strategy with a short-live expire time attached to the query string. Completely air tight, but it seems way over the top for what you're doing.
Also, there is no "appropriate header" for lying to a client about the availability of a resource ;) Just send a 404.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.