file_get_contents not returning entire webpage - php

I've been trying to retrieve the contents of a webpage (http://3sk.tv) using file_get_contents. Unfortunately, the resulting output is missing many elements (images, formating, styling, etc...), and just basically looks nothing like the original page I'm trying to retrieve.
This has never happened before with any other URLs I have tried retrieve using this same method, but for some reason, this particular URL (http://3sk.tv) refuses to work properly.
The code I'm using is:
<?php
$homepage = file_get_contents('http://3sk.tv');
echo $homepage;
?>
Am I missing anything? All suggestions on how to get this working properly would be greatly appreciated. Thank you all for your time and consideration.

Thats normal behaviour, as you are only grabbing the file, and not related images, stylesheets etc...

I have one quick workaround to fix relative paths
http://www.w3schools.com/tags/tag_base.asp
Just add to your code <base> tag.
<?php
$homepage = file_get_contents('http://3sk.tv');
echo str_replace(
'<head>',
'<head><base href="http://3sk.tv" target="_blank">',
$homepage
);
?>
It's should help.

This is to be expected. If you look at the source code, you'll notice many places which do not have a full URL (ex lib/dropdown/dropdown.css). This tells the browser to assume http://3sk.tv/lib/dropdown/dropdown.css. However, on your website, it will be YOURURL.COM/lib/dropdown/dropdown.css, which does not exist. This will be the case for much of the content.
So, you can't just print another website's source and expect it to work. It needs to be the same URL.
The best way to embed another website is usually to just use an iframe or some alternative.

The webpage is not completely generated server-side, but it relies heavily on JavaScript after the HTML part loads. If you are looking for rendering the page as it looks in browser, you may need a headless browser instead - see e.g. this binding to PhantomJS: http://jonnnnyw.github.io/php-phantomjs/

Related

PHP's file_get_contents: dealing with relative paths inside the result

I'm trying to solve a cross-domain issue, so I'm implementing a script that will get a URL from a GET parameter and open with file_get_contents. It works fine until the page try to get relative paths like (the following line is inside index.html):
<script src="js/custom_script.js" />
If I create a regex with preg_replace that replace all the HTML data switching js/custom.js to http://content.domain/js/custom_script.js it also works, but the problem is that I don't always know how many levels are inside the page I'm trying to open, like: index.html could have a button to another page with another relative paths.
Is there an elegant solution to this problem?
I would use the base HTML tag, and instead of scraping through the whole source, just insert it right before the </head> tag.
Could be really simple with just a line of code: echo str_replace("</head>", "<base href=\"http://content.domain/\" target=\"_blank\"></head>", $source)
Using the base tag makes the browser handle all the nested links for you:
Specify a default URL and a default target for all links on a page:

Adding and styling external file

There is a file on another site that I do not own, with a URL in the following format:
http://example.com/path/with/some/variables
I want to include this file in one of my own pages. I could use iframe to do this, but I also want to change the CSS of something within the included file. To my knowledge, I can't do this with this method.
However, I can't seem to be able to successfully add this via PHP either, with something like:
<?php include 'http://example.com/path/with/some/variables'; ?>
I'm not sure what other methods exist that can do this, but surely this must be possible.
Also, I'm aware of the security implications of using include in a situation like this.
Use readfile:
<?php readfile('http://example.com/path/with/some/variables'); ?>
Yeah, security limitations won't allow you do do this directly in an iframe, by manipulating the DOM of the iframed file.
To do it in PHP, you could create a PHP script to read the contents of the URL and add an external CSS file that you've created, to override whatever you want. So:
myreader.php:
$contents = file_get_contents("http://example.com/path/with/some/variables");
$contents = preg_replace("/<head>/", "<head>\n<link rel='stylesheet' type='text/css' href='mystyle.css'>", $contents, 1);
echo $contents;
and then create mystyle.css:
body {
color : red !important;
}
Finally, either just point your browser to myreader.php, or if you still want it in an iframe, point the iframe src to myreader.php.
PS: Stealing is wrong :)
You can use file_get_contents
<?php $content = file_get_contents('http://example.com/path/with/some/variables'); ?>
Here is the documentation file_get_contents

PHP + Smarty: Parse PHP+HTML into a String?

I am using PHP in combination with Smarty Templates to generate pages serverside. Currently, I am loading a page as follows:
$smarty->assign('app', file_get_contents("some_content.php"));
Where some content contains HTML with PHP tags and code inside those tags.
I would like the PHP content inside this file within the current scope (That of the script reading the file), so that a particular function I've defined is available. How would I go about doing so? All the information I can find is regarding the eval(...) function, which doesn't seem to be able to cope with the HTML/PHP mixture: would I need to perform a find/eval/replace operation to achieve the desired result, or is there a more elegant way of doing this?
From my opinion, this short snippet of the code you posted shows that something is generally wrong there :)
But nevertheless you can achieve whatever you are trying to achieve by doing the following:
ob_start();
include("some_content.php");
$result = ob_get_clean();
$smarty->assign('app', $result);
Ich, I'm such a dummkopf. There is an answer right on the PHP manual for eval, right under my nose. Here is the answer I neglected to notice.
You can use {literal}...{/literal} smarty tags to display any content in smarty templates as is. It used to transfer java scripts and other specific content.

How to know php file is loaded from source code

I'm working with my JS files, what i have now is a unique php file with JS header, if a variable is set it includes the real js file, which is fine.
The "home" page has the script tag for the php-js file:
<head>
<script type="text/javascript" language="javascript" src="bootstrap.php"></script>
</head>
the bottstrap.php file has something like:
if(isset($hostData) && !empty($hostData)) {
include('bootstrap.js');
}else {
echo "document.write('<center><bold>PLEASE DO SOMETHING...!</bold></center>');";
}
all that seems to be fine, however when viewing the source code (CTRL+U) the browser shows the "bootstrap.php" part as a link, if clicked it obviously redirects to http://mydomain/bootstrap.php and the js code can be easily seen, which is exactly what i don't want...
So my question is, is there any php-way to know if the file is being loaded from browser's "rendering view" or being loaded from browser's "source code view" ???
Any help is truly appreciated =)
In short, no. You can't hide your script source from your users. The best you can do is obfuscate it using tools like YUICompressor.
There's no way you can hide the javascript code. It needs to be executed by the client, and even if you try to hide it by formatting your code badly, tools like firebug can easily introspect the code and pull out the code.
To be honest I don't think you can actually hide it like that. I'm assuming the best thing you've got to go on is the useragent string but I'm assuming if you "view source" in a browser it would still send the regular headers.
The only way I can think of adding the JS include without it appearing when in view source mode is to actually load the external file via javascript (you could even break the path of the js file into variables so it isn't really human readable) which I would not advise.
If someone wants to get at your javascript they will there no is way of avoiding it.
and the js code can be easily seen, which is exactly what i don't want...
You don't want the JS to be seen, but you do want to use it???
There IS something wrong with your code though if you want the js file to be used in your page.
You need to include / require the file:
<script type="text/javascript" language="javascript" src="<?php include bootstrap.php ?>"></script>
Otherwise the browser will load the contents of the bootstrap file, but you want to run the code inside it (which can only be done at the server).
Also:
change:
include('bootstrap.js');
to
echo bootstrap.js;
EDIT
by re-reading your question (and other answers) that's exactly what you want: make your JS code invisible (correct me if wrong).
The answer to that is: No cannot be done.
You can try to obfuscate the code but it will take someone who wants to see it seconds to 'decode'.
Try using the $_SERVER["HTTP_referer"], which have the url that called this file.
I'm really sorry for disappearing from here...
The best solution I decided to implement is quite simple: don't show ANY URL or PHP files within JS code; so during last months I've used a unique PHP file to do all necessary database queries, a stored procedure generates dynamically all the URL's needed from JS.
In that way URL's vary every time and what I've named "poor logic" goes free for users to view/copy I don't mind that while server data is secure.
THANKS ALL FOR YOUR VALUABLE ANSWERS!!!

Can I source a php file as javascript?

I am using a WP template that allows me to incorporate arbitrary HTML. Unfortunately, I have to use this particular widget and can't use other WP widgets.
I have on my webserver /some/path/serve_image.php that spits out a random HREF'd IMG SRC with a caption and some other info from a MySQL query.
Now...how can I say "take that output and treat it as HTML"? If I just put "/some/path/serve_image.php" I get that literal string.
I tried:
<script type="javascript" src="/some/path/serve_image.php"></script>
but that didn't work. I tried changing everything in serve_image.php to be document.write() calls and that didn't seem to work either. I'm not the world's greatest JS guy...
So if I have a URL on the net that spits out some HTML and I want to include that HTML in my web page, what's the best way to do that? Sort of like what Google does with Adsense - you source their show_ads.js.
Why no? Add
header('Content-Type: application/javascript');
And output JavaScript Like:
echo("var image = \"".$images[array_rand($images)]."\";");
echo("$('img.randim').attr('src', image);
No. JavaScript and PHP are two completely separate languages. In fact, if it was JavaScript, you aren't even loading it the right way.
<script type="text/javascript"></script>
The way you're trying to do it would throw a parse error, because it would try to use the PHP as JavaScript. Some browsers would even reject it, because PHP files have a text/html MIME type, while JavaScript should be application/javascript.
PHP has to be done server side, so loading it in the client just doesn't work.
What I think you're after is this:
<?php
require('/some/path/serve_image.php');
?>
Just place that wherever you want the image to be.

Categories