i have this php script which works well an convert almost all pages nicely but in few pages it is unable to convert relative url to absolute url.it is giving wrong result for below links.
$url = 'http://www.lowridermagazine.com/girls/1201_lrms_cat_cuesta_lowrider_girls_model/photo_01.html';
// Example of a relative link of the page above.
$relative = 'photo_01.html';
// Parse the URL the crawler was sent to.
$url = parse_url($url);
if(FALSE === filter_var($relative, FILTER_VALIDATE_URL))
{
// If the link isn't a valid URL then assume it's relative and
// construct an absolute URL.
print $url['scheme'].'://'.$url['host'].'/'.ltrim($relative, '/');
}
else
{
$relative;
}
it works nicely for http://www.santabanta.com/photos/shriya/9830084.htm url but fails in above url.
any idea where i am doing mistake
Related
Hey so i'm trying to link my current subdomain to new domain with specific url format.
Example my current subdomain is:
http://current.example.org/
what i would like to do is at the event of specific url format redirect to new domain:
http://newexample.org/url?=http://current.example.org/somefolder
any help will be appreciated thanks.
Updated answer per comment (you may need to do some tweaking, I'm not sure of the exact value of _SERVER['REQUEST_URI']:
if (strpos($_SERVER['REQUEST_URI'], "somefolder") !== false){
// redirect
header('Location: http://newexample.org/url?='.$_SERVER['REQUEST_URI']);
exit;
}
Original answer follows:
If the url contains 'somefolder' then append to redirect url, you mean?
$url = 'http://current.example.org/somefolder' ;
if (FALSE !== strpos($url, 'somefolder')) {
$url = 'http://newexample.org/url?=.'$url ;
}
// curl connect using $url
I am currently setting up a site, which requires some sort of "proxy" work. Basically through $_GET['url'] I can grab a site's content using file_get_contents($url). However, when links are shown like: <a href="images/image.png".../>, they will link to my site instead of theirs, which makes all images, links, etc. load from my site, which returns a 404 not found error.
I have not been able to find anything about this anywhere. How I do the "proxying" in theory, but not as a final product:
$url = $_GET['url'];
$content = file_get_contents($url);
echo $content;
What could I possibly do to change this, so all links doesn't depend on what the browser sees, but where they actually come from (the site link in $_GET['url']), which basically turns relative links into absolute? Thanks!
You would have to know what their site is in order to make a request from it.
To do this, you can parse the url:
$urlParsed = parse_url($url);
$urlHostOnly= $urlParsed['scheme'] . "://" . $urlParsed['host'] . "/";
Then, the tricky part, you have to prepend the host only url to each link.
Most links in html are in hrefs and src values so here is a simple replacer to deal with those.
$content = file_get_contents($url);
$replaced_content = preg_replace(
"/(href|src)=\"((?!http[s]:\/\/[a-z\.]{2,6}).*)\"/",
"$1=\"$urlHostOnly$2\"",
$content
);
Now that you have the replaced contents, echo it to the client
echo $replaced_content;
Note: There can be some conflict with respect to stylesheets and ssl if you do not specify the correct protocol (http / https) when entering the url.
See: http://i.imgur.com/tz6Hn28.png for an example of this.
Seems like I've solved this from an advice from a friend.
//grabs the URL of the site I am working with (the $_GET['url'] site basically)
$fullUrl = basename($url);
//Replaces <head> with <head> followed by a base-tag, which has the href attribute of the website.
//This will make all relative links absolute to that base-tag href.
$content = str_replace("<head>", "<head>\n<base href='http://" . $fullUrl . "' />", $content);
echo $content;
Voilá, this site now functions perfectly.
EDIT: Okay, it did not work perfectly.. for some reason. If the URL linked to a file like help.asp, basename() would return with help.asp. I went with a different route:
function addhttp($url) {
if (!preg_match("~^(?:f|ht)tps?://~i", $url)) {
$url = "http://" . $url;
}
return $url;
}
$url = addhttp($url);
preg_match('/^(?:https?:\/\/)?(?:[^#\n]+#)?(?:www\.)?([^:\/\n]+)/', $url, $fullUrl);
$fullUrl = $fullUrl[1];
No more wrong URLs being loaded. This all work... for now.
So I want to redirect a page in a directory, for example auth/login.php, to another page, at auth/. The page is actually called auth/index.php but I don't want this to show up to the client. How can I do this using this PHP code:
header("Location: /auth/");
However, I want the filename to be relative, e.g. ../ instead of hard-coded as auth. Is this possible, and how?
To direct the users to your current directory, you should extract the script's directory from $_SERVER['SCRIPT_NAME'], and then compare it to the actual HTTP request ($_SERVER['REQUEST_URI']). If they're not the same - redirect to the current directory. If they're the same - you should start your actual script:
<?php
$currentScript = $_SERVER['SCRIPT_NAME'];
$pathInfo = pathinfo($currentScript);
$currentDir = $pathInfo['dirname'].'/';
if ($currentDir != $_SERVER['REQUEST_URI'])
{
header('Location: '.$currentDir);
return ;
}
// Here be real site data
In my website, users can put an URL in their profile.
This URL can be http://www.google.com or www.google.com or google.com.
If I just insert in my PHP code $url, the link is not always absolute.
How can I force the a tag to be absolute ?
If you prefix the URL with // it will be treated as an absolute one. For example:
Google.
Keep in mind this will use the same protocol the page is being served with (e.g. if your page's URL is https://path/to/page the resulting URL will be https://google.com).
Use a protocol, preferably http://
Google
Ask users to enter url in this format, or concatenate http:// if not added.
If you prefix the URL only with //, it will use the same protocol the page is being served with.
Google
I recently had to do something similar.
if (strpos($url, 'http') === false) {
$url = 'http://' .$url;
}
Basically, if the url doesn't contain 'http' add it to the front of the string (prefix).
Or we can do this with RegEx
$http_pattern = "/^http[s]*:\/\/[\w]+/i";
if (!preg_match($http_pattern, $url, $match)){
$url = 'http://' .$url;
}
Thank you to #JamesHilton for pointing out a mistake. Thank you!
With this code:
if(empty($aItemInfo['url'])) {
$url = '<p> </p>';
} else {
$url = ' | LINK';
}
I've got this as output:
http://localhost/tester/www.google.com
In db there is only www.google.com and ofcourse it's fictional.
What am I doing wrong?
You need to add http:// while parsing your code, before using it in the <a> tag.
If all your URLs will be without http:// use this code:
$url = 'http://'.$aItemInfo['url'];
Then use $url
Not too sure what you're trying to link to. If you're linking to an external site you'll need to add http:// in front of the link. If not, the link will be added to the end of the current domain name as shown above
You can put links are relative or absolute paths. If you don't include the "http://" part, then it assumes that it is a relative path. Add href="http://'.$aItemInfo['url'].'"