Fix links addresses on websites HTML code - php

i have been working on one tool lately. It grabs all the link addresses from the website.
My problem is that links in html code sometimes is different:
/index.php
index.php
http://www.website.com/index.php
I need to make all links same:
/index.php -> http://www.website.com/index.php
index.php -> http://www.website.com/index.php
http://www.website.com/index.php -> http://www.website.com/index.php
Thanks for help.

Using preg_replace to fix relative urls
Requires:
$domain = the subject sites domain
$path = the document or string you are looking for relative links with in.
Returns:
$url = the doument or string with the links within it converted to proper urls with the domain given.
Code:
$url = preg_replace('<a\shref="([\/\?\w\.=\&]+)"([\s]rel="(\w+)")*>/', '<a href="http://{$site_domain}$1" rel="$3">' $path)
good luck, let me know how it goes.

Welcome to GoogleOverflow.com.
Here is the complete tutorial for parsing links in HTML using PHP and regex: http://www.the-art-of-web.com/php/parse-links/

Here's a function which will return the absolute URL given the base (current) URL and a relative one.

You need to check for the existence of a base tag. If you find it, it specify the base URL (otherwise, the base URL is the same path the browser points to, up to the last /).

Related

Php get website url after https:// or www. or subdomain

I want to get the correct URL with PHP without any error, my links example:
https://example.com/
https://example.com/search/
https://example.com/search/?q=test
https://it.example.com/
https://it.example.com/search/
https://it.example.com/search/?q=test
so i want to get all link if is https://example.com/ show example.com if is https://example.com/search/ show example.com/search/ if is https://it.example.com/search/?q=test show example.com/search/?q=test etc.. without any error. thanks
Looks like you need $_SERVER['REQUEST_URI']:
$link = "$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
Have a look over the PHP documents too: http://php.net/manual/en/reserved.variables.server.php
This will return the HTTP Host without https, and will also get you the request_uri with query strings etc.
parse_url() will also give you each element, and then you can build up the string you need:
http://php.net/manual/en/function.parse-url.php

How to get value after / in URL

I am trying to get the value after the / in a URL in PHP.
I have tried using $_GET['va'], but this only works for the following,
http://localhost:8080/default?va=xxx
What I'm trying to get is this,
http://localhost:8080/default/xxx
How do I get the xxx value after a / in PHP.
Any help is greatly appreciated.
Edit
Thanks to everyone who answered, I wasn't very clear in stating what I wanted. It appears what I was looking for is known as a pretty URL or a clean URL.
I solved this by following Pedro Amaral Couto's answer.
Here is what I did;
I modified my .htaccess file on my Apache server, and added the following code.
RewriteEngine On
RewriteRule ^([a-zA-Z0-9]+)$ default.php?page=$1
RewriteRule ^([a-zA-Z0-9]+)/$ default.php?page=$1
Then I modified my default.php file to GET ['page']
<?php
echo $_GET['page'];
?>
And it returned the value after the "/".
You want to make what is called "pretty URLs" (and other names).
You need to configure the HTTP server appropriately, otherwise you'll get a 404 error. If you're using Apache, that means you may configure .htaccess with RewriteEngine module activated. You also need to add regular expressions.
There's already a question in StackOverflow concerning that subject:
Pretty URLs with .htaccess
Here are another relevant articles that may help:
http://www.desiquintans.com/cleanurls
https://medium.com/#ArthurFinkler/friendly-urls-for-static-php-files-using-htaccess-3264e7622373
You can see how it's done in Wordpress:
https://codex.wordpress.org/Using_Permalinks#Where.27s_my_.htaccess_file.3F
If you follow those, you won't need to change the PHP code, you may use $_GET to retrieve "xxx".
You are looking for: $_SERVER['REQUEST_URI']
The URI which was given in order to access this page; for instance, '/index.html'.
basename(parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH));
So the $_GET global variable is for parsing the query string.
What you're looking for is the $_SERVER['REQUEST_URI'] global variable:
$url = $_SERVER['REQUEST_URI'];
$url will now contain the full URL of your path. You'll need to use explode('/', $url) to break up that full URL into an array of little strings and parse it from there.
Example:
$pieces = explode('/', $url);
// this will get you the first string value after / in your URL
echo $pieces[0];
You can do in 2 ways.
1- do these steps
Get URL
Explode by /
Get Last element of array
2- Add .htaccess and map that value for some variable
RewriteRule ^default/(.*) page.php?variable=$1
and you can get $_GET['variable']

anchor tag weird behaviour in codeigniter

i have controller by name"job_classified" the problem is when i open
click me
in view it opens http://localhost/my_project/job_classified/google.com instead of google.com
what actually is the problem? i tried other code igniter URI functions but it didn't work for me.can someone guide me how to do it correctly
Why your code doesn't work is described here:- https://stackoverflow.com/a/2005097/4248328
So do like below:-
click me
try this
click me
Please change url to:
click me
Reason: HTML parses urls as of relative ones if no http or https is given.
HTML server in your case considers google.com as a relative file is the same directory.
In Codeigniter you can do it a couple of ways
Using the base_url from the url helper
Note: the base url in config.php must be set
https://www.codeigniter.com/user_guide/helpers/url_helper.html#base_url
Some Name
Some Name
Also you can use the anchor();
https://www.codeigniter.com/user_guide/helpers/url_helper.html#anchor
<?php echo anchor('job_classified', 'Job Classified');?>
<?php echo anchor('controller/function', 'Job Classified');?>
How to remove index.php from url
https://github.com/wolfgang1983/htaccess_for_codeigniter
Always prepend URLs with double slash. That way you don't need to think whether location should be http or https.
This way browser will automatically
determine which scheme to use

$_SERVER['HTTP_HOST'] appends dot to the end of URL

I am calling a function that reads $_SERVER['HTTP_HOST'] and render an anchor element with href that is read from $_SERVER['HTTP_HOST'].
On my mobile theme on Android devices this function appends a dot at the end of the url, so it looks like www.example.com. which makes some other functions work improperly.
Upon debugging I realized that it is precisely $_SERVER['HTTP_HOST'] that has this wrong value.
Anyone has this problem or any idea how to fix it?
i dont think its php issue, but this code can resolve your issue.
trim($_SERVER['HTTP_HOST'], '.')
In the Domain Name System, and most notably, in DNS zone files, a fully qualified domain name is specified with a trailing dot. For example,
somehost.example.com. specifies an absolute domain name that ends with an empty top level domain label.
So the PHP is actually returning the correct value. As to how to combat this, use sudhakar's suggestion.

Dynamic links linking to root domain

I am building a site directory and I am having some trouble linking up to sites..
The directory currently stores the site domain in a table and calls it through a foreach loop listing 25 separate domains on the page, but when I click on the links I am greeted with
localhost/directory (my site root) /linkeddomain.com
Rather than just displaying linkeddomain.com
I put http://www. in front of the array call
href='http://www.".$row['siteurl']."
However this is useless for production because if anyone enters into their domain http://www.theirdomain.com it will come out as http://www.http://www.theirdomain.com
Does anyone know how to fix this issue?
Thanks in advance
Luke
Make sure that the base URL is always the full URL, including the scheme and a subdomain (if applicable).
So:
$base_url = "http://livesite.com";
$base_url = "http://localhost/john/customerX";
$base_url = "https://secure.livesite.com";
If all your links are prefix by the base URL you should be fine.
Note that in all URLs I left off the trailing /. You can chose to include it, just make sure you always do it in the same way - have a clear normalized form.
You can just check to see if you need to add http to the url or not.
if(!preg_match("^https?://", $url)){
$url .= "http://" . $url;
}
That will only add it if it is needed.

Categories