Remove characters from beginning and end string - php

I want to ouput only MYID from URL. What I did so far:
$url = "https://whatever.expamle.com/display/MYID?out=1234567890?Browser=0?OS=1";
echo substr($url, 0, strpos($url, "?out="));
output: https://whatever.expamle.com/display/MYID
$url = preg_replace('#^https?://whatever.expamle.com/display/#', '', $url);
echo $url;
ouput: MYID?out=1234567890?Browser=0?OS=1
How can I combine this? Thanks.

For a more general solution, we can use regex with preg_match_all:
$url = "https://whatever.expamle.com/display/MYID?out=1234567890?Browser=0?OS=1";
preg_match_all("/\/([^\/]+?)\?/", $url, $matches);
print_r($matches[1][0]); // MYID

When the string is always a Uniform Resource Locator (URL), like you present it in your question,
given the following string:
$url = "https://whatever.expamle.com/display/MYID?out=1234567890?Browser=0?OS=1";
you can benefit from parsing it first:
$parts = parse_url($url);
and then making use of the fact that MYID is the last path component:
$str = preg_replace(
'~^.*/(?=[^/]*$)~' /* everything but the last path component */,
'',
$parts['path']
);
echo $str, "\n"; # MYID
and then depending on your needs, you can combine with any of the other parts, for example just the last path component with the query string:
echo "$str?$parts[query]", "\n"; # MYID?out=1234567890?Browser=0?OS=1
Point in case is: If the string already represents structured data, use a dedicated parser to divide it (cut it in smaller pieces). It is then easier to come to the results you're looking for.
If you're on Linux/Unix, it is even more easy and works without a regular expression as the basename() function returns the paths' last component then (does not work on Windows):
echo basename(parse_url($url, PHP_URL_PATH)),
'?',
parse_url($url, PHP_URL_QUERY),
"\n"
;
https://php.net/parse_url
https://php.net/preg_replace
https://www.php.net/manual/en/regexp.reference.assertions.php

Related

SPLIT URL in PHP

I have below URL in my code and i want to split it and get the number from it
For example from the below URL need to fetch 123456
https://review-test.com/#/c/123456/
I have tried this and it is not working
$completeURL = https://review-test.com/#/c/123456/ ;
list($url, $number) = explode('#c', preg_replace('/^.*\/+/', '', $completeURL));
Use parse_url
It's specifically made for this sort of thing.
You can do this without using regex also -
$completeURL = 'https://review-test.com/#/c/123456/' ;
list($url, $number) = explode('#c', str_replace('/', '', $completeURL));
echo $number;
If you wan to get the /c/123456/ params you will need to execute the following:
$url = 'https://review-test.com/#/c/123456/';
$url_fragment = parse_url($url, PHP_URL_FRAGMENT);
$fragments = explode('/', $url_fragment);
$fragments = array_filter(array_map('trim', $fragments));
$fragments = array_values($fragments);
The PHP_URL_FRAGMENT will return a component of the url after #
After parse_url you will end up with a string like this: '/c/123456/'
The explode('/', $url_fragment); function will return an array with empty indexes where '/' was extracted
In order to remove empty indexes array_filter($fragments); the
array_map with trim option will remove excess spaces. It does not
apply in this case but in real case scenario you better trim.
Now if you var_dump the result you can see that the array needs to
be reindexed array_values($fragments)
You should try this: basename
basename — Returns trailing name component of path
<?php
echo basename("https://review-test.com/#/c/123456/");
?>
Demo : http://codepad.org/9Ah83qaP
Subsequently you can directly take from pure regex to fetch numbers from string,
preg_match('!\d+!', "https://review-test.com/#/c/123456/", $matches);
print_r($matches);
Working demo
Simply:
$tmp = explode( '/', $completeUrl).end();
It will explode the string by '/' and take the last element
If you have no other option than regex, for your example data you could use preg_match to split your url instead of preg_replace.
An approach could be to
Capture the first part as a group (.+\/)
Then capture your number as a group (\d+)
Followed by a forward slash at the end of the line \/$/
This will take the last number from the url followed by a forward slash.
Then you could use list and skip the first item of the $matches array because that will contain the text that matched the full pattern.
$completeURL = "https://review-test.com/#/c/123456/";
preg_match('/(.+\/)(\d+)\/$/', $completeURL, $matches);
list(, $url, $number) = $matches;

Extract specific part of URL from string

I need to extract only parts of a URL with PHP but I am struggling to the set point where the extraction should stop. I used a regex to extract the entire URL from a longer string like this:
$regex = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i';
preg_match_all($regex, $href, $matches);
The result is the following string:
http://www.cambridgeenglish.org/test-your-english/&sa=U&ei=a4rbU8agB-zY0QWS_IGYDw&ved=0CFEQFjAL&usg=AFQjCNGU4FMUPB2ZuVM45OoqQ39rJbfveg
Now I want to extract only this bit http://www.cambridgeenglish.org/test-your-english/. I basically need to get rid off everything starting at &amp onwards.
Anyone an idea how to achieve this? Do I need to run another regex or can I add it to the initial one?
I would suggest you abandon regex and let PHP's own parse_url function do this for you:
http://php.net/manual/en/function.parse-url.php
$parsed = parse_url($url);
$my_url = $parsed['scheme'] . '://' . $parsed['hostname'] . $parsed['path'];
to get the substring of the path up to the &amp, try:
$parsed = parse_url($url);
$my_url = $parsed['scheme'] . '://' . $parsed['hostname'] . substr($parsed['path'], 0, strpos($parsed['path'],'&amp'));
The below regex would get ridoff everything after the string &amp. Your php code would be,
<?php
echo preg_replace('~&amp.*$~', '', 'http://www.cambridgeenglish.org/test-your-english/&sa=U&ei=a4rbU8agB-zY0QWS_IGYDw&ved=0CFEQFjAL&usg=AFQjCNGU4FMUPB2ZuVM45OoqQ39rJbfveg');
?> //=> http://www.cambridgeenglish.org/test-your-english/
Explanation:
&amp Matches the string &amp.
.* Matches any character zero or more times.
$ End of the line.

Function to shorten a specific string

I have this string:
$str="http://ecx.images-amazon.com/images/I/418lsVTc0aL._SL110_.jpg";
Is there a built-in php function that can shorten it by removing the ._SL110_.jpg part, so that the result will be:
http://ecx.images-amazon.com/images/I/418lsVTc0aL
no, there's not any built in URL shortener php function, if you want to do something similar you can use the substring or create a function that generates a short link and stores the long and short value somewhere in database and display only the short one.
well, it depends if you need a regexp replace (if you don't know the complete value) or if you can do a simple str_replace like below:
$str = str_replace(".SL110.jpg", "", "http://ecx.images-amazon.com/images/I/418lsVTc0aL._SL110_.jpg");
You can use preg_replace().
For example preg_replace("/\.[^\.]+\.jpg$/i", "", $str);
I would recommend using:
$tmp = explode("._", $str);
and then using $tmp[0] for your purpose, if you make sure the part you want to get rid of is always separated by "._" (dot-underscore) symbols.
You can try
$str = "http://ecx.images-amazon.com/images/I/418lsVTc0aL._SL110_.jpg";
echo "<pre>";
A.
echo strrev(explode(".", strrev($str), 3)[2]) , PHP_EOL;
B.
echo pathinfo($str,PATHINFO_DIRNAME) . PATH_SEPARATOR . strstr(pathinfo($str,PATHINFO_FILENAME),".",true), PHP_EOL;
C.
echo preg_replace(sprintf("/.[^.]+\.%s$/i", pathinfo($str, PATHINFO_EXTENSION)), null, $str), PHP_EOL;
Output
http://ecx.images-amazon.com/images/I/418lsVTc0aL
See Demo
you could do this substr($data,0,strpos($data,"._")), if what you want is to strip everything after "._"
No, it is not (at least not directly). Such URL shorteners usually generate unique ID and remember your original URL and generated ID. When you enter such url, you start a script, which looks for given ID and then redirect to target URL.
If you want just cut of some portion of your string, then assuming that filename format is as you shown, just look for 1st dot and substr() to that place. Or
$tmp = explode('.', $filename);
$shortName = $tmp[0];
If suffix ._SL110_.jpg is always there, then simply str_replace('._SL110_.jpg', '', $filename) could work.
EDIT
Above was example for filename only. Whole code would be:
$url = "http://ecx.images-amazon.com/images/I/418lsVTc0aL._SL110_.jpg";
$urlTmp = explode('/', $url);
$fileNameTmp = explode( '.', $urlTmp[ count($urlTmp)-1 ] );
$urlTmp[ count($urlTmp)-1 ] = $fileNameTmp[0];
$newUrl = implode('/', $urlTmp );
printf("Old: %s\nNew: %s\n", $url, $newUrl);
gives:
Old: http://ecx.images-amazon.com/images/I/418lsVTc0aL._SL110_.jpg
New: http://ecx.images-amazon.com/images/I/418lsVTc0aL

Function to remove GET variable with php

i have this URI.
http://localhost/index.php?properties&status=av&page=1
i am fetching basename of the URI using following code.
$basename = basename($_SERVER['REQUEST_URI']);
the above code gives me following string.
index.php?properties&status=av&page=1
i would want to remove the last variable from the string i.e &page=1. please note the value for page will not always be 1. keeping this in mind i would want to trim the variable this way.
Trim from the last position of the string till the first delimiter i.e &
Update :
I would like to remove &page=1 from the string, no matter in which position it is on.
how do i do this?
Instead of hacking around with regular expression you should parse the string as an url (what it is)
$string = 'index.php?properties&status=av&page=1';
$parts = parse_url($string);
$queryParams = array();
parse_str($parts['query'], $queryParams);
Now just remove the parameter
unset($queryParams['page']);
and rebuild the url
$queryString = http_build_query($queryParams);
$url = $parts['path'] . '?' . $queryString;
There are many roads that lead to Rome. I'd do it with a RegEx:
$myString = 'index.php?properties&status=av&page=1';
$myNewString = preg_replace("/\&[a-z0-9]+=[0-9]+$/i","",$myString);
if you only want the &page=1-type parameters, the last line would be
$myNewString = preg_replace("/\&page=[0-9]+/i","",$myString);
if you also want to get rid of the possibility that page is the only or first parameter:
$myNewString = preg_replace("/[\&]*page=[0-9]+/i","",$myString);
Thank you guys but i think i have found the better solution, #KingCrunch had suggested a solution i extended and converted it into function. the below function can possibly remove or unset any URI variable without any regex hacks being used. i am posting it as it might help someone.
function unset_uri_var($variable, $uri) {
$parseUri = parse_url($uri);
$arrayUri = array();
parse_str($parseUri['query'], $arrayUri);
unset($arrayUri[$variable]);
$newUri = http_build_query($arrayUri);
$newUri = $parseUri['path'].'?'.$newUri;
return $newUri;
}
now consider the following uri
index.php?properties&status=av&page=1
//To remove properties variable
$url = unset_uri_var('properties', basename($_SERVER['REQUEST_URI']));
//Outputs index.php?page=1&status=av
//To remove page variable
$url = unset_uri_var('page', basename($_SERVER['REQUEST_URI']));
//Outputs index.php?properties=&status=av
hope this helps someone. and thank you #KingKrunch for your solution :)
$pos = strrpos($_SERVER['REQUEST_URI'], '&');
$url = substr($_SERVER['REQUEST_URI'], 0, $pos - 1);
Documentation for strrpos.
Regex that works on every possible situation: /(&|(?<=\?))page=.*?(?=&|$)/. Here's example code:
$regex = '/(&|(?<=\?))page=.*?(?=&|$)/';
$urls = array(
'index.php?properties&status=av&page=1',
'index.php?properties&page=1&status=av',
'index.php?page=1',
);
foreach($urls as $url) {
echo preg_replace($regex, '', $url), "\n";
}
Output:
index.php?properties&status=av
index.php?properties&status=av
index.php?
Regex explanation:
(&|(?<=\?)) -- either match a & or a ?, but if it's a ?, don't put it in the match and just ignore it (you don't want urls like index.php&status=av)
page=.*? -- matches page=[...]
(?=&|$) -- look for a & or the end of the string ($), but don't include them for the replacement (this group helps the previous one find out exactly where to stop matching)
You could use a RegEx (as Chris suggests) but it's not the most efficient solution (lots of overhead using that engine... it's easy to do with some string parsing:
<?php
//$url="http://localhost/index.php?properties&status=av&page=1";
$base=basename($_SERVER['REQUEST_URI']);
echo "Basename yields: $base<br />";
//Find the last ampersand
$lastAmp=strrpos($base,"&");
//Filter, catch no ampersands found
$removeLast=($lastAmp===false?$base:substr($base,0,$lastAmp));
echo "Without Last Parameter: $removeLast<br />";
?>
The trick is, can you guarantee that $page will be stuck on the end? If it is - great, if it isn't... what you asked for may not always solve the problem.

PHP preg_match between text and the first occurrence of -

I'm trying to grab the 12345 out of the following URL using preg_match.
$url = "http://www.somesite.com/directory/12345-this-is-the-rest-of-the-url.html";
$beg = "http://www.somesite.com/directory/";
$close = "\-";
preg_match("($beg(.*)$close)", $url, $matches);
I have tried multiple combinations of . * ? \b
Does anyone know how to extract 12345 out of the URL with preg_match?
Two things, first off, you need preg_quote and you also need delimiters. Using your construction method:
$url = "http://www.somesite.com/directory/12345-this-is-the-rest-of-the-url.html";
$beg = preg_quote("http://www.somesite.com/directory/", '/');
$close = preg_quote("-", '/');
preg_match("/($beg(.*?)$close)/", $url, $matches);
But, I would write the query slightly differently:
preg_match('/directory\/(\d+)-/i', $url, $match);
It only matches the directory part, is far more readable, and ensures that you only get digits back (no strings)
This doesn't use preg_match but would achieve the same thing and would execute faster:
$url = "http://www.somesite.com/directory/12345-this-is-the-rest-of-the-url.html";
$url_segments = explode("/", $url);
$last_segment = array_pop($url_segments);
list($id) = explode("-", $last_segment);
echo $id; // Prints 12345
Too slow, I am ^^.
Well, if you are not stuck on preg_match, here is a fast and readable alternative:
$num = (int)substr($url, strlen($beg));
(looking at your code I guessed, that the number you are looking for is a numeric id is it is typical for urls looking like that and will not be "12abc" or anything else.)

Categories