I am trying to strip everything that follows and includes the last ? of a given url. I am currently working with preg_replace but no luck in accomplishing the goal. This is the regex #\/[^?]*$# I am using to single out the last ?. Also is there a faster way by using substr?
Example link:
preg_replace('#\/[^?]*$#', '', $post="www.exapmle.com?26sf213132aasdf1312sdf31")
Desired Output
www.example.com
Here's how to do it with substr and strrpos:
$post = "www.exapmle.com?26sf213132aasdf1312sdf31";
$pos = strrpos($post, '?');
$result = substr($post, 0, $pos);
Add a \? at start of regex instead of \/
\?[^?]*$
\? matches a ?
[^?]*$ matches anything other than a ? until the end of string anchored by $
Example http://regex101.com/r/sW6jE7/3
$post="www.exapmle.com?26sf213132aasdf1312sdf31";
$res=preg_replace('/\?[^?]*$/', '', $post);
echo $res;
will give an output
www.example.com
EDIT
If you want to remove the entire query string from url then a slight modifiation of regex would do the work
\?.*$
which will remove anything followed by a question mark
Simply match everything from the start upto the ? symbol.
preg_match('/^[^?\n]*/', $post="www.exapmle.com?26sf213132aasdf1312sdf31", $match);
echo $match[0];
Output:
www.exapmle.com
Try this its working fine :
$host_url = "www.exapmle.com?26sf213132aasdf1312sdf31";
$part_url = strrpos($host_url, '?');
$result = substr($host_url, 0, $part_url);
echo $result;
Although OP tags regex, Surprisingly nobody suggests explode(), which is much easier.
$post = "www.exapmle.com?26sf213132aasdf1312sdf31";
$tokens = explode('?', $post);
echo $tokens[0]; // www.exapmle.com
Related
I know it may sound as a common question but I have difficulty understanding this process.
So I have this string:
http://domain.com/campaign/tgadv?redirect
And I need to get only the word "tgadv". But I don't know that the word is "tgadv", it could be whatever.
Also the url itself may change and become:
http://domain.com/campaign/tgadv
or
http://domain.com/campaign/tgadv/
So what I need is to create a function that will get whatever word is after campaign and before any other particular character. That's the logic..
The only certain thing is that the word will come after the word campaign/ and that any other character that will be after the word we are searching is a special one ( i.e. / or ? )
I tried understanding preg_match but really cannot get any good result from it..
Any help would be highly appreciated!
I would not use a regex for that. I would use parse_url and basename:
$bits = parse_url('http://domain.com/campaign/tgadv?redirect');
$filename = basename($bits['path']);
echo $filename;
However, if want a regex solution, use something like this:
$pattern = '~(.*)/(.*)(\?.*)~';
preg_match($pattern, 'http://domain.com/campaign/tgadv?redirect', $matches);
$filename = $matches[2];
echo $filename;
Actually, preg_match sounds like the perfect solution to this problem. I assume you are having problems with the regex?
Try something like this:
<?php
$url = "http://domain.com/campaign/tgadv/";
$pattern = "#campaign/([^/\?]+)#";
preg_match($pattern, $url, $matches);
// $matches[1] will contain tgadv.
$path = "http://domain.com/campaign/tgadv?redirect";
$url_parts = parse_url($path);
$tgadv = strrchr($url_parts['path'], '/');
You don't really need a regex to accomplish this. You can do it using stripos() and substr().
For example:
$str = '....Your string...';
$offset = stripos($str, 'campaign/');
if ( $offset === false ){
//error, end of h4 tag wasn't found
}
$offset += strlen('campaign/');
$newStr = substr($str, $offset);
At this point $newStr will have all the text after 'campaign/'.
You then just need to use a similar process to find the special character position and use substr() to strip the string you want out.
You can also just use the good old string functions in this case, no need to involve regexps.
First find the string /campaign/, then take the substring with everything after it (tgadv/asd/whatever/?redirect), then find the next / or ? after the start of the string, and everything in between will be what you need (tgadv).
I am trying to create a regular expression to do the following (within a preg_replace)
$str = 'http://www.site.com&ID=1620';
$str = 'http://www.site.com';
How would I write a preg_replace to simply remove the &ID=1620 from the string (taking into account the ID could be variable string length
thanks in advance
You could use...
$str = preg_replace('/[?&;]ID=\d+/', '', $str);
I'm assuming this is meant to be a normal URL, hence the [?&;]. If that's the case, the & should be a ?.
If it's part of a larger list of GET params, you are probably better off using...
parse_str($str, $params);
unset($params['ID']);
$str = http_build_query($params);
I'm guessing that & is not allowed as a character in the ID attribute. In that case, you can use
$result = preg_replace('/&ID=[^&]+/', '', $subject);
or (possibly better, thanks to PaulP.R.O.):
$result = preg_replace('/[?&]ID=[^&]+/', '', $subject);
This will remove &ID= (the second version would also remove ?ID=) plus any amount of characters that follow until the next & or end of string. This approach makes sure that any following attributes will be left alone:
$str = 'http://www.site.com?spam=eggs&ID=1620&foo=bar';
will be changed into
$str = 'http://www.site.com?spam=eggs&foo=bar';
You can just use parse_url
(that is if the URL is of the form: http://something.com?id1=1&id2=2):
$url = parse_url($str);
echo "http://{$url['host]}";
I'd like to replace more than one forward slash with one forward slash.
Examples:
this/is//an//example -> this/is/an/example
///another//example//// -> /another/example/
example.com///another//example//// -> example.com/another/example/
Thanks!
EDIT: This will be used to fix URLs that have more than one forward slash.
try
preg_replace('#/+#','/',$str);
or
preg_replace('#/{2}#','/',$str);
Tips: use str_replace for such a simple replacement AS it
replace all occurrences of the search string with the replacement string
str_replace('/','/',$str);
Reference
You might want to use regex:
$modifiedString = preg_replace('|/{2,}|','/',$strToModify);
I use the {2,} instead of + to avoid replacing single '/'.
Use a regex to replace one or more /-es with /:
$string = preg_replace('#/+#', '/', $string);
I see you want to create a valid url... you might want to check out realpath, or maybe even better the snippet in the first comment:
$path = '../gallery/index/../../advent11/app/';
$pattern = '/\w+\/\.\.\//';
while(preg_match($pattern, $path)) {
$path = preg_replace($pattern, '', $path);
}
// $path == '../advent11/app/'
As you can see this also solves ../-es :)
I am currently building breadcrumb. It works for example for
http://localhost/researchportal/proposal/
<?php
$url_comp = explode('/',substr($url,1,-1));
$end = count($url_comp);
print_r($url_comp);
foreach($url_comp as $breadcrumb) {
$landing="http://localhost/";
$surl .= $breadcrumb.'/';
if(--$end)
echo '
<a href='.$landing.''.$surl.'>'.$breadcrumb.'</a>ยป';
else
echo '
'.$breadcrumb.'';
};?>
But when I typed in http://localhost////researchportal////proposal//////////
All the formatting was gone as it confuses my code.
I need to have the site path in an array like ([1]->researchportal, [2]->proposal)
regardless of how many slashes I put.
So can $url_comp = explode('/',substr($url,1,-1)); be turned into a regular expression to get my desired output?
You don't need regex. Look at htmlentities() and stripslashes() in the PHP manual. A regex will return a boolean value of whatever it says, and won't really help you achieve what you are trying to do. All the regex can let you do is say if the string matches the regex do something. If you put in a regex requiring at least 2 characters between each slash, then any time anyone puts more than one consecutive slash in there, the if statement will stop.
http://ca3.php.net/manual/en/function.stripslashes.php
http://ca3.php.net/manual/en/function.htmlentities.php
Found this on the php manual.
It uses simple str_replace statements, modifying this should achieve exactly what your post was asking.
<?
function stripslashes2($string) {
$string = str_replace("\\\"", "\"", $string);
$string = str_replace("\\'", "'", $string);
$string = str_replace("\\\\", "\\", $string);
return $string;
}
?>
those reqular expressions drive me crazy. I'm stuck with this one:
test1:[[link]] test2:[[gold|silver]] test3:[[out1[[inside]]out2]] test4:this|not
Task:
Remove all [[ and ]] and if there is an option split choose the later one so output should be:
test1:link test2:silver test3:out1insideout2 test4:this|not
I came up with (PHP)
$text = preg_replace("/\\[\\[|\\]\\]/",'',$text); // remove [[ or ]]
this works for part1 of the task. but before that I think I should do the option split, my best solution:
$text = preg_replace("/\\[\\[(.*\|)(.*?)\\]\\]/",'$2',$text);
Result:
test1:silver test3:[[out1[[inside]]out2]] this|not
I'm stuck. may someone with some free minutes help me? Thanks!
I think the easiest way to do this would be multiple passes. Use a regular expression like:
\[\[(?:[^\[\]]*\|)?([^\[\]]+)\]\]
This will replace option strings to give you the last option from the group. If you run it repeatedly until it no longer matches, you should get the right result (the first pass will replace [[out1[[inside]]out2]] with [[out1insideout2]] and the second will ditch the brackets.
Edit 1: By way of explanation,
\[\[ # Opening [[
(?: # A non-matching group (we don't want this bit)
[^\[\]] # Non-bracket characters
* # Zero or more of anything but [
\| # A literal '|' character representing the end of the discarded options
)? # This group is optional: if there is only one option, it won't be present
( # The group we're actually interested in ($1)
[^\[\]] # All the non-bracket characters
+ # Must be at least one
) # End of $1
\]\] # End of the grouping.
Edit 2: Changed expression to ignore ']' as well as '[' (it works a bit better like that).
Edit 3: There is no need to know the number of nested brackets as you can do something like:
$oldtext = "";
$newtext = $text;
while ($newtext != $oldtext)
{
$oldtext = $newtext;
$newtext = preg_replace(regexp,replace,$oldtext);
}
$text = $newtext;
Basically, this keeps running the regular expression replace until the output is the same as the input.
Note that I don't know PHP, so there are probably syntax errors in the above.
This is impossible to do in one regular expression since you want to keep content in multiple "hierarchies" of the content. It would be possible otherwise, using a recursive regular expression.
Anyways, here's the simplest, most greedy regular expression I can think of. It should only replace if the content matches your exact requirements.
You will need to escape all backslashes when putting it into a string (\ becomes \\.)
\[\[((?:[^][|]+|(?!\[\[|]])[^|])++\|?)*]]
As others have already explained, you use this with multiple passes. Keep looping while there are matches, performing replacement (only keeping match group 1.)
Difference from other regular expressions here is that it will allow you to have single brackets in the content, without breaking:
test1:[[link]] test2:[[gold|si[lv]er]]
test3:[[out1[[in[si]de]]out2]] test4:this|not
becomes
test1:[[link]] test2:si[lv]er
test3:out1in[si]deout2 test4:this|not
Why try to do it all in one go. Remove the [[]] first and then deal with options, do it in two lines of code.
When trying to get something going favour clarity and simplicity.
Seems like you have all the pieces.
Why not just simply remove any brackets that are left?
$str = 'test1:[[link]] test2:[[gold|silver]] test3:[[out1[[inside]]out2]] test4:this|not';
$str = preg_replace('/\\[\\[(?:[^|\\]]+\\|)+([^\\]]+)\\]\\]/', '$1', $str);
$str = str_replace(array('[', ']'), '', $str);
Well, I didn't stick to just regex, because I'm of a mind that trying to do stuff like this with one big regex leads you to the old joke about "Now you have two problems". However, give something like this a shot:
$str = 'test1:[[link]] test2:[[gold|silver]] test3:[[out1[[inside]]out2]] test4:this|not'; $reg = '/(.*?):(.*?)( |$)/';
preg_match_all($reg, $str, $m);
foreach($m[2] as $pos => $match) {
if (strpos($match, '|') !== FALSE && strpos($match, '[[') !== FALSE ) {
$opt = explode('|', $match); $match = $opt[count($opt)-1];
}
$m[2][$pos] = str_replace(array('[', ']'),'', $match );
}
foreach($m[1] as $k=>$v) $result[$k] = $v.':'.$m[2][$k];
This is C# using only using non-escaped strings, hence you will have to double the backslashes in other languages.
String input = "test1:[[link]] " +
"test2:[[gold|silver]] " +
"test3:[[out1[[inside]]out2]] " +
"test4:this|not";
String step1 = Regex.Replace(input, #"\[\[([^|]+)\|([^\]]+)\]\]", #"[[$2]]");
String step2 = Regex.Replace(step1, #"\[\[|\]\]", String.Empty);
// Prints "test1:silver test3:out1insideout2 test4:this|not"
Console.WriteLine(step2);
$str = 'test1:[[link]] test2:[[gold|silver]] test3:[[out1[[inside]]out2]] test4:this|not';
$s = preg_split("/\s+/",$str);
foreach ($s as $k=>$v){
$v = preg_replace("/\[\[|\]\]/","",$v);
$j = explode(":",$v);
$j[1]=preg_replace("/.*\|/","",$j[1]);
print implode(":",$j)."\n";
}