PHP preg_match getting string between 2 chars - php

Please consider this string:
$string = 'hello world /foo bar/';
The end result I wish to obtain:
$result1 = 'hello world';
$result2 = 'foo bar';
What I've tried:
preg_match('/\/(.*?)\//', $string, $match);
Trouble is this only return "foo bar" and not "hello world". I can probably strip "/foo bar/" from the original string, but in my real use case that would take additional 2 steps.

The regular expression only matches what you tell it to match. So you need to have it match everything including the /s and then group the /s.
This should do it:
$string = 'hello world /foo bar/';
preg_match('~(.+?)\h*/(.*?)/~', $string, $match);
print_r($match);
PHP Demo: https://eval.in/507636
Regex101: https://regex101.com/r/oL5sX9/1 (delimiters escaped, in PHP usage changed the delimiter)
The 0 index is everything found, 1 the first group, 2 the second group. So between the /s is $match[2]; the hello world is $match[1]. The \h is any horizontal whitespace before the / if you want that in the first group remove the \h*. The . will account for whitespace (excluding new line unless specified with s modifier).

$result = explode("/", $string);
results in
$result[0] == 'hello world ';
$result[1] == 'foo bar';
You might want to replace the space in hello world. More info here: http://php.net/manual/de/function.explode.php

To solve this conversion issue, use below code.
$string = 'hello world /foo bar/';
$returnValue = str_replace(' /', '/', $string);
$result = explode("/", $returnValue);
If you want to print it, put below lines in your code.
echo $pieces[0]; // hello world
echo $pieces[1]; // foo bar
https://eval.in/507650

Related

How to not perform preg_replace if subject starts with quote

I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.
To combat this I'd like it to ignore the replacement if the link starts with a quote.
I think a positive lookahead may be needed but everything I've tried hasn't worked.
$string = 'test http://www.example.com';
$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $string);
var_dump($string);
The above outputs:
http://www.example.com">test</a> http://www.example.com
When it should output:
test http://www.example.com
You might get along with lookarounds.
Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:
(?<![">])\bhttps?://\S+\b
In PHP this would be:
<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .= 'But please leave me alone ';
$string .= '(https://www.google.com).';
$regex = '~ # delimiter
(?<![">]) # a neg. lookbehind
https?://\S+ # http:// or https:// followed by not a whitespace
\b # a word boundary
~x'; # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>
See a demo on ideone.com. However, maybe a parser is more appropriate.
Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:
<?php
$string = 'test http://www.example.com';
$rx = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
$rp = array("$1$2$3", "$2");
$string = preg_replace($rx,$rp, $string);
var_dump($string);
// DUMPS:
// 'testhttp://www.example.com'
The Idea
You can split your string at the already existing anchors, and only parse the pieces in between.
The Code
$input = 'test http://www.example.com';
// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);
// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {
// Because we return the delimiter in the results set,
// every $part with an uneven key is an anchor.
return $key % 2
? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $part)
: $part;
}, array_keys($parts), $parts);

Remove hashtags from the end of a sentence

I would like to remove all words from the end of a text that are starting with a space and # sign.
URLS or hashtags within a sentence should not be remove.
Example text:
hello world #dontremoveme foobar http://example.com/#dontremoveme #remove #removeme #removeüäüö
I tried this but it removes all hashtags:
$tweet = "hello world #dontremoveme foobar http://example.com/#dontremoveme #remove #removeme #removeüäüö";
preg_match_all("/(#\w+)/", $tweet, $matches);
var_dump( $matches );
My idea is to check every word starting at the end of the text for a leading # with a space in front, until it's no longer the case.
How to translate that into a regular expression?
You could use something like so: ( #[^# ]+?)+$ and replace it with an empty string.
An example is available here. Since you have non ASCII characters, the . operator (which matches any character) should help you tackle any character.
The following regex matches all words starting with a [Space]# at the end of the line.
/( #\S+)*$/g
https://regex101.com/r/eH4bJ2/1
This will do the job:
$tweet = "hello world #dontremoveme foobar http://example.com/#dontremoveme #remove #removeme #removeüäüö";
$res = preg_replace("/ #\p{L}+\b(?!\s+\p{L})/u", '', $tweet);
echo $res,"\n";
Output:
hello world #dontremoveme foobar http://example.com/#dontremoveme

Regex in PHP: Replacing text between strings

Okay I have made some progress on a problem I am solving, but need some help with a small glitch.
I need to remove all characters from the filenames in the specific path images/prices/ BEFORE the first digit, except for where there is from_, in which case remove all characters from the filename BEFORE from_.
Examples:
BEFORE AFTER
images/prices/abcde40.gif > images/prices/40.gif
images/prices/UgfVe5559.gif > images/prices/5559.gif
images/prices/wedsxcdfrom_88457.gif > images/prices/from_88457.gif
What I've done:
$pattern = '%images/(.+?)/([^0-9]+?)(from_|)([0-9]+?)\.gif%';
$replace = 'images/\\1/\\3\\4.gif';
$string = "AAA images/prices/abcde40.gif BBB images/prices/wedsxcdfrom_88457.gif CCC images/prices/UgfVe5559.gif DDD";
$newstring = str_ireplace('from_','733694521548',$string);
while(preg_match($pattern,$newstring)){
$newstring=preg_replace($pattern,$replace,$newstring);
}
$newstring=str_ireplace('733694521548','from_',$newstring);
echo "Original:\n$string\n\nNew:\n$newstring";
My expected output is:
AAA images/prices/40.gif BBB images/prices/from_88457.gif CCC images/prices/5559.gif DDD"
But instead I am getting:
AAA images/prices/40.gif BBB images/from_88457.gif CCC images/5559.gif DDD
The prices/ part of the path is missing from the last two paths.
Note that the AAA, BBB etc. portions are just placeholders. In reality the paths are scattered all across a raw HTML file parsed into a string, so we cannot rely on any pattern in between occurrences of the text to be replaced.
Also, I know the method I am using of substituting from_ is hacky, but this is purely for a local file operation and not for a production server, so I am okay with it. However if there is a better way, I am all ears!
Thanks for any assistance.
You can use lookaround assertions:
preg_replace('~(?<=/)(?:([a-z]+)(?=\d+\.gif)|(\w+)(?=from_))~i', '', $value);
Explanation:
(?<=/) # If preceded by a '/':
(?: # Begin group
([a-z]+) # Match alphabets from a-z, one or more times
(?=\d+\.gif) # If followed followed by digit(s) and '.gif'
| # OR
(\w+) # Match word characters, one or more times
(?=from_) # If followed by 'from_'
) # End group
Visualization:
Code:
$pattern = '~(?<=/)(?:([a-z]+)(?=\d+\.gif)|(\w+)(?=from_))~i';
echo preg_replace($pattern, '', $string);
Demo
You can use this regex for replacement:
^(images/prices/)\D*?(from_)?(\d+\..+)$
And use this expression for replacement:
$1$2$3
RegEx Demo
Code:
$re = '~^(images/prices/)\D*?(from_)?(\d+\..+)$~m';
$str = "images/prices/abcde40.gif\nimages/prices/UgfVe5559.gif\nimages/prices/wedsxcdfrom_88457.gif";
$result = preg_replace($re, '$1$2$3', $str);
You can try with Lookaround as well. Just replace with blank string.
(?<=^images\/prices\/).*?(?=(from_)?\d+\.gif$)
regex101 demo
Sample code: (directly from above site)
$re = "/(?<=^images\\/prices\\/).*?(?=(from_)?\\d+\\.gif$)/m";
$str = "images/prices/abcde40.gif\nimages/prices/UgfVe5559.gif\nimages/prices/wedsxcdfrom_88457.gif";
$subst = '';
$result = preg_replace($re, $subst, $str);
If string is not multi-line then use \b as word boundary instead of ^ and $ to match start and end of the line/string.
(?<=\bimages\/prices\/).*?(?=(from_)?\d+\.gif\b)
$arr = array(
'images/prices/abcde40.gif',
'images/prices/UgfVe5559.gif',
'images/prices/wedsxcdfrom_88457.gif'
);
foreach($arr as $str){
echo preg_replace('#images/prices/.*?((from_|\d).*)#i','images/prices/$1',$str);
}
DEMO
EDIT:
$str = 'AAA images/prices/abcde40.gif BBB images/prices/wedsxcdfrom_88457.gif CCC images/prices/UgfVe5559.gif DDD';
echo preg_replace('#images/prices/.*?((from_|\d).*?\s|$)#i','images/prices/$1',$str), PHP_EOL;

PHP string replace match whole word

I would like to replace just complete words using php
Example :
If I have
$text = "Hello hellol hello, Helloz";
and I use
$newtext = str_replace("Hello",'NEW',$text);
The new text should look like
NEW hello1 hello, Helloz
PHP returns
NEW hello1 hello, NEWz
Thanks.
You want to use regular expressions. The \b matches a word boundary.
$text = preg_replace('/\bHello\b/', 'NEW', $text);
If $text contains UTF-8 text, you'll have to add the Unicode modifier "u", so that non-latin characters are not misinterpreted as word boundaries:
$text = preg_replace('/\bHello\b/u', 'NEW', $text);
multiple word in string replaced by this
$String = 'Team Members are committed to delivering quality service for all buyers and sellers.';
echo $String;
echo "<br>";
$String = preg_replace(array('/\bTeam\b/','/\bfor\b/','/\ball\b/'),array('Our','to','both'),$String);
echo $String;
Result: Our Members are committed to delivering quality service to both buyers and sellers.
Array replacement list: In case your replacement strings are substituting each other, you need preg_replace_callback.
$pairs = ["one"=>"two", "two"=>"three", "three"=>"one"];
$r = preg_replace_callback(
"/\w+/", # only match whole words
function($m) use ($pairs) {
if (isset($pairs[$m[0]])) { # optional: strtolower
return $pairs[$m[0]];
}
else {
return $m[0]; # keep unreplaced
}
},
$source
);
Obviously / for efficiency /\w+/ could be replaced with a key-list /\b(one|two|three)\b/i.
You can also use T-Regx library, that quotes $ or \ characters while replacing
<?php
$text = pattern('\bHello\b')->replace($text)->all()->with('NEW');

preg_match all words start with an #?

i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username

Categories