How can I match the three words in the following string with a Perl compatible regular expression?
word1#$word2#$word3
I don't know the actual words "word1, word2 and word3" in advance. I only know the separator, which is #$.
And I can't use the word boundary as I have a multibyte encoding. This means for instance that the string can contain non-ASCII characters like umlauts which are not detected by the \w control character.
Try this regular expression:
/(\w+)#\$(\w+)#\$(\w+)/
Edit After your provided us with more information (see the comments to this answer):
/((?:[^#]+|#[^$])*)#\$((?:[^#]+|#[^$])*)#\$((?:[^#]+|#[^$])*)/
#!/usr/bin/perl
use strict;
use warnings;
my $x = 'word1#$word2#$word3';
print $_, "\n" for split /#\$/, $x;
$str = explode('#$', $str);
Regex is overkill for this.
A split function might be useful although it depends what you want to do with the line.
here is an example though.
my $line = "word1#$word2#$word3"
my #words = split('#$', $line)
This will work for any string that has 2 #
/([^#]+)\#\$([^#]+)\#\$([^#]+)/
/([^#]*?)#\$([^#]*?)#\$([^#]*)/
Related
$str="Your LaTeX document can \DIFaddbegin \DIFadd{test}\DIFaddend be easily
and the text can have multiple lines in it like this\DIFaddbegin \DIFadd{test2}
\DIFaddend"
I need to convert all \DIFaddbegin \DIFadd{test}\DIFaddend to \added{test}.
I tried
$o= preg_replace_callback('/\\DIFaddbegin\\s\DIFadd{(.*?)}\\DIFaddend/',
function($m) {return preg_replace('/$m[0]/','\added{$m[1]}',$m[0]);},$str);
But no luck. Which would be correct pattern for this? And also even if the string contains new line character the pattern should work.
You don't need a callback, using preg_replace() is fine for this task. To match a single backslash you need to double escape it meaning \\\\. To match possible whitespace between each substring, you can use \s* meaning whitespace "zero or more" times.
$str = preg_replace('~\\\\DIFaddbegin\s*\\\\DIFadd({[^}]*})\s*\\\\DIFaddend~', '\added$1', $str);
Try this:
$new_str = preg_replace("/\\\\DIFaddbegin \\\\DIFadd\{(.*)\}\\\\DIFaddend/s","\\added{\$1}",$str);
I am trying to parse a badly formed html table:
A couple of lines of this are:
Food:</b> Yes<b><br>
Pool: </b>Beach<b></b><b><br>
Centre:</b> Yes<b><br>
After spending a lot of time on this with Xpath, I think it is probably better to split the above text into lines use preg_split and parse from there.
The pattern I think would work uses:
<\b><\br>*: <\b>
my code is as follows:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split($pattern, $output);
print_r($chars);
I am getting the following error:
Delimiter must not be alphanumeric or backslash
What I am doing wrong?
Try this:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split('#'.$pattern.'#', $output);
print_r($chars);
The preg_quote function just makes it safely escaped, it doesn't actually add the delimiters for you.
As other people will surely point out, using regular expressions is not a good way to parse HTML :)
Your regular expression is also not going to match what you hope. Here's a version that will probably work for your input:
$in = " Pool: </b>Beach<b></b><b><br>";
$out = explode(':', strip_tags($in));
$key = trim($out[0]);
$value = trim($out[1]);
echo "$key = $value\n";
This removes all the HTML, then splits on the colon, and then removes any surrounding whitespace.
Your pattern needs to start and end with a delimiter; looks like you're using # if I'm reading this correctly, so you should have $pattern = '#</b></br>.*:</b>#';.
Also, you're mixing things up; * is not a simple wildcard in regex. If you mean "any number of any characters," the pattern you need is .*. I've included this above.
Let say I have the following string:
getPasswordLastChangedDatetime
How would I be able to split that up by capital letters so that I would be able to get:
get
Password
Last
Changed
Datetime
If you only care about ASCII characters:
$parts = preg_split("/(?=[A-Z])/", $str);
DEMO
The (?= ..) construct is called lookahead [docs].
This works if the parts only contain a capital character at the beginning. It gets more complicated if you have things like getHTMLString. This could be matched by:
$parts = preg_split("/((?<=[a-z])(?=[A-Z])|(?=[A-Z][a-z]))/", $str);
DEMO
Asked this a little too soon, found this:
preg_replace('/(?!^)[[:upper:]]/',' \0',$test);
For instance:
(?:^|\p{Lu})\P{Lu}*
No need to over complicated solution. This does it
preg_replace('/([A-Z])/',"\n".'$1',$string);
This doens't take care of acronyms of course
Use this: [a-z]+|[A-Z][a-z]* or \p{Ll}+|\p{Lu}\p{Ll}*
preg_split("/(?<=[a-z])(?=[A-Z])/",$password));
preg_split('#(?=[A-Z])#', 'asAs')
Can anyone give me a quick summary of the differences please?
To my mind, are they both doing the same thing?
str_replace replaces a specific occurrence of a string, for instance "foo" will only match and replace that: "foo". preg_replace will do regular expression matching, for instance "/f.{2}/" will match and replace "foo", but also "fey", "fir", "fox", "f12", etc.
[EDIT]
See for yourself:
$string = "foo fighters";
$str_replace = str_replace('foo','bar',$string);
$preg_replace = preg_replace('/f.{2}/','bar',$string);
echo 'str_replace: ' . $str_replace . ', preg_replace: ' . $preg_replace;
The output is:
str_replace: bar fighters, preg_replace: bar barhters
:)
str_replace will just replace a fixed string with another fixed string, and it will be much faster.
The regular expression functions allow you to search for and replace with a non-fixed pattern called a regular expression. There are many "flavors" of regular expression which are mostly similar but have certain details differ; the one we are talking about here is Perl Compatible Regular Expressions (PCRE).
If they look the same to you, then you should use str_replace.
str_replace searches for pure text occurences while preg_replace for patterns.
I have not tested by myself, but probably worth of testing. But according to some sources preg_replace is 2x faster on PHP 7 and above.
See more here: preg_replace vs string_replace.
This is my first question on this wonderful website.
Lets say I have a string $a="some text..%PROD% more text" There will be just one %..% in the string. I need to replace PROD between the % with another variable content. So I used to do:
$a = str_replace('%PROD%',$var,$a);
but now the PROD between % started coming in different cases. So I could expect prod or Prod. So I made the entire string uppercase before doing replacement. But the side effect is that other letters in the original string also became uppercase. Someone suggested me to use regular expression. But how ?
Thanks,
Rohan
You can make use of str_ireplace function. Its similar to str_replace but is case insensitive during matching.
$x = 'xxx';
$str = 'abc %Prod% def';
$str = str_ireplace('%PROD%',$x,$str); // $str is now "abc xxx def"
Just use str_ireplace(). It's a case-insensitive version of str_replace(), and much more efficient for a simple replacement than regular expressions (also much more straightforward).
You could use a regular expression, but PHP also conveniently has a case-insensitive version of str_replace, str_ireplace