regex function[filename] pattern and function[string_with_escaped_characters] pattern - php

I'm trying to script and parse a file,
Please help with regex in php to find and replace the following patterns:
From: "This is a foo[/www/bar.txt] within a foo[/etc/bar.txt]"
To: "This is a bar_txt_content within a bar2_txt_content"
Something along those lines:
$subject = "This is a foo[/www/bar.txt] within a foo[/etc/bar.txt]";
$pattern = '/regex-needed/';
preg_match($pattern, $subject, $matches);
foreach($matches as $match) {
$subject = str_replace('foo['.$match[0].']', file_get_contents($match[0]), $subject);
}
And my second request is to have:
From: 'This is a foo2[bar bar ] bar bar].'
To: "this is a returned"
Something along those lines:
$subject = 'This is a foo2[bar bar \] bar bar].';
$pattern = '/regex-needed/';
preg_match($pattern, $subject, $matches);
foreach($matches as $match) {
$subject = str_replace('foo2['.$match[0].']', my_function($match[0]), $subject);
}
Please help in constructing these patterns...

If you always have a structure like foo[ ... ]
Then is very easy:
foo\[([^]]+)\]
That is .NET syntax but i'm sure the expressions is simple enough for you to convert.
Description of the regex:
Match the characters “foo” literally «foo»
Match the character “[” literally «[»
Match the regular expression below and capture its match into backreference number 1 «([^]]+)»
Match any character that is NOT a “]” «[^]]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “]” literally «]»

Luc,
this should help you get started.
http://php.net/manual/en/function.preg-replace.php
You may have to setup a loop and increase the counter, using preg_replace with a limit of 1 to replace only the first instance.
In order to match foo[/www/bar.txt]:
the regex should be something like:
foo\[\/www\/([A-Za-z0-9]*)\.txt\]
The backslashes are there to cancel the special meaning of some characters in your regexp.
It will match foo[/www/.[some file name].txt, and ${1} will contain the filename without the .txt as brackets form groups which can be used in the replaced expression. ${1} will contain what was matched in the first round brackets, ${2} will contain what was matched in the second one, etc ...
Therefore your replaced expression should be something like "${1}_txt_content". Or in the second iteration "${1}2_txt_content".
[A-Za-z0-9]* means any alphanumeric character 0 or more times, you may want to replace the * with a + if you want at least 1 character.
So try:
$pattern = foo\[\/www\/([A-Za-z0-9]*)\.txt\];
$replace = "${1}_txt_content";
$total_count = 1;
do {
echo preg_replace($pattern, $replace, $subject, 1, $count);
$replace = "${1}" + ++$total_count + "_txt_content";
} while ($count != 0);
(warning, this is my first ever PHP program, so it may have mistakes as I cannot test it ! but I hope you get the idea)
Hope that helps !
Tony
PS: I am not a PHP programmer but I know this works in C#, for example, and looking at the PHP documentation it seems that it should work.
PS2: I always keep this website bookmarked for reference when I need it: http://www.regular-expressions.info/

$pattern = '/\[([^\]]+)\]/';
preg_match_all($pattern, $subject, $matches);
print_r($matches['1']);

found the correct regex I needed for escaping:
'/foo\[[^\[]*[^\\\]\]/'

Related

Find next word after colon in regex

I am getting a result as a return of a laravel console command like
Some text as: 'Nerad'
Now i tried
$regex = '/(?<=\bSome text as:\s)(?:[\w-]+)/is';
preg_match_all( $regex, $d, $matches );
but its returning empty.
my guess is something is wrong with single quotes, for this i need to change the regex..
Any guess?
Note that you get no match because the ' before Nerad is not matched, nor checked with the lookbehind.
If you need to check the context, but avoid including it into the match, in PHP regex, it can be done with a \K match reset operator:
$regex = '/\bSome text as:\s*'\K[\w-]+/i';
See the regex demo
The output array structure will be cleaner than when using a capturing group and you may check for unknown width context (lookbehind patterns are fixed width in PHP PCRE regex):
$re = '/\bSome text as:\s*\'\K[\w-]+/i';
$str = "Some text as: 'Nerad'";
if (preg_match($re, $str, $match)) {
echo $match[0];
} // => Nerad
See the PHP demo
Just come from the back and capture the word in a group. The Group 1, will have the required string.
/:\s*'(\w+)'$/

Php select from string

Hi I'm new to php and I need a little help
I need to change the text that is between ** in php string and put it between html tag
$text = "this is an *example*";
But I really don't know how and i need help
personally I would use explode, you can then piece the sentence back together if the example appears in the middle of a sentence
<?php
$text = "this is an *example*";
$pieces = explode("*", $text);
echo $pieces[0];
?>
Edit:
Since you're looking for what basically amounts to custom BB Code use this
$text = "this is an *example*";
$find = '~[\*](.*?)[\*]~s';
$replace = '<span style="color: green">$1</span>';
echo preg_replace($find,$replace,$text);
You can add this to a function and have it parse any text that gets passed to it, you can also make the find and replace variables into arrays and add more codes to it
You really should use a DOM parser for things like this, but if you can guaratee it will always be the * character you can use some regex:
$text = "this is an *example*";
$regex = '/(?<=\*)(.*?)(?=\*)/';
$replacement = 'ostrich';
$new_text = preg_replace($regex, $replacement, $text);
echo $new_text;
Returns
this is an *ostrich*
Here is how the regex works:
Positive Lookbehind (?<=\*)
\* matches the character * literally (case sensitive)
1st Capturing Group (.*?)
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=\*)
\* matches the character * literally (case sensitive)
This regex essentially starts and ends by looking at what is ahead of and behind the search character you specified and leaves those characters intact during the replacement with preg_replace().

Twitter handle regular expression PHP [duplicate]

i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username

preg_match_all regex issue for url routing

for an url routing I have
Patern :
/^\/stuff\/other-stuff\/(?:([^\/]\+?))$/i
Subject :
/stuff/other-stuff/foo-AB123456.html
why $num_matches is equal to 0 ??
$num_matches = preg_match_all($patern, $subject, $matches);
Help should be greatly appreciated :)
because of this:
[^\/]\+?
firstly there is no slash after other-stuff so you cannot find the sentence with a negated / secondly the + must not be escaped if you are doing this kind of match . + must only be escaped when you are doing a literal match.
the corrected regex should be :
^\/stuff\/other-stuff\/(?:(.+?))$
demo here : http://regex101.com/r/aV9cR0
will match foo-AB123456.html in the first capture
$patern= "#^/stuff/other-stuff/([^/]+)$#i";
$subject = "/stuff/other-stuff/foo-AB123456.html";
preg_match_all($patern, $subject, $matches);
print_r($matches[1]);
It looks to me like your regex could be simplified to something like:
(?i)^/stuff/other-stuff/[\w-.]+$
It would work like this:
<?php
$regex="~(?i)^/stuff/other-stuff/([\w-./]+)$~";
$string = "/stuff/other-stuff/foo-AB123456.html";
$hit = preg_match($regex,$string,$m);
echo $m[0]."<br />";
echo $m[1]."<br />";
?>
Output:
/stuff/other-stuff/foo-AB123456.html
foo-AB123456.html
Note that this could be done in a number of different ways.
Here are some details about the regex.
The ~ delimiter is nicer than the original / because you don't have to escape the slashes.
The parentheses in ([\w-.]+) capture the end of the url into Group 1. This is why $m[1] yields foo-AB123456.html
After the final slash, [\w-./]+ matches any number of letters or digits, underscores, dashes, dots and forward slashes. This is a "mini-spec" for what characters we expect there. If you want to allow anything at all, you could go with a simple dot.

preg_match all words start with an #?

i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username

Categories