The capturing group in the regular expression isn't output - php

<?php
$string = 'This is my regular expression';
$array = array();
preg_match('/^.*((my)? regular (expression)?)$/i', $string, $array);
var_dump($array);
?>
After execution of this script I have:
array (size=4)
0 => string 'This is my regular expression' (length=29)
1 => string ' regular expression' (length=19)
2 => string '' (length=0)
3 => string 'expression' (length=10)
Why it doesn't output capturing group (my)?

That is because you have a greedy quantifier .* before it. You should instead use a non greedy quantifier .*?.
Do it as follows instead:
<?php
$string = 'This is my regular expression';
$array = array();
preg_match('/^.*?((my)? regular (expression)?)$/i', $string, $array);
var_dump($array);
?>
DEMO
[OUTPUT]
array (size=4)
0 => string 'This is my regular expression' (length=29)
1 => string 'my regular expression' (length=21)
2 => string 'my' (length=2)
3 => string 'expression' (length=10)

Related

How to combine Regex with removing space and # character

I have this array, that I need to remove white spaces and # hashtag character:
array (size=7)
0 => string 'darwin' (length=6)
1 => string ' #nature' (length=8)
2 => string ' explore' (length=8)
3 => string ' galapagos' (length=10)
4 => string 'karma' (length=5)
foreach ($feedSinglePosts["hashtags_list"] as $key=>&$item) {
$item = preg_replace('/(\s|^)/', '', $item);
$item = preg_replace('/\#+/', '', $item);
}
The Regex above works well but I want to make it one line if possible.
When I do: /(\s|^)\#+/ it outputs this:
array (size=7)
0 => string 'darwin' (length=6)
1 => string 'nature' (length=6)
2 => string ' explore' (length=8)
3 => string ' galapagos' (length=10)
4 => string 'karma' (length=5)
How to make the regex on liner that removes white spaces and3 hashtag.
It appears that the characters will be at the beginning or end. If so then no need for loops or regex:
array_walk($feedSinglePosts["hashtags_list"], function(&$v) { $v = trim($v, "\n\r #"); });
If you need to remove them anywhere:
$feedSinglePosts["hashtags_list"] = str_replace(["\n","\r"," ","#"], "", $feedSinglePosts["hashtags_list"]);
A non-regex way with array_walk() and trim(),
<?php
$array = ['darwin' ,' #nature',' explore', ' galapagos','karma'];
function remove_hash_space(&$value,$key){
$value = trim($value,'# ');
}
array_walk($array, 'remove_hash_space');
print_r($array);
?>
DEMO: https://3v4l.org/nOadH
OR with single line array_map(),
$array = array_map(function($e){return trim($e,'# ');},$array);
DEMO: https://3v4l.org/OaS1F
You may use
$arr = ['darwin',' #nature',' explore',' galapagos','karma'];
print_r( preg_replace('~^[\s#]+~', '', $arr) );
// => Array ( [0] => darwin [1] => nature [2] => explore [3] => galapagos [4] => karma )
See the regex demo
The ^[\s#]+ pattern matches 1 or more occurrences (+) of whitespace or # characters ([\s#]) at the start of the string (^).
If your strings may contain some wierd Unicode whitespace, consider adding the u modifier: ~^[\s#]+~u.
If you only need to handle horizontal whitespace, replace \s with \h.

PHP Regex Match getting unexpected output

I'm trying to create a simple PHP script that retrieves info from a string and puts it into an array. Ive looked around on some sites on multi capture regex for one pattern but can't seem to get the output im looking for
Currently this is my script.
$input = "username: jack number: 20";
//$input = file_get_contents("test.txt");
preg_match_all("/username: ([^\s]+)|number: ([^\s]+)/", $input, $data);
var_dump($data);
Which produces this output:
0 =>
array (size=2)
0 => string 'username: jack' (length=14)
1 => string 'number: 20' (length=10)
1 =>
array (size=2)
0 => string 'jack' (length=4)
1 => string '' (length=0)
2 =>
array (size=2)
0 => string '' (length=0)
1 => string '20' (length=2)
Im looking to get the data into the form of:
0 =>
array (size=x)
0 => string 'jack'
1 =>
array (size=x)
0 => string '20'
Or two different arrays where the keys correspond to the same user/number combo
You can use match-reset \K:
preg_match_all('/\b(?:username|number):\h*\K\S+/', $input, $data);
print_r($data[0]);
Array
(
[0] => jack
[1] => 20
)
RegEx Breakup:
\b => a word boundary
(?:username|number) => matches username or number. (?:..) is non-capturing group
:\h* => matches a colon followed optional horizontal spaces
\K => match reset, causes regex engine to forget matched data
\S+ => match 1 or more non-space chars
Or else you can use a capturing group to get your matched data like this:
preg_match_all('/\b(?:username|number):\h*(\S+)/', $input, $data);
print_r($data[1]);
Array
(
[0] => jack
[1] => 20
)
(?<=username:|number:)\s*(\S+)
You can use lookbehind here.See demo.
https://regex101.com/r/mG8kZ9/10

how to pull elseif - preg_match_all

I need advise how to pull content from this string.
$string = "{elseif "xxx"=="xxx"} text {elseif "xx2"!="xx2"}
text text
text
{elseif ....} text";
//or 'xxx'=='xxx'
$regex = "??";
preg_match_all($regex, $string, $out, PREG_SET_ORDER);
var_dump($out);
And my idea of ​​var_dump output is:
array
0 =>
array
0 => string 'xxx' (length=3)
1 => string '==' (length=2)
2 => string 'xxx' (length=3)
3 => string 'text' (length=4)
1 =>
array
1 => string 'xx2' (length=)
2 => string '!=' (length=)
3 => string 'xx2' (length=)
4 => string 'text text
text' (length=)
2 =>
array
...
The output need not necessarily be as follows, but the same content.
my attempt:
$regex = "~{elseif ([\"\'](.*)[\"\'])(!=|==|===|<=|<|>=|>)([\"\'](.*)[\"\'])}(.*)~sU";
But I have bad or no output content.
Do you mean something like this? If you want to test it.
$regex = "/\{\s*elseif\s*(\"[^"]+\")\s*([^"]+)\s*(\"[^"]+\")\s*\}\s*([^{]*)\s*/gi";

the fastest way to replace (and store in array) links in the text with their order numbers

There is a $str string that may contain html text including <a >link</a> tags.
I want to store links in array and set the proper changes in the $str.
For example, with this string:
$str="some text <a href='/review/'>review</a> here <a class='abc' href='/about/'>link2</a> hahaha";
we get:
linkArray[0]="<a href='/review/'>review</a>";
positionArray[0] = 10;//position of the first link in the string
linkArray[1]="<a class='abc' href='/about/'>link2</a>";
positionArray[1]=45;//position of the second link in the string
$changedStr="some text [[0]] here [[1]] hahaha";
Is there any faster way (the performance) to do that, than running through the whole string using for?
this can be done by preg_match_all with PREG_OFFSET_CAPTURE FLAG.
e.g.
$str="some text <a href='/review/'>review</a> here <a class='abc' href='/about/'>link2</a> hahaha";
preg_match_all("|<[^>]+>(.*)</[^>]+>|U",$str,$out,PREG_OFFSET_CAPTURE);
var_dump($out);
Here the output array is $out. PREG_OFFSET_CAPTURE captures the offset in the string where the pattern starts.
The above code will output:
array (size=2)0 =>
array (size=2)
0 =>
array (size=2)
0 => string '<a href='/review/'>review</a>' (length=29)
1 => int 10
1 =>
array (size=2)
0 => string '<a class='abc' href='/about/'>link2</a>' (length=39)
1 => int 45
1 =>
array (size=2)
0 =>
array (size=2)
0 => string 'review' (length=6)
1 => int 29
1 =>
array (size=2)
0 => string 'link2' (length=5)
1 => int 75
for more information you can click on the link http://php.net/manual/en/function.preg-match-all.php
for $changedStr:
let $out be the output string from preg_match_all
$count= 0;
foreach($out[0] as $result) {
$temp=preg_quote($result[0],'/');
$temp ="/".$temp."/";
$str =preg_replace($temp, "[[".$count."]]", $str,1);
$count++;
}
var_dump($str);
This gives the output :
string 'some text [[0]] here [[1]] hahaha' (length=33)
I would use a regular expression to do such, check this:
http://weblogtoolscollection.com/regex/regex.php
try them here:
http://www.solmetra.com/scripts/regex/index.php
And use this:
http://php.net/manual/en/function.preg-match-all.php
Find your best regular expression to solve every case you may find: preg_match_all, if you set the pattern correctly, will return you an array containing every link you desire.
Edit:
In your case, assuming you want to keep the "<a>", this may work:
$array = array();
preg_match_all('/<a.*.a>/', '{{your data}}', $arr, PREG_PATTERN_ORDER);
Input example:
test
Lkdlasdk
llkdla
xx
Output with the above regexp:
Array
(
[0] => Array
(
[0] => test
[1] => Lkdlasdk
[2] => xx
)
)
Hope this helps

php - How do I convert a string to an associative array of its keywords

take this string as an example: "will see you in London tomorrow and Kent the day after tomorrow".
How would I convert this to an associative array that contains the keywords as keys, whilst preferably missing out the common words, like this:
Array ( [tomorrow] => 2 [London] => 1 [Kent] => 1)
Any help greatly appreciated.
I would say you could :
split the string into an array of words
with explode
or preg_split
depending on the complexity you'll accept for your words separators
use array_filter to only keep the lines (i.e. words) you want
the callback function will have to return false for all non-valid-words
and, then, use array_count_values on the resulting list of words
which will count how many times each words is present in the array of words
EDIT : and, just for fun, here's a quick example :
First of all, the string, that gets exploded into words :
$str = "will see you in London tomorrow and Kent the day after tomorrow";
$words = preg_split('/\s+/', $str, -1, PREG_SPLIT_NO_EMPTY);
var_dump($words);
Which gets you :
array
0 => string 'will' (length=4)
1 => string 'see' (length=3)
2 => string 'you' (length=3)
3 => string 'in' (length=2)
4 => string 'London' (length=6)
5 => string 'tomorrow' (length=8)
6 => string 'and' (length=3)
7 => string 'Kent' (length=4)
8 => string 'the' (length=3)
9 => string 'day' (length=3)
10 => string 'after' (length=5)
11 => string 'tomorrow' (length=8)
Then, the filteting :
function filter_words($word) {
// a pretty simple filter ^^
if (strlen($word) >= 5) {
return true;
} else {
return false;
}
}
$words_filtered = array_filter($words, 'filter_words');
var_dump($words_filtered);
Which outputs :
array
4 => string 'London' (length=6)
5 => string 'tomorrow' (length=8)
10 => string 'after' (length=5)
11 => string 'tomorrow' (length=8)
And, finally, the counting :
$counts = array_count_values($words_filtered);
var_dump($counts);
And the final result :
array
'London' => int 1
'tomorrow' => int 2
'after' => int 1
Now, up to you to build up from here ;-)
Mainly, you'll have to work on :
A better exploding function, that deals with ponctuation (or deal with that during filtering)
An "intelligent" filtering function, that suits your needs better than mine
Have fun !
You could have a table of common words, then go through your string one word at a time, checking if it exists in the table, if not, then add it to your associative array, or +1 to it if it already exists.
using a blacklist of words not to be included
$str = 'will see you in London tomorrow and Kent the day after tomorrow';
$skip_words = array( 'in', 'the', 'will', 'see', 'and', 'day', 'you', 'after' );
// get words in sentence that aren't to be skipped and count their values
$words = array_count_values( array_diff( explode( ' ', $str ), $skip_words ) );
print_r( $words );

Categories