i have this code
$a='-t40-';
preg_match('/^-t(.*?)-$/', $a,$match);
var_dump($match);
Result:
array(2) { [0]=> array(1) { [0]=> string(5) "-t40-" }
[1]=> array(1) { [0]=> string(2) "40" } }
if i add some text after last "-" code will not be valid.
if $a='-t40-some text'; i need a result similar with:
array(3) { [0]=> array(1) { [0]=> string(5) "-t40-" }
[1]=> array(1) { [0]=> string(2) "40" }
[2]=> array(1) { [0]=> string(9) "some text" }}
How to edit pattern to find "some text"?
Thanks in advance.
$a='-t40-some text';
preg_match('/^-t(.*?)-(.*?)$/', $a,$match);
var_dump($match);
Output:
array(3) {
[0]=>
string(14) "-t40-some text"
[1]=>
string(2) "40"
[2]=>
string(9) "some text"
}
Explanation:
^ : beginning of line
-t : literally "-t"
(.*?) : group 1, 0 or more any charater but newline, not greedy
- : literally "-"
(.*?) : group 2, 0 or more any charater but newline, not greedy
$ : end of line
Related
I'm splitting a string following this format:
| + anything goes here + single space
The following regular expression corresponds to said pattern:
/(\|\S*)/
Using preg_split with PREG_SPLIT_DELIM_CAPTURE oddly returns the delimiter into two parts. Is there a flag or option to combine these resulting outputs?
$string = "|one |two |three this is a phrase |four";
$result = preg_split('/(\|\S*)/', $string, NULL, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
What I get:
array(7) {
[0]=>
string(4) "|one"
[1]=>
string(1) " "
[2]=>
string(4) "|two"
[3]=>
string(1) " "
[4]=>
string(6) "|three"
[5]=>
string(18) " this is a phrase "
[6]=>
string(5) "|four"
}
What I want:
array(5) {
[0]=>
string(5) "|one "
[1]=>
string(5) "|two "
[2]=>
string(7) "|three "
[3]=>
string(17) "this is a phrase "
[4]=>
string(5) "|four"
}
Simply catch another whitespace at the end of the word, and you'll get this:
/(\|\S*\h*)/ || /(\|\S*\s*)/
So your code will be:
<?php
$string = "|one |two |three this is a phrase |four";
$result = preg_split('/(\|\S*\s*)/', $string, NULL, PREG_SPLIT_NO_EMPTY |
PREG_SPLIT_DELIM_CAPTURE);
var_dump ($result);
Regex 101: https://regex101.com/r/m5M7Dv/1
Result
array(5) { [0]=> string(5) "|one " [1]=> string(5) "|two " [2]=> string(7) "|three " [3]=> string(17) "this is a phrase " [4]=> string(5) "|four" }
im trying for few hours to find the right regular expression in php to match any language letters but to prevent it to allow space
i have try this
[^\p{L}]
this is ok but it look like it allow the space
then i have try this
[^\w_-]
and it still look that it allow space
anyone can help with this please ?
You need to specify the Unicode modifier u to get Unicode character properties in PCRE.
For example...
$pattern = "/([\p{L}]+)/u";
$string = "你好,世界!Привет мир! !مرحبا بالعالم";
if (preg_match_all($pattern, $string, $match)) {
var_dump($match);
}
Gives us...
array(2) {
[0]=>
array(6) {
[0]=>
string(6) "你好"
[1]=>
string(6) "世界"
[2]=>
string(12) "Привет"
[3]=>
string(6) "мир"
[4]=>
string(10) "مرحبا"
[5]=>
string(14) "بالعالم"
}
[1]=>
array(6) {
[0]=>
string(6) "你好"
[1]=>
string(6) "世界"
[2]=>
string(12) "Привет"
[3]=>
string(6) "мир"
[4]=>
string(10) "مرحبا"
[5]=>
string(14) "بالعالم"
}
}
I need to parse the following text:
First: 1
Second: 2
Multiline: blablablabla
bla2bla2bla2
bla3b and key: value in the middle if strting
Fourth: value
Value is a string OR multiline string, at the same time value could contain "key: blablabla" substring. Such subsctring should be ignored (not parsed as a separate key-value pair).
Please help me with regex or other algorithm.
Ideal result would be:
$regex = "/SOME REGEX/";
$matches = [];
preg_match_all($regex, $html, $matches);
// $mathes has all key and value parsed pairs, including multilines values
Thank you.
I tried with simple regexes but result is incorrect, because I don't know how to handle multilines:
$regex = "/(.+?): (.+?)/";
$regex = "/(.+?):(.+?)\n/";
...
You can do it with this pattern:
$pattern = '~(?<key>[^:\s]+): (?<value>(?>[^\n]*\R)*?[^\n]*)(?=\R\S+:|$)~';
preg_match_all($pattern, $txt, $matches, PREG_SET_ORDER);
print_r($matches);
You can sort of do it, as long as you consider a single word followed by a colon at the start of a line to be a new key start:
$data = 'First: 1
Second: 2
Multiline: blablablabla
bla2bla2bla2
bla3b and key: value in the middle if strting
Fourth: value';
preg_match_all('/^([a-z]+): (.*?)(?=(^[a-z]+:|\z))/ims', $data, $matches);
var_dump($matches);
This gives the following result:
array(4) {
[0]=>
array(4) {
[0]=>
string(10) "First: 1
"
[1]=>
string(11) "Second: 2
"
[2]=>
string(86) "Multiline: blablablabla
bla2bla2bla2
bla3b and key: value in the middle if strting
"
[3]=>
string(13) "Fourth: value"
}
[1]=>
array(4) {
[0]=>
string(5) "First"
[1]=>
string(6) "Second"
[2]=>
string(9) "Multiline"
[3]=>
string(6) "Fourth"
}
[2]=>
array(4) {
[0]=>
string(3) "1
"
[1]=>
string(3) "2
"
[2]=>
string(75) "blablablabla
bla2bla2bla2
bla3b and key: value in the middle if strting
"
[3]=>
string(5) "value"
}
[3]=>
array(4) {
[0]=>
string(7) "Second:"
[1]=>
string(10) "Multiline:"
[2]=>
string(7) "Fourth:"
[3]=>
string(0) ""
}
}
I want to grab all IDs (integers) from several URLs within a text. These URLs could look like these:
http://url.tld/index.php/p1
http://url.tld/p2#abc
http://url.tld/index.php/Page/3-xxx
http://url.tld/Page/4
For this, I've built two regexes (the URLs are enclosed by an URL bbcode):
\[url\](http\://url\.tld/index\.php/p(\d+).*?\)[/url\]
\[url\](http\://url\.tld(?:/index\.php)?/Page/(\d+).*?\)[/url\]
However, if i do a preg_match_all with every single regex, I get an array that looks like this (and which is correct):
array(3) {
[0]=>
array(2) {
[0]=>
string(62) "[url]http://url.tld/index.php/Page/6-fdgfh/[/url]"
[1]=>
string(50) "[url]http://url.tld/Page/7[/url]"
}
[1]=>
array(2) {
[0]=>
string(51) "http://url.tld/index.php/Page/6-fdgfh/"
[1]=>
string(39) "http://url.tld/Page/7"
}
[2]=>
array(2) {
[0]=>
string(1) "6"
[1]=>
string(1) "7"
}
}
But if I combine both regexes with a pipe:
\[url\](http\://url\.tld/index\.php/p(\d+).*?|http\://url\.tld(?:/index\.php)?/Page/(\d+).*?)\[/url\]
it builds an array like this (which is wrong):
array(4) {
[0]=>
array(3) {
[0]=>
string(71) "[url]http://url.tld/index.php/p9-abc#hashtag[/url]"
[1]=>
string(62) "[url]http://url.tld/index.php/Page/6-fdgfh/[/url]"
[2]=>
string(50) "[url]http://url.tld/Page/7[/url]"
}
[1]=>
array(3) {
[0]=>
string(60) "http://url.tld/index.php/t9-abc#hashtag"
[1]=>
string(51) "http://url.tld/index.php/Page/6-fdgfh/"
[2]=>
string(39) "http://url.tld/Page/7"
}
[2]=>
array(3) {
[0]=>
string(1) "9"
[1]=>
string(0) ""
[2]=>
string(0) ""
}
[3]=>
array(3) {
[0]=>
string(0) ""
[1]=>
string(1) "6"
[2]=>
string(1) "7"
}
}
====
So, my question is: How can I fix this? What I need is the array structure from the first example, while using both regular expressions as one regular expression, because I need a consistent structure to do a preg_replace_callback later.
I think you're looking for the Branch Reset group:
\[url]((?|http://url\.tld/index\.php/p(\d+).*?|http://url\.tld(?:/index\.php)?/Page/(\d+).*?))\[/url]
Or, for the line-noise-challenged among us:
\[url]
(
(?|
http://url\.tld/index\.php/p(\d+)[^[]*
|
http://url\.tld(?:/index\.php)?/Page/(\d+)[^[]*
)
)
\[/url]
This captures the numbers in group #2, no matter which part of the regex matched it. The whole URL is still captured in group #1.
<!--:en-->Apvalus šviestuvas<!--:-->
<!--:ru-->Круглый Светильник<!--:-->
<!--:lt-->Round lighting<!--:-->
I need get the content between <!--:lt--> and <!--:-->
I have tried:
$string = "<!--:en-->Apvalus šviestuvas<!--:--><!--:ru-->Круглый Светильник<!--:--><!--:lt-->Round lighting<!--:-->";
preg_match('<!--:lt-->+[a-zA-Z0-9]+<!--:-->$', $string, $match);
var_dump($match);
Something is wrong with the syntax and logic. How can I make this work?
preg_match("/<!--:lt-->([a-zA-Z0-9 ]+?)<!--:-->/", $string, $match);
added delimiters
added a match group
added ? to make it ungreedy
added [space] (there is a space in Round lighting)
Your result should be in $match[1].
A cooler and more generic variation is:
preg_match_all("/<!--:([a-z]+)-->([^<]+)<!--:-->/", $string, $match);
Which will match all of them. Gives:
array(3) { [0]=> array(3) { [0]=> string(37) "Apvalus šviestuvas" [1]=> string(53) "Круглый Светильник" [2]=> string(32) "Round lighting" } [1]=> array(3) { [0]=> string(2) "en" [1]=> string(2) "ru" [2]=> string(2) "lt" } [2]=> array(3) { [0]=> string(19) "Apvalus šviestuvas" [1]=> string(35) "Круглый Светильник" [2]=> string(14) "Round lighting" } }
Use this Pattern (?<=<!--:lt-->)(.*)(?=<!--:-->)
<?php
$string = "<!--:en-->Apvalus šviestuvas<!--:--><!--:ru-->Круглый Светильник<!--:--><!--:lt-->Round lighting<!--:-->";
preg_match('~(?<=<!--:lt-->)(.*)(?=<!--:-->)~', $string, $match);
var_dump($match);