Can someone help me with a regular expression to get the year and month from a text string?
Here is an example text string:
http://www.domain.com/files/images/2012/02/filename.jpg
I'd like the regex to return 2012/02.
This regex pattern would match what you need:
(?<=\/)\d{4}\/\d{2}(?=\/)
Depending on your situation and how much your strings vary - you might be able to dodge a bullet by simply using PHP's handy explode() function.
A simple demonstration - Dim the lights please...
$str = 'http://www.domain.com/files/images/2012/02/filename.jpg';
print_r( explode("/",$str) );
Returns :
Array
(
[0] => http:
[1] =>
[2] => www.domain.com
[3] => files
[4] => images
[5] => 2012 // Jack
[6] => 02 // Pot!
[7] => filename.jpg
)
The explode() function (docs here), splits a string according to a "delimiter" that you provide it. In this example I have use the / (slash) character.
So you see - you can just grab the values at 5th and 6th index to get the date values.
Related
I tried multiple time to make a pattern that can validate given string is natural number and split into single number.
..and lack of understanding of regex, the closest thing that I can imagine is..
^([1-9])([0-9])*$ or ^([1-9])([0-9])([0-9])*$ something like that...
It only generates first, last, and second or last-second split-numbers.
I wonder what I need to know to solve this problem.. thanks
You may use a two step solution like
if (preg_match('~\A\d+\z~', $s)) { // if a string is all digits
print_r(str_split($s)); // Split it into chars
}
See a PHP demo.
A one step regex solution:
(?:\G(?!\A)|\A(?=\d+\z))\d
See the regex demo
Details
(?:\G(?!\A)|\A(?=\d+\z)) - either the end of the previous match (\G(?!\A)) or (|) the start of string (^) that is followed with 1 or more digits up to the end of the string ((?=\d+\z))
\d - a digit.
PHP demo:
$re = '/(?:\G(?!\A)|\A(?=\d+\z))\d/';
$str = '1234567890';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => 7
[7] => 8
[8] => 9
[9] => 0
)
Let's take an example of following string:
$string = "length:max(260):min(20)";
In the above string, :max(260):min(20) is optional. I want to get it if it is present otherwise only length should be returned.
I have following regex but it doesn't work:
/(.*?)(?::(.*?))?/se
It doesn't return anything in the array when I use preg_match function.
Remember, there can be something else than above string. Maybe like this:
$string = "number:disallow(negative)";
Is there any problem in my regex or PHP won't return anything? Dumping preg_match returns int 1 which means the string matches the regex.
Fully Dumped:
int 1
array (size=2)
0 => string '' (length=0)
1 => string '' (length=0)
You're using single character (.) matching in the case of being lazy, at the very beginning. So it stops at the zero position. If you change your preg_match function to preg_match_all you'll see the captured groups.
Another problem is with your Regular Expression. You're killing the engine. Also e modifier is deprecated many many decades before!!! and yet it was used in preg_replace function only.
Don't use s modifier too! That's not needed.
This works at your case:
/([^:]+)(:.*)?/
Online demo
I tried to prepare a regex which can probably solve your issue and also add some value to it
this regex will not only match the optional elements but will also capture in key value pair
Regex
/(?<=:|)(?'prop'\w+)(?:\((?'val'.+?)\))?/g
Test string
length:max(260):min(20)
length
number:disallow(negative)
Result
MATCH 1
prop [0-6] length
MATCH 2
prop [7-10] max
val [11-14] 260
MATCH 3
prop [16-19] min
val [20-22] 20
MATCH 4
prop [24-30] length
MATCH 5
prop [31-37] number
MATCH 6
prop [38-46] disallow
val [47-55] negative
try demo here
EDIT
I think I understand what you meant by duplicate array with different key, it was due to named captures eg. prop & val
here is the revision without named capturing
Regex
/(?<=:|)(\w+)(?:\((.+?)\))?/
Sample code
$str = "length:max(260):min(20)";
$str .= "\nlength";
$str .= "\nnumber:disallow(negative)";
preg_match_all("/(?<=:|)(\w+)(?:\((.+?)\))?/",
$str,
$matches);
print_r($matches);
Result
Array
(
[0] => Array
(
[0] => length
[1] => max(260)
[2] => min(20)
[3] => length
[4] => number
[5] => disallow(negative)
)
[1] => Array
(
[0] => length
[1] => max
[2] => min
[3] => length
[4] => number
[5] => disallow
)
[2] => Array
(
[0] =>
[1] => 260
[2] => 20
[3] =>
[4] =>
[5] => negative
)
)
try demo here
i was trying to create a regular expressions to extract all MP3/OGG links from a example word but i could't! this is a example word that i'm trying to extract MP3/OGG files from it:
this is a example word http://domain.com/sample.mp3 and second file is https://www.mydomain.com/sample2.ogg. then this is a link for third file Download
and PHP part:
$Word = "this is a example word http://domain.com/sample.mp3 and second file is https://www.mydomain.com/sample2.ogg. then this is a link for third file Download";
$Pattern = '/href=\"(.*?)\".mp3/';
preg_match_all($Pattern,$Word,$Matches);
print_r($Matches);
i tried this too:
$Pattern = '/href="([^"]\.mp3|ogg)"/';
$Pattern = '/([-a-z0-9_\/:.]+\.(mp3|ogg))/i';
so i need your help to fix this code and extract all MP3/OGG links from that example word.
Thank you guys.
To retrieve all links, you can use:
((https?:\/\/)?(\w+?\.)+?(\w+?\/)+\w+?.(mp3|ogg))
Demo.
((https?:\/\/)? Optional http:// or https://
(\w+?\.)+? Matches domain groups
(\w+?\/)+ Matches the final domain group and forward slash
\w+?.(mp3|ogg)) Matches a filename ending in .mp3 or .ogg.
In the string you provided there are several unescaped quotation marks, when corrected and my regex added in:
$Word = "this is a example word http://domain.com/sample.mp3 and second file is https://www.mydomain.com/sample2.ogg. then this is a link for third file Download";
$Pattern = '/((https?:\/\/)?(\w+?\.)+?(\w+?\/)+\w+?.(mp3|ogg))/im';
preg_match_all($Pattern,$Word,$Matches);
var_dump($Matches[0]);
Produces the following output:
array (size=3)
0 => string 'http://domain.com/sample.mp3' (length=28)
1 => string 'https://www.mydomain.com/sample2.ogg' (length=36)
2 => string 'http://seconddomain.com/files/music.mp3' (length=39)
..extract all MP3/OGG links from that example word.
e.g.:
(?<=https?://(.+)?)\.(mp3|ogg)
$1 - uri
$2 - extension
Updated:
:( yes, on the PHP (v5.5 tested) search with:
(?<=https?://(.+)?)\.(mp3|ogg)
there are restrictions:
Compilation failed: lookbehind assertion is not fixed length at offset n
so, the similar variant:
(?<=p1(.+)?)p2 - match p2 if matched p1 before
p2(?=(.+)p3) - match p2 if matched p3 after - all working with not fixed length ~ .+? for PHP
for your sample:
//p2(?=.*p3)
preg_match_all("#https?://(?=(.+?)\.(mp3|ogg))#im", $Word, $Matches);
/*
[0] => Array
(
[0] => http://
[1] => https://
[2] => http://
)
[1] => Array
(
[0] => domain.com/sample
[1] => www.mydomain.com/sample2
[2] => seconddomain.com/files/music
)
[2] => Array
(
[0] => mp3
[1] => ogg
[2] => mp3
)
*/
For the life of me, I can't figure out how to write the regex to split this.
Lets say we have the sample text:
15HGH(Whatever)ASD
I would like to break it down into the following groups (numbers, letters by themselves, and parenthesis contents)
15
H
G
H
Whatever
A
S
D
It can have any combination of the above such as:
15HGH
12ABCD
ABCD(Whatever)(test)
So far, I have gotten it to break apart either the numbers/letters or just the parenthesis part broken away. For example, in this case:
<?php print_r(preg_split( "/(\(|\))/", "5(Test)(testing)")); ?>
It will give me
Array
(
[0] => 5
[1] => Test
[2] => testing
)
I am not really sure what to put in the regex to match on only numbers and individual characters when combined. Any suggestions?
I don't know if preg_match_all satisfying you:
$text = '15HGH(Whatever)ASD';
preg_match_all("/([a-z]+)(?=\))|[0-9]+|([a-z])/i", $text, $out);
echo '<pre>';
print_r($out[0]);
Array
(
[0] => 15
[1] => H
[2] => G
[3] => H
[4] => Whatever
[5] => A
[6] => S
[7] => D
)
I've got this: Example (I don't know how is written the \n) but the substitution is working.
(\d+|\w|\([^)]++\)) Not too much to explain, first tries to get a number, then a char, and if there's nothing there, tries to get a whole word between parentheses. (They can't be nested)
Check this out using preg_match_all():
$string = '15HGH(Whatever)(Whatever)ASD';
preg_match_all('/\(([^\)]+)\)|(\d+)|([a-z])/i', $string, $matches);
$results = array_merge(array_filter($matches[1]),array_filter($matches[2]),array_filter($matches[3]));
print_r($results);
\(([^\)]+)\) --> Matches everything between parenthesis
\d+ --> Numbers only
[a-z] --> Single letters only
i --> Case insensitive
Greetings All
I am trying to get the values in the 4th column from the left for this url. I can get all the values but it skips the first one (e.g. 30 i think is the value on top right now )
My regex is
~<td align="center" class="row2">.*([\d,]+).*</td>~isU
NOTE: HTML PARSING IS NOT AN OPTION RIGHT NOW AS THIS IS PART OF A HUGE SYSTEM AND CANNOT
BE CHANGED
Thanking you
Imran
You could just use:
/([\d,]+)/
As the javascript function can be exploited as a "regex selection point"
If you want your regex to work you need to use non-greedy expression, i.e. change .* to .*?
Also your first align match attribute in the HTML is surrounded in '' quotation marks, not "" in the HTML, for some weird inconsistent reason. Try this:
|<td align=["\']center["\'] class="row2">.*?([\d,]+).*?</td>|is
Edit:
$a = file_get_contents('http://www.zajilnet.com/forum/index.php?showforum=31');
preg_match_all('|<td align=["\']center["\'] class="row2">.*?([\d,]+).*?</td>|is',$a,$m);
print_r($m[1]);
Result:
Array
(
[0] => 30
[1] => 16
[2] => 56
[3] => 14
[4] => 96
[5] => 4
[6] => 0
[7] => 17
[.... and more....]