Ignore character using regex (pcre) PHP - php

I am trying to capture date from a file file-2018-02-19-second.json.
I am using .*file-(..........).*.json regex to capture the date in the file .The regex is capturing 2018-02-19 date in the file but I want to ignore "-" in the file and only capture 20180219. How can I do it?

If your filenames have always the same format, you can convert your string to a DateTime instance using DateTime::createFromFormat:
$date = DateTime::createFromFormat('*-Y-m-d-*.*', 'file-2018-02-19-second.json');
echo $date->format('Ymd');
You can find the different format characters and their meanings in the php manual.

$fileName = 'file-2018-02-19-second.json';
preg_match('#([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))#is', $fileName,
$output);
if (!empty($output)) {
$date = preg_replace('#-#is', '', $output[1]);
echo $date;
}
hope can help you!
related link: https://www.regextester.com/96683

Option 1 - Match & Replace
See code in use here
<?php
$fn = 'file-2018-02-1-second.json';
$fn = preg_match('/.*file-\K\d{4}-\d{2}-\d{2}/', $fn, $o);
echo isset($o[0]) ? preg_replace('/\D+/', '', $o[0]) : 'No match found';
Option 2 - Group & Concatenate
See code in use here
<?php
$fn = 'file-2018-02-1-second.json';
$fn = preg_match('/.*file-(\d{4})-(\d{2})-(\d{2})/', $fn, $o);
echo isset($o[1], $o[2], $o[3]) ? $o[1].$o[2].$o[3] : 'No match found';
Explanation of Patterns
.*file-\K\d{4}-\d{2}-\d{2}
.* Match any character any number of times
file- Match this literally
\K Resets the starting point of the match. Any previously consumed characters are no longer included in the final match.
\d{4} Match any digit exactly 4 times
- Match this literally
\d{2} Match any digit exactly 2 times
- Match this literally
\d{2} Match any digit exactly 2 times
The second pattern \D+ simply matches any non-digit character one or more times for replacement.
The last pattern (from option 2) is really just the simplified version of the first pattern I described, but groups each number part into capture groups.
Result: 20180219

This question appears to be solely about data extract, not about data validation.
Code: (Demo)
$file = 'file-2018-02-19-second.json';
$date = preg_replace('~\D+~', '', $file);
echo $date;
Output:
20180219
If you need slightly more accuracy/validation than that (leveraging the location of file-, you can use \G to extract the date components before imploding them.
Code: (Demo) (Pattern Demo)
$file = 'somefile-2018-02-19-second.json';
echo preg_match_all('~\Gfile-\K\d+|-\K\d+~', $file, $out) ? implode($out[0]) : 'fail';
// same result as earlier method

Related

Replace all occurrences using preg_replace

The code below works perfectly:
$string = '(test1)';
$new = preg_replace('/^\(+.+\)+$/','word',$string);
echo $new;
Output:
word
If the code is this:
$string = '(test1) (test2) (test3)';
How to generate output:
word word word?
Why my regex do not work ?
^ and $ are anchors which means match should start from start of string and expand upto end of string
. means match anything except newline, + means one or more, by default regex is greedy in nature so it tries to match as much as possible where as we want to match ( ) so we need to change the pattern a bit
You can use
\([^)]+\)
$string = '(test1) (test2) (test3)';
$new = preg_replace('/\([^)]+\)/','word',$string);
echo $new;
Regex Demo

Regular expression currency format with dots and comma

My goal is getting something like that: 150.000,54 or 48.876,05 which means my commas are decimal starters.
Here's my code so far :
<?php
//cut numbers after comma if there are any, after 2 digits
$matchPattern = '/[0-9]+(?:\,[0-9]{2}){0,2}/';
//remove everything except numbers, commas and dots
$repl1 = preg_replace("/[^a-zA-Z0-9,.]/", "", $input);
//let there be a 0 before comma to have values like 0,75
$repl2 = preg_replace("/^[0]{1}$/", "",$repl1);
//now i need you here to help me for the expression putting dots after each 3 numbers, until the comma:
$repl3 = preg_replace("/regexphere$/", ".", $repl2);
preg_match($matchPattern, $repl3, $matches);
echo($matches[0]);
?>
I know preg_replacing 3 times is stupid but I am not good at writing regular expressions. If you have a better idea, don't just share it but also explain. I know a little of the types : http://regexone.com/lesson/0
Thank you in advance.
--------UPDATE--------
So I need to handle 0000,45 like inputs to 0,45 and like 010101,84 inputs to 1,84
When this is done, I'm done.
$input = Input::get('userinput');
$repl1 = preg_replace("/[^0-9,.]/", "", $input);
$repl2 = preg_replace("/^0/", "",$repl1);
$repl3 = str_replace(".","",$repl2);
preg_match('/[0-9]+(?:\,[0-9]{2}){0,2}/', $repl3, $matches);
$repl4 = preg_replace('/(\d)(?=(\d{3})+(?!\d))/', '$1.', $matches[0]);
return repl4;
----UPDATE----
Here's what i get so far : https://ideone.com/5qmslB
I just need to remove the zeroes before the comma, before the numbers.
I am not sure this is the best way, but I hope it is helpful.
Here is the updated code that I used with a fake $input:
<?php
$input = "textmdwrhfejhg../,2222233333,34erw.re.ty";
//cut numbers after comma if there are any, after 2 digits
$matchPattern = '/[0-9]+(?:\,[0-9]{2}){0,2}/';
//remove everything except numbers, commas and dots
$repl1 = trim(preg_replace("/[^0-9,.]/", "", $input), ".,");
echo "$repl1" . "\n";
//let there be a 0 before comma to have values like 0,75, remove the 0
$repl2 = preg_replace("/^0/", "",$repl1);
echo "$repl2" . "\n";
//The expression putting dots after each 3 numbers, until the comma:
$repl3 = preg_replace('/(\d)(?=(?:\d{3})+(?!\d))/', '$1.', $repl2);
echo "$repl3" . "\n";
The expression putting dots after each 3 numbers is
(\d)(?=(?:\d{3})+(?!\d))
Here, you can see how it works. In plain human,
(\d) - A capturing group that we'll use in the replacement pattern, matching a single digit that....
(?=(?:\d{3})+(?!\d)) - is followed by groups of 3 digits. External (?=...) is a look-ahead construction that checks but does not consume characters, (?:\d{3})+ is a non-capturing group (no need to keep the matched text in memory) that matches 3 digits exactly (due to the limiting quantifier {3}) 1 or more times (due to the + quantifier), and (?!\d) is a negative look-ahead checking that the next character after the last matched 3-digit group is not a digit.
This does not work in case we have more than 3 digits after a decimal separator. With regex, I can only think of a way to support 4 digits after decimal with (?<!,)(\d)(?=(?:\d{3})+(?!\d)). Not sure if there is a generic way without variable-width look-behind in PHP (as here, we also need a variable-width look-ahead, too). Thus, you might consider splitting the $repl2 value by comma, and only pass the first part to the regex. Then, combine. Something like this:
$spl = split(',', $repl2); // $repl2 is 1234,123456
$repl3 = preg_replace('/(\d)(?=(?:\d{3})+(?!\d))/', '$1.', $spl[0]);
$repl3 .= "," . $spl[1]; // "1.234" + "," + "123456"
echo "$repl3" . "\n"; // 1.234,123456
Update:
The final code I have come up with:
$input = "textmdwrhfejhg../0005456,2222233333,34erw.re.ty";
//Here's mine :
$repl1 = trim(preg_replace("/[^0-9,.]/", "", $input), '.,');
//following line just removes one zero, i want it to remove all chars like
//Input : 000549,569 Output : 549,569
echo "$repl1\n";
$repl2 = preg_replace("/^0+(?!,)/", "",$repl1);
$repl3 = str_replace(".","",$repl2);
preg_match('/[0-9]+(?:\,[0-9]{2}){0,2}/', $repl3, $matches);
$repl4 = preg_replace('/(\d)(?=(\d{3})+(?!\d))/', '$1.', $matches[0]);
echo $repl4;

Removing all characters and numbers except last variable with dash symbol

Hi I want to remove a characters using preg_replace in php so i have this code here which i want to remove the whole characters, letters and numbers except the last digit(s) which has dash(-) symbol followed by a digits so here's my code.
echo preg_replace('/(.+)(?=-[0-9])|(.+)/','','asdf1245-10');
I expect the result will be
-10
the problem is above is not working very well. I checked the pattern using http://www.regextester.com/ it seems like it works, but on the other side http://www.phpliveregex.com/ doesn't work at all. I don't know why but anyone who can help to to figure it out?
Thanks a lot
Here is a way to go:
echo preg_replace('/^.+?(-[0-9]+)?$/','$1','asdf1245-10');
Output:
-10
and
echo preg_replace('/^.+?(-[0-9]+)?$/','$1','asdf124510');
Output:
<nothing>
My first thinking is to use explode in this case.. make it simple like the following code.
$string = 'asdf1245-10';
$array = explode('-', $string);
end($array);
$key = key($array);
$result = '-' . $array[$key];
$result => '-10';
An other way:
$result = preg_match('~\A.*\K-\d+\z~', $str, $m) ? $m[0] : '';
pattern details:
\A # start of the string anchor
.* # zero or more characters
\K # discard all on the left from match result
-\d+ # the dash and the digits
\z # end of the string anchor
echo preg_replace('/(\w+)(-\w+)/','$2', 'asdf1245-10');

Make two simple regex's into one

I am trying to make a regex that will look behind .txt and then behind the "-" and get the first digit .... in the example, it would be a 1.
$record_pattern = '/.txt.+/';
preg_match($record_pattern, $decklist, $record);
print_r($record);
.txt?n=chihoi%20%283-1%29
I want to write this as one expression but can only seem to do it as two. This is the first time working with regex's.
You can use this:
$record_pattern = '/\.txt.+-(\d)/';
Now, the first group contains what you want.
Your regex would be,
\.txt[^-]*-\K\d
You don't need for any groups. It just matches from the .txt and upto the literal -. Because of \K in our regex, it discards the previously matched characters. In our case it discards .txt?n=chihoi%20%283- string. Then it starts matching again the first digit which was just after to -
DEMO
Your PHP code would be,
<?php
$mystring = ".txt?n=chihoi%20%283-1%29";
$regex = '~\.txt[^-]*-\K\d~';
if (preg_match($regex, $mystring, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
}
?> //=> 1

Extract last section of string

I have a string like this:
[numbers]firstword[numbers]mytargetstring
I would like to know how is it possible to extract "targetstring" taking account the following :
a.) Numbers are numerical digits for example, my complete string with numbers:
12firstword21mytargetstring
b.) Numbers can be any digits, for example above are two digits each, but it can be any number of digits like this:
123firstword21567mytargetstring
Regardless of the number of digits, I am only interested in extracting "mytargetstring".
By the way "firstword" is fixed and will not change with any combination.
I am not very good in Regex so I appreciate someone with strong background can suggest how to do this using PHP. Thank you so much.
This will do it (or should do)
$input = '12firstword21mytargetstring';
preg_match('/\d+\w+\d+(\w+)$/', $input, $matches);
echo $matches[1]; // mytargetstring
It breaks down as
\d+\w+\d+(\w+)$
\d+ - One or more numbers
\w+ - followed by 1 or more word characters
\d+ - followed by 1 or more numbers
(\w+)$ - followed by 1 or more word characters that end the string. The brackets mark this as a group you want to extract
preg_match("/[0-9]+[a-z]+[0-9]+([a-z]+)/i", $your_string, $matches);
print_r($matches);
You can do it with preg_match and pattern syntax.
$string ='2firstword21mytargetstring';
if (preg_match ('/\d(\D*)$/', $string, $match)){
// ^ -- end of string
// ^ -- 0 or more
// ^^ -- any non digit character
// ^^ -- any digit character
var_dump($match[1]);}
Try it like,
print_r(preg_split('/\d+/i', "12firstword21mytargetstring"));
echo '<br/>';
echo 'Final string is: '.end(preg_split('/\d+/i', "12firstword21mytargetstring"));
Tested on http://writecodeonline.com/php/
You don't need regex for that:
for ($i=strlen($string)-1; $i; $i--) {
if (is_numeric($string[$i])) break;
}
$extracted_string = substr($string, $i+1);
Above it's probably the faster implementation you can get, certainly faster than using regex, which you don't need for this simple case.
See the working demo
your simple solution is here :-
$keywords = preg_split("/[\d,]+/", "hypertext123language2434programming");
echo($keywords[2]);

Categories