Make two simple regex's into one - php

I am trying to make a regex that will look behind .txt and then behind the "-" and get the first digit .... in the example, it would be a 1.
$record_pattern = '/.txt.+/';
preg_match($record_pattern, $decklist, $record);
print_r($record);
.txt?n=chihoi%20%283-1%29
I want to write this as one expression but can only seem to do it as two. This is the first time working with regex's.

You can use this:
$record_pattern = '/\.txt.+-(\d)/';
Now, the first group contains what you want.

Your regex would be,
\.txt[^-]*-\K\d
You don't need for any groups. It just matches from the .txt and upto the literal -. Because of \K in our regex, it discards the previously matched characters. In our case it discards .txt?n=chihoi%20%283- string. Then it starts matching again the first digit which was just after to -
DEMO
Your PHP code would be,
<?php
$mystring = ".txt?n=chihoi%20%283-1%29";
$regex = '~\.txt[^-]*-\K\d~';
if (preg_match($regex, $mystring, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
}
?> //=> 1

Related

Regular Expression That Contains At Least One Of Each

I'm trying to capitalize "words" that have at least one number, letter, and special character such as a period or dash.
Things like: 3370.01b, 6510.01.b, m-5510.30, and drm-2013-c-004914.
I don't want it to match things like: hello, sk8, and mixed-up
I'm trying to use lookaheads, as suggested, but I can't get it to match anything.
$output = preg_replace_callback('/\b(?=.*[0-9]+)(?=.*[a-z]+)(?=.*[\.-]+)\b/i', function($matches){return strtoupper($matches[0]);}, $input);
You can use this regex to match the strings you want,
(?=\S*[a-z])(?=\S*\d)[a-z\d]+(?:[.-][a-z\d]+)+
Explanation:
(?=\S*[a-z]) - This look ahead ensures that there is at least an alphabet character in the incoming word
(?=\S*\d) - This look ahead ensures that there is at least a digit in the incoming word
[a-z\d]+(?:[.-][a-z\d]+)+ - This part captures a word contain alphanumeric word containing at least one special character . or -
Online Demo
Here is the PHP code demo modifying your code,
$input = '3370.01b, 6510.01.b, m-5510.30, and drm-2013-c-004914 hello, sk8, and mixed-up';
$output = preg_replace_callback('/(?=\S*[a-z])(?=\S*\d)[a-z\d]+(?:[.-][a-z\d]+)+/i', function($matches){return strtoupper($matches[0]);}, $input);
echo $output;
Prints,
3370.01B, 6510.01.B, M-5510.30, and DRM-2013-C-004914 hello, sk8, and mixed-up
Regular expression:
https://regex101.com/r/sdmlL8/1
(?=.*\d)(.*)([-.])(.*)
PHP code:
https://ideone.com/qEBZQc
$input = '3370.01b';
$output = preg_replace_callback('/(?=.*\d)(.*)([-.])(.*)/i', function($matches){return strtoupper($matches[0]);}, $input);
I don't think you never captured anything to put into matches...
$input = '3370.01b foo';
$output = preg_replace_callback('/(?=.*[0-9])(?=.*[a-z])(\w+(?:[-.]\w+)+)/i', function($matches){return strtoupper($matches[0]);}, $input);
echo $output;
Output
3370.01B foo
Sandbox
https://regex101.com/r/syJWMN/1

preg_match how to return matches?

According to PHP manual "If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on."
How can I return a value from a string with only knowing the first few characters?
The string is dynamic and will always change whats inside, but the first four character will always be the same.
For example how could I return "Car" from this string "TmpsCar". The string will always have "Tmps" followed by something else.
From what I understand I can return using something like this
preg_match('/(Tmps+)/', $fieldName, $matches);
echo($matches[1]);
Should return "Car".
Your regex is flawed. Use this:
preg_match('/^Tmps(.+)$/', $fieldName, $matches);
echo($matches[1]);
$matches = []; // Initialize the matches array first
if (preg_match('/^Tmps(.+)/', $fieldName, $matches)) {
// if the regex matched the input string, echo the first captured group
echo($matches[1]);
}
Note that this task could easily be accomplished without regex at all (with better performance): See startsWith() and endsWith() functions in PHP.
"The string will always have "Tmps" followed by something else."
You don't need a regular expression, in that case.
$result = substr($fieldName, 4);
If the first four characters are always the same, just take the portion of the string after that.
An alternative way is using the explode function
$fieldName= "TmpsCar";
$matches = explode("Tmps", $fieldName);
if(isset($matches[1])){
echo $matches[1]; // return "Car"
}
Given that the text you are looking in, contains more than just a string, starting with Tmps, you might look for the \w+ pattern, which matches any "word" char.
This would result in such an regular expression:
/Tmps(\w+)/
and altogether in php
$text = "This TmpsCars is a test";
if (preg_match('/Tmps(\w+)/', $text, $m)) {
echo "Found:" . $m[1]; // this would return Cars
}

How to hook first hashtaged word in text

I have a PHP $var result named '$caption' result. That result sometimes have #hashtag words like Instagram or Twitter for example;
The Caipirinha is similar to a mojito, except there’s no mint … and there are a lot more limes. #Rio #Olympics #RiodeJaneiro #Caipirinha
I just want hook in text first hastagged word (bold).
In PHP how i can do (hook) first hastagged words in text($var result)?
Thanks for your answers.
You can use preg_match with a regex to match the first #word. Something like:
$string = 'The Caipirinha is similar to a mojito, except there’s no mint … and there are a lot more limes. #Rio #Olympics #RiodeJaneiro #Caipirinha';
preg_match('/#\S+/', $string, $firsthashedword);
echo $firsthashedword[0];
Should do it. \S is any non-whitespace character. The + is a quantifier meaning there must be at least one non-whitespace character after the #. Once it encounters a whitespace the match stops.
PHP Demo: https://eval.in/619122
Regex Demo: https://regex101.com/r/mE8lB6/1
<?php
$result = "This is a sample result #cool #muchwow";
$split = explode("#", $result);
$returnStr = '<b>#'.$split[1].'</b>';
echo str_replace('#'.$split[1], $returnStr, $result);
?>
Output:
This is a sample result <b>#cool</b> #muchwow
Explode will set $split to an array containing the words in between the hashtags.
So in your example the $split variable would equal ['Rio','Olympics'...etc]. Then I find the first occurrence of the '#' which would be equal to $split[1].
After I have that I perform a simple str_replace() to look for the first hashtag adding in html bold tags.
Check it out here: https://eval.in/619127

Extract last section of string

I have a string like this:
[numbers]firstword[numbers]mytargetstring
I would like to know how is it possible to extract "targetstring" taking account the following :
a.) Numbers are numerical digits for example, my complete string with numbers:
12firstword21mytargetstring
b.) Numbers can be any digits, for example above are two digits each, but it can be any number of digits like this:
123firstword21567mytargetstring
Regardless of the number of digits, I am only interested in extracting "mytargetstring".
By the way "firstword" is fixed and will not change with any combination.
I am not very good in Regex so I appreciate someone with strong background can suggest how to do this using PHP. Thank you so much.
This will do it (or should do)
$input = '12firstword21mytargetstring';
preg_match('/\d+\w+\d+(\w+)$/', $input, $matches);
echo $matches[1]; // mytargetstring
It breaks down as
\d+\w+\d+(\w+)$
\d+ - One or more numbers
\w+ - followed by 1 or more word characters
\d+ - followed by 1 or more numbers
(\w+)$ - followed by 1 or more word characters that end the string. The brackets mark this as a group you want to extract
preg_match("/[0-9]+[a-z]+[0-9]+([a-z]+)/i", $your_string, $matches);
print_r($matches);
You can do it with preg_match and pattern syntax.
$string ='2firstword21mytargetstring';
if (preg_match ('/\d(\D*)$/', $string, $match)){
// ^ -- end of string
// ^ -- 0 or more
// ^^ -- any non digit character
// ^^ -- any digit character
var_dump($match[1]);}
Try it like,
print_r(preg_split('/\d+/i', "12firstword21mytargetstring"));
echo '<br/>';
echo 'Final string is: '.end(preg_split('/\d+/i', "12firstword21mytargetstring"));
Tested on http://writecodeonline.com/php/
You don't need regex for that:
for ($i=strlen($string)-1; $i; $i--) {
if (is_numeric($string[$i])) break;
}
$extracted_string = substr($string, $i+1);
Above it's probably the faster implementation you can get, certainly faster than using regex, which you don't need for this simple case.
See the working demo
your simple solution is here :-
$keywords = preg_split("/[\d,]+/", "hypertext123language2434programming");
echo($keywords[2]);

Identifying a random repeating pattern in a structured text string

I have a string that has the following structure:
ABC_ABC_PQR_XYZ
Where PQR has the structure:
ABC+JKL
and
ABC itself is a string that can contain alphanumeric characters and a few other characters like "_", "-", "+", "." and follows no set structure:
eg.qWe_rtY-asdf or pkl123
so, in effect, the string can look like this:
qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ
My goal is to find out what string constitutes ABC.
I was initially just using
$arrString = explode("_",$string);
to return $arrString[0] before I was made aware that ABC ($arrString[0]) itself can contain underscores, thus rendering it incorrect.
My next attempt was exlpoding it on "_" anyway and then comparing each of the exploded string parts with the first string part until I get a semblance of a pattern:
function getPatternABC($string)
{
$count = 0;
$pattern ="";
$arrString = explode("_", $string);
foreach($arrString as $expString)
{
if(strcmp($expString,$arrString[0])!==0 || $count==0)
{
$pattern = $pattern ."_". $arrString[$count];
$count++;
}
else break;
}
return substr($pattern,1);
}
This works great - but I wanted to know if there was a more elegant way of doing this using regular expressions?
Here is the regex solution:
'^([a-zA-Z0-9_+-]+)_\1_\1\+'
What this does is match (starting from the beginning of the string) the longest possible sequence consisting of the characters inside the square brackets (edit that per your spec). The sequence must appear exactly twice, each time followed by an underscore, and then must appear once more followed by a plus sign (this is actually the first half of PQR with the delimiter before JKL). The rest of the input is ignored.
You will find ABC captured as capture group 1.
So:
$input = 'qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ';
$result = preg_match('/^([a-zA-Z0-9_+-]+)_\1_\1\+/', $input, $matches);
if ($result) {
echo $matches[2];
}
See it in action.
Sure, just make a regular expression that matches your pattern. In this case, something like this:
preg_match('/^([a-zA-Z0-9_+.-]+)_\1_\1\+JKL_XYZ$/', $string, $match);
Your ABC is in $match[1].
If the presence of underscores in these strings has a low frequency, it may be worth checking to see if a simple explode() will do it before bothering with regex.
<?php
$str = 'ABC_ABC_PQR_XYZ';
if(substr_count($str, '_') == 3)
$abc = reset(explode('_', $str));
else
$abc = regexy_function($str);
?>

Categories