I'm working on a PHP based application extension that will extend a launcher style app via the TVRage API class to return results to a user wherever they may be. This is done via Alfred App (alfredapp.com).
I would like to add the ability to include show name followed by S##E##:
example: Mike & Molly S01E02
The show name can change, so I can't stop it there, but I want to separate the S##E## from the show name. This will allow me to use that information to continue the search via the API. Even better, if there was a way to grab the numbers, and only the numbers between the S and the E (in the example 01) and the numbers after E (in the example 02) that would be perfect.
I was thinking the best function is strpos but after looking closer that searches for a string within a string. I believe I would need to use a regex to correctly do this. That would leave me with preg_match. Which led me to:
$regex = ?;
preg_match( ,$input);
Problem is I just don't understand Regular Expressions well enough to write it. What regular expression could be used to separate the show name from the S##E## or get just the two separate numbers?
Also, if you have a good place to teach regular expressions, that would be fantastic.
Thanks!
You can turn it around and use strrpos to look for the last space in the string and then use substr to get two strings based on the position you found.
Example:
$your_input = trim($input); // make sure there are no spaces at the end (and the beginning)
$last_space_at = strrpos($your_input, " ");
$show = substr($your_input, 0, $last_space_at - 1);
$episode = substr($your_input, $last_space_at + 1);
Regex:
$text = 'Mike & Molly S01E02';
preg_match("/(.+)(S\d{2}E\d{2})/", $text, $output);
print_r($output);
Output:
Array
(
[0] => Mike & Molly S01E02
[1] => Mike & Molly
[2] => S01E02
)
If you want the digits separately:
$text = 'Mike & Molly S01E02';
preg_match("/(.+)S(\d{2})E(\d{2})/", $text, $output);
print_r($output);
Output:
Array
(
[0] => Mike & Molly S01E02
[1] => Mike & Molly
[2] => 01
[3] => 02
)
Explanation:
. --> Match every character
.+ --> Match every character one or more times
\d --> Match a digit
\d{2} --> Match 2 digits
The parenthesis are to group the results.
www.regular-expressions.info is a good place to learn regex.
Related
i am using php and wants to extract phone/mobile numbers from string, i have string with multiple format of phone numbers like
$str = '(123) 456-7890 or (123)456-7890 and 1234567890 test "123.456.7890" another test "123 456 7890"';
i had write one RE as,
$phoneMatches = '';
$str = '(123) 456-7890 or (123)456-7890 or 1234567890 or "123.456.7890" or "123 456 7890"';
$phonePattern = '/\b[0-9]{3}\s*[-]?\s*[0-9]{3}\s*[-]?\s*[0-9]{4}\b/';
preg_match_all($phonePattern, $str, $phoneMatches);
echo "<pre>";
print_r($phoneMatches);
exit;
but it gives me output like this,
Array
(
[0] => Array
(
[0] => 1234567890
[1] => 123 456 7890
)
)
Means only two, but i want all the possible combination of phone numbers and mobile numbers from string of text by using only ONE Regular expression.
Thanks
I know I'm late, and I'm not sure if this is what you wanted, but I came up with this solution:
[+()\d].*?\d{4}(?!\d)
Demonstration: regex101.com
Explanation:
[+()\d] - We start by matching anything that might represent the start of a phone number.
.*?\d{4} - Then we match anything (using a lazy quantifier) until we reach four ending digits. Just a little note: I considered this as a rule, but it might not always apply. You'd then need to modify the regex to include other cases.
(?!\d) - This is a negative lookahead and it means that we don't want any matches followed by a digit character. I used this to avoid some half-matches.
Another observation is that this regex doesn't validate any phone number. You could have anything in between the matches, mainly because of this part: .*?\d{4}. This will work depending on what kind of situation you intend to use it.
I have some alerts setup, that are emailed to me on a regular occurrence and in those emails I get content that looks like this:
2002 Volkswagen Eurovan Clean title - $2000
That is the general consistent format. Those are also links that are clickable.
I have a script that's setup already that will extract the links from the body string properly, but what I am looking for is basically the year and the price from those titles that come in. There is the possibility of more than one being listed within the email.
So my question is, how can I use preg_match_all to properly grab all the possibilities so that I can then explode them to get the first piece of data (year) and the last piece of data (price)? Would I take the approach to see if I can match based on digits as it's presumed the format will generally be the same?
You can try matching the 4 digits starting with 19 and 20 and name these captures a year, and the digits after $ a price, and use anchors ^ and $ if these values are always at the beginning and end of a string:
^(?'year'\b(?:19|20)\d{2}\b)|(?'price'\$\d+)$
See demo
Sample IDEONE code:
$re = "/^(?'year'\\b(?:19|20)\\d{2}\\b)|(?'price'\\$\\d+)$/";
$str = "2002 Volkswagen Eurovan Clean title - \$2100";
preg_match_all($re, $str, $matches);
print_r(array_filter($matches["year"]));
print_r(array_filter($matches["price"]));
Output:
Array
(
[0] => 2002
)
Array
(
[1] => $2100
)
I am trying to group words of 4 or more characters with words of 3 or less characters using preg_match_all() in PHP. I am doing this for a keyword search function where users can enter things like "An elephant" and I cannot have any results come back that have just "An" in them.
Therefore instead of breaking the keywords apart by spaces, (e.g. "An", "elephant") I need to put the keywords of three or less characters with the next or previous keyword. (e.g. "An elephant", "History of")
In order to accomplish this I am trying to use conditional sub patterns but I am not sure if I am really on the right track here.
Here's the best I've got so far:
(\s\S{1,3}\s*)?(?(1)\S+)
Yet I seem to also be matching a whole bunch of empty spaces as well.
Can someone please point me in the right direction?
In the case of "History of elephants" I am trying to get it to create two matches: "History of", and "elephants".
I cannot simply omit the "stop words" because they are important in this case. The real-life use case is searching for course titles such as "Calculus A" and in that case "A" is important.
See if this would match your needs:
\b(?:[\w'-]{1,3}\W+[\w'-]{4,}|[\w'-]{4,}\W+[\w'-]{1,3}|[\w'-]{4,})\b
Starts at \b word boundaries where it...
[\w'-]{1,3}\W+[\w'-]{4,} matches 1-3 word characters, followed by \W+ one or more non-word characters, followed by [\w'-]{4,}\b 4 or more word characters.
|[\w'-]{4,}\W+[\w'-]{1,3} or matches first the 4+ words followed by shorter ones.
|[\w'-]{4,} or matches any words with at least 4 characters. (reduce if needed)
Test at regex101.com; Regex FAQ
Also see the problems if input is such as "I visted Calculus A, you in Calculus B?"; Outputs: I visted, Calculus A, in Calculus because of the priority of preceding words.
And a PHP-example ($out[0] would hold the matches)
$str = "
An elephant in the garden
history of elephants
Algebra A B-movies";
$pattern = '~\b(?:
[\w\'-]{1,3}\W+[\w\'-]{4,}|
[\w\'-]{4,}\W+[\w\'-]{1,3}|
[\w\'-]{4,}
)\b~x';
if(preg_match_all($pattern, $str, $out)) {
print_r($out[0]);
}
outputs to:
Array
(
[0] => An elephant
[1] => the garden
[2] => history of
[3] => elephants
[4] => Algebra A
[5] => B-movies
)
Test at eval.in (link expires soon)
There are some complications with what you're trying to do, it gives rise to ambiguities. Is History of elephants [History of] [elephants] or [History] [of elephants]? You're probably better of just excluding a set of specific stop words or words that meet some criteria.
If you want to exclude words of 3 or less characters, you might try the following.
You say you're already splitting the keywords at spaces, so you should have an array of words. You can just array_filter that array based on word length (> 3 chars), and you should have the list of words you want to use.
$words = array('no', 'na', 'sure', 'definitely');
function length_filter($word) {
return mb_strlen($word) > 3;
};
$longer_than_3 = array_filter($words, 'length_filter');
print_r($longer_than_3);
// Array
// (
// [2] => sure
// [3] => definitely
// )
I'm not good in regular expression, but today I faced an unavoidable situation, So I need a regular expression that matches the case below:
|hhas.jpg||sd22-9393das.png||8jjas.png||IMG00338-20110109.jpg|
I tried this regex : /(?<=\|)(\w|\d+\.\w+)(?=\|)/i but not getting the desired results...
I want to match all the strings by using preg_match function of PHP between two | signs ie hhas.jpg, sd22-9393das.png etc...
You can use the following regex:
/\|([^|]*)\|/gi
Demo
Matched strings:
1. hhas.jpg
2. sd22-9393das.png
3. 8jjas.png
4. IMG00338-20110109.jpg
Use this..
preg_match_all('/\|(.*?)\||/', $str, $matches);
print_r(array_filter($matches[1]));
OUTPUT :
Array
(
[0] => hhas.jpg
[1] => sd22-9393das.png
[2] => 8jjas.png
[3] => IMG00338-20110109.jpg
)
Demonstration
your expression :
/(?<=\|)(\w|\d+\.\w+)(?=\|)/i
pretty well written , but just has a few minor flaws
when you say \w that is only one character.
the OR condition
\d+\.\w+ will match only when it meets the same order. i.e. list of digits first followed by a . and then followed by letters or digits or underscore.
better change your regex to :
/(?<=\|)(.*?)(?=\|)/ig
this will give you anything which is between |s
also IMHO , using lookarounds for such a problem is an overkill. Better use :
/\|(.*?)\|/ig
Try without using regular expression.
explode('||', rtrim(ltrim ('|hhas.jpg||sd22-9393das.png||8jjas.png||IMG00338-20110109.jpg|','|'),'|'));
Output:
Array ( [0] => hhas.jpg [1] => sd22-9393das.png [2] => 8jjas.png [3] => IMG00338-20110109.jpg )
This regex finds the right string, but only returns the first result. How do I make it search the rest of the text?
$text =",415.2109,520.33970,495.274100,482.3238,741.5634
655.3444,488.29980,741.5634";
preg_match("/[^,]+[\d+][.?][\d+]*/",$text,$data);
echo $data;
Follow up:
I'm pushing the initial expectations of this script, and I'm at the point where I'm pulling out more verbose data. Wasted many hours with this...can anyone shed some light?
heres my string:
155.101.153.123:simple:mass_mid:[479.0807,99.011, 100.876],mass_tol:[30],mass_mode: [1],adducts:[M+CH3OH+H],
130.216.138.250:simple:mass_mid:[290.13465,222.34566],mass_tol:[30],mass_mode:[1],adducts:[M+Na],
and heres my regex:
"/mass_mid:[((?:\d+)(?:.)(?:\d+)(?:,)*)/"
I'm really banging my head on this one! Can someone tell me how to exclude the line mass_mid:[ from the results, and keep the comma seperated values?
Use preg_match_all rather than preg_match
From the PHP Manual:
(`preg_match_all`) searches subject for all matches to the regular expression given in pattern and puts them in matches in the order specified by flags.
After the first match is found, the subsequent searches are continued on from end of the last match.
http://php.net/manual/en/function.preg-match-all.php
Don't use a regex. Use split to split apart your inputs on the commas.
Regexes are not a magic wand you wave at every problem that happens to involve strings.
Description
To extract a list of numeric values which may include a single decimal point, then you could use this regex
\d*\.?\d+
PHP Code Example:
<?php
$sourcestring=",415.2109,520.33970,495.274100,482.3238,741.5634
655.3444,488.29980,741.5634";
preg_match_all('/\d*\.?\d+/im',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
yields matches
$matches Array:
(
[0] => Array
(
[0] => 415.2109
[1] => 520.33970
[2] => 495.274100
[3] => 482.3238
[4] => 741.5634
[5] => 655.3444
[6] => 488.29980
[7] => 741.5634
)
)