Regular Expression separation

Regular Expression separation - php

I'm trying to make a regular expression that will select only the first string of two strings.
IE:
hello:howareyou
I want the regex to return only hello.
Similarly, I would want another one to return howareyou, but I should be able to figure that out once I understand the first part.
Thank you!
EDIT:
So far I have tried (?:[^"<:]|"[^"]*"|<[^>]*)* but that merely splits the two.

You could simply use explode(':', $str), but if you insist on using a regular expression, you can do that as well with preg_match('/(.+?):(.+)/', $str, $matches) which will return the first part in $matches[1] and the second part in $matches[2].

Related

preg_replace with Regex - find number-sequence in URL

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!

You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo

You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.

You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

very complex RegEx

so there's a string,
<?php
$string = <<<STR
/\!##$%^&*()?.,djasijdiwqpk,=-c./zcxzo123154897kp02ldz.,world90iops02&&&8ks
STR;
I want to replace everything to NULL, except word "world" and number 1 and 3,
I just want to get "world13" or "world31" from that string USING regular expressions
I have already implemented basic solution,
via strpos() and substr() and this is works as excepted. But I need to do this via RegExp
The question is:
Is it possible to extract that word using RegEx?

~(world(?:(31|13))~i. The 'i' makes the regex case insensitive. The ?: is there so it doesn't put it in the matches array in a separate result. Wouldn't say it's very complex, by the way :) If you want every 1 and 3 in there, you can use ~(world|1|3)~i.

Is it possible to extract that word using RegEx?
Yes. You can use this regular expression:
(world)
I know, that, But I can't extract world13 or world31
Ah, I understand! You can use:
$string = preg_replace('/.*/s', 'world13', $string);

A simple solution is to find things you need and then join them to a string.
preg_match_all('/world|[13]/', $string, $matches);
$ret = join($matches[0]);

preg_match returning weird results

I am searching a string for urls...and my preg_match is giving me an incorrect amount of matches for my demo string.
String:
Hey there, come check out my site at www.example.com
Function:
preg_match("#(^|[\n ])([\w]+?://[\w]+[^ \"\n\r\t<]*)#ise", $string, $links);
echo count($links);
The result comes out as 3.
Can anybody help me solve this? I'm new to REGEX.

$links is the array of sub matches:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
The matches of the two groups plus the match of the full regular expression results in three array items.
Maybe you rather want all matches using preg_match_all.

If you use preg_match_pattern, (as Gumbo suggested), please note that if you run your regex against this string, it will both match the value of your anchor attribute "href" as well as the linked Text which in this case happens to comtain an url. This makes TWO matches.
It would be wise to run an array_unique on your resultset :)

In addition to the advice on how to use preg_match, I believe there is something seriously wrong with the regular expression you are using. You may want to trying something like this instead:
preg_match("_([a-zA-Z]+://)?([0-9a-zA-Z$-\_.+!*'(),]+\.)?([0-9a-zA-Z]+)+\.([a-zA-Z]+)_", $string, $links);
This should handle most cases (although it wouldn't work if there was a query string after the top-level domain). In the future, when writing regular expressions, I recommend the following web-sites to help: http://www.regular-expressions.info/ and especially http://regexpal.com/ for testing them as you're writing them.

Splitting a string containing <if> <else> with regexps

I'm very poor with regexps but this should be very simple for someone who knows regexps.
Basically I will have a string like this:
<if>abc <else>xyz
I would like a regexp so if the string contains <if> <else>, it splits the string into two parts and returns the two strings after <if> and <else>. In the above example it might return an array with the first element being abc, second xyz. I'm open to approaches not using regexps too.
Any thoughts?

// $subject is you variable containing string you want to run regex on
$result = preg_match('/<if>(.*)<else>(.*)/i', $subject, $matches);
// $matches[0] has the full matched text
// $matches[1] has if part
// $matches[2] has else part
The /i at the end makes the search case sensitive to allow both <if><else> and <IF> <elSE>

Regular Expressions will work for this in simple cases. For more complicated cases (where nesting occurs), you may find it easier to parse the string and use a simple stack to grab the data you need.

The regular expression you are looking for is:
/<if>(.*)<\/else>(.*)/
You'd have a problem if there are multiple <else>'s, though. It can be edited depending on what you want the result to be.
Also note that this regular expression would work even if there is something before <if>.
In PHP, you'd want a code like the following:
<?php
$string = '<if>abc</else>xyz';
preg_match_all("/<if>(.*)<\/else>(.*)/", $string, $matches);
foreach($matches[0] as $value)
{
echo $value;
}
?>

Simple RegEx PHP

Since I am completely useless at regex and this has been bugging me for the past half an hour, I think I'll post this up here as it's probably quite simple.
hey.exe
hey2.dll
pomp.jpg
In PHP I need to extract what's between the <a> tags example:
hey.exe
hey2.dll
pomp.jpg

Avoid using '.*' even if you make it ungreedy, until you have some more practice with RegEx. I think a good solution for you would be:
'/<a[^>]+>([^<]+)<\/a>/i'
Note the '/' delimiters - you must use the preg suite of regex functions in PHP. It would look like this:
preg_match_all($pattern, $string, $matches);
// matches get stored in '$matches' variable as an array
// matches in between the <a></a> tags will be in $matches[1]
print_r($matches);

This appears to work:
$pattern = '/<a.*?>(.*?)<\/a>/';

([^<]*)

I found this regular expression tester to be helpful.

Here is a very simple one:
<a.*>(.*)</a>
However, you should be careful if you have several matches in the same line, e.g.
hey.exehey2.dll
In this case, the correct regex would be:
<a.*?>(.*?)</a>
Note the '?' after the '*' quantifier. By default, quantifiers are greedy, which means they eat as much characters as they can (meaning they would return only "hey2.dll" in this example). By appending a quotation mark, you make them ungreedy, which should better fit your needs.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regular Expression separation - php

You could simply use explode(':', $str), but if you insist on using a regular expression, you can do that as well with preg_match('/(.+?):(.+)/', $str, $matches) which will return the first part in $matches[1] and the second part in $matches[2].

Related

preg_replace with Regex - find number-sequence in URL

very complex RegEx

preg_match returning weird results

Splitting a string containing <if> <else> with regexps

Simple RegEx PHP

Categories

Resources