PHP Regex: Select all except last occurrence - php

I'm trying to replace all \n's sans that final one with \n\t in order to nicely indent for a recursive function.
This
that
then
thar
these
them
should become:
This
that
then
thar
these
them
This is what I have: preg_replace('/\n(.+?)\n/','\n\t$1\n',$var);
It currently spits this out:
This
that
then
thar
these
them
Quick Overview:
Need to indent every line less the first and last line using regex, how can I accomplish this?

You can use a lookahead:
$var = preg_replace('/\n(?=.*?\n)/', "\n\t", $var);
See it working here: ideone

After fixing a quotes issue, your output is actually like this:
This
that
then
thar
these
them
Use a positive lookahead to stop that trailing \n from getting eaten by the search regex. Your "cursor" was already set beyond it so only every other line was being rewritten; your match "zones" overlapped.
echo preg_replace('/\n(.+?)(?=\n)/', "\n\t$1", $input);
// newline-^ ^-text ^-lookahead ^- replacement
Live demo.

preg_replace('/\n(.+?)(?=\n)/',"\n\t$1",$var);
Modified the second \n to be the lookahead (?=\n), otherwise you'd run into issues with regex not recognizing overlapping matches.
http://ideone.com/1JHGY

Let the downwoting begin, but why use regex for this?
<?php
$e = explode("\n",$oldstr);
$str = $e[count($e) - 1];
unset($e[count($e) - 1]);
$str = implode("\n\t",$e)."\n".$str;
echo $str;
?>
Actually, str_replace has a "count" parameter, but I just can't seem to get it to work with php 5.3.0 (found a bug report). This should work:
<?php
$count = substr_count($oldstr,"\n") - 1;
$newstr = str_replace("\n","\n\t",$oldstr,&$count);
?>

Related

How to cut out everything from a string except certain part of it in php?

Let's say I have string like this:
Village_name(315|431 K64)
What I want to do is when I paste that into let's say text box, and click a button, all I will be left with is 315|431.
Is there a way of doing this?
Use the below regex and then replace the match with \1.
(\d+\|\d+)|.
It captures the number|number part and matches all the remaining chars. By replacing all the matched chars with \1 will give you the number|number part only.
DEMO
In php, you may use this also.
(?:\d+\|\d+)(*SKIP)(*F)|.
substring which was matched by \d+\|\d+ regex would be matched first and the following (*SKIP)(*F) makes the regex to fail. Now thw . after the pipe symbol would match all the chars except number|number because we already skipped that part.
DEMO
I know this question has been answered and the answer has been accepted. But I still want to suggest this answer, as you really don't need to use PHP to realize your requirement. Just use Javascript. Its enough:
var str = 'Village_name(315|431 K64)';
var pattern = /\((\w+\|\w+) /;
var res = str.match(pattern);
document.write(res[1]);
Please try this:-
<?php
$str = 'Village_name(315|431 K64)';
preg_match_all('/(?:\d+\|\d+)/', $str, $matches);
echo "<pre/>";print_r($matches);//print in array format completly
$i=0;
foreach($matches as $match){ //iteration through one foreach as you asked
echo $match[$i];
$i++;
}
?>
Output:- http://prntscr.com/74ddg9
Note:- explode can work with some adjustment but if the format only like what you given.So go for preg_match_all. It's best.

preg_replace with Regex - find number-sequence in URL

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!
You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo
You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.
You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

PHP Regex to remove everything after a character

So I've seen a couple articles that go a little too deep, so I'm not sure what to remove from the regex statements they make.
I've basically got this
foo:bar all the way to anotherfoo:bar;seg98y34g.?sdebvw h segvu (anything goes really)
I need a PHP regex to remove EVERYTHING after the colon. the first part can be any length (but it never contains a colon. so in both cases above I'd end up with
foo and anotherfoo
after doing something like this horrendous example of psuedo-code
$string = 'foo:bar';
$newstring = regex_to_remove_everything_after_":"($string);
EDIT
after posting this, would an explode() work reliably enough? Something like
$pieces = explode(':', 'foo:bar')
$newstring = $pieces[0];
explode would do what you're asking for, but you can make it one step by using current.
$beforeColon = current(explode(':', $string));
I would not use a regex here (that involves some work behind the scenes for a relatively simple action), nor would I use strpos with substr (as that would, effectively, be traversing the string twice). Most importantly, this provides the person who reads the code with an immediate, "Ah, yes, that is what the author is trying to do!" instead of, "Wait, what is happening again?"
The only exception to that is if you happen to know that the string is excessively long: I would not explode a 1 Gb file. Instead:
$beforeColon = substr($string, 0, strpos($string,':'));
I also feel substr isn't quite as easy to read: in current(explode you can see the delimiter immediately with no extra function calls and there is only one incident of the variable (which makes it less prone to human errors). Basically I read current(explode as "I am taking the first incident of anything prior to this string" as opposed to substr, which is "I am getting a substring starting at the 0 position and continuing until this string."
Your explode solution does the trick. If you really want to use regexes for some reason, you could simply do this:
$newstring = preg_replace("/(.*?):(.*)/", "$1", $string);
A bit more succinct than other examples:
current(explode(':', $string));
You can use RegEx that m.buettner wrote, but his example returns everything BEFORE ':', if you want everything after ':' just use $2 instead of $1:
$newstring = preg_replace("/(.*?):(.*)/", "$2", $string);
You could use something like the following. demo: http://codepad.org/bUXKN4el
<?php
$s = 'anotherfoo:bar;seg98y34g.?sdebvw h segvu';
$result = array_shift(explode(':', $s));
echo $result;
?>
Why do you want to use a regex?
list($beforeColon) = explode(':', $string);

How to use use php preg_split with an html string

I am trying to parse a badly formed html table:
A couple of lines of this are:
Food:</b> Yes<b><br>
Pool: </b>Beach<b></b><b><br>
Centre:</b> Yes<b><br>
After spending a lot of time on this with Xpath, I think it is probably better to split the above text into lines use preg_split and parse from there.
The pattern I think would work uses:
<\b><\br>*: <\b>
my code is as follows:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split($pattern, $output);
print_r($chars);
I am getting the following error:
Delimiter must not be alphanumeric or backslash
What I am doing wrong?
Try this:
$pattern='</b></br>*:</b>';
$pattern=preg_quote($pattern,'#');
$chars = preg_split('#'.$pattern.'#', $output);
print_r($chars);
The preg_quote function just makes it safely escaped, it doesn't actually add the delimiters for you.
As other people will surely point out, using regular expressions is not a good way to parse HTML :)
Your regular expression is also not going to match what you hope. Here's a version that will probably work for your input:
$in = " Pool: </b>Beach<b></b><b><br>";
$out = explode(':', strip_tags($in));
$key = trim($out[0]);
$value = trim($out[1]);
echo "$key = $value\n";
This removes all the HTML, then splits on the colon, and then removes any surrounding whitespace.
Your pattern needs to start and end with a delimiter; looks like you're using # if I'm reading this correctly, so you should have $pattern = '#</b></br>.*:</b>#';.
Also, you're mixing things up; * is not a simple wildcard in regex. If you mean "any number of any characters," the pattern you need is .*. I've included this above.

Replace array values in string with regex?

Say I have the following array and string:
$array = array('$AA', '$AB', '$AC', '$ZZ');
$string = 'String mentioning $AA and $AB and $CZ and $MARTASS';
I want to check $string for matches against $array. Every word in $string that begins with "$" should be checked. In the example, a match is found for $AA and $AB; not for $CZ. The desired output would be:
String mentioning {MATCH} and {MATCH} and {NO-MATCH}
Is this possible with one regex or is it better to write several lines of PHP? Any input is kindly received :)
Should be possible with two find-and-replaces, done in this order:
first:
\b(($AA)|($AB)|($AC)|($ZZ))\b ---> {MATCH}
second:
\b$\w+\b ---> {NO-MATCH}
I'm not sure this is in PHP syntax, but it shouldn't be too hard to get there. \b is a word separator boundary, which I believe is allowed in PHP.
Edit: You might need to escape $, not sure as it's grouped.
Yes it is possible. Have a look at the examples in the preg_replace_callback() documentation. You would use a replace call of the form:
function substituteVar($matches) {
...
}
...
$newString = preg_replace_callback("/\\$(\w+)/", 'substituteVar', $string);
I think I'll leave the content of the substituteVar() as an "exercise for the reader". :-)
This should work...
<?php
$string = 'String mentioning $AA and $AB and $CZ and $MARTASS';
echo preg_replace_callback("/\\$\S+/",
create_function('$a','return in_array($a[0],array("\$AA", "\$AB", "\$AC", "\$ZZ")) ? "{MATCH}" : "{NO-MATCH}";'),
$string
);
?>
Regex matches $ followed by one or more not spaces (\S+) and then checks if the matched string is in the array (included in create function definition so it is in scope, and escaped properly)
I wouldn't bother using a regex here, a simple scan of the string from start to finish, looking for the '$' character and then performing a binary search on the array would be much simpler and faster.

Categories