PHP Regex to remove everything after a character - php

So I've seen a couple articles that go a little too deep, so I'm not sure what to remove from the regex statements they make.
I've basically got this
foo:bar all the way to anotherfoo:bar;seg98y34g.?sdebvw h segvu (anything goes really)
I need a PHP regex to remove EVERYTHING after the colon. the first part can be any length (but it never contains a colon. so in both cases above I'd end up with
foo and anotherfoo
after doing something like this horrendous example of psuedo-code
$string = 'foo:bar';
$newstring = regex_to_remove_everything_after_":"($string);
EDIT
after posting this, would an explode() work reliably enough? Something like
$pieces = explode(':', 'foo:bar')
$newstring = $pieces[0];

explode would do what you're asking for, but you can make it one step by using current.
$beforeColon = current(explode(':', $string));
I would not use a regex here (that involves some work behind the scenes for a relatively simple action), nor would I use strpos with substr (as that would, effectively, be traversing the string twice). Most importantly, this provides the person who reads the code with an immediate, "Ah, yes, that is what the author is trying to do!" instead of, "Wait, what is happening again?"
The only exception to that is if you happen to know that the string is excessively long: I would not explode a 1 Gb file. Instead:
$beforeColon = substr($string, 0, strpos($string,':'));
I also feel substr isn't quite as easy to read: in current(explode you can see the delimiter immediately with no extra function calls and there is only one incident of the variable (which makes it less prone to human errors). Basically I read current(explode as "I am taking the first incident of anything prior to this string" as opposed to substr, which is "I am getting a substring starting at the 0 position and continuing until this string."

Your explode solution does the trick. If you really want to use regexes for some reason, you could simply do this:
$newstring = preg_replace("/(.*?):(.*)/", "$1", $string);

A bit more succinct than other examples:
current(explode(':', $string));

You can use RegEx that m.buettner wrote, but his example returns everything BEFORE ':', if you want everything after ':' just use $2 instead of $1:
$newstring = preg_replace("/(.*?):(.*)/", "$2", $string);

You could use something like the following. demo: http://codepad.org/bUXKN4el
<?php
$s = 'anotherfoo:bar;seg98y34g.?sdebvw h segvu';
$result = array_shift(explode(':', $s));
echo $result;
?>

Why do you want to use a regex?
list($beforeColon) = explode(':', $string);

Related

php regex replace double backslash with single

I do not want to use stripslashes() because I only want to replace "\\" with "\".
I tried preg_replace("/\\\\/", "\\", '2\\sin(\\pi s)\\Gamma(s)\\zeta(s) = i\\oint_C \\frac{-x^{s-1}}{e^x -1} \\mathrm{d}x');
Which to my disapointment returns: 2\\sin(\\pi s)\\Gamma(s)\\zeta(s) = i\\oint_C \\frac{-x^{s-1}}{e^x -1} \\mathrm{d}x
Various online regex testers indicate that the above should work. Why is it not?
First, like many other people are stating, regular expressions might be too heavy of a tool for the job, the solution you are using should work however.
$newstr = preg_replace('/\\\\/', '\\', $mystr);
Will give you the expected result, note that preg_replace returns a new string and does not modify the existing one in-place which may be what you are getting hung up on.
You can also use the less expensive str_replace in this case:
$newstr = str_replace('\\\\', '\\', $mystr);
This approach costs much less CPU time and memory since it doesn't need to compile a regular expression for a simple task such as this.
You dont need to use regex for this, use
$newstr = str_replace("\\\\", "\\", $mystr);
See the str_replace docs

Regex for PHP seems simple but is killing me

I'm trying to make a replace in a string with a regex, and I really hope the community can help me.
I have this string :
031,02a,009,a,aaa,AZ,AZE,02B,975,135
And my goal is to remove the opposite of this regex
[09][0-9]{2}|[09][0-9][A-Za-z]
i.e.
a,aaa,AZ,AZE,135
(to see it in action : http://regexr.com?3795f )
My final goal is to preg_replace the first string to only get
031,02a,009,02B,975
(to see it in action : http://regexr.com?3795f )
I'm open to all solution, but I admit that I really like to make this work with a preg_replace if it's possible (It became something like a personnal challenge)
Thanks for all help !
As #Taemyr pointed out in comments, my previous solution (using a lookbehind assertion) was incorrect, as it would consume 3 characters at a time even while substrings weren't always 3 characters.
Let's use a lookahead assertion instead to get around this:
'/(^|,)(?![09][0-9]{2}|[09][0-9][A-Za-z])[^,]*/'
The above matches the beginning of the string or a comma, then checks that what follows does not match one of the two forms you've specified to keep, and given that this condition passes, matches as many non-comma characters as possible.
However, this is identical to #anubhava's solution, meaning it has the same weakness, in that it can leave a leading comma in some cases. See this Ideone demo.
ltriming the comma is the clean way to go there, but then again, if you were looking for the "clean way to go," you wouldn't be trying to use a single preg_replace to begin with, right? Your question is whether it's possible to do this without using any other PHP functions.
The anwer is yes. We can take
'/(^|,)foo/'
and distribute the alternation,
'/^foo|,foo/'
so that we can tack on the extra comma we wish to capture only in the first case, i.e.
'/^foo,|,foo/'
That's going to be one hairy expression when we substitute foo with our actual regex, isn't it. Thankfully, PHP supports recursive patterns, so that we can rewrite the above as
'/^(foo),|,(?1)/'
And there you have it. Substituting foo for what it is, we get
'/^((?![09][0-9]{2}|[09][0-9][A-Za-z])[^,]*),|,(?1)/'
which indeed works, as shown in this second Ideone demo.
Let's take some time here to simplify your expression, though. [0-9] is equivalent to \d, and you can use case-insensitive matching by adding /i, like so:
'/^((?![09]\d{2}|[09]\d[a-z])[^,]*),|,(?1)/i'
You might even compact the inner alternation:
'/^((?![09]\d(\d|[a-z]))[^,]*),|,(?1)/i'
Try it in more steps:
$newList = array();
foreach (explode(',', $list) as $element) {
if (!preg_match('/[09][0-9]{2}|[09][0-9][A-Za-z]/', $element) {
$newList[] = $element;
}
}
$list = implode(',', $newList);
You still have your regex, see! Personnal challenge completed.
Try matching what you want to keep and then joining it with commas:
preg_match_all('/[09][0-9]{2}|[09][0-9][A-Za-z]/', $input, $matches);
$result = implode(',', $matches);
The problem you'll be facing with preg_replace is the extra-commas you'll have to strip, cause you don't just want to remove aaa, you actually want to remove aaa, or ,aaa. Now what when you have things to remove both at the beginning and at the end of the string? You can't just say "I'll just strip the comma before", because that might lead to an extra comma at the beginning of the string, and vice-versa. So basically, unless you want to mess with lookaheads and/or lookbehinds, you'd better do this in two steps.
This should work for you:
$s = '031,02a,009,a,aaa,AZ,AZE,02B,975,135';
echo ltrim(preg_replace('/(^|,)(?![09][0-9]{2}|[09][0-9][A-Za-z])[^,]+/', '', $s), ',');
OUTPUT:
031,02a,009,02B,975
Try this:
preg_replace('/(^|,)[1-8a-z][^,]*/i', '', $string);
this will remove all substrings starting with the start of the string or a comma, followed by a non allowed first character, up to but excluding the following comma.
As per #GeoffreyBachelet suggestion, to remove residual commas, you should do:
trim(preg_replace('/(^|,)[1-8a-z][^,]*/i', '', $string), ',');

Replace from one custom string to another custom string

How can I replace a string starting with 'a' and ending with 'z'?
basically I want to be able to do the same thing as str_replace but be indifferent to the values in between two strings in a 'haystack'.
Is there a built in function for this? If not, how would i go about efficiently making a function that accomplishes it?
That can be done with Regular Expression (RegEx for short).
Here is a simple example:
$string = 'coolAfrackZInLife';
$replacement = 'Stuff';
$result = preg_replace('/A.*Z/', $replacement, $string);
echo $result;
The above example will return coolStuffInLife
A little explanation on the givven RegEx /A.*Z/:
- The slashes indicate the beginning and end of the Regex;
- A and Z are the start and end characters between which you need to replace;
- . matches any single charecter
- * Zero or more of the given character (in our case - all of them)
- You can optionally want to use + instead of * which will match only if there is something in between
Take a look at Rubular.com for a simple way to test your RegExs. It also provides short RegEx reference
$string = "I really want to replace aFGHJKz with booo";
$new_string = preg_replace('/a[a-zA-z]+z/', 'boo', $string);
echo $new_string;
Be wary of the regex, are you wanting to find the first z or last z? Is it only letters that can be between? Alphanumeric? There are various scenarios you'd need to explain before I could expand on the regex.
use preg_replace so you can use regex patterns.

Obtain first line of a string in PHP

In PHP 5.3 there is a nice function that seems to do what I want:
strstr(input,"\n",true)
Unfortunately, the server runs PHP 5.2.17 and the optional third parameter of strstr is not available. Is there a way to achieve this in previous versions in one line?
For the relatively short texts, where lines could be delimited by either one ("\n") or two ("\r\n") characters, the one-liner could be like
$line = preg_split('#\r?\n#', $input, 2)[0];
for any sequence before the first line feed, even if it an empty string,
or
$line = preg_split('#\r?\n#', ltrim($input), 2)[0];
for the first non-empty string.
However, for the large texts it could cause memory issues, so in this case strtok mentioned below or a substr-based solution featured in the other answers should be preferred.
When this answer was first written, almost a decade ago, it featured a few subtle nuances
it was too localized, following the Opening Post with the assumption that the line delimiter is always a single "\n" character, which is not always the case. Using PHP_EOL is not the solution as we can be dealing with outside data, not affected by the local system settings
it was assumed that we need the first non-empty string
there was no way to use either explode() or preg_split() in one line, hence a trick with strtok() was proposed. However, shortly after, thanks to the Uniform Variable Syntax, proposed by Nikita Popov, it become possible to use one of these functions in a neat one-liner
but as this question gained some popularity, it's better to cover all the possible edge cases in the answer. But for the historical reasons here is the original solution:
$str = strtok($input, "\n");
that will return the first non-empty line from the text in the unix format.
However, given that the line delimiters could be different and the behavior of strtok() is not that straight, as "Delimiter characters at the start or end of the string are ignored", as it says the man page for the original strtok() function in C, now I would advise to use this function with caution.
It's late but you could use explode.
<?php
$lines=explode("\n", $string);
echo $lines['0'];
?>
$first_line = substr($fulltext, 0, strpos($fulltext, "\n"));
or something thereabouts would do the trick. Ugly, but workable.
try
substr( input, 0, strpos( input, "\n" ) )
echo str_replace(strstr($input, '\n'),'',$input);
list($line_1, $remaining) = explode("\n", $input, 2);
Makes it easy to get the top line and the content left behind if you wanted to repeat the operation. Otherwise use substr as suggested.
not dependent from type of linebreak symbol.
(($pos=strpos($text,"\n"))!==false) || ($pos=strpos($text,"\r"));
$firstline = substr($text,0,(int)$pos);
$firstline now contain first line from text or empty string, if no break symbols found (or break symbol is a first symbol in text).
try this:
substr($text, 0, strpos($text, chr(10)))
You can use strpos combined with substr. First you find the position where the character is located and then you return that part of the string.
$pos = strpos(input, "\n");
if ($pos !== false) {
echo substr($input, 0, $pos);
} else {
echo 'String not found';
}
Is this what you want ?
l.e.
Didn't notice the one line restriction, so this is not applicable the way it is. You can combine the two functions in just one line as others suggested or you can create a custom function that will be called in one line of code, as wanted. Your choice.
Many times string manipulation will face vars that start with a blank line, so don't forget to evaluate if you really want consider white lines at first and end of string, or trim it. Also, to avoid OS mistakes, use PHP_EOL used to find the newline character in a cross-platform-compatible way (When do I use the PHP constant "PHP_EOL"?).
$lines = explode(PHP_EOL, trim($string));
echo $lines[0];
A quick way to get first n lines of a string, as a string, while keeping the line breaks.
Example 6 first lines of $multilinetxt
echo join("\n",array_splice(explode("\n", $multilinetxt),0,6));
Can be quickly adapted to catch a particular block of text, example from line 10 to 13:
echo join("\n",array_splice(explode("\n", $multilinetxt),9,12));

PHP Regex: Select all except last occurrence

I'm trying to replace all \n's sans that final one with \n\t in order to nicely indent for a recursive function.
This
that
then
thar
these
them
should become:
This
that
then
thar
these
them
This is what I have: preg_replace('/\n(.+?)\n/','\n\t$1\n',$var);
It currently spits this out:
This
that
then
thar
these
them
Quick Overview:
Need to indent every line less the first and last line using regex, how can I accomplish this?
You can use a lookahead:
$var = preg_replace('/\n(?=.*?\n)/', "\n\t", $var);
See it working here: ideone
After fixing a quotes issue, your output is actually like this:
This
that
then
thar
these
them
Use a positive lookahead to stop that trailing \n from getting eaten by the search regex. Your "cursor" was already set beyond it so only every other line was being rewritten; your match "zones" overlapped.
echo preg_replace('/\n(.+?)(?=\n)/', "\n\t$1", $input);
// newline-^ ^-text ^-lookahead ^- replacement
Live demo.
preg_replace('/\n(.+?)(?=\n)/',"\n\t$1",$var);
Modified the second \n to be the lookahead (?=\n), otherwise you'd run into issues with regex not recognizing overlapping matches.
http://ideone.com/1JHGY
Let the downwoting begin, but why use regex for this?
<?php
$e = explode("\n",$oldstr);
$str = $e[count($e) - 1];
unset($e[count($e) - 1]);
$str = implode("\n\t",$e)."\n".$str;
echo $str;
?>
Actually, str_replace has a "count" parameter, but I just can't seem to get it to work with php 5.3.0 (found a bug report). This should work:
<?php
$count = substr_count($oldstr,"\n") - 1;
$newstr = str_replace("\n","\n\t",$oldstr,&$count);
?>

Categories