I'am doing a project which is cleaning text with php.
say that i have string like this :
$string = 'Life is like. . . . . . a box of chocolate. . . . . '; //each dot separated by space
How can i get the output like
$string ='Life is like a box of chocolate';
You can use preg_replace
$string = 'Life is like. . . . . . a box of chocolate. . . . . ';
echo preg_replace('/[.,]/', '', $string);
If your expected result is just to strip the dots and surrounding spaces, you can use preg_replace, but to get your expected result you need a bit more complex regex:
echo preg_replace(array("/(?<!\w) *\. */", "/(?<=\w)\. */"), array("", " "), $string);
Here I use an array of two regex to match two different cases:
a dot followed and preceded by 0 or more spaces, all preceded by a character which is not a letter or a digit (\w in regex means a letter or a digit, and this is a negative lookbehind). This is substituted by an empty string.
a dot followed by 0 or more spaces, and preceded by a letter or a digit. This is substituted by a space.
2 is to match and remove the dot in 'like. ' but leave a single space after the word.
This gives your expected result: 'Life is like a box of chocolate'
You can learn more on regex (short for regular expression) looking for tutorials on the web.
isn't it an easy str_replace()?
str_replace(". ", "", $text);
Related
Given an address stored as a single string with newlines delimiting its components like:
1 Street\nCity\nST\n12345
The goal would be to replace all newline characters except the first one with spaces in order to present it like:
1 Street
City ST 12345
I have tried methods like:
[$street, $rest] = explode("\n", $input, 2);
$output = "$street\n" . preg_replace('/\n+/', ' ', $rest);
I have been trying to achieve the same result using a one liner with a regular expression, but could not figure out how.
I would suggest not solving this with complicated regex but keeping it simple like below. You can split the string with a \n, pop out the first split and implode the rest with a space.
<?php
$input = explode("\n","1 Street\nCity\nST\n12345");
$input = array_shift($input) . PHP_EOL . implode(" ", $input);
echo $input;
Online Demo
You could use a regex trick here by reversing the string, and then replacing every occurrence of \n provided that we can lookahead and find at least one other \n:
$input = "1 Street\nCity\nST\n12345";
$output = strrev(preg_replace("/\n(?=.*\n)/", " ", strrev($input)));
echo $output;
This prints:
1 Street
City ST 12345
You can use a lookbehind pattern to ensure that the matching line is preceded with a newline character. Capture the line but not the trailing newline character and replace it with the same line but with a trailing space:
preg_replace('/(?<=\n)(.*)\n/', '$1 ', $input)
Demo: https://onlinephp.io/c/5bd6d
You can use an alternation pattern that matches either the first two lines or a newline character, capture the first two lines without the trailing newline character, and replace the match with what's captured and a space:
preg_replace('/(^.*\n.*)\n|\n/', '$1 ', $input)
Demo: https://onlinephp.io/c/2fb2f
I leave you another method, the regex is correct as long as the conditions are met, in this way it always works
$string=explode("/","1 Street\nCity\nST\n12345");
$string[0]."<br>";
$string[1]." ".$string[2]." ".$string[3]
I am investigating, but I am not able to find the solution to this.
The idea is to replace the characters $_ from a string with something else.
If you remove the dollar sign from the $search variable, it kind of works (but not in a desirable way).
It is not working because the dollar sign is a special character, but I cannot find how to scape it.
This is what I have:
$search = '$_'; // replace to '_' OR '[$_]' it returs "$1" instead of "1"
$replace = 1;
$regex = '#".*?"(*SKIP)(*FAIL)|\b' . $search . '\b#s';
$fullInput = '"$_" $_';
$r = preg_replace([$regex], $replace, $fullInput);
echo $r . PHP_EOL;
// Output with current code : "$_" $_
// Output with '_' or '[$_]': "$_" $1
//
// Expected result: "$_" 1
To have into account, if the text is between quotes, it should not be replaced.
You may use this regex for search:
"[^\\"]*(?:\\.|[^\\"]*)*"(*SKIP)(*F)|\$_
and replace it with [$0]
Pattern before | matches a double quoted string allowing an escaped quote in between. Pattern after | matches $_.
RegEx Demo
RegEx Details:
"[^\\"]*(?:\\.|[^\\"]*)*": Match a double quoted string. We allow escaped characters in this match.
(*SKIP)(*F): Skip and fail this match
|: OR
\$_: Match literal text $_
Code:
$search = '$_';
$replace = '[$0]';
$regex = '/"[^\\"]*(?:\\.|[^\\"]*)*"(*SKIP)(*F)|' . preg_quote($search, '/') . '/';
$fullInput = '"$_" $_';
$r = preg_replace($regex, $replace, $fullInput);
echo $r . PHP_EOL;
Output:
"$_" [$_]
I want to search for the regex that finds abc word from paragraph with following scenarios:
abc
'abc'
"abc"
abc's
def'abc
`abc`
(abc)
[abc]
{abc}
abc<br/> (any tag can appear after abc)
<br/>abc (any tag can appear befor abc)
abc:
abc;
abc,
#abc (Here apart from # it can be any special character)
After finding all those abc i want to replace them with this:
<span class='clsIgnoreWord'>abc</span>.
Which will be regex for finding these scenarios and replacing those with the above mentioned span enclosure?
The code which i have tried was for only word replacement and i want all above scenarios in one regex....
The code which i tried was for single word is:
preg_replace(
"/\b" . $string . "\b/",
"<span class='clsIgnoreWord'>" . $string . "</span>",
$paragraphText
);
Try this:
$word = 'abc';
$subject = "here there def'abc";
$replacement = '<span class="clsIgnoreWord">abc</span>';
$pattern = "/([^\w^\ ^\>]|\<br\/?\>|(\w+\'?))?(" . preg_quote($word) . "(\'s)?)(\<br\/?\>|[^\w^\ ^\<])?/";
$result = preg_replace($pattern, $replacement, $subject);
Regex explanation:
The first group ([^\w^\ ^\>]|\<br\/?\>|(\w+\'?))? will match zero or:
a single non-character other then space and >: [[^\w^\ ^\>]] part or
a <br> or <br/> tag: \<br\/?\> or
a group characters that are followed by ' if there are any: (\w+\'?) part.
Second group (" . preg_quote($word) . "(\'s)?) will match anything that variable $words holds and the $word value followed by 's. preg_quote() will escape any special chars in $word.
The third group (\<br\/?\>|[^\w^\ ^\<])? will match zero or:
a single non-character other then space and <: [^\w^\ ^\<] part;
a <br> or <br/> tag: \<br\/?\> part;
I've got this format: 00-0000 and would like to get to 0000-0000.
My code so far:
<?php
$string = '11-2222';
echo $string . "\n";
echo preg_replace('~(\d{2})[-](\d{4})~', "$1_$2", $string) . "\n";
echo preg_replace('~(\d{2})[-](\d{4})~', "$100$2", $string) . "\n";
The problem is - the 0's won't be added properly (I guess preg_replace thinks I'm talking about argument $100 and not $1)
How can I get this working?
You could try the below.
echo preg_replace('~(?<=\d{2})-(?=\d{4})~', "00", $string) . "\n";
This will replace hyphen to 00. You still make it simple
or
preg_replace('~-(\d{2})~', "$1-", $string)
The replacement string "$100$2" is interpreted as the content of capturing group 100, followed by the content of capturing group 2.
In order to force it to use the content of capturing group 1, you can specify it as:
echo preg_replace('~(\d{2})[-](\d{4})~', '${1}00$2', $string) . "\n";
Take note how I specify the replacement string in single-quoted string. If you specify it in double-quoted string, PHP will attempt (and fail) at expanding variable named 1.
How to replace part of a string based on a "." (period) character
only if it appears after/before/between a word(s),
not when it is before/between any number(s).
Example:
This is a text string.
-Should be able to replace the "string." with "string ."
(note the Space between the end of the word and the period)
Example 2
This is another text string. 2.0 times longer.
-Should be able to replace "string." with "string ."
(note the Space between the end of the word and the period)
-Should Not replace "2.0" with "2 . 0"
It should only do the replacement if the "." appears at the end/start of a word.
Yes - I've tried various bits of regex.
But everything I do results in either nothing happening,
or the numbers are fine, but I take the last letter from the word preceeding the "."
(thus instead of "string." I end up with "strin g.")
Yes - I've looked through numerous posts here - I have seen Nothing that deals with the desire, nor the "strange" problem of grabbing the char before the ".".
You can use a lookbehind (?<=REXP)
preg_replace("/(?<=[a-z])\./i", "XXX", "2.0 test. abc") // <- "2.0 testXXX abc"
which will only match if the text before matches the corresponding regex (in this case [a-z]). You may use a lookahead (?=REXP) in the same way to test text after the match.
Note: There is also a negative lookbehind (?<!REXP) and a negative lookahead (?!REXP) available which will reject matches if the REXP does not match before or after.
$input = "This is another text string. 2.0 times longer.";
echo preg_replace("/(^\.|\s\.|\.\s|\.$)/", " $1", $input);
http://ideone.com/xJQzQ
I'm not too good with the regex, but this is what I would do to accomplish the task with basic PHP. Basically explode the entire string by it's . values, look at each variable to see if the last character is a letter or number and add a space if it's a number, then puts the variable back together.
<?
$string = "This is another text string. 2.0 times longer.";
print("$string<br>");
$string = explode(".",$string);
$stringcount = count($string);
for($i=0;$i<$stringcount;$i++){
if(is_numeric(substr($string[$i],-1))){
$string[$i] = $string[$i] . " ";
}
}
$newstring = implode('.',$string);
print("$newstring<br>");
?>
It should only do the replacement if the "." appears at the end/start of a word.
search: /([a-z](?=\.)|\.(?=[a-z]))/
replace: "$1 "
modifiers: ig (case insensitive, globally)
Test in Perl:
use strict;
use warnings;
my $samp = '
This is another text string. 2.0 times longer.
I may get a string of "this and that.that and this.this 2.34 then that 78.01."
';
$samp =~ s/([a-z](?=\.)|\.(?=[a-z]))/$1 /ig;
print "$samp";
Input:
This is another text string. 2.0 times longer.
I may get a string of "this and that.that and this.this 2.34 then that 78.01."
Output:
This is another text string . 2.0 times longer .
I may get a string of "this and that . that and this . this 2.34 then that 78.01."