remove all characters after a character using regular expression in php - php

I need to remove all the characters after the character which I select in a string.
Here is my string
$string = "Blow the fun in the sun.paste a random text";
$new_string = preg_replace("/paste.*/","",$string,-1,$count);
echo $new_string;
My output is Blow the fun in the sun.
If my string is like this
$string = "Blow the fun in the sun.paste \n a random text";
$new_string = preg_replace("/paste.*/","",$string,-1,$count);
echo $new_string;
My output is
Blow the fun in the sun.
a random text
But, I need my output as Blow the fun in the sun. even if there are \n or \t or some other special characters in my strings. How can I match this, while taking those special characters into consideration?

You will need s flag (DOTALL) to make DOT match new lines:
$new_string = preg_replace("/paste.*/"s, "", $string, -1, $count);
Without s flag your regex is not matching new lines as your input contains new lines and you want to replace string that contains new line as well.

Related

PHP rtrim all trailing special characters

I'm making a function that that detect and remove all trailing special characters from string. It can convert strings like :
"hello-world"
"hello-world/"
"hello-world--"
"hello-world/%--+..."
into "hello-world".
anyone knows the trick without writing a lot of codes?
Just for fun
[^a-z\s]+
Regex demo
Explanation:
[^x]: One character that is not x sample
\s: "whitespace character": space, tab, newline, carriage return, vertical tab sample
+: One or more sample
PHP:
$re = "/[^a-z\\s]+/i";
$str = "Hello world\nhello world/\nhello world--\nhellow world/%--+...";
$subst = "";
$result = preg_replace($re, $subst, $str);
try this
$string = preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
or escape apostraphe from string
preg_replace('/[^A-Za-z0-9\-\']/', '', $string); // escape apostraphe
You could use a regex like this, depending on your definition of "special characters":
function clean_string($input) {
return preg_replace('/\W+$/', '', $input);
}
It replaces any characters that are not a word character (\W) at the end of the string $ with nothing. \W will match [^a-zA-Z0-9_], so anything that is not a letter, digit, or underscore will get replaced. To specify which characters are special chars, use a regex like this, where you put all your special chars within the [] brackets:
function clean_string($input) {
return preg_replace('/[\/%.+-]+$/', '', $input);
}
This one is what you are looking for. :
([^\n\w\d \"]*)$
It removes anything that is not from the alphabet, a number, a space and a new line.
Just call it like this :
preg_replace('/([^\n\w\s]*)$/', '', $string);

Text file as single string in PHP code

my text file is like this:
atagatatagatagtacataacta\n
actatgctgtctgctacgtccgta\n
ctgatagctgctcgctactacgat\n
gtcatgatctgatctacgatcaga\n
I need this file in single string or in single line in both same and reverese order like this:
atagatatagatagtacataactaactatgctgtctgctacgtccgtactgatagctgctcgctactacgatgtcatgatctgatctacgatcaga
and "reverese" (for which I didn't write code because I need help ).
I am using:
<?php
$re = "/[AG]?[AT][AT]GAGG[ATC]GC[GA]?[ATGC]/";
$str = file_get_contents("filename.txt");
trim($str);
preg_match($re, $str, $matches);
print_r($matches);
?>
You can remove spaces and newlines using preg_replace, and you can reverse a string using strrev.
$yourString = "atagatatagatagtacataacta\n actatgctgtctgctacgtccgta\n ctgatagctgctcgctactacgat\n gtcatgatctgatctacgatcaga\n";
$stringWithoutSpaces = preg_replace("/\s+/", "", $yourString);
$stringReversed = strrev($stringWithoutSpaces);
echo $stringReversed;
http://php.net/manual/de/function.preg-replace.php
http://php.net/manual/en/function.strrev.php
Explanation:
With preg_replace you replace any character in $yourString with an empty string "" that matches the search pattern "/\s+/". The \s in the search pattern stands for any whitespace character (tab, linefeed, carriage return, space, formfeed), the + is there to match also multiple whitespace characters, not just one.

PHP Regex: Remove words less than 3 characters

I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);
Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries
You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''
Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);
As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"
To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.

Regex to insert dot (.) after characters, before new line

I'm reformatting some text, and sometimes I have a string, where there is a sentence which is not ended by a dot.
I'm running various checks for this purpose, and one more I'd like is to "Add dot after last character before new line".
I'm not sure how to form the regular expression for this:]
$string = preg_replace("/???/", ".\n", $string);
Try this one:
$string = preg_replace("/(?<![.])(?=[\n\r]|$)/", ".", $string);
negative lookbehind (?<![.]) is checking previous character is not .
positive lookahead (?=[\n\r]|$) is checking next character is a newline or end of string.
like this I suppose:
<?php
$string = "Add dot after last character before new line\n";
$string = preg_replace("/(.)$/", "$1.\n", $string);
print $string;
?>
This way the dot will be added after the word line in the sentence and before the \n.
demo : http://ideone.com/J4g7tH
I'd do:
$string = "Add dot after last character before new line\n";
$string = preg_replace("/([^.\r\n])$/s", "$1.", $string);
Thanks for all the answers, but none of them really caught all scenarios right.
I fumbled my way to a good solution using the word boundary regex character class:
// Add dot after every word boundary that is followed by a new line.
$string = preg_replace("/[\b][\n]/", ".\n", $string);
I guess [\b][\n] could just as well be \b\n without square brackets.
This works for me:
$content = preg_replace("/(\w+)(\n)/", "$1.$2", $content);
It will match a word immediately followed by a new line, and add a dot in between.
Will match:
Hello\n
Will not match:
Hello \n
or
Hello.\n

Remove all non-matching characters in PHP string?

I've got text from which I want to remove all characters that ARE NOT the following.
desired_characters =
0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n
The last is a \n (newline) that I do want to keep.
To match all characters except the listed ones, use an inverted character set [^…]:
$chars = "0123456789!&',-./abcdefghijklmnopqrstuvwxyz\n";
$pattern = "/[^".preg_quote($chars, "/")."]/";
Here preg_quote is used to escape certain special characters so that they are interpreted as literal characters.
You could also use character ranges to express the listed characters:
$pattern = "/[^0-9!&',-.\\/a-z\n]/";
In this case it doesn’t matter if the literal - in ,-. is escaped or not. Because ,-. is interpreted as character range from , (0x2C) to . (0x2E) that already contains the - (0x2D) in between.
Then you can remove those characters that are matched with preg_replace:
$output = preg_replace($pattern, "", $str);
$string = 'This is anexample $tring! :)';
$string = preg_replace('/[^0-9!&\',\-.\/a-z\n]/', '', $string);
echo $string; // hisisanexampletring!
^ This is case sensitive, hence the capital T is removed from the string. To allow capital letters as well, $string = preg_replace('/[^0-9!&\',\-.\/A-Za-z\n]/', '', $string)

Categories