Regex for breaking repeated characters - php

I have a comment box on my site , what I want here is if a user writes input (any character) which is more than 20 characters and doesnot put space between them then it should place a space between it.
Like: "asdasdasdasdasdasdasdasd"
Parsed: "asdasdasdasdasdasdas dasd"
I think it can be done with string compare but I want the regex to match it or the full solution. Thanks for any help.

it is called word wrapping.
http://php.net/manual/en/function.wordwrap.php
from examples :
<?php
$text = "A very long woooooooooooord.";
$newtext = wordwrap($text, 8, " ", true);
echo "$newtext\n";
?>
output:
A very long wooooooo ooooord.

The function wordwrap does this job well. But here is a regex based solution:
$str = "asdasdasdasdasdasdasdasd";
$str = preg_replace('/(.{20})/','$1 ',$str);
This will put add a space even if the input is of size 20. If you don't want that use:
$str = preg_replace('/(.{20})(?=.)/','$1 ',$str);

Related

How can remove the numberic suffix in php?

For example, if I want to get rid of the repeating numeric suffix from the end of an expression like this:
some_text_here_1
Or like this:
some_text_here_1_5
and I want finally receive something like this:
some_text_here
What's the best and flexible solution?
$newString = preg_replace("/_?\d+$/","",$oldString);
It is using regex to match an optional underscore (_?) followed by one or more digits (\d+), but only if they are the last characters in the string ($) and replacing them with the empty string.
To capture unlimited _ numbers, just wrap the whole regex (except the $) in a capture group and put a + after it:
$newString = preg_replace("/(_?\d+)+$/","",$oldString);
If you only want to remove a numberic suffix if it is after an underscore (e.g. you want some_text_here14 to not be changed, but some_text_here_14 to be changed), then it should be:
$newString = preg_replace("/(_\d+)+$/","",$oldString);
Updated to fix more than one suffix
Strrpos is far better than regex on such a simple string problem.
$str = "some_text_here_13_15";
While(is_numeric(substr($str, strrpos($str, "_")+1))){
$str = substr($str,0 , strrpos($str, "_"));
}
Echo $str;
Strrpos finds the last "_" in str and if it's numeric remove it.
https://3v4l.org/OTdb9
Just to give you an idea of what I mean with regex not being a good solution on this here is the performance.
Regex:
https://3v4l.org/Tu8o2/perf#output
0.027 seconds for 100 runs.
My code with added numeric check:
https://3v4l.org/dkAqA/perf#output
0.003 seconds for 100 runs.
This new code performs even better than before oddly enough, regex is very slow. Trust me on that
You be the judge on what is best.
First you'll want to do a preg_replace() in order to remove all digits by using the regex /\d+/. Then you'll also want to trim any underscores from the right using rtrim(), providing _ as the second parameter.
I've combined the two in the following example:
$string = "some_text_here_1";
echo rtrim(preg_replace('/\d+/', '', $string), '_'); // some_text_here
I've also created an example of this at 3v4l here.
Hope this helps! :)
$reg = '#_\d+$#';
$replace = '';
echo preg_replace($reg, $replace, $string);
This would do
abc_def_ghi_123 > abc_def_ghi
abc_def_1 > abc_def
abc_def_ghi > abc_def_ghi
abd_def_ > abc_def_
abc_123_def > abd_123_def
in case of abd_def_123_345 > abc_def
one could change the line
$reg = '#(?:_\d+)+$#';

Remove text after link

So I have an #mentions function on my site that users input themselves but can do something line:
#foo Hello This is some mention text included.
I would like to remove just the text (Everything after #foo) The content comes through the streamitem_content:
$json['streamitem_content_usertagged'] =
preg_replace('/(^|\s)#(\w+)/', '\1#$1',
$json['streamitem_content']);
Give this a try
$json['streamitem_content'] = '#foo Hello This is some mention text included.';
$json['streamitem_content_usertagged'] =
preg_replace('/#(\w+)/', '#$1',
$json['streamitem_content']);
echo $json['streamitem_content_usertagged'];
Output:
#foo Hello This is some mention text included.
Preg_replace will only replace what it finds so you don't need to find content you aren't interested. If you did want to capture multiple parts of a string though capture groups increase by one after each group (). So this
preg_replace('/(^|\s)#(\w+)/', '$1#$2',
$json['streamitem_content']);
echo $json['streamitem_content_usertagged'];
would actually be
preg_replace('/(^|\s)#(\w+)/', '$1#$2',
$json['streamitem_content']);
Update:
$json['streamitem_content'] = '#foo Hello This is some mention text included.';
$json['streamitem_content_usertagged'] =
preg_replace('/#(\w+).*$/', '#$1',
$json['streamitem_content']);
echo $json['streamitem_content_usertagged'];
Output:
#foo
If the content you want to replace after #foo can extended to multiple lines use the s modifier.
Regex101 Demo: https://regex101.com/r/tX1rO0/1
So pretty much the regex says find an # then capture all continuous a-zA-Z0-9_ characters. After a those continuos characters we don't care go to the end of the string.
You can use this:
preg_replace('/^\s*#(\w+)/', '#$1',
$json['streamitem_content']);
This removes the leading white space, and includes the # in the hyperlink's text (not the link argument).
If you need to keep the leading white space in tact:
preg_replace('/^(\s*)#(\w+)/', '$1#$2',
$json['streamitem_content']);
You could use explode(); and str_replace(); . They might have a speed advantage over preg.
Assuming the line is available as a variable (e.g. $mention):
$mention = $json['streamitem_content'];
$mention_parts = explode(" ", $mention);
$the_part_you_want = str_replace('#','', $mention_parts[0]);
// or you could use $the_part_you_want = ltrim($mention_parts[0], '#');
$json['streamitem_content_usertagged'] = '#' . $mention_parts[0] . '';
or use trim($mention_parts[0]); to remove any whitespace if it is unwanted.
You could use fewer variables and reuse $mention as array but this seemed a clearer way to illustrate the principle.

PHP str_replace can't give me the output I want

I want to replace a string at a particular position. For that I used str_replace() PHP function, but after that, I can't get an output. Here I show you what I want.
$str = "hello 8-7-2015 world -12";
// here I want replace - with ' desh ' but in date only. That I have detected using check before character if space than it should be 'minus' otherwise it should be 'desh'.
$key = strpos($str, "-");
if($key !== false){
$a = substr($str, $key-1 , 1);
if($a != " "){
$str = str_replace("-","desh",$str);
}else{
$str = str_replace("-","minus",$str);
}
}
I get output like: hello 8 desh 7 desh 2015 world desh 12 . Everywhere there is desh I want minus 12. Other values are okay and should not be changed.
Means particular position change.
Your code (with an if) doesn't loop over the string looking for all occurrences, so that should have raised an alert flag with you when all the occurrences were changed.
What it does is to find the first occurrence, which isn't preceded by a space, then it executes:
str_replace("-","desh",$str);
which replaces all occurrences within the string. In order to do what you want, all you need is:
str_replace(" -"," minus",$str);
str_replace("-","desh",$str);
This will first take care of all - character preceded by a space, turning them into " minus".
The second line will then take care of all the remaining - characters, replacing them with "desh".
Just as an aside, if you're doing this to be able to "speak" the words (in the sense of a text-to-speech (TTS) program), you probably want spaces on either sides of the words you're adding. You can achieve that with a very small modification:
str_replace(" -"," minus ",$str);
str_replace("-"," desh ",$str);
That may make it easier for your TTS code to handle the words.
There's no point in your condition since str_replace takes effect on whole the string without any relation to your $key variable.
$str = str_replace(" -","minus",$str);
$str = str_replace("-","desh",$str);
Truth is that you don't even need that condition. Simply use the first str_replace when the search term has blank space prior to it and the second str_replce doesn't. (order it's important).
You can use regex:
$str = preg_replace(
['/(\d{1,2})-(\d{1,2})-(\d{2,4})/','/-(\d+)/'],
['$1 desh $2 desh $3', 'minus $1'],
$str);
check this,
First you have to get the date from the string, then change the date format as you want after that concatenate with other strings
$str = explode(' ',$str);
$str1 = str_replace("-","desh",$str[1]);
$str2 = str_replace("-","minus",$str[2]);
$str = $str[0].$str1.$str2;

Regex to trim text between tags

I expected this to be a simple regex but I guess my head isn't screwed on this morning!
I'm taking the source code of a page and tidying it up with a bunch of other preg_replaces, so by the time we get to the regex below, the result is already a single line string with things like comments stripped out, etc.
All I'm looking to do now is trim the texts between > and < char's down to remove extra whitespace. I.e.
<p> hello world </p>
should become
<p>hello world</p>
I figured this would do the trick, but it seems to do nothing?
$data = trim(preg_replace('/>(\s*)([^\s]*?)(\s*)</', '>$2<', $data));
Cheers.
Here's a ridiculous way to do it lol:
$str = "<p> hello world </p>";
$strArr = explode(" ", $str);
$strArr = array_filter($strArr);
var_dump(implode(" ",$strArr));
Use the power of arrays to remove the white spaces lol
you can use the /e modifier in regex to use the trim() function while replacing.
$data = preg_replace('/>([^<]*)</e', '">" . trim("$1") . "<"', $data);
A regex could be:
>\s+(.*[^\s])\s+<
but don't use it, there are better ways to reach that goal (example: HTMLtidy)
You may use this snippet of code.
$x = '<p> hello world </p>';
$foo = preg_replace('/>\s+/', '>', $x); //first remove space after ">" symbol
$foo = htmlentities(preg_replace('/\s+</', '<', $foo)); //now remove space before "<" symbol
echo $foo;

how could I combine these regex rules?

I'm detecting #replies in a Twitter stream with the following PHP code using regexes.
$text = preg_replace('!^#([A-Za-z0-9_]+)!', '#$1', $text);
$text = preg_replace('! #([A-Za-z0-9_]+)!', ' #$1', $text);
How can I best combine these two rules without false flagging email#domain.com as a reply?
OK, on a second thought, not flagging whatever#email means that the previous element has to be a "non-word" item, because any other element that could be contained in a word could be signaled as an email, so it would lead:
!(^|\W)#([A-Za-z0-9_]+)!
but then you have to use $2 instead of $1.
Since the ^ does not have to stand at the beginning of the RE, you can use grouping and | to combine those REs.
If you don't want re-insert the whitespace you captured, you have to use "positive lookbehind":
$text = preg_replace('/(?<=^|\s)#(\w+)/',
'#$1', $text);
or "negative lookbehind":
$text = preg_replace('/(?<!\S)#(\w+)/',
'#$1', $text);
...whichever you find easier to understand.
Here's how I'd do the combination
$text = preg_replace('!(^| )#([A-Za-z0-9_]+)!', '$1#$2', $text);
$text = preg_replace('/(^|\W)#(\w+)/', '#$2', $text);
preg_replace('%(?<!\S)#([A-Za-z0-9_]+)%', '#$1', $text);
(?<!\S) is loosely translated to "no preceding non-whitespace character". Sort of a double-negation, but also works at the start of the string/line.
This won't consume any preceding character, won't use any capturing group, and won't match strings such as "foo-#host.com", which is a valid e-mail address.
Tested:
Input = 'foo bar baz-#qux.com bee #def goo#doo #woo'
Output = 'foo bar baz-#qux.com bee #def goo#doo #woo'
Hu, guys, don't push too far... Here it is :
!^\s*#([A-Za-z0-9_]+)!
I think you can use alternation,: so look for the beginning of a string or a space
'!(?:^|\s)#([A-Za-z0-9_]+)!'

Categories