Limited number of break lines in text in PHP - php

Assume a string is like
Section 1
Section 2 (after 1 line break)
Section 3 (after 2 line breaks)
Section 4 (after 4 line breaks)
Section 5 (after 1 line break)
My intention is to only allow N number of breaks and replace the other ones with SPACE in PHP. For example, if N=3 then the text above would be outputed like:
Section 1
Section 2 (after 1 line break)
Section 3 (after 2 line breaks) Section 4 (after 4 line breaks) Section 5 (after 1 line break)
My code is below but I am looking for a better way:
function limitedBreaks($str = '', $n = 5)
{
$str = nl2br($str);
$chars = str_split($str);
$counter = 0;
foreach ($chars as $key => $char) {
if ($char == "<br/>")
if ($counter > $n) {
$chars[$key] = ' ';
} else {
$counter += 1;
}
}
return implode($chars, ' ');
}

This really is a job better suited for regex than exploding and iterating.
$str = preg_replace('~\v+~', "\n\n", $str);
Here \v+ matches any number of vertical spaces, substituted with two \n newlines. You will get a standard one line gap between your content (non-linebreak-only) lines. This results in:
Section 1
Section 2 (after 1 line break)
Section 3 (after 2 line breaks)
Section 4 (after 4 line breaks)
Section 5 (after 1 line break)
If you want to only target more than N breaks, use e.g. \v{4,}, assuming EOL is \n, to normalize all instances of 4 or more newlines. If your file uses Windows EOL (\r\n), be aware that you need to double it up, or use (\r\n|\n){4,}, since \r and \n each are one match of \v.
That's the basic idea. Seeing as you want to replace 4+ newlines with horizontal space, merging the lines instead of normalizing line break quantity, you would simply:
$str = preg_replace('~(\r\n|\n){4,}~', " ", $str);
This would give you:
Section 1
Section 2 (after 1 line break)
Section 3 (after 2 line breaks) Section 4 (after 4 line breaks)
Section 5 (after 1 line break)
Here the gap with 4 or more EOLs was substituted with space and merged with the preceding line. The rest of the "acceptably spaced" lines are still in their places.
However it seems that you want to merge all subsequent rows into a single line after any gap of 4+ EOLs. Is that really the requirement? The first example I posted is a fairly standard operation for normalizing content with irregular linebreaks; especially "level items" like section headings!
OP: thanks for explaining your use case, makes sense. This can be regex-ed without loops:
$str = preg_replace_callback('~(\r\n|\n){4,}(?<therest>.+)~s', function($match) {
return ' ' . preg_replace('~\v+~', ' ', $match['therest']);
}, $str);
Here we capture (as named group therest) all the content that follows four or more linebreaks using preg_replace_callback, and inside the callback we preg_replace all vertical spaces in "the rest" with a single horizontal space. This results in:
Section 1
Section 2 (after 1 line break)
Section 3 (after 2 line breaks) Section 4 (after 4 line breaks) Section 5 (after 1 line break) Section 17 after a hundred bundred line breaks"
For convenience, here's the regex above wrapped in a function:
function fuse_breaks(string $str): string {
$str = preg_replace_callback('~(\r\n|\n){4,}(?<therest>.+)~s', function($match) {
return ' ' . preg_replace('~\v+~', ' ', $match['therest']);
}, $str);
return $str;
}
// Usage:
$fused_text = fuse_breaks($source_text);

Your example with N=3 shows either 4 line breaks – if the empty lines count –, or 2 line breaks.
To make things clearer this is a function limitedLines, which reduces the text to a specific amount of lines:
$str = "
line 1
line 2
line 3
line 4
line 5
line 6
";
function limitedLines(string $str = '', int $maxLines = 5): string {
$maxLines = $maxLines < 1 ? 1 : $maxLines;
$arr = explode("\n", trim($str));
while (count($arr) > $maxLines) {
$last = array_pop($arr);
$arr[array_key_last($arr)] .= ' ' . trim($last);
}
return implode("\n", $arr);
}
$result = limitedLines($str, 3);
print_r($result);
This will print:
line 1
line 2
line 3 line 4 line 5 line 6

Related

PHP str_replace with an offset

I have the following output:
Item
Length : 130
Depth : 25
Total Area (sq cm): 3250
Wood Finish: Beech
Etc: etc
I want to remove the Total Area (sq cm): and the 4 digits after it from the string, currently I am trying to use str_replace like so:
$tidy_str = str_replace( $totalarea, "", $tidy_str);
Is this the correct function to use and if so how can I include the 4 random digits after this text? Please also note that this is not a set output so the string will change position within this.
You can practice php regex at http://www.phpliveregex.com/
<?php
$str = '
Item
Length : 130
Depth : 25
Total Area (sq cm): 3250
Wood Finish: Beech
Etc: etc
';
echo preg_replace("/Total Area \(sq cm\): [0-9]*\\n/", "", $str);
Item
Length : 130
Depth : 25
Wood Finish: Beech
Etc: etc
This will do it.
$exp = '/\(sq cm\): \d+/';
echo preg_replace($exp, '', $array);
Try with this:
preg_replace('/(Total Area \(sq cm\): )([0-9\.,]*)/' , '', $tidy_str);
You are looking for substr_replace:
$strToSearch = "Total Area (sq cm):";
$totalAreaIndex = strpos($tidy_str, $strToSearch);
echo substr_replace($tidy_str, '', $totalAreaIndex, strlen($strToSearch) + 5); // 5 = space plus 4 numbers.
If you want to remove the newline too, you should check if it's \n or \r\n. \n add one, \r\n add two to offset. Ie. strlen($strToSearch) + 7

PHP Remove Unused Line Breaks Using nl2br

How to remove unused line breaks using nl2br function for this example:
Hello
Nice
Expect output display:
Hello
Nice
and another example:
remove this unused line
remove this unused line
remove this unused line
Hello
remove this unused line
remove this unused line
remove this unused line
remove this unused line
Expect output display:
Hello
So means if the line break more than 3 line, so only set 1 line breaks.
Here is my PHP code:
nl2br($string);
Old schooled but hope this works
$str = explode("\n", $str);
foreach($str as $val){
if(!empty($val)){
$result[] = $val;
}
}
$final = implode("\n", $result); //if you want line break use "<br>"
echo $final;

Regex to insert line breaks after specific series of token

I am trying to convert a multi-line string, to insert line breaks \n after two final closing parenthesis occur in series, ie: )) becomes ))\n.
There is also likely to be a ')' just prior to the '))', effectively creating ')))'.
These two or three parenthesis may or may not be "spread out" by indeterminate lengths of whitespace, eg )), ) ), ))), ) ) ), )) ), ) )) and so on.
I've tried the following:
//Example message
$message = '(item (name 286) (Index 31) (Image "item001") (class money coin) (code 4 110 0) (country 2) (plural 1) (buy 0))
(item(name 7904)(Index 7904) (specialty (Dex 10(defense 55)(hp 3500)(dodge 71) ))
(item(name 7905)(Index 7905)(country 2)
(level 80)(specialty(hp 3400) ) )
(item(name 7906)(Index 7906)(level 80) (specialty(Str 10)) ) ';
// Converts all lines into one line
$message = preg_replace("/[\r\n]*/","",$message);
// Replace '))' with '))\n' - doesn't work.
$message = preg_replace("/[)s+)]s*/","\n",$message);
$InititemLines = explode("\n", $message);
for ($line = 0; $line < count($InititemLines); $line++) {
echo "Line #<b>{$line}</b> : " . $InititemLines[$line] . "<br />\n";
}
To convert all lines into one, I used:
$message = preg_replace("/[\r\n]*/","",$message);
Then, to replace )) with ))\n, I tried the following (but it doesn't work):
$message = preg_replace("/[)s+)]s*/","))\n",$message);
I want the output to be like this:
Line #0: (item (name 286) (Index 31) (Image "item001") (class money coin) (code 4 11 0 0) (country 2) (plural 1) (buy 0))
Line #1: (item(name 7904)(Index 7904) (specialty (Dex 10)(defense 55)(hp 3500)(dodge 71) ))
Line #2: (item(name 7905)(Index 7905)(country 2)(level 80)(specialty(hp 3400) ) )
Line #3: (item(name 7906)(Index 7906)(level 80) (specialty(Str 10)) )
This will replace the "))" at the end of ALL lines, in the case of line ending in ))) or )):
$message = Preg_replace( "\)?(\s*\)\s*\))", "$1\n", $message );
This regex means
find an optional single closing parenthesis ')'. We escape it as ')' as the parenthesis has special meanings in regex. It's optional because of the trailing '?'.
followed by 0 or more space characters denoted as '\s'. 0 or more denoted by '*',
followed by another ')'
followed by another 0 or more spaces,
followed by another ')'
Then we surround \s*\)\s*\) with a pair of '(' and ')' meaning "group this section, so we can reference it later". We do this so we can replace it with ))\n.
And then a more elegant solution might be (depending on your requirements...), to subsequently also strip any excess remaining spaces from before every ')':
$message = preg_replace("(\)\s*)", "\)", $message);
This regex means
find an operning ')',
followed by 0 or more spaces
Grouped, so we can replace.
(In your example, I believe this will strip all the excess whitespace, while leaving the spaces in your strings alone).
Thank you got it working fine with
$message = Preg_replace("/(\s*\)\s*\)?\s*\))/", "$1\n", $message );

In database cell find keyword to pull information out of 'quotes' keeping line breaks to put into seperate <li> tags

I am using php to pull information out of a database, but I have come to a stumbling block with this issue:
I have an extra column, where the cells contain a few different values laid out like keyword='lots of data'. This is an example of a cell's content:
eTestOne='This would be one sentence.
This would be another.'
eTestTwo='1 Test
1 Other Test
2 Other Tests'
eTestThree='This would be another entry'
So basically I need to find the keyword eTestTwo and pull out the info from the 'quotes' keeping the line breaks to put into separate <li> tags.
I have got sort of halfway I think. This is the php I have so far:
$pos = stripos($info['extra'], "eTestTwo='");
echo substr($info['extra'], $pos + 9 );
But this doesn't strip off the extra information after the closing quote mark and doesn't help me distinguish between each line to put into their own <li>:
1 Test
1 Other Test
2 Other Tests'
eTestThree='This would be another entry'
The final output I am trying to achieve is:
<li>1 Test</li>
<li>1 Other Test</li>
<li>2 Other Tests</li>
EDIT : Just to clarify, the cell can contain more or less values in no particular order. So the solution really needs to rely on finding the keyword eTestTwo and somehow grabbing the info from between its quotes.
Edit 2:
$parts = explode("eTestTwo" , $info['extra']);
$temp = explode("'" ,$parts[1]);
/*
temp[0] = "="
temp[1] = "1 Test
1 Other Test
2 Other Tests"
temp[2] = "...."
*/
$lines = explode("\r" , $temp[1]);
unset($temp);
foreach($lines as $line)
echo "<li>".$line."</li>";
Old anwer:
$parts = explode("=" , $info['extra']);
/*
parts[0] = eTestOne
parts[1] = 'This would be one sentence.
This would be another.'
parts[2] = eTestTwo
parts[3] = '1 Test
1 Other Test
2 Other Tests'
...
*/
$lines = explode("\n" , $parts[3]);
foreach($lines as $line)
echo "<li>".$line."</li>";
//It will work only if there are no '=' chars in the quotes.

Convert Single Line Comments To Block Comments

I need to convert single line comments (//...) to block comments (/*...*/). I have nearly accomplished this in the following code; however, I need the function to skip any single line comment is already in a block comment. Currently it matches any single line comment, even when the single line comment is in a block comment.
## Convert Single Line Comment to Block Comments
function singleLineComments( &$output ) {
$output = preg_replace_callback('#//(.*)#m',
create_function(
'$match',
'return "/* " . trim(mb_substr($match[1], 0)) . " */";'
), $output
);
}
As already mentioned, "//..." can occur inside block comments and string literals. So if you create a small "parser" with the aid f a bit of regex-trickery, you could first match either of those things (string literals or block-comments), and after that, test if "//..." is present.
Here's a small demo:
$code ='A
B
// okay!
/*
C
D
// ignore me E F G
H
*/
I
// yes!
K
L = "foo // bar // string";
done // one more!';
$regex = '#
("(?:\\.|[^\r\n\\"])*+") # group 1: matches double quoted string literals
|
(/\*[\s\S]*?\*/) # group 2: matches multi-line comment blocks
|
(//[^\r\n]*+) # group 3: matches single line comments
#x';
preg_match_all($regex, $code, $matches, PREG_SET_ORDER | PREG_OFFSET_CAPTURE);
foreach($matches as $m) {
if(isset($m[3])) {
echo "replace the string '{$m[3][0]}' starting at offset: {$m[3][1]}\n";
}
}
Which produces the following output:
replace the string '// okay!' starting at offset: 6
replace the string '// yes!' starting at offset: 56
replace the string '// one more!' starting at offset: 102
Of course, there are more string literals possible in PHP, but you get my drift, I presume.
HTH.
You could try a negative look behind: http://www.regular-expressions.info/lookaround.html
## Convert Single Line Comment to Block Comments
function sinlgeLineComments( &$output ) {
$output = preg_replace_callback('#^((?:(?!/\*).)*?)//(.*)#m',
create_function(
'$match',
'return "/* " . trim(mb_substr($match[1], 0)) . " */";'
), $output
);
}
however I worry about possible strings with // in them. like:
$x = "some string // with slashes";
Would get converted.
If your source file is PHP, you could use tokenizer to parse the file with better precision.
http://php.net/manual/en/tokenizer.examples.php
Edit:
Forgot about the fixed length, which you can overcome by nesting the expression. The above should work now. I tested it with:
$foo = "// this is foo";
sinlgeLineComments($foo);
echo $foo . "\n";
$foo2 = "/* something // this is foo2 */";
sinlgeLineComments($foo2);
echo $foo2 . "\n";
$foo3 = "the quick brown fox";
sinlgeLineComments($foo3);
echo $foo3. "\n";;

Categories