Split a long string not using space - php

If I have sentences like this:
$msg = "hello how are you?are you fine?thanks.."
and I wish to seperate it into 3 (or whatever number).
So I'm doing this:
$msglen = strlen($msg);
$seperate = ($msglen /3);
$a = 0;
for($i=0;$i<3;$i++)
{
$seperate = substr($msg,$a,$seperate)
$a = $a + $seperate;
}
So the output should be..
hello how are
[a space here->] you?are you [<-a space here]
fine?thanks..
So is it possible to separate at middle of any word instead of having a space in front or end of the separated message?
Such as "thank you" -> "than" and "k you" instead of "thank" " you ".
Because I'm doing a convert function and with a space in front or end it will effect the convertion , and the space is needed for the conversion,so I can't ignore or delete it.
Thanks.

I take it you can't use trim because the message formed by the joined up strings must be unchanged?
That could get complicated. You could make something that tests for a space after the split, and if a space is detected, makes the split one character earlier. Fairly easy, but what if you have two spaces together? Or a single lettered word? You can of course recursively test this way, but then you may end up with split strings of lengths that are very different from each other.
You need to properly define the constraints you want this to function within.
Please state exactly what you want to do - do you want each section to be equal? Is the splitting in between words of a higher priority than this, so that the lengths do not matter much?
EDIT:
Then, if you aren't worried about the length, you could do something like this [starting with Eriks code and proceeding to change the lengths by moving around the spaces:
$msg = "hello how are you?are you fine?thanks..";
$parts = split_without_spaces ($msg, 3);
function split_without_spaces ($msg, $parts) {
$parts = str_split(trim($msg), ceil(strlen($msg)/$parts));
/* Used trim above to make sure that there are no spaces at the start
and end of the message, we can't do anything about those spaces */
// Looping to (count($parts) - 1) becaause the last part will not need manipulation
for ($i = 0; $i < (count($parts) - 1) ; $i++ ) {
$k = $i + 1;
// Checking the last character of the split part and the first of the next part for a space
if (substr($parts[$i], -1) == ' ' || $parts[$k][0] == ' ') {
// If we move characters from the first part to the next:
$num1 = 1;
$len1 = strlen($parts[$i]);
// Searching for the last two consecutive non-space characters
while ($parts[$i][$len1 - $num1] == ' ' || $parts[$i][$len1 - $num1 - 1] == ' ') {
$num1++;
if ($len1 - $num1 - 2 < 0) return false;
}
// If we move characters from the next part to the first:
$num2 = 1;
$len2 = strlen($parts[$k]);
// Searching for the first two consecutive non-space characters
while ($parts[$k][$num2 - 1] == ' ' || $parts[$k][$num2] == ' ') {
$num2++;
if ($num2 >= $len2 - 1) return false;
}
// Compare to see what we can do to move the lowest no of characters
if ($num1 > $num2) {
$parts[$i] .= substr($parts[$k], 0, $num2);
$parts[$k] = substr($parts[$k], -1 * ($len2 - $num2));
}
else {
$parts[$k] = substr($parts[$i], -1 * ($num1)) . $parts[$k];
$parts[$i] = substr($parts[$i], 0, $len1 - $num1);
}
}
}
return ($parts);
}
This takes care of multiple spaces and single lettered characters - however if they exist, the lengths of the parts may be very uneven. It could get messed up in extreme cases - if you have a string made up on mainly spaces, it could return one part as being empty, or return false if it can't manage the split at all. Please test it out thoroughly.
EDIT2:
By the way, it'd be far better for you to change your approach in some way :) I seriously doubt you'd actually have to use a function like this in practice. Well.. I hope you do actually have a solid reason to, it was somewhat fun coming up with it.

If you simply want to eliminate leading and trailing spaces, consider trim to be used on each result of your split.

If you want to split the string into exact thirds it is not known where the cut will be, maybe in a word, maybe between words.
Your code can be simplified to:
$msg = "hello how are you?are you fine?thanks..";
$parts = str_split($msg, ceil(strlen($msg)/3));
Note that ceil() is needed, otherwise you might get 4 elements out because of rounding.

You're probably looking for str_split, chunk_split or wordwrap.

Related

Split a comma separated string but only split by a comma

Hi I have a long string
0BV,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD,1AG,1AKF.......
I want to show it in a page by sub sting them
like
0BV,0BW,100,102,108,112,146
163191,192,193,1D94,19339
1A1,1AA,1AE,1AFD,1AG,1AKF
What i want to do is create sub strings from the string , length of 100 characters , but if the 100 th character is a not a comma i want to check the next comma in the string and split by that .
I tried to use chunk() to split by word count , but since the sub-string lengths are different , it is showing inappropriate in the page
$db_ocode = $row["option_code"];
$exclude_options_array = explode(",",$row["option_code"]);
$exclude_options_chunk_array = array_chunk($exclude_options_array,25);
$exclude_options_string = '';
foreach($exclude_options_chunk_array as $exclude_options_chunk)
{
$exclude_options_string .= implode(",",$exclude_options_chunk);
$exclude_options_string .= "</br>";
}
Please help . thanks in advance
Take the string, set the cutoff position. If that position does not contain a comma then find the first comma after that position and cut off there. Simple
<?php
$string="0BV,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD";
$cutoff=30;
if($string[$cutoff]!=",")
$cutoff=strpos($string,",",$cutoff);
echo substr($string,0,$cutoff);
Fiddle
(.{99})(?=,),|([^,]*),
Instead of split you can grab the captures which is much easy.See demo for 20 characters.
https://regex101.com/r/sH8aR8/37
Using Hanky Panky's answer i was able to provide a complete solution to my Problem , Thank you very much Hanky panky . If my code is not efficient ,Kindly please edit it .
$string="0BV,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD";
for($start=0;$start<strlen($string);) {
$cutoff=30;
if(isset($string[$start+$cutoff]) && $string[$start+$cutoff]!=",")
{
$cutoff=strpos($string,",",$start+$cutoff);
}
else if(($start+$cutoff) >= strlen($string))
{
$cutoff = strlen($string);
}
else if($start >= 30)
{
$cutoff = $start + $cutoff;
}
echo substr($string,$start,$cutoff-$start)."\n";
$start=$cutoff+1;
}
In case python
ln=0
i=1
str='0BVAa,0BW,100,102,108,112,146,163191,192,193,1D94,19339,1A1,1AA,1AE,1AFD,1AG,1AKF'
for item in str:
print (item),
ln=ln+len(item)
if ln/10>=i and item==',':
print ""
i=i+1

preg_match admits only two consecutive lowercases

I want to check if password contains:
minimum 2 lower cases
minimum 1 upper case
minimum 2 selected special characters
The problem is that when i want to verify this,it admits two lowercases,but only if they are consecutive,like this:paSWORD .
if I enter pASWORd,it returns an error.
This is the code
preg_match("/^(?=.*[a-z]{2})(?=.*[A-Z])(?=.*[_|!|#|#|$|%|^|&|*]{2}).+$/")
I don't see where the problem is and how to fix it.
You're looking for [a-z]{2} in your regex. That is two consecutive lowercases!
I will go out on a limb and suggest that it is probably better to individually check each of your three conditions in separate regexes rather than trying to be clever and do it in one.
I've put some extra braces in which may get your original idea to work for non-consecutive lowercase/special chars, but I think the expression is overcomplex.
preg_match("/^(?=(.*[a-z]){2})(?=.*[A-Z])(?=(.*[_!##$%^&*]){2}).+$/")
You can use this pattern to check the three rules:
preg_match("/(?=.*[a-z].*[a-z])(?=.*[A-Z])(?=.*[_!##$%^&*].*[_!##$%^&*])/");
but if you want to allow only letters and these special characters, you must add:
preg_match("/^(?=.*[a-z].*[a-z])(?=.*[A-Z])(?=.*[_!##$%^&*].*[_!##$%^&*])[a-zA-Z_!##%^&*]+$/");
a way without regex
$str = '*MauriceAimeLeJambon*';
$chars = 'abcdefghijklmnopqrtuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_!##$%^&*';
$state = array('lower' => 2, 'upper' => 1, 'special' => 2);
$strlength = strlen($str);
for ($i=0; $i<$strlength; $i++) {
$pos = strpos($chars, $str[$i]);
if (is_numeric($pos)) {
if ($state['lower'] && $pos<26) $state['lower']--;
elseif ($state['upper'] && $pos<52) $state['upper']--;
elseif ($state['special']) $state['special']--;
} else { $res = false; break; }
$res = !$state['lower'] && !$state['upper'] && !$state['special'];
}
var_dump($res);
(This version give the same result than the second pattern. If you want the same result than the first pattern, just remove the else {} and put the last line out of the for loop.)

PHP: How to break a string by words within a character limit and near line breaks

I am using a terrible wrapper of PDFLib that doesn't handle the problem PDFLib has with cells that are more than the character limit (Which is around 1600 characters per cell).
So I need to break a large paragraph into smaller strings that fit neatly into the cells, without breaking up words, and as close to the end of the line as possible.
I am completely stumped about how to do this efficiently (I need it to run in a reasonable amount of time)
Here is my code, which cuts the block up into substrings based on character length alone, ignoring the word and line requirements I stated above:
SPE_* functions are static functions from the wrapper class,
SetNextCellStyle calls are used to draw a box around the outline of the cells
BeginRow is required to start a row of text.
EndRow is required to end a row of text, it must be called after BeginRow, and if the preset number of columns is not completely filled, an error is generated.
AddCell adds the string to the second parameter number of columns.
function SPE_divideText($string,$cols,$indent,$showBorders=false)
{
$strLim = 1500;
$index = 0;
$maxIndex = round((strlen($string) / 1500-.5));
$retArr= array();
while(substr($string, $strLim -1500,$strLim)!=FALSE)
{
$retArr[$index] = substr($string, $strLim -1500,$strLim);
$strLim+=1500;
SPE_BeginRow();
SPE_SetNextCellStyle('cell-padding', '0');
if($indent>0)
{
SPE_Empty($indent);
}
if($showBorders)
{
SPE_SetNextCellStyle('border-left','1.5');
SPE_SetNextCellStyle('border-right','1.5');
if($index == 0)
{
SPE_SetNextCellStyle('border-top','1.5');
}
if($index== $maxIndex)
{
SPE_SetNextCellStyle('border-bottom','1.5');
}
}
SPE_AddCell($retArr[$index],$cols-$indent);
SPE_EndRow();
$index++;
}
}
Thanks in advance for any help!
Something like this should work.
function substr_at_word_boundary($string, $chars = 100)
{
preg_match('/^.{0,' . $chars. '}(?:.*?)\b/iu', $string, $matches);
$new_string = $matches[0];
return ($new_string === $string) ? $string : $new_string;
}
$string = substr_at_word_boundary($string, 1600)

How to compare two very large strings [duplicate]

This question already has answers here:
Closed 10 years ago.
How can I compare the two large strings of size 50Kb each using php. I want to highlight the differentiating bits.
Differences between two strings can also be found using XOR:
$s = 'the sky is falling';
$t = 'the pie is failing';
$d = $s ^ $t;
echo $s, "\n";
for ($i = 0, $n = strlen($d); $i != $n; ++$i) {
echo $d[$i] === "\0" ? ' ' : '#';
}
echo "\n$t\n";
Output:
the sky is falling
### #
the pie is failing
The XOR operation will result in a string that has '\0' where both strings are the same and something not '\0' if they're different. It won't be faster than just comparing both strings character by character, but it'd be useful if you want to just know the first character that's different by using strspn().
Do you want to output like diff?
Perhaps this is what you want https://github.com/paulgb/simplediff/blob/5bfe1d2a8f967c7901ace50f04ac2d9308ed3169/simplediff.php
ADDED:
Or if you want to highlight every character that is different, you can use a PHP script like this:
for($i=0;$i<strlen($string1);$i++){
if($string1[$i]!=$string2[$i]){
echo "Char $i is different ({$string1[$i]}!={$string2[$i]}<br />\n";
}
}
Perhaps if you can tell us in detail how you would like to compare, or give us some examples, it would be easier for us to decide the answer.
A little modification to #Alvin's script:
I tested it in my local server with a 50kb lorem ipsum string, i substituted all "a" to "4" and it highlight them. It runs pretty fast
<?php
$string1 = "This is a sample text to test a script to highlight the differences between 2 strings, so the second string will be slightly different";
$string2 = "This is 2 s4mple text to test a scr1pt to highlight the differences between 2 strings, so the first string will be slightly different";
for($i=0;$i<strlen($string1);$i++){
if($string1[$i]!=$string2[$i]){
$string3[$i] = "<mark>{$string1[$i]}</mark>";
$string4[$i] = "<mark>{$string2[$i]}</mark>";
}
else {
$string3[$i] = "{$string1[$i]}";
$string4[$i] = "{$string2[$i]}";
}
}
$string3 = implode("",$string3);
$string4 = implode("",$string4);
echo "$string3". "<br />". $string4;
?>

PHP Number Cleaner RegEx

I have a function I use in PHP to work with numbers. The intent is to clean the number and, optionally, convert nulls to zero. It began for me for use in prep for sql, but is now used in more places. Here it is:
function clean_num ($num, $null_to_zero = true) {
$num = preg_replace("/[^-0-9.0-9$]/","",$num);
if (strlen($num) == 0)
$num = ($null_to_zero) ? 0 : null;
else if (strlen($num) == 1 && ($num == '-' || $num == '.'))
$num = ($null_to_zero) ? 0 : null;
return $num;
}
Does anyone have any ideas on a faster, better way of doing this? It works, the regex is simple enough and should cover all cases I need, but... A diff regex might do all the same without other junk. Regex is not my strength. Thanks!
The regex [^-0-9.0-9$] matches any char that is
not a hyphen
not a digit
not a .
not a $
there is no need to have two 0-9 in the char class, so effectively your regex is: [^-0-9.$] or [^-\d.$]

Categories