PHP - Display output without cutting off words - php

I have a search result that strictly counts the number of characters before and after the SEARCH TERM when cutting off the full string. Unfortunately, this causes the output to cut off words in the middle. (...with an ellipse before and after the counting)
I am trying to have the search result cut off the full string ONLY at white space vs. in the middle of a word.
Here is the function:
private function _highlight_results(){
$GLOBALS['_SEARCH_SUMMARY_LENGTH'] = 24;
foreach($this->results as $url => &$this_result){
if(!$this_result['url_display'] && $this_result['url']){
$this_result['url_display'] = $this_result['url'];
}
foreach($this_result['search_term'] as $search_term){
$search_term = preg_quote($search_term,'/');
foreach(array('title','summary','url_display') as $highlight_item){
if($this_result[$highlight_item] && preg_match('/'.$search_term.'/i',$this_result[$highlight_item])){
if($highlight_item != 'url_display' && strlen($this_result[$highlight_item]) > $GLOBALS['_SEARCH_SUMMARY_LENGTH']){
$boobs = ceil(($GLOBALS['_SEARCH_SUMMARY_LENGTH']-strlen($this->_search_term))/2);
preg_match('/(.{0,'.$boobs.'})('.$search_term.')(.{0,'.$boobs.'})/i',$this_result[$highlight_item],$matches);
// want to even out the strings a bit so if highlighted term is at end of string, put more characters infront.
$before_limit = $after_limit = ($boobs - 2);
if(strlen($matches[1])>=$before_limit && strlen($matches[3])>=$after_limit){
// leave limit alone.
}else if(strlen($matches[1])<$before_limit){
$after_limit += $before_limit - strlen($matches[1]);
$before_limit = strlen($matches[1]);
preg_match('/(.{0,'.($before_limit+2).'})('.$search_term.')(.{0,'.($after_limit+2).'})/i',$this_result[$highlight_item],$matches);
}else if(strlen($matches[3])<$after_limit){
$before_limit += $after_limit - strlen($matches[3]);
$after_limit = strlen($matches[3]);
preg_match('/(.{0,'.($before_limit+2).'})('.$search_term.')(.{0,'.($after_limit+2).'})/i',$this_result[$highlight_item],$matches);
}
$this_result[$highlight_item] = (strlen($matches[1])>$before_limit) ? '...'.substr($matches[1],-$before_limit) : $matches[1];
$this_result[$highlight_item] .= $matches[2];
$this_result[$highlight_item] .= (strlen($matches[3])>$after_limit) ? substr($matches[3],0,$after_limit).'...' : $matches[3];
}
}else if(strlen($this_result[$highlight_item]) > $GLOBALS['_SEARCH_SUMMARY_LENGTH']){
$this_result[$highlight_item] = substr($this_result[$highlight_item],0,$GLOBALS['_SEARCH_SUMMARY_LENGTH']).'...';
}
}
}
foreach($this_result['search_term'] as $search_term){
$search_term = preg_quote($search_term,'/');
foreach(array('title','summary','url_display') as $highlight_item){
$this_result[$highlight_item] = preg_replace('/'.$search_term.'/i','<span id="phpsearch_resultHighlight">$0</span>',$this_result[$highlight_item]);
}
}
}
}
Here's what I was thinking...
Just before displaying the string output, the script should loop through the string using a function that 'looks for' an ellipse and an immediate character and then removes the character AFTER and continues looping until a white space is found. Then, the next loop would 'look for' a character and then an ellipse and then removes the character and continues looping until a white space is found BEFORE the ellipse.
Here's some very sad pseudo code of my description above:
WHILE (not the end of the string) {
// NOT SURE IF I NEED A FOREACH LOOP HERE TO CHECK EACH CHAR
IF ( ^ ('...' and an immediate char are found) ) {
delete chars until a white space is found;
// if '...' is deleted along with the chars, then put the '...' back in:
//string .= '...' . string;
}
IF ( $ (a char and an immediate '...' are found) ) {
delete chars until a white space is found;
// if '...' is deleted along with the chars, then put the '...' back in:
//string .= string . '...';
}
}
PRINT string;
I think you can get the idea of what I'm looking for from the stuff above. I have researched and tested wordwrap() but still have not found THE answer.

Here's an approach that should work fine and also be quite performant. The only drawback is that it breaks words only on spaces as it stands, and this cannot be trivially fixed because there is no strrspn function to complement strspn (but one could be easily written and used to extend this solution).
function display_short($str, $limit, $ellipsis = '...') {
// if all of it fits there's nothing to do
if (strlen($str) <= $limit) {
return $str;
}
// $ellipsis will count towards $limit
$limit -= strlen($ellipsis);
// find the last space ("word boundary")
$pos = strrpos($str, ' ', $limit - strlen($str));
// if none found, prefer breaking into the middle of
// "the" word instead of just giving up
if ($pos === false) {
$pos = $limit;
}
return substr($str, 0, $pos).$ellipsis;
}
Test with:
$string = "the quick brown fox jumps over the lazy dog";
for($limit = 10; $limit <= strlen($string); $limit += 10) {
print_r(display_short($string, $limit));
}
See it in action.

Related

Regex : Removing all extra whitespace, line-breaks, and empty spaces

Update: It seems like it is just a regex problem.
I am trying to remove all extra whitespace, line-breaks, and empty spaces from user story with a function to grab only 100 characters
Issue is although 100 character limit works, the removing of whitespace, linebreaks and empty spaces does not apply:
function aboutme_echo($x, $length)
{
if(strlen($x) <= $length)
{
echo $x;
}
else
{
$y = substr($x,0,$length) . '...';
echo $y;
}
}
aboutme_echo((preg_replace("/\s+/"," ", $aboutme)), 100);
Example String: 🤣🤣🤣WHAT?! That's crazy!
Long story short,
😛😛someone reached out to me who had a pharma virus.
😎I have the opportunity to rebuild their site, but I can't rush the planning and staging, but i...
Something like this maybe?
function cleanExcerpt($string) {
$pattern = '/\s+/';
$replace = ' ';
$cleanstring = trim(preg_replace($pattern,$replace,$string));
return strtok(wordwrap($cleanstring, 100, "...\n"), "\n");
}
To use, you must pass through a string and echo the function like this:
$string = "Example String: 🤣🤣🤣WHAT?! That's crazy! Long story short, 😛😛someone reached ";
echo cleanExcerpt($string);

PHP: How to Properly Use Strpos() to Find a Word in a String

If one is experienced in PHP, then one knows how to find whole words in a string and their position using a regex and preg_match() or preg_match_all. But, if you're looking instead for a lighter solution, you may be tempted to try with strpos(). The question emerges as to how one can use this function without it detecting substrings contained in other words. For example, how to detect "any" but not those characters occurring in "company"?
Consider a string like the following:
"Will *any* company do *any* job, (are there any)?"
How would one apply strpos() to detect each appearance of "any" in the string? Real life often involves more than merely space delimited words. Unfortunately, this sentence didn't appear with the non-alphabetical characters when I originally posted.
I think you could probably just remove all the whitespace characters you care about (e.g., what about hyphenations?) and test for " word ":
var_dump(firstWordPosition('Will company any do any job, (are there any)?', 'any'));
var_dump(firstWordPosition('Will *any* company do *any* job, (are there any)?', 'any'));
function firstWordPosition($str, $word) {
// There are others, maybe also pass this in or array_merge() for more control.
$nonchars = ["'",'"','.',',','!','?','(',')','^','$','#','\n','\r\n','\t',];
// You could also do a strpos() with an if and another argument passed in.
// Note that we're padding the $str to with spaces to match begin/end.
$pos = stripos(str_replace($nonchars, ' ', " $str "), " $word ");
// Have to account for the for-space on " $str ".
return $pos ? $pos - 1: false;
}
Gives 12 (offset from 0)
https://3v4l.org/qh9Rb
<?php
$subject = "any";
$b = " ";
$delimited = "$b$subject$b";
$replace = array("?","*","(",")",",",".");
$str = "Will *any* company do *any* job, (are there any)?";
echo "\nThe string: \"$str\"";
$temp = str_replace($replace,$b,$str);
while ( ($pos = strpos($temp,$delimited)) !== false )
{
echo "\nThe subject \"$subject\" occurs at position ",($pos + 1);
for ($i=0,$max=$pos + 1 + strlen($subject); $i <= $max; $i++) {
$temp[$i] = $b;
}
}
See demo
The script defines a word boundary as a blank space. If the string has non-alphabetical characters, they are replaced with blank space and the result is stored in $temp. As the loop iterates and detects $subject, each of its characters changes into a space in order to locate the next appearance of the subject. Considering the amount of work involved one may wonder if such effort really pays off compared to using a regex with a preg_ function. That is something that one will have to decide themselves. My purpose was to show how this may be achieved using strpos() without resorting to the oft repeated conventional wisdom of SO which advocates using a regex.
There is an option if you are loathe to create a replacement array of non-alphabetical characters, as follows:
<?php
function getAllWholeWordPos($s,$word){
$b = " ";
$delimited = "$b$word$b";
$retval = false;
for ($i=0, $max = strlen( $s ); $i < $max; $i++) {
if ( !ctype_alpha( $s[$i] ) ){
$s[$i] = $b;
}
}
while ( ( $pos = stripos( $s, $delimited) ) !== false ) {
$retval[] = $pos + 1;
for ( $i=0, $max = $pos + 1 + strlen( $word ); $i <= $max; $i++) {
$s[$i] = $b;
}
}
return $retval;
}
$whole_word = "any";
$str = "Will *$whole_word* company do *$whole_word* job, (are there $whole_word)?";
echo "\nString: \"$str\"";
$result = getAllWholeWordPos( $str, $whole_word );
$times = count( $result );
echo "\n\nThe word \"$whole_word\" occurs $times times:\n";
foreach ($result as $pos) {
echo "\nPosition: ",$pos;
}
See demo
Note, this example with its update improves the code by providing a function which uses a variant of strpos(), namely stripos() which has the added benefit of being case insensitive. Despite the more labor-intensive coding, the performance is speedy; see performance.
Try the following code
<!DOCTYPE html>
<html>
<body>
<?php
echo strpos("I love php, I love php too!","php");
?>
</body>
</html>
Output: 7

PHP - Turn this string: "adc 25...123.50 xyz" into 2 variables: "25" and "123.50"?

The title almost much sums what i am trying to accomplish.
I have a string that could consist of letters in the alphabet or, numbers or characters like ")" and "*". It may also include a numeric string separated by three dots "...", e.g. "25...123.50".
An example of this string could be:
peaches* 25...123.50 +("apples") or -(peaches*) apples* 25...123.50
Now, what i would like to do is capture the numbers before and after the three dots, so i end up with 2 variables, 25 and 123.50. I would then like to trim the string so that i end up with a string that excludes the number values:
peaches* +("apples") or -(peaches*) apples*
So essentially:
$string = 'peaches* 25...123.50 +("apples")';
if (preg_match("/\.\.\./", $string ))
{
# How do i get the left value (could or could not be a decimal, using .)
$from = 25;
# How do i get the right value (could or could not be a decimal, using .)
$to = 123.50;
# How do i remove the value "here...here" is this right?
$clean = preg_replace('/'.$from.'\.\.\.'.$to.'/', '', $string);
$clean = preg_replace('/ /', ' ', $string);
}
If anyone could provide me with some input on the best way to go about this complicated task it would be greatly appreciated! Any suggestions, advice, input, feedback or comments are most welcome, Thank you!
This preg_match should work:
$str = 'peaches* 25...123.50 +("apples")';
if (preg_match('~(\d+(?:\.\d+)?)\.{3}(\d+(?:\.\d+)?)~', $str, $arr))
print_r($arr);
Pseudo code
In a loop:
Perform a strpos for "..." and substr at that position. Then go back from the end of that substring (character by character), checking to see if each is_numeric or a period. On the first non-numeric/non-period occurrence, you grab a substring from the beginning of the original string to that point (store it temporarily). Then start checking for is_numeric or period in the other direction. Grab a substring and add it to the other substring you stored. Repeat.
It's not a regex, but it will accomplish the same goal nonetheless.
Some php
$my_string = "blah blah abc25.4...123.50xyz blah blah etc";
$found = 1;
while($found){
$found = $cursor = strpos($my_string , "...");
if(!empty($found)){
//Go left
$char = ".";
while(is_numeric($char) || $char == "."){
$cursor--;
$char = substr($my_string , $cursor, 1);
}
$left_substring = substr($my_string , 1, $cursor);
//Go right
$cursor = $found + 2;
$char = ".";
while(is_numeric($char) || $char == "."){
$cursor++;
$char = substr($my_string , $cursor, 1);
}
$right_substring = substr($my_string , $cursor);
//Combine the left and right
$my_string = $left_substring . $right_substring;
}
}
echo $my_string;

How to cut a text after a number of chars and obtain text sporead on multiple rows?

I have a very long text, and I need to cut the text after N chars, so that at the end I obtain a text, rendered on multiple rows, without any of the words being cut;
So, if a have a text with the lenght of a 1000 chars, which has been saved on 1 line, and I need to cut from 100 to 100 chars, at the end, I will get a text spread on 10 lines.
I tryed something, but I got stuck;
foreach does not work, the text is not seen a a array; also, i did not made sure to keep the words intact in my test;
Has anyone tryed this? Or is there any link with solution?
public static function cut_line_after_n_chars($str, $n = 70) {
$result = '';
$pos = 0;
foreach ($str as $c) {
$pos++;
if ($pos == $n) {
$result .= $c + '<br/>';
$pos = 0;
}
else
$result .= $c;
}
return $result;
}
It sounds like you need wordwrap.
http://php.net/manual/en/function.wordwrap.php
This allows you to break a string into an array of pieces without cutting off words. You can then format these pieces as you like.
EDIT
If you still need each of your lines to be 100 characters, you can use str_pad to add extra spaces onto each row.
Use explode() function to get array of words from your string.
$words = explode( ' ', $str );
$length = 0;
foreach( $words as $word ) {
// Your loop code goes here.
}

Get the current + the next word in a string

this is what I try to get:
My longest text to test When I search for e.g. My I should get My longest
I tried it with this function to get first the complete length of the input and then I search for the ' ' to cut it.
$length = strripos($text, $input) + strlen($input)+2;
$stringpos = strripos($text, ' ', $length);
$newstring = substr($text, 0, strpos($text, ' ', $length));
But this only works first time and then it cuts after the current input, means
My lon is My longest and not My longest text.
How I must change this to get the right result, always getting the next word. Maybe I need a break, but I cannot find the right solution.
UPDATE
Here is my workaround till I find a better solution. As I said working with array functions does not work, since part words should work. So I extended my previous idea a bit. Basic idea is to differ between first time and the next. I improved the code a bit.
function get_title($input, $text) {
$length = strripos($text, $input) + strlen($input);
$stringpos = stripos($text, ' ', $length);
// Find next ' '
$stringpos2 = stripos($text, ' ', $stringpos+1);
if (!$stringpos) {
$newstring = $text;
} else if ($stringpos2) {
$newstring = substr($text, 0, $stringpos2);
} }
Not pretty, but hey it seems to work ^^. Anyway maybe someone of you have a better solution.
You can try using explode
$string = explode(" ", "My longest text to test");
$key = array_search("My", $string);
echo $string[$key] , " " , $string[$key + 1] ;
You can take i to the next level using case insensitive with preg_match_all
$string = "My longest text to test in my school that is very close to mY village" ;
var_dump(__search("My",$string));
Output
array
0 => string 'My longest' (length=10)
1 => string 'my school' (length=9)
2 => string 'mY village' (length=10)
Function used
function __search($search,$string)
{
$result = array();
preg_match_all('/' . preg_quote($search) . '\s+\w+/i', $string, $result);
return $result[0];
}
There are simpler ways to do that. String functions are useful if you don't want to look for something specific, but cut out a pre-defined length of something. Else use a regular expression:
preg_match('/My\s+\w+/', $string, $result);
print $result[0];
Here the My looks for the literal first word. And \s+ for some spaces. While \w+ matches word characters.
This adds some new syntax to learn. But less brittle than workarounds and lengthier string function code to accomplish the same.
An easy method would be to split it on whitespace and grab the current array index plus the next one:
// Word to search for:
$findme = "text";
// Using preg_split() to split on any amount of whitespace
// lowercasing the words, to make the search case-insensitive
$words = preg_split('/\s+/', "My longest text to test");
// Find the word in the array with array_search()
// calling strtolower() with array_map() to search case-insensitively
$idx = array_search(strtolower($findme), array_map('strtolower', $words));
if ($idx !== FALSE) {
// If found, print the word and the following word from the array
// as long as the following one exists.
echo $words[$idx];
if (isset($words[$idx + 1])) {
echo " " . $words[$idx + 1];
}
}
// Prints:
// "text to"

Categories