Regex : Removing all extra whitespace, line-breaks, and empty spaces - php

Update: It seems like it is just a regex problem.
I am trying to remove all extra whitespace, line-breaks, and empty spaces from user story with a function to grab only 100 characters
Issue is although 100 character limit works, the removing of whitespace, linebreaks and empty spaces does not apply:
function aboutme_echo($x, $length)
{
if(strlen($x) <= $length)
{
echo $x;
}
else
{
$y = substr($x,0,$length) . '...';
echo $y;
}
}
aboutme_echo((preg_replace("/\s+/"," ", $aboutme)), 100);
Example String: 🤣🤣🤣WHAT?! That's crazy!
Long story short,
😛😛someone reached out to me who had a pharma virus.
😎I have the opportunity to rebuild their site, but I can't rush the planning and staging, but i...

Something like this maybe?
function cleanExcerpt($string) {
$pattern = '/\s+/';
$replace = ' ';
$cleanstring = trim(preg_replace($pattern,$replace,$string));
return strtok(wordwrap($cleanstring, 100, "...\n"), "\n");
}
To use, you must pass through a string and echo the function like this:
$string = "Example String: 🤣🤣🤣WHAT?! That's crazy! Long story short, 😛😛someone reached ";
echo cleanExcerpt($string);

Related

Get the current + the next word in a string

this is what I try to get:
My longest text to test When I search for e.g. My I should get My longest
I tried it with this function to get first the complete length of the input and then I search for the ' ' to cut it.
$length = strripos($text, $input) + strlen($input)+2;
$stringpos = strripos($text, ' ', $length);
$newstring = substr($text, 0, strpos($text, ' ', $length));
But this only works first time and then it cuts after the current input, means
My lon is My longest and not My longest text.
How I must change this to get the right result, always getting the next word. Maybe I need a break, but I cannot find the right solution.
UPDATE
Here is my workaround till I find a better solution. As I said working with array functions does not work, since part words should work. So I extended my previous idea a bit. Basic idea is to differ between first time and the next. I improved the code a bit.
function get_title($input, $text) {
$length = strripos($text, $input) + strlen($input);
$stringpos = stripos($text, ' ', $length);
// Find next ' '
$stringpos2 = stripos($text, ' ', $stringpos+1);
if (!$stringpos) {
$newstring = $text;
} else if ($stringpos2) {
$newstring = substr($text, 0, $stringpos2);
} }
Not pretty, but hey it seems to work ^^. Anyway maybe someone of you have a better solution.
You can try using explode
$string = explode(" ", "My longest text to test");
$key = array_search("My", $string);
echo $string[$key] , " " , $string[$key + 1] ;
You can take i to the next level using case insensitive with preg_match_all
$string = "My longest text to test in my school that is very close to mY village" ;
var_dump(__search("My",$string));
Output
array
0 => string 'My longest' (length=10)
1 => string 'my school' (length=9)
2 => string 'mY village' (length=10)
Function used
function __search($search,$string)
{
$result = array();
preg_match_all('/' . preg_quote($search) . '\s+\w+/i', $string, $result);
return $result[0];
}
There are simpler ways to do that. String functions are useful if you don't want to look for something specific, but cut out a pre-defined length of something. Else use a regular expression:
preg_match('/My\s+\w+/', $string, $result);
print $result[0];
Here the My looks for the literal first word. And \s+ for some spaces. While \w+ matches word characters.
This adds some new syntax to learn. But less brittle than workarounds and lengthier string function code to accomplish the same.
An easy method would be to split it on whitespace and grab the current array index plus the next one:
// Word to search for:
$findme = "text";
// Using preg_split() to split on any amount of whitespace
// lowercasing the words, to make the search case-insensitive
$words = preg_split('/\s+/', "My longest text to test");
// Find the word in the array with array_search()
// calling strtolower() with array_map() to search case-insensitively
$idx = array_search(strtolower($findme), array_map('strtolower', $words));
if ($idx !== FALSE) {
// If found, print the word and the following word from the array
// as long as the following one exists.
echo $words[$idx];
if (isset($words[$idx + 1])) {
echo " " . $words[$idx + 1];
}
}
// Prints:
// "text to"

How to split up a string after a certain number of characters in PHP

I am using PHP to return the user's browser user agent The problem is where I want to print it: I don't want the length to be longer than about 30 characters per line. Is there a way to break the returned variable (from the function that I call to get the string) into substrings of a certain length? And since UA strings are different lengths, I am not sure what to expect.
This is the PHP code where I return the user agent:
function __toString() {
return "Browser Name:
return "Browser Name: {$this->getBrowser()} \n" .
" Browser Version: {$this->getVersion()} \n" .
" Browser User Agent String: {$this->getUserAgent()} \n" .
" Platform: {$this->getPlatform()} ";
}
In particular, this call $this->getUserAgent. I output using this:
<?php require_once('browser.php'); $browser = new Browser(); echo $browser . "\n"; ?>
Right now, the name, version and platform calls output like I want to (because none are anywhere near as long at the UA string).
So in short, how do I split up the returned user string so that it won't exceed a certain number of characters per line? Ideally, I'd like to store them into temporary variables, because I have to add spaces in between the words. For example, where it says "Platform", there are spaces before it, so it lines up vertically with Browser Version, and then spaces so that the result of all the returned strings from the functions line up.
In case anyone wants the Github code for above to see what I am doing, the function calls are in this on lines 339-243, and the echoed results go to this on line 152.
At this point I am very very close
Just need help adding spaces before the wrapped text (see my answer below)
This is what I have right now:
$text1 = $this->getUserAgent();
$UAline1 = substr($text1, 0, 26);
$text2 = $this->getUserAgent();
$towrapUA = str_replace($UAline1, '', $text2);
$wordwrapped = chunk_split($towrapUA, 26, "\n\r");
The only issue at this point it how do I get a constant number of spaces before each of the wrapped code? I need (lets say) 20 spaces before all of the wrapped lines for formatting.
Try this:
$str = chunk_split($string, 30, "\n\r");
// Splits a string after X chars (in this case 30) and adds a line break
You can also try it using regex:
$str = preg_replace("/(.{30})/", "$1\n\r", $string);
Or, as suggested in the comments above, this does the same thing:
$str = wordwrap($string, 30, "<br />\n");
More info:
http://php.net/manual/en/function.chunk-split.php
http://us2.php.net/wordwrap
EDIT:
Based on your edited question, it looks like this is what you're looking for:
$text1 = $this->getUserAgent();
$UAline1 = substr($text1, 0, 26);
$towrapUA = str_replace($UAline1, '', $text1);
$space = str_repeat(' ', 20);
$wordwrapped = wordwrap($towrapUA, 26, "\n");
$wordwrapped = explode("\n", $wordwrapped); // Split at '\n'
$numlines = count($wordwrapped) - 1;
$string = '';
$i = 0;
foreach($wordwrapped as $line) {
if($i < $numlines) {
$string .= $space . $line . "\n\r"; // Add \n\r back in if not last line
} else {
$string .= $space . $line; // If it's the last line, leave off \n\r
}
$i++;
}
echo $string;

PHP - Display output without cutting off words

I have a search result that strictly counts the number of characters before and after the SEARCH TERM when cutting off the full string. Unfortunately, this causes the output to cut off words in the middle. (...with an ellipse before and after the counting)
I am trying to have the search result cut off the full string ONLY at white space vs. in the middle of a word.
Here is the function:
private function _highlight_results(){
$GLOBALS['_SEARCH_SUMMARY_LENGTH'] = 24;
foreach($this->results as $url => &$this_result){
if(!$this_result['url_display'] && $this_result['url']){
$this_result['url_display'] = $this_result['url'];
}
foreach($this_result['search_term'] as $search_term){
$search_term = preg_quote($search_term,'/');
foreach(array('title','summary','url_display') as $highlight_item){
if($this_result[$highlight_item] && preg_match('/'.$search_term.'/i',$this_result[$highlight_item])){
if($highlight_item != 'url_display' && strlen($this_result[$highlight_item]) > $GLOBALS['_SEARCH_SUMMARY_LENGTH']){
$boobs = ceil(($GLOBALS['_SEARCH_SUMMARY_LENGTH']-strlen($this->_search_term))/2);
preg_match('/(.{0,'.$boobs.'})('.$search_term.')(.{0,'.$boobs.'})/i',$this_result[$highlight_item],$matches);
// want to even out the strings a bit so if highlighted term is at end of string, put more characters infront.
$before_limit = $after_limit = ($boobs - 2);
if(strlen($matches[1])>=$before_limit && strlen($matches[3])>=$after_limit){
// leave limit alone.
}else if(strlen($matches[1])<$before_limit){
$after_limit += $before_limit - strlen($matches[1]);
$before_limit = strlen($matches[1]);
preg_match('/(.{0,'.($before_limit+2).'})('.$search_term.')(.{0,'.($after_limit+2).'})/i',$this_result[$highlight_item],$matches);
}else if(strlen($matches[3])<$after_limit){
$before_limit += $after_limit - strlen($matches[3]);
$after_limit = strlen($matches[3]);
preg_match('/(.{0,'.($before_limit+2).'})('.$search_term.')(.{0,'.($after_limit+2).'})/i',$this_result[$highlight_item],$matches);
}
$this_result[$highlight_item] = (strlen($matches[1])>$before_limit) ? '...'.substr($matches[1],-$before_limit) : $matches[1];
$this_result[$highlight_item] .= $matches[2];
$this_result[$highlight_item] .= (strlen($matches[3])>$after_limit) ? substr($matches[3],0,$after_limit).'...' : $matches[3];
}
}else if(strlen($this_result[$highlight_item]) > $GLOBALS['_SEARCH_SUMMARY_LENGTH']){
$this_result[$highlight_item] = substr($this_result[$highlight_item],0,$GLOBALS['_SEARCH_SUMMARY_LENGTH']).'...';
}
}
}
foreach($this_result['search_term'] as $search_term){
$search_term = preg_quote($search_term,'/');
foreach(array('title','summary','url_display') as $highlight_item){
$this_result[$highlight_item] = preg_replace('/'.$search_term.'/i','<span id="phpsearch_resultHighlight">$0</span>',$this_result[$highlight_item]);
}
}
}
}
Here's what I was thinking...
Just before displaying the string output, the script should loop through the string using a function that 'looks for' an ellipse and an immediate character and then removes the character AFTER and continues looping until a white space is found. Then, the next loop would 'look for' a character and then an ellipse and then removes the character and continues looping until a white space is found BEFORE the ellipse.
Here's some very sad pseudo code of my description above:
WHILE (not the end of the string) {
// NOT SURE IF I NEED A FOREACH LOOP HERE TO CHECK EACH CHAR
IF ( ^ ('...' and an immediate char are found) ) {
delete chars until a white space is found;
// if '...' is deleted along with the chars, then put the '...' back in:
//string .= '...' . string;
}
IF ( $ (a char and an immediate '...' are found) ) {
delete chars until a white space is found;
// if '...' is deleted along with the chars, then put the '...' back in:
//string .= string . '...';
}
}
PRINT string;
I think you can get the idea of what I'm looking for from the stuff above. I have researched and tested wordwrap() but still have not found THE answer.
Here's an approach that should work fine and also be quite performant. The only drawback is that it breaks words only on spaces as it stands, and this cannot be trivially fixed because there is no strrspn function to complement strspn (but one could be easily written and used to extend this solution).
function display_short($str, $limit, $ellipsis = '...') {
// if all of it fits there's nothing to do
if (strlen($str) <= $limit) {
return $str;
}
// $ellipsis will count towards $limit
$limit -= strlen($ellipsis);
// find the last space ("word boundary")
$pos = strrpos($str, ' ', $limit - strlen($str));
// if none found, prefer breaking into the middle of
// "the" word instead of just giving up
if ($pos === false) {
$pos = $limit;
}
return substr($str, 0, $pos).$ellipsis;
}
Test with:
$string = "the quick brown fox jumps over the lazy dog";
for($limit = 10; $limit <= strlen($string); $limit += 10) {
print_r(display_short($string, $limit));
}
See it in action.

Smarter word-wrap in PHP for long words?

I'm looking for a way to make word-wrap in PHP a bit smarter. So it doesn't pre-break long words leaving any prior small words alone on one line.
Let's say I have this (the real text is always completely dynamic, this is just to show):
wordwrap('hello! heeeeeeeeeeeeeeereisaverylongword', 25, '<br />', true);
This outputs:
hello!
heeeeeeeeeeeeeeereisavery
longword
See, it leaves the small word alone on the first line.
How can I get it to ouput something more like this:
hello! heeeeeeeeeeee
eeereisaverylongword
So it utilizes any available space on each line. I have tried several custom functions, but none have been effective (or they had some drawbacks).
I've had a go at the custom function for this smart wordwrap:
function smart_wordwrap($string, $width = 75, $break = "\n") {
// split on problem words over the line length
$pattern = sprintf('/([^ ]{%d,})/', $width);
$output = '';
$words = preg_split($pattern, $string, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
foreach ($words as $word) {
if (false !== strpos($word, ' ')) {
// normal behaviour, rebuild the string
$output .= $word;
} else {
// work out how many characters would be on the current line
$wrapped = explode($break, wordwrap($output, $width, $break));
$count = $width - (strlen(end($wrapped)) % $width);
// fill the current line and add a break
$output .= substr($word, 0, $count) . $break;
// wrap any remaining characters from the problem word
$output .= wordwrap(substr($word, $count), $width, $break, true);
}
}
// wrap the final output
return wordwrap($output, $width, $break);
}
$string = 'hello! too long here too long here too heeeeeeeeeeeeeereisaverylongword but these words are shorterrrrrrrrrrrrrrrrrrrr';
echo smart_wordwrap($string, 11) . "\n";
EDIT: Spotted a couple of caveats. One major caveat with this (and also with the native function) is the lack of multibyte support.
How about
$string = "hello! heeeeeeeeeeeeeeereisaverylongword";
$break = 25;
echo implode(PHP_EOL, str_split($string, $break));
Which outputs
hello! heeeeeeeeeeeeeeere
isaverylongword
str_split() converts the string to an array of $break size chunks.
implode() joins the array back together as a string using the glue which in this case is an end of line marker (PHP_EOL) although it could as easily be a '<br/>'
This is also a solution (for browsers etc.):
$string = 'hello! heeeeeeeeeeeeeeeeeeeeeereisaverylongword';
echo preg_replace('/([^\s]{20})(?=[^\s])/', '$1'.'<wbr>', $string);
It puts a <wbr> at words with 20 or more characters
<wbr> means "word break opportunity" so it only breaks if it has to (dictated by width of element/browser/viewer/other). It's invisible otherwise.
Good for fluid/responsive layout where there is no fixed width. And does not wrap odd like php's wordwrap
You can use CSS to accomplish this.
word-wrap: break-word;
That will break the word for you. Here is a link to see it in action:
http://www.css3.info/preview/word-wrap/
This should do the trick...
$word = "hello!" . wordwrap('heeeeeeeeeeeeeeereisaverylongword', 25, '<br />', true);
echo $word;

how to CaPiTaLiZe every other character in php?

I want to CaPiTaLiZe $string in php, don't ask why :D
I made some research and found good answers here, they really helped me.
But, in my case I want to start capitalizing every odd character (1,2,3...) in EVERY word.
For example, with my custom function i'm getting this result "TeSt eXaMpLe" and want to getting this "TeSt ExAmPlE".
See that in second example word "example" starts with capital "E"?
So, can anyone help me? : )
Well I would just make it an array and then put it back together again.
<?php
$str = "test example";
$str_implode = str_split($str);
$caps = true;
foreach($str_implode as $key=>$letter){
if($caps){
$out = strtoupper($letter);
if($out <> " ") //not a space character
$caps = false;
}
else{
$out = strtolower($letter);
$caps = true;
}
$str_implode[$key] = $out;
}
$str = implode('',$str_implode);
echo $str;
?>
Demo: http://codepad.org/j8uXM97o
I would use regex to do this, since it is concise and easy to do:
$str = 'I made some research and found good answers here, they really helped me.';
$str = preg_replace_callback('/(\w)(.?)/', 'altcase', $str);
echo $str;
function altcase($m){
return strtoupper($m[1]).$m[2];
}
Outputs: "I MaDe SoMe ReSeArCh AnD FoUnD GoOd AnSwErS HeRe, ThEy ReAlLy HeLpEd Me."
Example
Here's a one liner that should work.
preg_replace('/(\w)(.)?/e', "strtoupper('$1').strtolower('$2')", 'test example');
http://codepad.org/9LC3SzjC
Try:
function capitalize($string){
$return= "";
foreach(explode(" ",$string) as $w){
foreach(str_split($w) as $k=>$v) {
if(($k+1)%2!=0 && ctype_alpha($v)){
$return .= mb_strtoupper($v);
}else{
$return .= $v;
}
}
$return .= " ";
}
return $return;
}
echo capitalize("I want to CaPiTaLiZe string in php, don't ask why :D");
//I WaNt To CaPiTaLiZe StRiNg In PhP, DoN'T AsK WhY :D
Edited: Fixed the lack of special characters in the output.
This task can be performed without using capture groups -- just use ucfirst().
This is not built to process multibyte characters.
Grab a word character then, optionally, the next character. From the fullstring match, only change the case of the first character.
Code: (Demo) (or Demo)
$strings = [
"test string",
"lado lomidze needs a solution",
"I made some research and found 'good' answers here; they really helped me."
]; // if not already all lowercase, use strtolower()
var_export(preg_replace_callback('/\w.?/', function ($m) { return ucfirst($m[0]); }, $strings));
Output:
array (
0 => 'TeSt StRiNg',
1 => 'LaDo LoMiDzE NeEdS A SoLuTiOn',
2 => 'I MaDe SoMe ReSeArCh AnD FoUnD \'GoOd\' AnSwErS HeRe; ThEy ReAlLy HeLpEd Me.',
)
For other researchers, if you (more simply) just want to convert every other character to uppercase, you could use /..?/ in your pattern, but using regex for this case would be overkill. You could more efficiently use a for() loop and double-incrementation.
Code (Demo)
$string = "test string";
for ($i = 0, $len = strlen($string); $i < $len; $i += 2) {
$string[$i] = strtoupper($string[$i]);
}
echo $string;
// TeSt sTrInG
// ^-^-^-^-^-^-- strtoupper() was called here

Categories