Related
I have an application that can only deal with text up to 100 characters in length per line.
However I do not want to be splitting mid-word in a sentence as that doesn't look very nice. Therefore we would need to find the space before the 100th Character and then add it into the array.
I was thinking using strrpos would work - but am unsure how to do the continuing so it has everything in one array
$textToDraw = 'this is a message that is over 100 characters long just to see how well that the breaks work';
$characterLimit = substr($textToDraw, 0, 100);
$textBeforeLimit = strrpos($characterLimit, ' ', 0);
Thanks
UPDATE. This is the current code I have to split the text into an array and then draw each line. However I need it to cut on the space before 100 characters - and not on a hardcoded 100 character limit.
for ($i = 0; $i < count($textToDraw); $i++) {
$splitPoint = 100;
if ( strlen($textToDraw[$i]) > $splitPoint ) {
$newTextLines = str_split($textToDraw[$i], $splitPoint);
array_splice($textToDraw, $i, 1, $newTextLines);
$i = $i + count($newTextLines) - 1;
}
}
foreach ($textToDraw as $actualTextToDraw) {
$page->drawText($actualTextToDraw, $this->x , $this->y , 'UTF-8');
}
You can use php function wordwrap as below. This Wraps a string to a given number of characters using a string break character.
<?php
$textToDraw = 'this is a message that is over 100 characters long just to see how well that the breaks work';
$newtext = wordwrap($textToDraw, 100, "<br />\n");
echo $newtext;
?>
Try wordwrap().
Wraps a string to a given number of characters using a string break character.
Example adapted from the documentation page:
<?php
$text = "The quick brown fox jumped over the lazy dog.";
$newtext = wordwrap($text, 20, "<br />\n");
echo $newtext;
// Outputs:
// The quick brown fox<br />
// jumped over the lazy<br />
// dog.
Update for new info in question:
Apply it to your needs:
$text = "Here's some example text that may or may not be really really long.";
$linedText = wordwrap($text, 20, "\n");
$lines = explode("\n", $linedText);
// Do whatever with $lines.
I have this text : http://pastebin.com/2Zgbs7hi
And i want to be able to remove the HTML code from it and just display the plain text but i want to keep at least one line break where there are currently a few line breaks
i have tried:
$ticket["summary"] = 'pastebin example';
$TicketSummaryDisplay = nl2br($ticket["summary"]);
$TicketSummaryDisplay = stripslashes($TicketSummaryDisplay);
$TicketSummaryDisplay = trim(strip_tags($TicketSummaryDisplay));
$TicketSummaryDisplay = preg_replace('/\n\s+$/m', '', $TicketSummaryDisplay);
echo $TicketSummaryDisplay;
that is displaying as plain text, but it shows it all as one big block of text with no line breaks at all
Maybe this will earn you some time.
<?php
libxml_use_internal_errors(true); //crazy o tags
$html = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$dom = new DOMDocument;
$dom->loadHTML($html);
$result='';
foreach ($dom->getElementsByTagName('p') as $node) {
if (strstr($node->nodeValue, 'Legal Disclaimer:')){
break;
}
$result .= $node->nodeValue;
}
echo $result;
This example should successfully store text from html into an array of strings.
After stripping all the tags, you can use preg_split with \R special character ( matches any newline sequence ) to convert string into array. That array will now have several blank values, and there will be also some amount of html non-breaking space entities, so we will check the array for empty values with array_filter() function ( it will remove all items that do not satisfy the filter conditions, in our case, an empty value ). Here are a problem with entity, because and space characters are not the same, they have different ASCII code, so trim() function will not remove spaces. Here are two possible solutions, the first uncommented part will only replace   and check for white space characters, while the second commented one will decode all html entities and also check for spaces.
PHP:
$text = file_get_contents( 'http://pastebin.com/raw.php?i=2Zgbs7hi' );
$text = strip_tags( $text );
$array = array_filter(
preg_split( '/\R/', $text ),
function( &$item ) {
$item = str_replace( ' ', ' ', $item );
return trim( $item );
// $item = html_entity_decode( $item );
// return trim( str_replace( "\xC2\xA0", ' ', $item ) );
}
);
foreach( $array as $value ) {
echo $value . '<br />';
}
Array output:
Array
(
[8] => Hi,
[11] => Ashley has explained that I need to ask for another line and broadband for the wifi to work, please can you arrange this.
[13] => Regards
[23] => Legal Disclaimer:
[24] => This email and its attachments are confidential. If you received it by mistake, please don’t share it. Let us know and then delete it. Its content does not necessarily represent the views of The Dragon Enterprise
[25] => Centre and we cannot guarantee the information it contains is complete. All emails are monitored and may be seen by another member of The Dragon Enterprise Centre's staff for internal use
)
Now you should have clear array with only items with value in it. By the way, newlines in HTML are expressed through <br />, not through \n, your example as response in a web browser still has them, but they are only visible in page source code. I hope I did not missed the point of the question.
try this get text output with line brakes
<?php
$ticket["summary"] = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$TicketSummaryDisplay = nl2br($ticket["summary"]);
echo strip_tags($TicketSummaryDisplay,'<br>');
?>
You are asking on how to add line-breaks to your "one big block of text with no line breaks at all".
Short answer
After you stripped the HTML tags, apply wordwrap with a desired text-block length
$text = wordwrap($text, 90, "<br />\n");
I really wonder, why nobody suggested that function before.
there is also chunk_split around, which doesn't take words into account and just splits after a certain number of chars. breaking words - but that's not what you want, i guess.
PHP
<?php
$text = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
/**
* Returns string without html tags, also
* removes takes control chars, spaces and " " into account.
*/
function dropHtmlTags($string) {
// remove html tags
//$string = preg_replace ('/<[^>]*>/', ' ', $string);
$string = strip_tags($string);
// control characters and " "
$string = str_replace("\r", '', $string); // remove
$string = str_replace("\n", ' ', $string); // replace with space
$string = str_replace("\t", ' ', $string); // replace with space
$string = str_replace(" ", ' ', $string);
// remove multiple spaces
$string = preg_replace('/ {2,}/', ' ', $string);
$string = trim($string);
return $string;
}
$text = dropHtmlTags($text);
// The Answer: insert line breaks after 95 chars,
// to get rid of the "one big block of text with no line breaks at all"
$text = wordwrap($text, 95, "<br />\n");
// if you want to insert line-breaks before the legal disclaimer,
// uncomment the next line
//$text = str_replace("Regards Legal Disclaimer", "<br /><br />Regards Legal Disclaimer", $text);
echo $text;
?>
Result
first section shows your text block
second section shows the text with wordwrap applied (code from above)
Hello it can be done as follows:
$abc= file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$abc = strip_tags("\n", $abc);
echo $abc;
Please, let me know whether it works
you may use
<?php
$a= file_get_contents('a.txt');
echo nl2br(htmlspecialchars($a));
?>
<?php
$handle = #fopen("pastebin.html", "r");
if ($handle) {
while (!feof($handle)) {
$buffer = fgetss($handle, 4096);
echo $buffer;
}
fclose($handle);
}
?>
output is
Hi,
Ashley has explained that I need to ask for another line and broadband for the wifi to work, please can you arrange this.
Regards
Legal Disclaimer:
This email and its attachments are confidential. If you received it by mistake, please don’t share it. Let us know and then delete it. Its content does not necessarily represent the views of The Dragon Enterprise
Centre and we cannot guarantee the information it contains is complete. All emails are monitored and may be seen by another member of The Dragon Enterprise Centre's staff for internal use
You can probably write additional code to convert to spaces etc.
I'm not sure I did understand everything correctly but this seems to be your expected result:
$txt = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
var_dump(preg_replace("/(\ \;(\s{1,})?)+/", "\n", trim(strip_tags(preg_replace("/(\s){1,}/", " ", $txt)))));
//more readable
$txt = preg_replace("/(\s){1,}/", " ", $txt);
$txt = trim(strip_tags($txt));
$txt = preg_replace("/(\ \;(\s{1,})?)+/", "\n", $txt);
The strip_tags() function strips HTML and PHP tags from a string, if that is what you are trying to accomplish.
Examples from the docs:
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
Test paragraph. Other text
<p>Test paragraph.</p> Other text
Maybe you guys can help:
I have a variable called $bio with bio data.
$bio = "Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way cooler then the author of the question";
I search the $bio using a set of functions to search for a certain word, lets say "author" which adds a span class around that word, and I get:
$bio = "Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way cooler then the <span class=\"highlight\">author</span> of the question";
I use a function to limit the text to 85 chars:
$bio = limit_text($bio,85);
The problem is when there are more then 80 chars before the word "author" in $bio.
When the limit_text() is applied, I won't see the highlighted word author.
What I need is for the limit_text() function to work as normal, adding all the words that contain the span class highlight at the end.
Something like this:
*"This is the limited text to 85 chars, but there are no words with the span class highlight so I am putting to be continued ... **author**, **author2** (and all the other words that have a span class highlight around them separate by comma "*
Hope you understood what I mean, if not, please comment and I'll try to explain better.
Here is my limit_text() function:
function limit_text($text, $length){ // Limit Text
if(strlen($text) > $length) {
$stringCut = substr($text, 0, $length);
$text = substr($stringCut, 0, strrpos($stringCut, ' '));
}
return $text;
}
UPDATE:
$xturnons = str_replace(",", ", ", $xturnons);
$xbio = str_replace(",", ", ", $xbio);
$xbio = customHighlights($xbio,$toHighlight);
$xturnons = customHighlights($xturnons,$toHighlight);
$xbio = limit_text($xbio,85);
$xturnons = limit_text($xturnons,85);
The customHighlights function which adds the span class highlighted:
function addRegEx($word){ // Highlight Words
return "/" . $word . '[^ ,\,,.,?,\.]*/i';
}
function highlight($word){
return "<span class='highlighted'>".$word[0]."</span>";
}
function customHighlights($searchString,$toHighlight){
$searchFor = array_map('addRegEx',$toHighlight);
$result = preg_replace_callback($searchFor,'highlight',$searchString);
return $result;
}
This change to your limit_text function will take the text, and cut it if it's longer than the given $length. If you pass a $needle to it, it will search for the first occurrence of it, and end your sentence with it.
Also, if the text is cut before it's actual length, it will add $addition to it, while still preserving the limit of $length characters.
I've included a usage and a sample output based on your given below:
<?php
/**
* $text - The text to cut from
* $length - The amount of characters that should be returned
* $needle - If needle is given and found in the text, and it is
* at least $length far from the start of the string - it will end the sentence with it.
* $addition - If the sentence was cut in the middle, will add it to the end of it.
**/
function limit_text($text, $length, $needle="", $addition="...") {
if(strlen($text) > $length) {
$length -= strlen($addition);
$start = 0;
$trimLast = true;
if (!empty($needle)) {
$needleStart = strpos($text, $needle);
if ($needleStart > $length) {
$length -= strlen($needle);
$start = $needleStart + strlen($needle) - $length;
$trimLast = false;
}
}
$stringCut = substr($text, max(0, $start), $length);
if ($start > 0) {
$stringCut = substr($stringCut, strpos($stringCut, ' ')+1);
}
if ($trimLast) {
$lastWhitespace = strrpos($stringCut, ' ');
$stringCut = substr($stringCut, 0, $lastWhitespace);
}
// split into words (so we won't replace words that contain it in the middle)
// and wrap $needle with <span class="highlighted"></span>
if (!empty($needle)) {
$words = explode(" ", $stringCut);
$needles = array_keys($words, $needle);
foreach ($needles as $needleKey) {
$words[$needleKey] = "<span class=\"highlighted\">$needle</span>";
}
$stringCut = implode(" ", $words);
}
$text = $stringCut.$addition;
}
return $text;
}
$bio = "Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way cooler then the author of the question";
$text = limit_text($bio, 85, "author");
var_dump($text);
Output:
string (111) "fast cars and boats. I work as a blogger and I'm way cooler then the <span class="highlighted">author</span>..."
First, you need to make sure you don't break words apart by shortening the string. Then you need to append all of the <span class="highlight"> tokens to the end of the shortened string. Here is what I came up with (in about 8 lines!):
function limit_text($text, $length){
if( strlen( $text) < $length) {
return $text;
}
// Truncate the string without breaking words
list( $wrapped) = explode("\n", wordwrap( $text, $length));
// Get the span of text occurring after the wrapped string
$remainder = substr( $text, strlen( $wrapped));
// Add the "to be continued" to $wrapped
$wrapped .= ' to be continued ... ';
// Now, grab all of the <span class="highlight"></span> tags in the $remainder
preg_match_all( '#<span class="highlight">[^<]+</span>#i', $remainder, $matches);
// Add the <span> tags to the end of the string, separated by a comma, if present
$wrapped .= implode( ', ', $matches[0]);
return $wrapped;
}
Now, with your original test:
$bio = "Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way cooler then the <span class=\"highlight\">author</span> of the question";
$bio = limit_text( $bio,85);
var_dump( htmlentities( $bio));
This outputs:
string(165) "Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way to be continued ... <span class="highlight">author</span>"
Now, another test with multiple <span> tags:
This outputs:
$bio = 'Hello, what about a <span class="highlight">span tag</span> before the limit? Or what if I have many <span class="highlight">span tags</span> <span class="highlight">after</span> <span class="highlight">the</span> limit?';
$bio = limit_text( $bio,85);
var_dump( htmlentities( $bio));
string(308) "Hello, what about a <span class="highlight">span tag</span> before the limit? Or what to be continued ... <span class="highlight">span tags</span>, <span class="highlight">after</span>, <span class="highlight">the</span>"
If you have more test cases, or have a modification to the function above, let me know and I can fix it!
Judging from your requirements, this should do what you want:
function get_highlighted_string($s)
{
return '<span class="highlight">' . htmlspecialchars($s) . '</span>';
}
function limit_text($text, $max_length, array $keywords = array(), $continued = '...')
{
// highlights to put after the cut string
$extra = array();
// highlight keywords
if ($keywords) {
$re = '~\b(' . join('|', array_map('preg_quote', $keywords, array('~'))) . ')\b~i';
// get all matches and capture their positions as well
if (preg_match_all($re, $text, $matches, PREG_OFFSET_CAPTURE)) {
// we reverse the matches by position to make replacement easier
foreach (array_reverse($matches[1]) as $match) {
// $match[0] = match
// $match[1] = start position
$match_len = strlen($match[0]);
if ($match[1] + $match_len <= $max_length) {
// still fits in cut string
$match_replacement = get_highlighted_string($match[0]);
$text = substr_replace($text, $match_replacement, $match[1], $match_len);
// update max length
$max_length = $max_length - $match_len + strlen($match_replacement);
} else {
// will not fit in the cut string, so we place it outside
array_unshift($extra, get_highlighted_string($match[0]));
}
}
}
// use wordwrap and strcspn to cut the string by word boundaries
if (strlen($text) > $max_length) {
$text = substr($text, 0, strcspn(wordwrap($text, $max_length, "\0"), "\0")) . " $continued";
}
}
if ($extra) {
// append what we couldn't fit in the cut string
$text .= ' ' . join(', ', $extra);
}
return $text;
}
Example:
echo limit_text("Hello, I like fast cars and boats. I work as a blogger I'm way cooler then the author of the question", 85, array('author', 'question'));
Hello, I like fast cars and boats. I work as a blogger I'm way cooler then the <span class="highlight">author</span> ... <span class="highlight">question</span>
In the example, the cut-off is exactly at author so that highlight comes before the ... while the question keywords gets put behind.
Another example:
echo limit_text("Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way cooler then the author of the question", 85, array('author', 'question'));
Hello, I am John, I'm 25, I like fast cars and boats. I work as a blogger and I'm way ... <span class="highlight">author</span>, <span class="highlight">question</span>
Both keywords are beyond the 85 character marker, so they are appended at the back, comma separated.
Let me know if this works for you :)
I have a simple text with HTML tags, for example:
Once <u>the</u> activity reaches the resumed state, you can freely add and remove fragments to the activity. Thus, <i>only</i> while the activity is in the resumed state can the <b>lifecycle</b> of a <hr/> fragment change independently.
I need to replace some parts of this text ignoring its html tags when I do this replace, for example this string - Thus, <i>only</i> while I need to replace with my string Hello, <i>its only</i> while . Text and strings to be replaced are dynamically. I need your help with my preg_replace pattern
$text = '<b>Some html</b> tags with <u>and</u> there are a lot of tags <i>in</i> this text';
$arrayKeys= array('Some html' => 'My html', 'and there' => 'is there', 'in this text' => 'in this code');
foreach ($arrayKeys as $key => $value)
$text = preg_replace('...$key...', '...$value...', $text);
echo $text; // output should be: <b>My html</b> tags with <u>is</u> there are a lot of tags <i>in</i> this code';
Please help me to find solution. Thank you
Basically we're going to build dynamic arrays of matches and patterns off of plain text using Regex. This code only matches what was originally asked for, but you should be able to get an idea of how to edit the code from the way I've spelled it all out. We're catching either an open or a close tag and white space as a passthru variable and replacing the text around it. This is setup based on two and three word combinations.
<?php
$text = '<b>Some html</b> tags with <u>and</u> there are a lot of tags <i>in</i> this text';
$arrayKeys= array(
'Some html' => 'My html',
'and there' => 'is there',
'in this text' =>'in this code');
function make_pattern($string){
$patterns = array(
'!(\w+)!i',
'#^#',
'! !',
'#$#');
$replacements = array(
"($1)",
'!',
//This next line is where we capture the possible tag or
//whitespace so we can ignore it and pass it through.
'(\s?<?/?[^>]*>?\s?)',
'!i');
$new_string = preg_replace($patterns,$replacements,$string);
return $new_string;
}
function make_replacement($replacement){
$patterns = array(
'!^(\w+)(\s+)(\w+)(\s+)(\w+)$!',
'!^(\w+)(\s+)(\w+)$!');
$replacements = array(
'$1\$2$3\$4$5',
'$1\$2$3');
$new_replacement = preg_replace($patterns,$replacements,$replacement);
return $new_replacement;
}
foreach ($arrayKeys as $key => $value){
$new_Patterns[] = make_pattern($key);
$new_Replacements[] = make_replacement($value);
}
//For debugging
//print_r($new_Patterns);
//print_r($new_Replacements);
$new_text = preg_replace($new_Patterns,$new_Replacements,$text);
echo $new_text."\n";
echo $text;
?>
Output
<b>My html</b> tags with <u>is</u> there are a lot of tags <i>in</i> this code
<b>Some html</b> tags with <u>and</u> there are a lot of tags <i>in</i> this text
Here we go. this piece of code should work, assuming you're respecting only twp constraints :
Pattern and replacement must have the same number of words. (Logical, since you want to keep position)
You must not split a word around a tag. (<b>Hel</b>lo World won't work.)
But if these are respected, this should work just fine !
<?php
// Splits a string in parts delimited with the sequence.
// '<b>Hey</b> you' becomes '~-=<b>~-=Hey~-=</b>~-= you' that make us get
// array ("<b>", "Hey" " you")
function getTextArray ($text, $special) {
$text = preg_replace ('#(<.*>)#isU', $special . '$1' . $special, $text); // Adding spaces to make explode work fine.
return preg_split ('#' . $special . '#', $text, -1, PREG_SPLIT_NO_EMPTY);
}
$text = "
<html>
<div>
<p>
<b>Hey</b> you ! No, you don't have <em>to</em> go!
</p>
</div>
</html>";
$replacement = array (
"Hey you" => "Bye me",
"have to" => "need to",
"to go" => "to run");
// This is a special sequence that you must be sure to find nowhere in your code. It is used to split sequences, and will disappear.
$special = '~-=';
$text_array = getTextArray ($text, $special);
// $restore is the array that will finally contain the result.
// Now we're only storing the tags.
// We'll be story the text later.
//
// $clean_text is the text without the tags, but with the special sequence instead.
$restore = array ();
for ($i = 0; $i < sizeof ($text_array); $i++) {
$str = $text_array[$i];
if (preg_match('#<.+>#', $str)) {
$restore[$i] = $str;
$clean_text .= $special;
}
else {
$clean_text .= $str;
}
}
// Here comes the tricky part.
// We wanna keep the position of each part of the text so the tags don't
// move after.
// So we're making the regex look like (~-=)*Hey(~-=)* you(~-=)*
// And the replacement look like $1Bye$2 me $3.
// So that we keep the separators at the right place.
foreach ($replacement as $regex => $newstr) {
$regex_array = explode (' ', $regex);
$regex = '(' . $special . '*)' . implode ('(' . $special . '*) ', $regex_array) . '(' . $special . '*)';
$newstr_array = explode (' ', $newstr);
$newstr = "$1";
for ($i = 0; $i < count ($regex_array) - 1; $i++) {
$newstr .= $newstr_array[$i] . '$' . ($i + 2) . ' ';
}
$newstr .= $newstr_array[count($regex_array) - 1] . '$' . (count ($regex_array) + 1);
$clean_text = preg_replace ('#' . $regex . '#isU', $newstr, $clean_text);
}
// Here we re-split one last time.
$clean_text_array = preg_split ('#' . $special . '#', $clean_text, -1, PREG_SPLIT_NO_EMPTY);
// And we merge with $restore.
for ($i = 0, $j = 0; $i < count ($text_array); $i++) {
if (!isset($restore[$i])) {
$restore[$i] = $clean_text_array[$j];
$j++;
}
}
// Now we reorder everything, and make it go back to a string.
ksort ($restore);
$result = implode ($restore);
echo $result;
?>
Will output Bye me ! No, you don't need to run!
[EDIT] Now supporting a custom pattern, which allows to avoid adding useless spaces.
I am looking for a quick way to replace text in a string that is between two tags.
The string contains <!-- Model # Start --> <!-- Model # End --> Tags.
I just want to replace what is between the tags, I believe preg_replace() will do this but I am not sure how to make it work.
To use preg_replace, pass in the original string and a regular expression - the matching result will be returned. There is not much more to say about that method as you need to understand regular expressions to use it.
Here is a programatic solution, possibly not the most efficient code, but gives you an indication of what it is doing.
$tagOne = "[";
$tagTwo = "]";
$replacement = "Greg";
$text = "Hello, my name is [NAME]";
$startTagPos = strrpos($text, $tagOne);
$endTagPos = strrpos($text, $tagTwo);
$tagLength = $endTagPos - $startTagPos + 1;
$text = substr_replace($text, $replacement, $startTagPos, $tagLength);
echo $text;
Outputs: Hello, my name is Greg.
$tagOne = "[";
$tagTwo = "]";
$replacement = "Greg";
$text = "Hello, my name is [NAME] endie ho [NAME] \n [NAME]";
$textLength = strlen($text);
for ($i; $i< $textLength; $i++) {
$startTagPos = strrpos($text, $tagOne);
$endTagPos = strrpos($text, $tagTwo);
$tagLength = $endTagPos - $startTagPos + 1;
if ($startTagPos<>0) $text = substr_replace($text, $replacement, $startTagPos, $tagLength);
}
echo $text;
Please note, the if statement that checks if there is a tag pos left, otherwise, for all the textlength - # of tags, your start position of 0, in this case 'H', will get gregs :)
The above outputs
Hello, my name is Greg endie ho Greg Greg