I have the contents of a textarea being stored in a PHP string after it is submitted by the user. I am hoping to be able to tweak the formatting of the contents of that string, such that it will be displayable as a list when it is echoed. In other words, I would need to insert UL and /UL at the beginning and end, respectively, and LI and /LI and the beginning and end of each line.
Before I mess with my code, I was wondering if anyone knows if this is this even possible? Are carriage returns sent via textarea submit? Any help/comments would be much appreciated.
[EDIT]
I have defined some variables to give myself all the necessary HTML stuff. The 'repertoire' variable is the original string containing text sent from user input.
$repertoire = ($_POST['repertoire']);
$list_start = '<UL>';
$list_end = '</UL>';
$list_end = '</UL>';
$list_start_line = '<LI>';
$list_end_line = '</LI>';
The following is an example of what would be submitted by the user, and therefore, what would constitute the original $repertoire string:
Luciano Berio - Circles
Mike Svoboda - Piangero la sorte mia
Nicholas von Ritter-Zahony - New Piece
Stefano Gervasoni - Due Poesie Francesi di Rilke
So we would at least need the following:
$repertoire_formatted = substr_replace($list_start, $repertoire, $list_end);
...but I don't know how to substitute <LI> for line breaks; also, I cannot know in advance the length of the string or of each line.
You can use regex to selecting every line and wrap it in <li></li>
$html = preg_replace("/([^\n]+)/", "<li>$1</li>", $repertoire);
$html = "<ul>\n$html</ul>";
Check result in demo
Related
I'm not sure what the terminology is, but basically I have a site that uses the "tag-it" system, currently you can click on the tags and it takes the user to
topics.php?tags=example
My question is what sort of scripting or coding would be required to be able to add additional links?
topics.php?tags=example&tags=example2
or
topics.php?tags=example+example2
Here is the code in how my site is linked to tags.
header("Location: topics.php?tags={$t}");
or
<?php echo strtolower($fetch_name->tags);?>
Thanks for any hints or tips.
You cannot really pass tags two times as a GET parameter although you can pass it as an array
topics.php?tags[]=example&tags[]=example2
Assuming this is what you want try
$string = "topics.php?";
foreach($tags as $t)
{
$string .= "tag[]=$t&";
}
$string = substr($string, 0, -1);
We iterate through the array concatenating value to our $string. The last line removes an extra & symbol that will appear after the last iteration
There is also another option that looks a bit more dirty but might be better depending on your needs
$string = "topics.php?tag[]=" . implode($tags, "&tag[]=");
Note Just make sure the tags array is not empty
topics.php?tags=example&tags=example2
will break in the back end;
you have to assign the data to one variable:
topics.php?tags=example+example2
looks good you can access it in the back end explode it by the + sign:
//toplics.php
<?php
...
$tags = urlencode($_GET['tags']);
$tags_arr = explode('+', $tags); // array of all tags
$current_tags = ""; //make this accessible in the view;
if($tags){
$current_tags = $tags ."+";
}
//show your data
?>
Edit:
you can create the fron-end tags:
<a href="topics.php?tags=<?php echo $current_tags ;?>horror">
horror
</a>
I have an issue where I have displayed up to 400 characters of a string that is pulled from the database, however, this string is required to contain HTML Entities.
By chance, the client has created the string to have the 400th character to sit right in the middle of a closing P tag, thus killing the tag, resulting in other errors for code after it.
I would prefer this closing P tag to be removed entirely as I have a "...read more" link attached to the end which would look cleaner if attached to the existing paragraph.
What would be the best approach for this to cover all HTML Entity issues? Is there a PHP function that will automatically close off/remove any erroneous HTML tags? I don't need a coded answer, just a direction will help greatly.
Thanks.
Here's a simple way you can do it with DOMDocument, its not perfect but it may be of interest:
<?php
function html_tidy($src){
libxml_use_internal_errors(true);
$x = new DOMDocument;
$x->loadHTML('<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />'.$src);
$x->formatOutput = true;
$ret = preg_replace('~<(?:!DOCTYPE|/?(?:html|body|head))[^>]*>\s*~i', '', $x->saveHTML());
return trim(str_replace('<meta http-equiv="Content-Type" content="text/html;charset=utf-8">','',$ret));
}
$brokenHTML[] = "<p><span>This is some broken html</spa";
$brokenHTML[] = "<poken html</spa";
$brokenHTML[] = "<p><span>This is some broken html</spa</p>";
/*
<p><span>This is some broken html</span></p>
<poken html></poken>
<p><span>This is some broken html</span></p>
*/
foreach($brokenHTML as $test){
echo html_tidy($test);
}
?>
Though take note of Mike 'Pomax' Kamermans's comment.
why you don't take the last word in the paragraph or content and remove it, if the word is complete you remove it , if is not complete you also remove it, and you are sure that the content still clean, i show you an example for what code will be look like :
while($row = $req->fetch(PDO::FETCH_OBJ){
//extract 400 first characters from the content you need to show
$extraction = substr($row->text, 0, 400);
// find the last space in this extraction
$last_space = strrpos($extraction, ' ');
//take content from the first character to the last space and add (...)
echo substr($extraction, 0, $last_space) . ' ...';
}
just remove last broken tag and then strip_tags
$str = "<p>this is how we do</p";
$str = substr($str, 0, strrpos($str, "<"));
$str = strip_tags($str);
If I have a string of HTML, maybe like this...
<h2>Header</h2><p>all the <span class="bright">content</span> here</p>
And I want to manipulate the string so that all words are reversed for example...
<h2>redaeH</h2><p>lla eht <span class="bright">tnetnoc</span> ereh</p>
I know how to extract the string from the HTML and manipulate it by passing to a function and getting a modified result, but how would I do so whilst retaining the HTML?
I would prefer a non-language specific solution, but it would be useful to know php/javascript if it must be language specific.
Edit
I also want to be able to manipulate text that spans several DOM elements...
Quick<em>Draw</em>McGraw
warGcM<em>warD</em>kciuQ
Another Edit
Currently, I am thinking to somehow replace all HTML nodes with a unique token, whilst storing the originals in an array, then doing a manipulation which ignores the token, and then replacing the tokens with the values from the array.
This approach seems overly complicated, and I am not sure how to replace all the HTML without using REGEX which I have learned you can go to the stack overflow prison island for.
Yet Another Edit
I want to clarify an issue here. I want the text manipulation to happen over x number of DOM elements - so for example, if my formula randomly moves letters in the middle of a word, leaving the start and end the same, I want to be able to do this...
<em>going</em><i>home</i>
Converts to
<em>goonh</em><i>gmie</i>
So the HTML elements remain untouched, but the string content inside is manipulated (as a whole - so goinghome is passed to the manipulation formula in this example) in any way chosen by the manipulation formula.
If you want to achieve a similar visual effect without changing the text you could cheat with css, with
h2, p {
direction: rtl;
unicode-bidi: bidi-override;
}
this will reverse the text
example fiddle: http://jsfiddle.net/pn6Ga/
Hi I came to this situation long time ago and i used the following code. Here is a rough code
<?php
function keepcase($word, $replace) {
$replace[0] = (ctype_upper($word[0]) ? strtoupper($replace[0]) : $replace[0]);
return $replace;
}
// regex - match the contents grouping into HTMLTAG and non-HTMLTAG chunks
$re = '%(</?\w++[^<>]*+>) # grab HTML open or close TAG into group 1
| # or...
([^<]*+(?:(?!</?\w++[^<>]*+>)<[^<]*+)*+) # grab non-HTMLTAG text into group 2
%x';
$contents = '<h2>Header</h2><p>the <span class="bright">content</span> here</p>';
// walk through the content, chunk, by chunk, replacing words in non-NTMLTAG chunks only
$contents = preg_replace_callback($re, 'callback_func', $contents);
function callback_func($matches) { // here's the callback function
if ($matches[1]) { // Case 1: this is a HTMLTAG
return $matches[1]; // return HTMLTAG unmodified
}
elseif (isset($matches[2])) { // Case 2: a non-HTMLTAG chunk.
// declare these here
// or use as global vars?
return preg_replace('/\b' . $matches[2] . '\b/ei', "keepcase('\\0', '".strrev($matches[2])."')",
$matches[2]);
}
exit("Error!"); // never get here
}
echo ($contents);
?>
Parse the HTML with something that will give you a DOM API to it.
Write a function that loops over the child nodes of an element.
If a node is a text node, get the data as a string, split it on words, reverse each one, then assign it back.
If a node is an element, recurse into your function.
could use jquery?
$('div *').each(function(){
text = $(this).text();
text = text.split('');
text = text.reverse();
text = text.join('');
$(this).text(text);
});
See here - http://jsfiddle.net/GCAvb/
I implemented a version that seems to work quite well - although I still use (rather general and shoddy) regex to extract the html tags from the text. Here it is now in commented javascript:
Method
/**
* Manipulate text inside HTML according to passed function
* #param html the html string to manipulate
* #param manipulator the funciton to manipulate with (will be passed single word)
* #returns manipulated string including unmodified HTML
*
* Currently limited in that manipulator operates on words determined by regex
* word boundaries, and must return same length manipulated word
*
*/
var manipulate = function(html, manipulator) {
var block, tag, words, i,
final = '', // used to prepare return value
tags = [], // used to store tags as they are stripped from the html string
x = 0; // used to track the number of characters the html string is reduced by during stripping
// remove tags from html string, and use callback to store them with their index
// then split by word boundaries to get plain words from original html
words = html.replace(/<.+?>/g, function(match, index) {
tags.unshift({
match: match,
index: index - x
});
x += match.length;
return '';
}).split(/\b/);
// loop through each word and build the final string
// appending the word, or manipulated word if not a boundary
for (i = 0; i < words.length; i++) {
final += i % 2 ? words[i] : manipulator(words[i]);
}
// loop through each stored tag, and insert into final string
for (i = 0; i < tags.length; i++) {
final = final.slice(0, tags[i].index) + tags[i].match + final.slice(tags[i].index);
}
// ready to go!
return final;
};
The function defined above accepts a string of HTML, and a manipulation function to act on words within the string regardless of if they are split by HTML elements or not.
It works by first removing all HTML tags, and storing the tag along with the index it was taken from, then manipulating the text, then adding the tags into their original position in reverse order.
Test
/**
* Test our function with various input
*/
var reverse, rutherford, shuffle, text, titleCase;
// set our test html string
text = "<h2>Header</h2><p>all the <span class=\"bright\">content</span> here</p>\nQuick<em>Draw</em>McGraw\n<em>going</em><i>home</i>";
// function used to reverse words
reverse = function(s) {
return s.split('').reverse().join('');
};
// function used by rutherford to return a shuffled array
shuffle = function(a) {
return a.sort(function() {
return Math.round(Math.random()) - 0.5;
});
};
// function used to shuffle the middle of words, leaving each end undisturbed
rutherford = function(inc) {
var m = inc.match(/^(.?)(.*?)(.)$/);
return m[1] + shuffle(m[2].split('')).join('') + m[3];
};
// function to make word Title Cased
titleCase = function(s) {
return s.replace(/./, function(w) {
return w.toUpperCase();
});
};
console.log(manipulate(text, reverse));
console.log(manipulate(text, rutherford));
console.log(manipulate(text, titleCase));
There are still a few quirks, like the heading and paragraph text not being recognized as separate words (because they are in separate block level tags rather than inline tags) but this is basically a proof of method of what I was trying to do.
I would also like it to be able to handle the string manipulation formula actually adding and removing text, rather than replacing/moving it (so variable string length after manipulation) but that opens up a whole new can of works I am not yet ready for.
Now I have added some comments to the code, and put it up as a gist in javascript, I hope that someone will improve it - especially if someone could remove the regex part and replace with something better!
Gist: https://gist.github.com/3309906
Demo: http://jsfiddle.net/gh/gist/underscore/1/3309906/
(outputs to console)
And now finally using an HTML parser
(http://ejohn.org/files/htmlparser.js)
Demo: http://jsfiddle.net/EDJyU/
You can use a setInterval to change it every ** time for example:
const TITTLE = document.getElementById("Tittle") //Let's get the div
setInterval(()=> {
let TITTLE2 = document.getElementById("rotate") //we get the element at the moment of execution
let spanTittle = document.createElement("span"); // we create the new element "span"
spanTittle.setAttribute("id","rotate"); // attribute to new element
(TITTLE2.textContent == "TEXT1") // We compare wich string is in the div
? spanTittle.appendChild(document.createTextNode(`TEXT2`))
: spanTittle.appendChild(document.createTextNode(`TEXT1`))
TITTLE.replaceChild(spanTittle,TITTLE2) //finally, replace the old span for a new
},2000)
<html>
<head></head>
<body>
<div id="Tittle">TEST YOUR <span id="rotate">TEXT1</span></div>
</body>
</html>
Sorry for not being able to make the title clearer.
Basically I can type text onto my page, where all HTML-TAGS are stripped, except from a couple which I've allowed.
What I want though is to be able to type all the tags I want, to be displayed as plain text, but only if they're within 'code' tags. I'm aware I'll probably use htmlentities, but how can I do it to only affect tags within the 'code' tag?
Can it be done?
Thanks in advance guys.
For example I have $_POST['content'] which is what's shown on the web page. And is the variable with all the output I'm having problems with.
Say I post a paragraph of text, it will be echoed out with all tags stripped except for a few, including the 'code' tag.
Within the code tag I put code, such as HTML information, but this should be displayed as text. How can I escape the HTML tags to be displayed as plain text within the 'code' tag only?
Below is an example of what I may type:
Hi there, this is some text and this is a picture <img ... />.
Below I will show you the code how to do this image:
<code>
<img src="" />
</code>
Everything within the tags should be displayed as plain text so that they won't get removed from PHP's strip_tags, but only html tags within the tags.
If it's STRICTLY code tags, then it can be done quite easily.
First, explode your string by any occurences of '' or ''.
For example, the string:
Hello <code> World </code>
Should become a 4-item array: {Hello,,World!,}
Now loop through the array starting at 0 and incrementing by 4. Each element you hit, run your current script on (to remove all but the allowed tags).
Now loop through the array starting at 2 and incrementing by 4. Each element you hit, just run htmlspecialentities on it.
Implode your array, and now you have a string where anything inside the tags is completely sanitized and anything outside the tags is partially sanitized.
This is the solution I found which works perfectly for me.
Thanks everyone for their help!
function code_entities($matches) {
return str_replace($matches[1],htmlentities($matches[1]),$matches[0]);
}
$content = preg_replace_callback('/<code.*?>(.*?)<\/code>/imsu',code_entities, $_POST['content']);
Here is some sample code that should do the trick:
$parsethis = '';
$parsethis .= "Hi there, this is some text and this is a picture <img src='http://www.google.no/images/srpr/logo3w.png' />\n";
$parsethis .= "Below I will show you the code how to do this image:\n";
$parsethis .= "\n";
$parsethis .= "<code>\n";
$parsethis .= " <img src='http://www.google.no/images/srpr/logo3w.png' />\n";
$parsethis .= "</code>\n";
$pattern = '#(<code[^>]*>(.*?)</code>)#si';
$finalstring = preg_replace_callback($pattern, "handle_code_tag", $parsethis);
echo $finalstring;
function handle_code_tag($matches) {
$ret = '<pre>';
$ret .= str_replace(array('<', '>'), array('<', '>'), $matches[2]);
$ret .= '</pre>';
return $ret;
}
What it does:
First using preg_replace_callback I match all code inside <code></code sending it to my callback function handle_code_tagwhich escapes all less-than and greater-than tags inside the content. The matches array wil contain full matched string in 1 and the match for (.*?) in [2].#si` s means match . across linebrakes and i means caseinsensitive
The rendered output looks like this in my browser:
So I am trying to parse an XML file and display first 150 words of an article with READ MORE link. It doesn't exactly parse 150 words though. I am also not sure how to make it so it does not parse IMG tag code, etc... the code is below
// Script displays 3 most recent blog posts from blog.pinchit.com (blog..pinchit.com/api/read)
// The entries on homepage show the first 150 words of description and "READ MORE" link
// PART 1 - PARSING
// if it was a JSON file
// $string=file_get_contents("http://blog.pinchit.com/api/read");
// $json_a=json_decode($string,true);
// var_export($json_a);
// XML Parsing
$file = "http://blog.pinchit.com/api/read";
$posts_to_display = 3;
$posts = array();
// get all the file nodes
if(!$xml=simplexml_load_file($file)){
trigger_error('Error reading XML file',E_USER_ERROR);
}
// counter for posts member array
$counter = 0;
// Accessing elements within an XML document that contain characters not permitted under PHP's naming convention
// (e.g. the hyphen) can be accomplished by encapsulating the element name within braces and the apostrophe.
foreach($xml->posts->post as $post){
//post's title
$posts[$counter]['title'] = $post->{'regular-title'};
// post's full body
$posts[$counter]['body'] = $post->{'regular-body'};
// post's body's first 150 words
//for some reason, I am not sure if it's exactly 150
$posts[$counter]['preview'] = substr($posts[$counter]['body'], 0, 150);
//strip all the html tags so it doesn't mess up the page
$posts[$counter]['preview'] = strip_tags($posts[$counter]['preview']);
//post's id
$posts[$counter]['id'] = $post->attributes()->id;
$posts_to_display--;
$counter++;
//exit the for loop after we parse out all the articles that we want
if ($posts_to_display == 0 ) break;
}
// Displays all of the posts
foreach($posts as $post){
echo "<b>" . $post['title'] . "</b>";
echo "<br/>";
echo $post['preview'];
echo " <a href='http://blog.pinchit.com/post/" . $post[id] . "'>Read More</a>";
echo "<br/><br/>";
}
Here are how results look now.
Editor's Pick: Club Sportiva
Nothing makes you feel as totally free and in control as a day behind the wheel of a sleek, sophisticated, sexy sports car. It’s no surprise Read More
Pinchy Drinks & Rocks: The Hotel Utah Saloon
Hotel Utah Read More
Monday Menu: Spicy Grapefruit, Paprika, Creamsicles
Feeling summery and savory today, and we have to admit it took a lot to resist the urge to make this an all appetizers, all desserts, or all drinks Read More
The HTML tags are counting against your character total. Strip the tags out first, then take your preview sample:
$preview = strip_tags($posts[$counter]['body']);
$posts[$counter]['preview'] = substr($preview, 0, 150).'...';
Also, one usually adds an ellipse ("...") to the end of truncated text to indicate that it continues.
Note that this has the potential disadvantage of removing tags you DO want, like <p> and <br>. If you want to preserve those, you can pass them as the second argument for strip_tags:
$preview = strip_tags($posts[$counter]['body'], '<br><p>');
$posts[$counter]['preview'] = substr($preview, 0, 150).'...';
BUT, be forewarned that XML-style tags might throw this off (<br />). If you're dealing with XML/HTML mixed, you might need to elevate your tag filtering using something like htmLawed, but the concept remains the same - get rid of the HTML before you truncate.
Looking at the tag <regular-body> it seems to contain HTML. Therefore I would recommend trying to parse that into a DOMDocument ( http://www.php.net/manual/en/domdocument.loadhtml.php ). You then would be able to loop through all the items and ignore certain tags (ex. ignore <img> but keep <p>). After that, you can then render out what you want and truncate it to 150 characters.