regex find <table> and modify - php

I have code on a side which looks like the one below and will be generated from a CMS.
The user can generate a table, but I have to put a <div> around it.
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo
dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem</p>
<table>
<thead>
<tr><td></td></tr>
...
</tbody>
</table>
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo
dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem</p>
<table>
<thead>
<tr><td></td></tr>
...
</tbody>
</table>
...
My goal is it now to give every <table> a <div class="table">
I´ve tried it with regex and got this result:
function smarty_modifier_table($string) {
preg_match_all('/<table.*?>(.*?)<\/table>/si', $string, $matches);
echo "<pre>";
var_dump($matches);
}
/* result
array(2) {
[0]=> string(949) "<table>...</table>"
[1]=> string(934) "<thead>...</tbody>"
}
array(2) {
[0]=> string(949) "<table>...</table>"
[1]=> string(934) "<thead>...</tbody>"
}
*/
First of all, I do not understand why the second array [1]=> string(934) "<thead>...</tbody>" appears
and second how to fit the modified array back into the string on the right place.

If your html is really simple like this, the following would probably work:
print preg_replace('~<table.+?</table>~si', "<div class='table'>$0</div>", $html);
If, however, you can have nested tables:
<table>
<tr><td> <table>INNER!</table> </td></tr>
</table>
this expression will fail miserably - that's why using regexes to parse html is not recommended. To handle complex html it's better to use a parser library, for example, XML DOM:
$doc = new DOMDocument();
$doc->loadHTML($html);
$body = $doc->getElementsByTagName('body')->item(0);
foreach($body->childNodes as $s) {
if($s->nodeType == XML_ELEMENT_NODE && $s->tagName == 'table') {
$div = $doc->createElement("div");
$div->setAttribute("class", "table");
$body->replaceChild($div, $s);
$div->appendChild($s);
}
}
This one handles nested tables correctly.

$buffer = preg_replace('%<table>(.*?)</table>%sim', '<table><div class="table">$1</div></table>', $buffer);

Thank you all for your incredible fast and perfect help!
So it works for me.
$result = preg_replace('~~si', "$0", $string);
return $result;
regards
Torsten

Related

how to get line by line from txt file and replace variable

I want to replace "\n" by " in first line and last line of email
$chekclist = $_POST['emaillist'];
$rwina = explode("\n", "$chekclist");
$i = 0;
$count = 1;
foreach ($rwina as $key => $email[i])
Actually you cannot do that because \n is where the line ends.
I'm assuming that you want your email to format like:
"Lorem ipsum dolor sit amet
consectetur adipiscing elit
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."
But the text you'll get from $_POST['emaillist'] will be format like this:
Lorem ipsum dolor sit amet \n
consectetur adipiscing elit\n
sed do eiusmod tempor incididunt \n
ut labore et dolore magna aliqua. \n
So if you want to replace \n with " it will be like this:
Lorem ipsum dolor sit amet"
consectetur adipiscing elit
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."
But there is a way to achieve what you are looking for if I'm assuming it right :p
So here's the code:
$chekclist = $_POST['emaillist']; // Get email text
$rwina = explode("\n", "$chekclist"); // Make array
$count = count($rwina); // Count array values
for ($i = 0; $i < $count; $i++) {
if ($i == 0) {
echo '"' . $rwina[$i] . '<br>';
} else if ($i == ($count - 1)) {
echo $rwina[$i] . '"<br>';
} else {
echo $rwina[$i]. '<br>';
}
}
Let me know if this is what you are looking for :)

highlight text with surrounding words

i want to highlight text in a given string with given keywords and add a random number of surrounding words.
Example sentence:
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.
Example keyword:
dolore magna
Desired result:
(mark 0-4 words before and after the keyword
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et **dolore magna** aliquyam erat, sed.
What did i try?
( [\w,\.-\?]+){0,5} ".$myKeyword." (.+ ){2,5}
and
([a-zA-Z,. ]+){1,3} ".$n." ([a-zA-Z,. ]+){1,3}
Any ideas how to improve this and make it more robust?
For highlighting use preg_replace function. Here's an idea: $s = "dolore magna";
$str = preg_replace(
'/\b(?>[\'\w-]+\W+){0,4}'.preg_quote($s, "/").'(?:\W+[\'\w-]+){0,4}/i',
'<b>$0</b>', $str);
Test the pattern at regex101 or php test at eval.in. echo $str;
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.
Using i flag for caseless matching - drop if not wanted. First group ?> atomic for performance.
As word character I used ['\w-] (\w shorthand for word character, ' and -)
\W matches a character, that is not a word character (negated \w)
\b matches a word boundary. Used it for better performance.
I think this would accomplish what you are after. Please see the demo for an explanation of everything the regex is doing, or post a comment if you have a question.
Regex:
((?:[\w,.\-?]+\h){0,5})\b' . . '\b((?:.+\h){2,5})
Demo: https://regex101.com/r/vG8qT2/1
PHP:
<?php
$string = 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed.';
$term = 'dolore magna';
$min = 0;
$max = 5;
preg_match('~((?:[\w,.\-?]+\h){'.$min.','.$max. '})\b' . preg_quote($term) . '\b((?:.+\h){'.$min.','.$max.'})~', $string, $matches);
print_r($matches);
Demo: https://eval.in/410063
Note the captured values will be in $matches[1] and $matches[2].

how to get first some line from a paragraph

I have a paragraph stored in database i want to get only first five line from . it how to do this ?
should i convert first array to srting ? if yes then how to do this ?
if its string than i can do this by
$str='mayank kumar swami mayank kumar swami';
$var= strlen($str);
for($i=0;$i<8;$i++){
echo $str[$i];
}
or how to get only 200 word from the database by sql ?
i know it can be done by css easily shows in Show first line of a paragraph nut i want to do this by php or sql query
what i doing
$article_result = mysql_query("SELECT * FROM article ORDER BY time DESC LIMIT 1",$connection);
if($article_result){
while($row = mysql_fetch_array($article_result))
{
echo "<div class=\"article_div\" >";
echo "<h4 id=\"article_heading\"><img src=\"images/new.png\" alt=\"havent got\" style=\"padding-right:7px;\">".$row['article_name']."</h4>";
echo"<h5 class=\"article_byline\">";
echo" by";
echo"{$row['authore']}</h5>";
echo" <div id=\"article_about\"><p>{$row['content']}</p></div>";
//here i want to get only 2000 word from database (content)
echo "</div>";
}
}
There are a number of solutions to this problem.
If you want to split it by number of words, something similar to what user247245 posted:
function get_x_words($string,$x=200) {
$parts = explode(' ',$string);
if (sizeof($parts)>$x) {
$parts = array_slice($parts,0,$x);
}
echo implode(' ',$parts);
}
My preferred method however is getting all the full words up until a certain point (e.g. 200 characters):
function chop_string($string,$x=200) {
$string = strip_tags(stripslashes($string)); // convert to plaintext
return substr($string, 0, strpos(wordwrap($string, $x), "\n"));
}
The above will chop the string at 200 characters, however will only chop it after the end of a word (so you won't get half a word returned at the end)
Are we talking words, lines or letters?
If words:
$a = explode(' ',$theText);
if (sizeof($a)>200) $a = array_slice($a,0,200);
echo implode(' ',$a);
regards,
You can use substring function in mysql
SELECT SUBSTRING('Quadratically',1,5);
returns
Quadr
I suggest you do with sql as it reduces the amount of data transfer between you db server and application server.
So, Now you modify to this
$article_result = mysql_query("SELECT article_name, authore, SUBSTRING(content,1,200) as content FROM article ORDER BY time DESC LIMIT 1",$connection);
Try This:
<?php
echo substr("mayank kumar swami mayank kumar swami", 0, 6);
?>
Result Output: mayank
<?php
$about_vendor ="Lorem ipsum dolor sit amet, consectetur adipisicing elit. Adipisci officia excepturi quisquam mollitia, obcaecati cupiditate, quaerat est quibusdam nostrum esse culpa voluptates eum, et architecto animi. Voluptates enim tenetur minus! Lorem ipsum dolor sit amet, consectetur adipisicing elit. Laboriosam magni exercitationem non at error possimus, voluptas aut, aperiam sint pariatur illo libero vel aspernatur tempora laborum. Harum nesciunt quos at. Lorem ipsum dolor sit amet, consectetur adipisicing elit. Vitae quidem saepe voluptates minus delectus, dolores, repellat maiores quae consectetur quasi qui voluptas eius odit autem optio cupiditate nesciunt iste ducimus!"
// Convert string to array
$convert_to_array = explode(' ',$about_vendor);
// Total length of array
$total_length_of_array = count($convert_to_array);
?>
<p class="tip_par" style="text-align:justify;">
<!-- specify how many words do you want to show, I like to show first 30 words -->
<?php for($i=0;$i<=30;$i++) {
echo $convert_to_array[$i].' ';
} ?>
<!-- If you have more than 30 word it will show on toggle click on below more link -->
<span style="color:#C1151B;"> <span data-toggle="collapse" data-target="#demo" style="cursor:pointer;"><?php if($total_length_of_array >30) {
echo "more";
} ?></span>
<div id="demo" class="collapse tip_par" style="padding-top: 0px;">
<?php for($i=31;$i<$total_length_of_array;$i++) {
echo $convert_to_array[$i].' ';
} ?>
</div>
</span> </p>

Interpolate a constant (not variable) into 'heredoc'

Consider:
<?php
define('my_const', 100);
echo <<<MYECHO
<p>The value of my_const is {my_const}.</p>
MYECHO;
?>
If I put a variable inside the braces, it prints out. But not the constant. How can I do it?
You can also approach the problem by assigning the value of the constant to a variable.
Personally I do it that way because if you have lots of constants in your string then your sprintf() call can be quite messy. It's also then harder to scan through the string and see what is doing what. Plus, by assigning the variables individually, you can see what is taking on what value.
An example would be:
$const = CONST;
$variable = VARIABLE;
$foo = (new Foo)->setFooProperty(12)->getFooProperty();
$bar = (123 - 456) * 10;
$ten = 1 + 2 + 1 + (5 - 4);
<<<EOD
Lorem ipsum dolor sit amet, **$variable** adipiscing elit.
Duis gravida aliquet dolor quis gravida.
Nullam viverra urna a velit laoreet, et ultrices purus condimentum.
Ut risus tortor, facilisis sed porta eget, semper a augue.
Sed adipiscing erat non sapien commodo volutpat.
Vestibulum nec lectus sed elit dictum accumsan vel adipiscing libero.
**$const** vehicula molestie sapien.
Ut fermentum quis risus ut pellentesque.
Proin in dignissim erat, eget molestie lorem. Mauris pretium aliquam eleifend.
**$foo** vitae sagittis dolor, quis sollicitudin leo.
Etiam congue odio sit amet sodales aliquet.
Etiam elementum auctor tellus, quis pharetra leo congue at. Maecenas sit amet ultricies neque.
Nulla luctus enim libero, eget elementum tellus suscipit eu.
Suspendisse tincidunt arcu at arcu molestie, a consequat velit elementum.
Ut et libero purus. Sed et magna vel elit luctus rhoncus.
Praesent dapibus consectetur tortor, vel **$bar** mauris ultrices id.
Mauris pulvinar nulla vitae ligula iaculis ornare.
Praesent posuere scelerisque ligula, id tincidunt metus sodales congue.
Curabitur lectus urna, porta sed molestie ut, mollis vitae libero.
Vivamus vulputate congue **$ten**.
EOD;
Use sprintf()
define('my_const', 100);
$string = <<< heredoc
<p>The value of my_const is %s.</p>
heredoc;
$string = sprintf($string, my_const);
Here is an little trick to allow double-quoted strings and heredocs to contain arbitrary expressions in curly braces syntax, including constants and other function calls. It uses the fact that a function name can be assigned to a variable and then called within heredoc:
<?php
// Declare a simple function
function _placeholder($val) { return $val; }
// And assign it to something short and sweet
$_ = '_placeholder';
// Or optionally for php version >= 5.3
// Use a closure (anomynous function) like so:
$_ = function ($val){return $val;};
// Our test values
define('abc', 'def');
define('ghi', 3);
$a = 1;
$b = 2;
function add($a, $b) { return $a+$b; }
// Usage
echo "b4 {$_(1+2)} after\n"; // Outputs 'b4 3 after'
echo "b4 {$_(abc)} after\n"; // Outputs 'b4 def after'
echo "b4 {$_(add($a, $b)+ghi*2)} after\n"; // Outputs 'b4 9 after'
$text = <<<MYEND
Now the same in heredoc:
b4 {$_(1+2)} after
b4 {$_(abc)} after
b4 {$_(add($a, $b)+ghi*2)} after
MYEND;
echo $text;
You can also use the get_defined_constants function. It puts back all currently defined constants in an array, which you can use in your HEREDOC string:
// Let's say there is FOO and BAR defined
$const = get_defined_constants();
$meta = <<< EOF
my awesome string with "{$const['FOO']}" and "{$const['BAR']}" constants
EOF;
You may use the "constant" function.
for example:
<?php
define('CONST1', 100);
define('CONST2', 200);
$C= 'constant';
echo <<<MYECHO
<p>The value of CONST1 is: {$C('CONST1')},
and CONST2 is:{$C('CONST2')}.</p>
MYECHO;
?>
Put your defined variable into simple variable and use include it in heredoc just as in following example:
<?php
define('my_const', 100);
$variable = my_const;
echo <<<MYECHO
<p>The value of my_const is {$variable}.</p>
MYECHO;
?>
Not everybody will like the use of shorthand echo tags, but this is still an option:
<?php
define('my_const', 100);
?>
<p>The value of my_const is <?= my_const ?>.</p>

Convert everything between <tag></tag> to HTML enitites with PHP

How could I convert everyting between a tag to html enities:
Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et dolore
magna aliquyam erat, sed diam voluptua.
<code class="highlight sql">
CREATE TABLE `comments`
</code>
<h1>Next step</h1>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
<b>Stet clita kasd gubergren, no sea takimata sanctus</b> est Lorem
dolor sit amet. Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy eirmod tempor invidunt
ut labore et dolore magna aliquyam erat, sed diam voluptua:
<code class="highlight php">
<?php
$host = "localhost";
?>
</code>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr.
Note: That example above is a string which I could convert in PHP.
This comes down to a regex for me. And before you start shouting it is possible to reliably match & replace subsets of html, as long as there are no nesting tags.
This is the easy way tbh. A regex to match a tag start till end and apply a function to the matches / encoding what we need and replacing it.
Heres the code:
<?php
$string = 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et dolore
magna aliquyam erat, sed diam voluptua.
<code class="highlight sql">
CREATE TABLE `comments`&
</code>
<h1>Next step</h1>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
<b>Stet clita kasd gubergren&, no sea takimata sanctus</b> est Lorem
dolor sit amet. Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy " eirmod " tempor invidunt
ut labore et dolore magna aliq&uyam erat, sed diam voluptua:
<code class="highlight php">
<?php
* $host = "localhost";
?>&
</code>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr.';
echo preg_replace("/(<code[^>]*?>)(.*?)(<\/code>)/se", "
stripslashes('$1').
htmlentities(stripslashes('$2')).
stripslashes('$3')
", $string);
And heres a working testcase on codepad
http://codepad.org/MhKwfOQl
This will work as long as there are no nasty nested tags / corrupted html.
I would still advise you to try and make sure you save the data as you want to make it visible, encoded where needed.
If you want to replace between a different set of tags change the regex.
Update: It seemed that $host was being parsed by php... and ofrourse we don't want this. This happened because php evaluates the replacement string as php which then executes the given functions and inputs the found strings into those functions, and if that string is encapsulated by double qoutes it will parse those strings too... heh what a hassle.
And another problem then arises, php escapes single and double qoutes in matches so they won't generate parse errors, this ment that any qoutes in the matches had to be stripped from their slashes too... resulting in the pretty long replace string.
Although a regular expression or parser may give you a solution to this puzzle, I think you may be going about your goal the wrong way.
Taken from the comments below the question:
#Poru How is that string generated?
#Phil: Fetched from database. It's
the content of a tutorial. It's an own development "CMS".
If you are storing this string in a database, and it's function is to return HTML content, you should be storing the content ready to serve as HTML, which means you must escape the appropriate characters with their equivalent HTML entities.
This was the advice already offered to you in this question: https://stackoverflow.com/questions/7059776/include-source-code-in-html-valid/7059834
The characters that must be escaped are explained here (among other various references):
http://php.net/manual/en/function.htmlspecialchars.php
The translations performed are:
'&' (ampersand) becomes '&'
'"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
"'" (single quote) becomes ''' only when ENT_QUOTES is set.
'<' (less than) becomes '<'
'>' (greater than) becomes '>'
If in fact this is the case, and this string is supposed to be HTML output and has no other function, it doesn't make any sense to save it as invalid HTML, or at least not what you intend it to be.
If you must store your code examples unescaped, consider a separate database table for these snippets, and simply run htmlspecialchars() on them before outputting it to the HTML document. You could even assign a language to each record, and use the appropriate syntax highlighting tool for each case automatically.
What you are attempting, in my opinion, is not the appropriate solution to this particular problem, in this context. Escaping the characters and having your HTML content ready to be output to screen in it's current form is the way to go.
$dom = new DOMDocument;
$dom->loadHTML(...);
$tags = $dom->getElementsByTagName('tag');
foreach($tags as $tag) {
$tag->nodeValue = htmlentities($tag->nodeValue);
}
$dom->saveHTML();

Categories