PHP: php and .html file separation - php

I'm currently working on separating HTML & PHP code here's my code which is currently working for me.
code.php
<?php
$data['#text#'] = 'A';
$html = file_get_contents('test.html');
echo $html = str_replace(array_keys($data),array_values($data),$html);
?>
test.html
<html>
<head>
<title>TEST HTML</title>
</head>
<body>
<h1>#text#</h1>
</body>
</html>
OUTPUT: A
it search and change the #text# value to array_value A it works for me.
Now i'm working on a code to search "id" tags on html file. If it's searches the "id" in ".html" file it will put the array_values in the middle of >
EX: <div id="test"> **aray_values here** </div>
test.php
<?php
$data['id="test"'] = 'A';
$html = file_get_contents('test.html');
foreach ($data as $search => $value)
{
if (strpos($html , $search))
{
echo 'FOUND';
echo $value;
}
}
?>
test.html
<html>
<head>
<title>TEST</title>
</head>
<body>
<div id="test" ></div>
</body>
</html>
My problem is I don't know how to put the array_values in the middle of every ></ search in the .html file.
Desired OUTPUT: <div id="test" >A</div>

function callbackInsert($matches)
{
global $data;
return $matches[1].$matches[3].$matches[4].$data[$matches[3]].$matches[6];
}
$data['test'] = 'A';
$html = file_get_contents('test.html');
foreach ($data as $search => $value)
{
preg_replace_callback('#(<([a-zA-Z]+)[^>]*id=")(.*?)("[^>]*>)([^<]*?)(</\\2>)#ism', 'callbackInsert', $html);
}
Warning: code is not tested and could be improved - re global keyword and what items are allowed between > and
Regular expression explanation:
(<([a-zA-Z]+) - any html tag starting including the last letter of the tag
[^>]* - anything that is inside a tag <>
id=")(.*?)(" - the id attribute and its value
[^>]* - anything that is inside a tag <>
>) - the closing tag
([^<]*?) - anything that is not a tag, tested by opening a tag <
(</\\2>) - the closing tag matching the 2nd bracket, ie. the matching opening tag

Use views (.phtml) files to dynamically generate content. This is native for PHP (no 3rd party required).
See this answer: What is phtml, and when should I use a .phtml extension rather than .php?
and this:
https://stackoverflow.com/questions/62617/whats-the-best-way-to-separate-php-code-and-html

Related

Find preceding element using PHP Simple HTML Parser

I have some HTML that is setup like the following (this can be different though!):
<table></table>
<h4>Content</h4>
<table></table>
I'm using PHP Simple HTML DOM Parser to loop over a section of code setup like this:
How can I say something like - "Find the table and the preceding h4, grab the text from the h4 if it exists, if it doesn't then leave blank".
If I just use $html->find('div[class=product-table] h4'); then it ignores the fact there was no title for the first table.
This is my full code for context:
$table_rows = $html->find('div[class=product-table] table');
$tablecounter = 1;
foreach ($table_rows as $table){
$tablevalue[] =
array(
"field_5b3f40cae191b" => "Table",
);
}
update_field( $field_key, $tablevalue, $post_id );
Update:
I've found in the documentation that you can use prev_sibling() so I've tried $table_title = $html->find('div[class=product-table] table')->prev_sibling('h4'); but can't seem to get it to work.
I've simplified the example to hopefully show the situation your after, it does assume that the <h4> tag is immediately prior to the <table> tag. But it uses the prev_sibling() of the table tag you find.
require_once 'simple_html_dom.php';
$source = "<html>
<body>
<div class='product-table'>
<table>t1</table>
<h4>Content</h4>
<table>t2</table>
</div>
</body>
</html>";
$html = str_get_html($source);
$table_rows = $html->find('div[class=product-table] table');
foreach ($table_rows as $table){
$prev = $table->prev_sibling();
if ( !empty($prev) && $prev->tag == "h4") {
echo "h4=".(string)$prev->innertext().PHP_EOL;
}
echo "content=".(string)$table.PHP_EOL;
}
echos..
content=<table>t1</table>
h4=Content
content=<table>t2</table>

Scrape an H1 element on the current page in PHP

I'm currently working with Wordpress. I have a hook that runs before a <title> attribute is populated with text that a user enters in the dashboard.
Now I want to set a default title of each page to equal an <h1> attribute text value on a current page. A fragment of the callback function for the hook I'm working with would look like:
if (!$seoTitle) {
$seoTitle = '<....>';
}
return $seoTitle;
I want seoTitle to default to an <h1> element text on the current page. Is it doable? How can I achieve this?
I'm not totally sure how you get your HTML but you could parse it with the built in DOM parser.
<?php
$html = "<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>This is a Heading one</h1>
<p>This is a paragraph.</p>
<h1>This is a Heading two</h1>
<p>This is a paragraph.</p>
<h1>This is a Heading three</h1>
<p><a href='testwww'> This is a paragraph.</a></p>
</body>
</html>";
$dom = new DOMDocument();
$dom->loadHTML($html);
//If you want to get it from a website you could do the following:
//$dom->loadHTML(file_get_contents('http://www.w3schools.com/'));
// iterate through the html to get all h1 text
foreach($dom->getElementsByTagName('h1') as $heading) {
$h1 = $heading->nodeValue;
echo $h1 . "<br>";
}
?>
Assuming you have your HTML content within a variable and doing this after the page has fully loaded please take a look at the below example:
<?php
$htmlContent = '<html><body><h1>HELLO</h1></body></html>'; // change this to what you need
$seoTitle = preg_replace('/(.*)<h1>([^>]*)<\/h1>(.*)/is', '$2', $htmlContent);
echo $seoTitle; // will output: HELLO
?>
echo "<h1>".(string)$seoTitle."</h1>";
Should work. You can also break out of the php ?> and then type regular html and then break in when you wanna echo the variable.

How to select Content of ALL div's with PHP

I want to select contents of every DIV tags in PHP.
Just imagine we have this HTML page :
<html>
<body>
<div class="one">Content1</div>
<span>blah..</span>
<div class="two">Content2</div>
</body>
</html>
Now , i want to have every DIV tag content, For example from that HTML code , I want to have Content1 in One variable and the Content2 in the other Variable and so on ....
Just need to access the parts easily. Just this.
Every page have random number of DIV tags, so i need a flexable Code to detect DIV tags and put the content of every one in array or any type of variable..
How to do it ?
DOMDocument
$divs = array();
$HTML = '<html>
<body>
<div class="one">Content1</div>
<span>blah..</span>
<div class="two">Content2</div>
</body>
</html>';
$doc = new DOMDocument();
$doc->loadHTML($HTML);
foreach($doc->getElementsByTagName('div') as $div) {
array_push($divs, $div->textContent);
}
var_dump($divs);
example
try to use strip_tags() function:
http://php.net/manual/en/function.strip-tags.php
You can download PHP Simple HTML DOM Parser
And access the div tags like this :
$html = file_get_html('urltopage.com');
foreach($html->find('div') as $e)
echo $e->innertext . '<br>';

DOMDocument remove script tags from HTML source

I used #Alex's approach here to remove script tags from a HTML document using the built in DOMDocument. The problem is if I have a script tag with Javascript content and then another script tag that links to an external Javascript source file, not all script tags are removed from the HTML.
$result = '
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>
hey
</title>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
<script>
alert("hello");
</script>
</head>
<body>hey</body>
</html>
';
$dom = new DOMDocument();
if($dom->loadHTML($result))
{
$script_tags = $dom->getElementsByTagName('script');
$length = $script_tags->length;
for ($i = 0; $i < $length; $i++) {
if(is_object($script_tags->item($i)->parentNode)) {
$script_tags->item($i)->parentNode->removeChild($script_tags->item($i));
}
}
echo $dom->saveHTML();
}
The above code outputs:
<html>
<head>
<meta charset="utf-8">
<title>hey</title>
<script>
alert("hello");
</script>
</head>
<body>
hey
</body>
</html>
As you can see from the output, only the external script tag was removed. Is there anything I can do to ensure all script tags are removed?
Your error is actually trivial. A DOMNode object (and all its descendants - DOMElement, DOMNodeList and a few others!) is automatically updated when its parent element changes, most notably when its number of children change. This is written on a couple of lines in the PHP doc, but is mostly swept under the carpet.
If you loop using ($k instanceof DOMNode)->length, and subsequently remove elements from the nodes, you'll notice that the length property actually changes! I had to write my own library to counteract this and a few other quirks.
The solution:
if($dom->loadHTML($result))
{
while (($r = $dom->getElementsByTagName("script")) && $r->length) {
$r->item(0)->parentNode->removeChild($r->item(0));
}
echo $dom->saveHTML();
I'm not actually looping - just popping the first element one at a time. The result: http://sebrenauld.co.uk/domremovescript.php
To avoid that you get the surprises of a live node list -- that gets shorter as you delete nodes -- you could work with a copy into an array using iterator_to_array:
foreach(iterator_to_array($dom->getElementsByTagName($tag)) as $node) {
$node->parentNode->removeChild($node);
};

Get HTML source code of page with PHP

If I have the html file:
<!doctype html>
<html>
<head></head>
<body>
<!-- Begin -->
Important Information
<!-- End -->
</body>
</head>
</html>
How can I use PHP to get the string "Important Information" from the file?
If you already have the parsing sorted, just use file_get_contents(). You can pass it a URL and it will return the content found at the URL, in this case, the html. Or if you have the file locally, you pass it the file path.
In this simple example you can open the file and do fgets() until you find a line with <!-- Begin --> and saving the lines until you find <!-- End -->.
If your HTML is in a variable you can just do:
<?php
$begin = strpos($var, '<!-- Begin -->') + strlen('<!-- Begin -->'); // Can hardcode this with 14 (the length of your 'needle'
$end = strpos($var, '<!-- End -->');
$text = substr($var, $begin, ($end - $begin));
echo $text;
?>
You can see the output here.
You can fetch "HTML" by this
//file_get_html function from third party library
// Create DOM from URL or file
$html = file_get_html('http://www.example.com/');
and any operation on DOM then read following docs:
http://de.php.net/manual/en/book.dom.php

Categories