How to select Content of ALL div's with PHP - php

I want to select contents of every DIV tags in PHP.
Just imagine we have this HTML page :
<html>
<body>
<div class="one">Content1</div>
<span>blah..</span>
<div class="two">Content2</div>
</body>
</html>
Now , i want to have every DIV tag content, For example from that HTML code , I want to have Content1 in One variable and the Content2 in the other Variable and so on ....
Just need to access the parts easily. Just this.
Every page have random number of DIV tags, so i need a flexable Code to detect DIV tags and put the content of every one in array or any type of variable..
How to do it ?

DOMDocument
$divs = array();
$HTML = '<html>
<body>
<div class="one">Content1</div>
<span>blah..</span>
<div class="two">Content2</div>
</body>
</html>';
$doc = new DOMDocument();
$doc->loadHTML($HTML);
foreach($doc->getElementsByTagName('div') as $div) {
array_push($divs, $div->textContent);
}
var_dump($divs);
example

try to use strip_tags() function:
http://php.net/manual/en/function.strip-tags.php

You can download PHP Simple HTML DOM Parser
And access the div tags like this :
$html = file_get_html('urltopage.com');
foreach($html->find('div') as $e)
echo $e->innertext . '<br>';

Related

How to stop adding new values to an array after one value is added to that array?

I have some html files that contain the same tags with different strings between these tags , I want to get strings from specific tags and after it finds the first match then this string is the only added to the array , for more details see this code.
The html:
<!DOCTYPE html>
<html>
<head></head>
<body>
<h1>Some Text</h1>
<p>This is the first Paragraph</p>
<ul>
<li></li>
<li></l1>
</ul>
<p>This is the second Pharagraph</p>
</body>
</html>
The html files will contain more elements
I want to get the text inside the first <p> only and prevent wasting time searching the whole html file while I just want to get one value from a specific tag.
The PHP:
//Loop inside all the HTML files inside a folder
$files = glob("files/*.html");
foreach($files as $file){
//Get the whole content of each HTMl file
$content = file_get_contents($file);
//Search for specific tag
preg_match_all('#<p>(.*?)<\/p>', $content, $matches);
}
I only want to add the value of the first match to the $matches.
I can't edit the html code to add class or id to the tags I want to get values from because I'm not the one who created them and I can't edit all the files manually
I don't mind using another way to get these values but it should achieve what I want (only the first match then it's stopped searching the whole file)
You can do this with DomDocument.
<?php
$html = '<!DOCTYPE html>
<html>
<head></head>
<body>
<h1>Some Text</h1>
<p>This is the first Paragraph</p>
<ul>
<li></li>
<li></l1>
</ul>
<p>This is the second Pharagraph</p>
</body>
</html>';
$err = libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
libxml_clear_errors();
libxml_use_internal_errors($err);
// find all p tags, select the first, get its value
$pValue = $dom->getElementsByTagName('p')->item(0)->nodeValue;
//This is the first Paragraph
echo $pValue;
https://3v4l.org/kjFoC
So if you wanted to add to your code, perhaps do it like:
<?php
function getFirstParagraph($src) {
$err = libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($src);
libxml_clear_errors();
libxml_use_internal_errors($err);
return $dom->getElementsByTagName('p')->item(0)->nodeValue;
}
//Loop inside all the HTML files inside a folder
$files = glob("files/*.html");
foreach($files as $file){
//Get the whole content of each HTMl file
$content = file_get_contents($file);
//
$matches[] = getFirstParagraph($content);
}

Append HTML into div tag with PHP

I want to append a variable containing HTML text in php to a preloaded div element in the same file. I am using simpler examples to try and achieve what I want.
<?php
$htmlString = "<p>Hello World!</p>";
?>
$htmlString is generated from a PHP function so I just want to put a sample html code to mimic HTML code. Iam trying to put $htmlString in the div element
<div id="demo"><h1>Test</h1></div>
I have tried the following but it does not work:
<?php
$dom = new domDocument;
$dom->loadHTML($html);
$div_tag = $dom->getElementById('demo');
echo $dom->saveHTML($div_tag);
?>
I want to produce this output:
<div id="demo"><h1>Test</h1><p>Hello World!</p></div>
You can call php in between html-tags:
<div id="demo"><h1>Test</h1><?php echo $htmlString ?></div>

Scrape an H1 element on the current page in PHP

I'm currently working with Wordpress. I have a hook that runs before a <title> attribute is populated with text that a user enters in the dashboard.
Now I want to set a default title of each page to equal an <h1> attribute text value on a current page. A fragment of the callback function for the hook I'm working with would look like:
if (!$seoTitle) {
$seoTitle = '<....>';
}
return $seoTitle;
I want seoTitle to default to an <h1> element text on the current page. Is it doable? How can I achieve this?
I'm not totally sure how you get your HTML but you could parse it with the built in DOM parser.
<?php
$html = "<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>This is a Heading one</h1>
<p>This is a paragraph.</p>
<h1>This is a Heading two</h1>
<p>This is a paragraph.</p>
<h1>This is a Heading three</h1>
<p><a href='testwww'> This is a paragraph.</a></p>
</body>
</html>";
$dom = new DOMDocument();
$dom->loadHTML($html);
//If you want to get it from a website you could do the following:
//$dom->loadHTML(file_get_contents('http://www.w3schools.com/'));
// iterate through the html to get all h1 text
foreach($dom->getElementsByTagName('h1') as $heading) {
$h1 = $heading->nodeValue;
echo $h1 . "<br>";
}
?>
Assuming you have your HTML content within a variable and doing this after the page has fully loaded please take a look at the below example:
<?php
$htmlContent = '<html><body><h1>HELLO</h1></body></html>'; // change this to what you need
$seoTitle = preg_replace('/(.*)<h1>([^>]*)<\/h1>(.*)/is', '$2', $htmlContent);
echo $seoTitle; // will output: HELLO
?>
echo "<h1>".(string)$seoTitle."</h1>";
Should work. You can also break out of the php ?> and then type regular html and then break in when you wanna echo the variable.

Select Content of div using php

I have a div named "main" in my page. I put the code to convert a html into pdf using php at the end of page. I want to select the content (div named main contains paragraphs, charts, tables etc.).
How ?
Below code will show you how to get DIV tag's content using PHP code.
PHP Code:
<?php
$content="test.html";
$source=new DOMdocument();
$source->loadHTMLFile($content);
$path=new DOMXpath($source);
$dom=$path->query("*/div[#id='test']");
if (!$dom==0) {
foreach ($dom as $dom) {
print "
The Type of the element is: ". $dom->nodeName. "
<b><pre><code>";
$getContent = $dom->childNodes;
foreach ($getContent as $attr) {
print $attr->nodeValue. "</code></pre></b>";
}
}
}
?>
We are getting DIV tag with ID "test", You can replace it with your desired one.
test.html
<div id="test">This is my content</div>
Output:
The Type of the element is: div
This is my content
You should put the php code into a separate file from the html and use something like DOMDocument to get the content from the div.
$dom = new DOMDocument();
$dom->loadHTMLFile('yourfile.html');
...
You cannot directly interact with the HTML DOM via PHP.
What you could do, is using a with an input containing your content. When submitting the form you can access the data via PHP.
But maybe you want to use Javascript for that task?
Nevertheless, a quick'n'dirty PHP example:
<form action="" method="post">
<textarea name="content">hello world</textarea>
</form>
<?php
if (isset($_POST['content'])) {
echo $_POST['content'];
}
?>

PHP or Javascript: Simply Remove and Replace HTML Code

I have this code on my page, but the link has different names and ids:
<div class="myclass">
<a href="http://www.example.com/?vstid=00575000&veranstaltung=http://www.example.com/page.html">
Example Text</a>
</div>
how can I remove and Replace it to this:
<div class="myclass">Sorry no link</div>
With PHP or Javascript? I tried it with str.replace
Thank you!
I assume you mean dynamically? You won't be able to do this with php because it is server side, and doesn't have anything to do with the HTML once its been output to the screen.
See: http://www.tizag.com/javascriptT/javascript-innerHTML.php for the javascript.
Or you could use jquery which is just better and nicer than trying to do a cross browser compatible javascript script.
$('.myclass').html('Sorry...');
If the page is still on the server before you need to make the replacement, do this:
<?php if (allowed_to_see_link()) { ?>
<div class="myclass">
<a href="http://www.example.com/? vstid=00575000&veranstaltung=http://www.example.com/page.html">
Example Text</a>
</div>
<?php } else { ?>
non-link-text
<php } ?>
and also write the named functions...
You might want to clearify what you are up to. If that is your file, then you can simply open up in an editor and remove the portions. If you want to modify HTML with PHP, you can use native DOM
$dom = new DOMDocument;
$dom->loadHTML($htmlString);
$xPath = new DOMXPath($dom);
foreach( $xPath->query('//div[#class="myclass"]/a') as $link) {
$link->parentNode->replaceChild(new DOMText('Sorry no link'), $link);
}
echo $dom->saveHTML();
The above code would replace any direct <a> element children of any <div> elements that have a class attribute of myclass with the Textnode "Sorry no link".

Categories