remove a child element in a element content with simple_html_dom - php

This is a sample html code:
<div class="leadContent">
<span> sentence 1 </span>
sentence 2
</div>
I just want to get sentence 2 (Not span tag and its content)
Is there any way to do this with simple_html_dom?
$html->find('div.leadContent', 0)->innertext;

You can remove span tag using outertext:
$html->find('div.leadContent span', 0)->outertext = '';
then you get the content
$html->find('div.leadContent', 0)->innertext;
edit:
You can also try this way:
$div = $html->find('div.leadContent', 0);
$div->find('span',0)->outertext = '';
$div will have your content.

You can try this way also,
echo $html->find('text',2)->plaintext;

Related

Simple HTML DOM - Skip certain element

I want to ignore the contents of the <a> which is inside <h3> element and only get the text of the <h3>.
<h3>
144.000 TL
<a class="emlak-endeksi-link trackClick trackId_emlak-endeksi-link" id="emlakEndeksiLink">
Emlak Endeksi</a>
</h3>
Example: only want to get 144.000 TL and ignore the (Emlak Endeksi)
foreach ($html1->find('div.classifiedInfo h3') as $price) {
$ilanlar['price'] = $price->plaintext;
}
not very familiar with simple html dom, but ... selecting the text node via http://simplehtmldom.sourceforge.net/manual.htm#frag_find_textcomment should help?
$ilanlar['price'] = $price->find('text', 0)->plaintext;
Maybe removing the <a> tag helps:
$str = <<<str
<h3>
144.000 TL
<a class="emlak-endeksi-link trackClick trackId_emlak-endeksi-link" id="emlakEndeksiLink">
Emlak Endeksi</a>
</h3>
str;
$html = str_get_html($str);
// Find first <h3>
$h3 = $html->find('h3', 0);
// Find first <a> inside the <h3>, or use $h3->find('a') to find all of them
$a = $h3->find('a', 0);
// Remove <a> tag
$a->outertext = '';
// Output: "144.000 TL"
print trim($h3->innertext);
You can do it via regular expression.
preg_match_all('\<h3>([^\n]*\n+)+<a([^\n]*\n+)+<\/h3>\', $content, $output);
echo $output[1];
https://regex101.com/r/qM5Nlk/1

how Access to a span tag without class name

I have this codes in my SimpleHtmlDom Project
how can I access this span Tags without Class Name?
<div class="somename">
<span>This text i need </span>
<span>This text i need too </span>
</div>
how can I echo that span tags?
I already tried this:
$html->find(".somename",0)->innertext;
I believe you are using simple_html_dom.php. If that is the case then:
$html->find("span",0)->innertext;
should give you the first span
$html->find("span",1)->innertext;
should give you the second span
$html->find("span")->innertext;
should give you all spans in an array
If you are trying to retrieve the content of the span you should use plaintext not innertext
If you want it to specifically search for spans in a div with a class somename you can do it like this:
$html->find("div[class=somename] span")->innertext;
Reference: http://simplehtmldom.sourceforge.net/manual.htm
Use xpath to get those span tags.
$xml = new SimpleXMLElement($yourHtmlContents);
$result = $xml->xpath('//span');
$firstSpan = (string) $result[0];
$secondSpan = (string) $result[1];

Replace DIVs without attributes with P

I have a string which contains html.
Example:
$content="<div>content<div style=''>some<div>another</div></div> <div>test</div> </div>";
I want to replace all divs without attributes, to paragraphs.
I tried
$content = preg_replace( '/<div>(.*?)<\/div>/', '<p>$1</p>', $content);
but it returns:
<p>content<div style=''>some<div>another</p></div> <p>test</p> </div>
which is not what I want. I want to replace all div without attributes to p.
What should I do?
Thank you!
$content = str_replace("<div>","<p>",$content);
$content = str_replace("</div>","</p>",$content);
echo $content

How to select Content of ALL div's with PHP

I want to select contents of every DIV tags in PHP.
Just imagine we have this HTML page :
<html>
<body>
<div class="one">Content1</div>
<span>blah..</span>
<div class="two">Content2</div>
</body>
</html>
Now , i want to have every DIV tag content, For example from that HTML code , I want to have Content1 in One variable and the Content2 in the other Variable and so on ....
Just need to access the parts easily. Just this.
Every page have random number of DIV tags, so i need a flexable Code to detect DIV tags and put the content of every one in array or any type of variable..
How to do it ?
DOMDocument
$divs = array();
$HTML = '<html>
<body>
<div class="one">Content1</div>
<span>blah..</span>
<div class="two">Content2</div>
</body>
</html>';
$doc = new DOMDocument();
$doc->loadHTML($HTML);
foreach($doc->getElementsByTagName('div') as $div) {
array_push($divs, $div->textContent);
}
var_dump($divs);
example
try to use strip_tags() function:
http://php.net/manual/en/function.strip-tags.php
You can download PHP Simple HTML DOM Parser
And access the div tags like this :
$html = file_get_html('urltopage.com');
foreach($html->find('div') as $e)
echo $e->innertext . '<br>';

Zend_Dom_Query query element issue

I have an issue where I have a div that doesnt have a class or id. Is it possible to select an div element when I know its innerText ie
<div class="thishere"></div>
<div>Search on a this text</div>
If not, the div before it has a class, how do i find its next sibling?
$selector = new Zend_Dom_Query($response->getBody());
$nodes = $selector->query('????');
Using JavaScript you can loop through every element on the page like this says and find that div with the special class. Then, you'll know that the next element in the loop will be that second div and you can get its contents using element.innerHTML.
$text = <<<text
<div class="thishere"></div>
<div>Search on a this text</div>
text;
$selector = new Zend_Dom_Query ($text);
$nodes = $selector->queryXpath('//div[contains(text(),"Search on a this text")]');
foreach ($nodes as $node)
{
...
}

Categories