This question already has answers here:
Notice: Trying to get property of non-object error
(3 answers)
Closed 7 years ago.
I've been getting an error message for the following piece of code (I'm trying to get the content inside the 'article' tags on a certain web page):
function getTextFromLink($url) {
$html = new DOMDocument();
$html->loadHTML($url);
$text = $html->getElementsByTagName('article')->item(0)->textContent;
return $text;
}
It says that I'm trying to get the property of a non-object on the line with
$text = $html->getElementsbyTagName('article')->item(0)->textContent;
I'm fairly new to php and DOM; what am I missing here?
You have two problems in your code:
The obvious problem is that $html->getElementsByTagName('article')->item(0) is not an object. Specifically, it is null, since the HTML you're parsing doesn't actually contain any article elements. You could've figured this out yourself by following Devon's advice and viewing the value of $html->getElementsByTagName('article')->item(0) using var_dump().
Now, why doesn't your HTML contain any article elements? Well, the real problem turns out to be that the loadHTML() method will load HTML from a string and parse it. That is to say, when you call $html->loadHTML($url);, PHP will parse the contents of the string variable $url as HTML code, and give you a DOMDocument representing the result. Given that you named the variable $url, I'm pretty sure that's not what you want.
What you actually want to use instead is probably loadHTMLFile(), which actually loads HTML code from a named file (or, apparently, URL), rather than from a PHP string.
Related
This question already has answers here:
Remove a child with a specific attribute, in SimpleXML for PHP
(18 answers)
Closed 7 years ago.
I know there are a lot of answers out there for this exact question, but none of them seem to help me solve my problem.
I have an xml file on my server, i need to use PHP SimpleXML to remove an element from the document. After some googling i found a number of answers saying to use unset() and then save the xml.
so i came up with this:
function deleteCourse($course){
$xml = self::getxml(); # get the XML file
unset($xml->xpath('course[name = "'.$course.'"]'));
$xml->asXml("data.xml");
}
now whenever i run this i get this error: PHP Fatal error: Can't use method return value in write context in blahblahLink on line 92
line 92 is unset($xml->xpath('course[name = "'.$course.'"]'));
I really hope somebody can help me out with this
unset won't work if you pass method return, pass variable content / array instead
This question already has answers here:
Simple XML - Dealing With Colons In Nodes
(4 answers)
Closed 9 years ago.
Im trying to navigate an XML block similar to this one ($doc) using PHP simplexml_load_string and using xpath on $doc to get only the 'Day' block like this:
$myday = $doc->xpath ('//Day');
that lets me access all data from the block as an object, meaning
$myday->AdultCount;
returns 1 and
$myday->Id;
returns "6a0"
however I can't access "SpecialDeals" content not using:
$myday->SpecialDeals
nor using:
$myday->SpecialDeals->a:string
Whats is the right syntax in this case?
<Days>
<DaysId>687</DaysId>
<Day>
<AdultsCount>1</AdultsCount>
<Availability>Available</Availability>
<Id>6a0</Id>
<RoomType>Studio</RoomType>
<SpecialDeals xmlns:a="http://microsoft.com/2003/Arrays">
<a:string>Best Day Ever</a:string>
</SpecialDeals>
</Day>
<DaysPrice>247.4</DaysPrice>
</Days>");
You can access the tags with colons in them (aka namespaces) using the children() method:
echo $xml->Day->SpecialDeals->children('a', true)->string[0];
Demo!
This SitePoint article explains namespaces in detail.
This question already has answers here:
Reg expression to remove empty Tags (any of them)?
(3 answers)
Closed 9 years ago.
As mentioned in title, I'd like remove all empty elements from XML document.
By empty I mean elements that don't have any text nodes in it or in its children.
Is it possible to do that with phpQuery?
I used Gordon's code from answer in this topic: Reg expression to remove empty Tags (any of them)?
Firstly I tried just to put his XPath query into phpQueryObject::find() method, but it gave me a warning saying it's incorrect query. Don't know why since it's using DOMXPath and should work.
Anyway the solution was still quite simple.
$pqDoc = phpquery::newDocument() // phpQueryObject created some way. Doesn't matter here.
$xp = new DOMXPath($pqDoc->getDOMDocument());
foreach($xp->query('//*[not(node()) or normalize-space() = ""]') as $node) {
$node->parentNode->removeChild($node);
}
Now you have removed empty elements and you still can use your changed phpQueryObject since it has actually working on DOMDocument's reference.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to parse and process HTML with PHP?
I'm trying to scrape a page with PHP using file_get_contents().
This page has some JSON wrapped in a bit of HTML. I'd like to strip out this HTML to be able to use json_decode() on the scraped string so I can deal with the JSON separately.
Is there any clean way to do that? A quick search didn't really lead to anything.
Thanks
parsing/stripping HTML content is always a tricky one because (common?) solutions via regex might crash if the HTML markup is malformed and are painful slow btw. I would suggest using this little HTML DOM parser class:
http://simplehtmldom.sourceforge.net/
edited & added from subcomment:
Okay this is a bad one because the inline javascript is not properly wrapped with CDATA-Tags. Otherwise something like this might work:
$html = new simple_html_dom();
$html->load_file('your-external-file');
foreach($html->find("script") as $obj) {
if(isset($obj->innertext) && strpos($obj->innertext, 'window._jscalls'))
echo $obj->innertext;
}
This question already has answers here:
Getting actual value from PHP SimpleXML node [duplicate]
(4 answers)
Closed 8 years ago.
I am using simplexml_load_string for XML packets. In my scenario, the XML string I want to convert is known as k.
My problem, however, is that when I use k, tags still remain that weren't parsed (<k>, <\k>).
For example, I use
$x->k, and I get back <k>DATA I WANT HERE<\EK>.
How do I get rid of these?
What the code does: It connects to a game and logs in.
Use InnerNode to get the value without the tags:
$x->k->InnerNode
You can also do a typecast:
(string)$x->k
I tried this and seem to be getting the string.
<?php
$str = "<msg t='sys'><body action='rndK' r='-1'><k>qH~e9Gmt</k></body></msg>";
$xml = simplexml_load_string( $str );
echo $xml->body->k; // gives 'qH~e9Gmt'
?>