I have the following template file, named 'test.html'
<div class='title'>TEST</div>
And I have the following PHP code:
<?
include "simplehtmldom/simple_html_dom.php";
$dom = file_get_html( "test.html" );
echo $dom->outertext;
?>
So far so good, this displays the file test.html. But when I try to change something I get an error:
<?
include "simplehtmldom/simple_html_dom.php";
$dom = file_get_html( "test.html" );
$dom->find('.title')->innertext = "changed";
echo $dom->outertext;
?>
Warning: Attempt to assign property of non-object in E:\internet\test.php on line 4. Though I do believe I'm exactly following the manual. What is going wrong here? Obviously $dom->find('.title') didn't return a valid element, but the question is: why? It should find the DIV?
First:
You obviously missed index for found elements, so there isn't property find()->innertext
repaired code here:
<?php
error_reporting(E_ALL);
include "simplehtmldom/simple_html_dom.php";
$dom = file_get_html( "index.html" );
$dom->find('.title',0)->innertext = "changed";
echo $dom->outertext;
Second:
I wouldn't recommend you to use Simple Html DOM library, beacuse it's old and not actual
Take a look at QueryPath library, which is doing the same and is in better condition.
http://querypath.org/
Related
I would like to scrape the table t-02 from https://wyniki.tge.pl/wyniki/rdn/ with HTML DOM Parser.
I created simple code but I was getting errors:
Fatal error: Call to a member function find() on null in /Users/piotrek/Sites/foo/index.html on line 3
My first code:
<?php
include ("simple_html_dom.php");
$html=file_get_html("https://wyniki.tge.pl/pl/wyniki/rdn/");
$tables=$html->find("table[#class=t-02]");
foreach($tables->find("tr") as $a) {
foreach($a->find("td") as $element) {
echo $element;
}
}
?>
I changed the code to print innertext and it worked:
<?php
include_once ("simple_html_dom.php");
$html=file_get_html("https://wyniki.tge.pl/pl/wyniki/rdn/");
$title=$html->find("table[#class=t-02]",0)->innertext;
echo $title
?>
I changed only code so what was wrong with first approach? What was the reason of fatal error?
The error is saying that the $html object is null, so there is nothing to perform the find method against. Do you have the simple_html_dom.php file in the same directory?
I'm fairly new to PHP, and i have a problem in defining a function that returns an array containing a price and description strings.
I am using the "simple html dom" php files that facilitates parsing.
The function i create requires 2 arguments : the link (from which it will grab data) and the id (used to get the proper css syntax).
This is the get_product_details.php
<?
require_once 'simple_html_dom.php';
$priceMatchTable=('span[id=our_price_display]');
$descMatchTable=('div[id=short_description_content]');
function get_prod_details( $link , $id ) {
global $priceMatchTable, $descMatchTable;
$html = file_get_html($link);
$result['price'] = $html->find($priceMatchTable[$id],0);
$result['desc'] = $html->find($descMatchTable[$id],0);
return $result;
}
And this is the main php:
<?php
include 'get_product_details.php';
$link = 'http://micromedia.tn/barette-memoire/1170-barette-m%C3%A9moire-1go-ddr-ii.html';
$id = 0;
$result = get_prod_details($link, $id);
echo $result['price'];
?>
Finally i get an error which tell:
find($priceMatchTable[$id],0); $result['desc'] = $html->find($descMatchTable[$id],0); return $result; }
Fatal error: Call to undefined function get_prod_details() in C:\xampp\htdocs\dom\index.php on line 8
Best regards!
This may sound silliy, but is
include 'get_product_details.php';
really pointing towards "get_product_details.php"?
Disable (//) the function call in you index.php and add a simple echo to your "get_product_details.php" to see if the file gets included.
I think you need something like:
include '/path/from/root_to_your/directory/get_product_details.php';
If your trying this in Windows land, it will look something like:
include 'C:\Documents\something\get_product_details.php';
how can Dynamic this code with php .my address is variable
<?php
$pagecontents = file_get_contents("http://google.com");
$html = htmlentities($pagecontents);
echo $html;
?>
I'm not sure, I understood, what the goal is, but if you want to do the same thing, that you have shown in the question but with multiple sites, then you can do it with a simple loop:
$sites = ["aa.com/a", "aa.com/b"] // array(...) with earlier PHP versions
foreach($sites as $url) {
$pagecontents = file_get_contents($url);
echo htmlentities($pagecontents);
}
If this is not what you are looking for, then please refactor the question, so it clearly explains, what you want to do!
I am using simple html dom parser but when I use file_get_html(), it returns empty page, but page is note empty you can check by opening in browser. Here is my code
include"11/simple_html_dom.php";
$link = "http://www.flipkart.com/transcend-storejet-25m3-2-5-inch-1-tb-external-hard-disk/p/itmd72p3y3zcsbku? pid=ACCD72ZXFC6ZRTST&srno=b_1&ref=549d7873-2897-4bd5-8451-776337341be8";
$html = file_get_html($link);
if(!empty($html)){
echo $html->find("span.fk-font-verybig") ;
}
else{
echo 'file is empty';
}
Any help would be appreciated.
Try this:
in your simple_html_dom.php
Edit this line define('MAX_FILE_SIZE',600000); to define('MAX_FILE_SIZE',900000); or even more according to the size of your file.
This sometimes occurs when your Html file size is larger then what is define so it returns empty without any errors.
I hope it works.
Instead of echo $html->find("span.fk-font-verybig");
try echo reset( $html->find("span.fk-font-verybig") );
I'm experimenting with autoblogging (i.e., RSS-driven blog posting) using WordPress, and all that's missing is a component to automattically fill in the content of the post with the content that the RSS's URL links to (RSS is irrelevant to the solution).
Using standard PHP 5, how could I create a function called fetchHTML([URL]) that returns the HTML content of a webpage that's found between the <body>...</body> tags?
Please let me know if there are any prerequisite "includes".
Thanks.
Okay, here's a DOM parser code example as requested.
<?php
function fetchHTML( $url )
{
$content = file_get_contents($url);
$html=new DomDocument();
$body=$html->getelementsbytagname('body');
foreach($body as $b){ $content=$b->textContent; break; }//hmm, is there a better way to do that?
return $content;
}
Assuming that it will always be <body> and not <BODY> or <body style="width:100%"> or anything except <body> and </body>, and with the caveat that you shouldn't use regex to parse HTML, even though I'm about to, here ya go:
<?php
function fetchHTML( $url )
{
$feed = '<body>Lots of stuff in here</body>';
$content = file_get_contents( $url );
preg_match( '/<body>([\s\S]{1,})<\/body>/m', $content, $match );
$content = $match[1];
return $content;
} // fetchHTML
?>
If you echo fetchHTML([some url]);, you'll get the html between the body tags.
Please note original caveats.
I think you're better of using a class like SimpleDom -> http://sourceforge.net/projects/simplehtmldom/ to extract the data as you don't need to write such complicated regular expressions