Table scraping with HTML DOM Parser - php

I would like to scrape the table t-02 from https://wyniki.tge.pl/wyniki/rdn/ with HTML DOM Parser.
I created simple code but I was getting errors:
Fatal error: Call to a member function find() on null in /Users/piotrek/Sites/foo/index.html on line 3
My first code:
<?php
include ("simple_html_dom.php");
$html=file_get_html("https://wyniki.tge.pl/pl/wyniki/rdn/");
$tables=$html->find("table[#class=t-02]");
foreach($tables->find("tr") as $a) {
foreach($a->find("td") as $element) {
echo $element;
}
}
?>
I changed the code to print innertext and it worked:
<?php
include_once ("simple_html_dom.php");
$html=file_get_html("https://wyniki.tge.pl/pl/wyniki/rdn/");
$title=$html->find("table[#class=t-02]",0)->innertext;
echo $title
?>
I changed only code so what was wrong with first approach? What was the reason of fatal error?

The error is saying that the $html object is null, so there is nothing to perform the find method against. Do you have the simple_html_dom.php file in the same directory?

Related

Extract the content of the script with PHP Simple HTML DOM

I want to extract the content of the script on this page, which has the ID __NEXT_DATA__ using PHP Simple HTML DOM, the code I wrote is this:
foreach($html_base->getElementsByTagName('script') as $element) {
if (isset($element->id)){
$id = $element->id;
if ($id == "__NEXT_DATA__"){
$f = $element->nodeValue;
echo $f;
break;
}
}
}
but unfortunately it gives me the following error:
Undefined property: DOMElement::$id
You can use simple html dom documentation but here's my suggestion:
$html = file_get_html("url");
$script = $html->find("script[id=__NEXT_DATA__]", 0)->innertext;
the second parameter which is 0, is the index of the searched results and because it's only one script with this id, you can take the first result.

PHP - Simple HTML DOM Parser - Getting FATAL ERROR when html is OK

I am trying to parse a table and output the a plaintext in another table. I have gotten this far:
<?php
if (url_exists($url))
{
$html = file_get_html($url);
}
else
{
echo "URL doesn't exist.";
}
if ($html && is_object($html) && isset($html->nodes))
{
// Everything checks out
$table = $html->find('table[border]');
if (!empty($table))
{
$row = $table->find('tr');
}
}
else
{
echo "Fetched page is not ok.";
}
?>
This returns an error:
Fatal error: Call to a member function find() on a non-object in /var/www/html/jsudimak/mailman/webdev-test1.php on line 78
Line 78 is this one: $row = $table->find('tr');
This means that :
the html is valid
the table I am trying to parse is also valid
Therefore, I am bewildered by the fact that the find() method is still returning this error.
I have looked into the cause of this error extensively for the past few days and I have yet to find a solution. I have also tried some other parsing tools no still no luck. Help me with this fellow debuggers!!!!
By the way, I am using the Simple HTML Dom Parser to parse the table.
Use $table = $html->find('table[border]')[0];
The documentation says that, unless you specify an index in the function find(), it will return an array

PHP error "Call to undefined function" using simple html dom

I'm fairly new to PHP, and i have a problem in defining a function that returns an array containing a price and description strings.
I am using the "simple html dom" php files that facilitates parsing.
The function i create requires 2 arguments : the link (from which it will grab data) and the id (used to get the proper css syntax).
This is the get_product_details.php
<?
require_once 'simple_html_dom.php';
$priceMatchTable=('span[id=our_price_display]');
$descMatchTable=('div[id=short_description_content]');
function get_prod_details( $link , $id ) {
global $priceMatchTable, $descMatchTable;
$html = file_get_html($link);
$result['price'] = $html->find($priceMatchTable[$id],0);
$result['desc'] = $html->find($descMatchTable[$id],0);
return $result;
}
And this is the main php:
<?php
include 'get_product_details.php';
$link = 'http://micromedia.tn/barette-memoire/1170-barette-m%C3%A9moire-1go-ddr-ii.html';
$id = 0;
$result = get_prod_details($link, $id);
echo $result['price'];
?>
Finally i get an error which tell:
find($priceMatchTable[$id],0); $result['desc'] = $html->find($descMatchTable[$id],0); return $result; }
Fatal error: Call to undefined function get_prod_details() in C:\xampp\htdocs\dom\index.php on line 8
Best regards!
This may sound silliy, but is
include 'get_product_details.php';
really pointing towards "get_product_details.php"?
Disable (//) the function call in you index.php and add a simple echo to your "get_product_details.php" to see if the file gets included.
I think you need something like:
include '/path/from/root_to_your/directory/get_product_details.php';
If your trying this in Windows land, it will look something like:
include 'C:\Documents\something\get_product_details.php';

How to detect element is exist or not

Hello everyone i am fetching data by using simple html dom
This is my code of php which is fetching data from site
include('simple_html_dom.php');
$html = new simple_html_dom();
$html->load_file($this->main_url.$lin->link);
if($html){
//check if language heading h2 exist then process forward
if($html->find('h2.channel-title',0)){
fetch data from tables
}
}
This line if($html->find('h2.channel-title',0)) finding h2.channel-title in find function of simple html dom give me a fatal error when h2.channer-title is not exist
In many pages <h2 class="channel-title"> English Links</h2> exists so i have code according to them and process further in my foreach loop it's working fine and fetched all data.
But
when <h2 class="channel-title">English Links</h2> tag is not exist it give me an error
Fatal error: Call to a member function find() on a non-object in C:\xampp\apps\wordpress\htdocs\wp-content\plugins\autobot\engine\simple_html_dom.php on line 1113
Please help me i am stuck in it need help thank you. i want if h2.channel-title exist run my foreach code else run another but don't give an error its stop my whole script. :(
this might help.
$html = new simple_html_dom();
$html->load_file($this->main_url.$lin->link);
if($html) {
$var = $html->find('h2.channel-title',0);
if(isset($var)) {
fetch data from tables
} else{
//do something
}
}
var_dump($html);
Which library you are using?

getting an error reading simpleDomObject

I have the following template file, named 'test.html'
<div class='title'>TEST</div>
And I have the following PHP code:
<?
include "simplehtmldom/simple_html_dom.php";
$dom = file_get_html( "test.html" );
echo $dom->outertext;
?>
So far so good, this displays the file test.html. But when I try to change something I get an error:
<?
include "simplehtmldom/simple_html_dom.php";
$dom = file_get_html( "test.html" );
$dom->find('.title')->innertext = "changed";
echo $dom->outertext;
?>
Warning: Attempt to assign property of non-object in E:\internet\test.php on line 4. Though I do believe I'm exactly following the manual. What is going wrong here? Obviously $dom->find('.title') didn't return a valid element, but the question is: why? It should find the DIV?
First:
You obviously missed index for found elements, so there isn't property find()->innertext
repaired code here:
<?php
error_reporting(E_ALL);
include "simplehtmldom/simple_html_dom.php";
$dom = file_get_html( "index.html" );
$dom->find('.title',0)->innertext = "changed";
echo $dom->outertext;
Second:
I wouldn't recommend you to use Simple Html DOM library, beacuse it's old and not actual
Take a look at QueryPath library, which is doing the same and is in better condition.
http://querypath.org/

Categories