Here is a basic example of how simple_html_dom works in a standalone php file.
test.php:
include ('simple_html_dom.php');
$url = "http://www.google.com";
$html = new simple_html_dom();
$html->load_file($url);
print $html;
If I execute it with the command: php test.php
It dumps correctly the html of the website (in this example, google.com)
Now let's take a look at a basic example of code using a Symfony task:
class parserBasic extends sfBaseTask {
public function configure()
{
$this->namespace = 'parser';
$this->name = 'basic';
}
public function execute($arguments = array(), $options = array())
{
$url = "http://www.google.com";
$html = new simple_html_dom();
$html->load_file($url);
print $html;
}
}
This file it's located under: <appname>/lib/task
I don't need to include the library in the file because being under the lib/task folder, it gets automatically loaded.
I execute the task using this command: php symfony parser:basic
And I get the following error message:
PHP Fatal error:
Call to a member function innertext() on a non-object in
/home/<username>/<appname>/lib/task/simple_html_dom.php on line 1688
Any suggestions?
The problem comes from Symfony.
In fact, if an error occur when loading the file with simple_html_dom, it won't say anything but returning false.
For example, if you perform this in your task:
$url = "http://www.google.com";
$html = new simple_html_dom();
$res = $html->load_file($url);
if (false === $res)
{
throw new Exception("load_file failed.");
}
print $html;
You will get an exception. If you tweak simple_html_dom to display en error when loading a file, around line 1085:
// load html from file
function load_file()
{
$args = func_get_args();
$this->load(call_user_func_array('file_get_contents', $args), true);
// Throw an error if we can't properly load the dom.
if (($error=error_get_last())!==null) {
// I added this line to see any errors
var_dump($error);
$this->clear();
return false;
}
}
You will see:
array(4) {
["type"]=>
int(8)
["message"]=>
string(79) "ob_end_flush(): failed to delete and flush buffer. No buffer to delete or flush"
["file"]=>
string(62) "/home/.../symfony.1.4/lib/command/sfCommandApplication.class.php"
["line"]=>
int(541)
}
I usually got this error (which is, in fact, a notice) when using task. The problem is here, in sfCommandApplication, with ob_end_flush:
/**
* Fixes php behavior if using cgi php.
*
* #see http://www.sitepoint.com/article/php-command-line-1/3
*/
protected function fixCgi()
{
// handle output buffering
#ob_end_flush();
ob_implicit_flush(true);
To fix that, I comment the line with #ob_end_flush();. And every thing goes fine. I know, it's an ugly fix, but it works. An other way to fix that, is to disable notice from PHP (in php.ini), like :
// Report all errors except E_NOTICE
error_reporting = E_ALL ^ E_NOTICE
Related
I have a simple AJAX call that retrieves text from a file, pushes it into a table, and displays it. The call works without issue when testing on a Mac running Apache 2.2.26/PHP 5.3 and on an Ubuntu box running Apache 2.2.1.6/PHP 5.3. It does not work on RedHat running Apache 2.2.4/PHP 5.1. Naturally, the RedHat box is the only place where I need it to be working.
The call returns 200 OK but no content. Even if nothing is found in the file (or it's inaccessible), the table header is echoed so if permissions were a problem I would still expect to see something. But to be sure, I verified the file is readable by all users.
Code has been redacted and simplified.
My ajax function:
function ajax(page,targetElement,ajaxFunction,getValues)
{
xmlhttp=new XMLHttpRequest();
xmlhttp.onreadystatechange=function()
{
if (xmlhttp.readyState===4 && xmlhttp.status===200)
{
document.getElementById(targetElement).innerHTML=xmlhttp.responseText;
}
};
xmlhttp.open('GET','/appdir/dir/filedir/'+page+'_funcs.php?function='+ajaxFunction+'&'+getValues+'&'+new Date().getTime(),false);
xmlhttp.setRequestHeader('cache-control','no-cache');
xmlhttp.send();
}
I call it like this:
ajax('pagename','destelement','load_info');
And return the results:
// Custom file handler
function warn_error($errno, $errstr) {
// Common function for warning-prone functions
throw new Exception($errstr, $errno);
}
function get_file_contents() {
// File operation failure would return a warning
// So handle specially to suppress the default message
set_error_handler('warn_error');
try
{
$fh = fopen(dirname(dirname(__FILE__))."/datafile.txt","r");
}
catch (Exception $e)
{
// Craft a nice-looking error message and get out of here
$info = "<tr><td class=\"center\" colspan=\"9\"><b>Fatal Error: </b>Could not load customer data.</td></tr>";
restore_error_handler();
return $info;
}
restore_error_handler();
// Got the file so get and return its contents
while (!feof($fh))
{
$line = fgets($fh);
// Be sure to avoid empty lines in our array
if (!empty($line))
{
$info[] = explode(",",$line);
}
}
fclose($fh);
return $info;
}
function load_info() {
// Start the table
$content .= "<table>
<th>Head1</th>
<th>Head2</th>
<th>Head3</th>
<th>Head4</th>";
// Get the data
// Returns all contents in an array if successful,
// Returns an error string if it fails
$info = get_file_contents();
if (!is_array($info))
{
// String was returned because of an error
echo $content.$info;
exit();
}
// Got valid data array, so loop through it to build the table
foreach ($info as $detail)
{
list($field1,$field2,$field3,$field4) = $detail;
$content .= "<tr>
<td>$field1</td>
<td>$field2</td>
<td>$field3</td>
<td>$field4</td>
</tr>";
}
$content .= "</table>";
echo $content;
}
Where it works, the response header indicates the connection as keep-alive; where it fails, the connection is closed. I don't know if that matters.
I've looked all over SO and the net for some clues but "no content" issues invariably point to same-origin policy problems. In my case, all content is on the same server.
I'm at a loss as to what to do/where to look next.
file_get_contents() expects a parameter. It does not know what you want, so it returned false. Also, you used get_file_contents() which is the wrong order.
This turned out to be a PHP version issue. In the load_info function I was using filter_input(INPUT_GET,"value"), but that was not available in PHP 5.1. I pulled that from my initial code post because I didn't think it was part of the problem. Lesson learned.
i want to write tests for a quite large and complicated project (better later than never).
i made the "code" runnable via bootstrap and tests, but i have some problems with exit commands inside the project ...
i have a testclass like this
class website_call_direct_Test extends PHPUnit_Framework_TestCase
{
public function execute(array $req){
$_REQUEST = array();
error_reporting(E_ALL & ~E_DEPRECATED & ~E_NOTICE & ~E_STRICT);
foreach($req as $k => $v){
$_REQUEST[$k] = $v;
}
$_GET = &$_REQUEST;
$_POST = &$_REQUEST;
ob_start();
include(G::$baseDir ."/index.php");
$output = ob_get_clean();
return $output;
}
/**
* #runInSeparateProcess
*/
public function testPrecaution()
{
$req = array();
$req['ajaxreq'] = 1;
$req['m'] = "precaution";
$req['type'] = "list";
$req['mode'] = "default";
$req['wnd'] = "new";
$output = $this->execute($req);
echo $output;
//SOME validation with the output
$this->assertEquals(false, strpos("...", $output));
throw new Exception();
}
}
The Problem is that the Exception is never thrown because the script ends with an exit at several points. i know i can test some classes directly, but i want to ensure that some calls with some variables do not produce errors / exceptions. is there any workaround beside removing every exit in the project?
Looks like you're testing the whole app. I don't think there is any way of disabling the exit sentences.
The workaround (and recommended way of testing whole apps), is accesing the site using curl, and checking that the HTML contains what you expect.
Even better if you use a scraping solution that will let you query the HTML using selectors or XPath, like this one.
I have a class similar to this
class x {
function __construct($file){
$this->readData = new splFileObject($file);
}
function a (){
//do something with $this->readData;
}
function b(){
//do something with $this->readData;
}
}
$o = new x('example.txt');
echo $o->a(); //this works
echo $o->b(); //this does not work.
it seems if which ever method called first only works, if they are called together only the first method that is called will work. I think the problem is tied to my lack of understand how the new object gets constructed.
The construct is loaded into the instance of the class. And you're instantiating it only once. And accessing twice. Are different actions. If you want to read the file is always taken, should create a method that reads this file, and within all other trigger this method.
I tested your code and it worked normal. I believe it should look at the logs and see if any error appears. If the file does not exist your code will stop.
Find for this error in your apache logs:
PHP Fatal error: Uncaught exception 'RuntimeException' with message 'SplFileObject::__construct(example.txt): failed to open stream
Answering your comment, this can be a way:
<?php
class x {
private $defaultFile = "example.txt";
private function readDefaultFile(){
$file = $this->defaultFile;
return new splFileObject($file);
}
function a (){
$content = $this->readDefaultFile();
return $content ;
}
function b(){
$content = $this->readDefaultFile();
return $content ;
}
}
$o = new x();
echo $o->a();
echo $o->b();
Both methods will return an object splFile.
I'd like to parse an xml file with PHP. I used this code:
$images = parsage("Rambod_catalog.xml", "Thumbnail");
$prices = parsage("Rambod_catalog.xml", "Retail_Price");
echo sizeof($images);
function getindex($element, $liste) {
for($i=0;$i<sizeof($liste);$i) {
if($liste[$i] == $element)return $i;
}
return 0;
}
function parsage($document, $noeud) {
$document_xml = new DomDocument;
$document_xml->load($document);
$elements = $document_xml->getElementsByTagName($noeud);
return $elements;
}
but i got this exception:
Warning: DOMDocument::load() [domdocument.load]: I/O warning :
failed to load external entity "/Rambod_catalog.xml"
So, what is the problem? How can i fix my code?
"/Rambod_catalog.xml"
The path you have given is invalid. What you have wrote there is, unfortunately, referring to filesystem root. either the file is stored in the current directory and you delete the leading slash or specify a full path. (include_path is at times unreliable).
you should refer to files like CONSTANT_TO_DOCUMENT_ROOT.PATH_SEPARATOR.$fileName in order to be flexible.
also see getcwd() and $_SERVER global variable and in need, print_r or var_dump the reserved globlas to find what you might use for the CONSTANT_TO_DOCUMENT_ROOT.
I'm using a library called Simple HTML DOM
One of it's methods, loads the url into a DOM object:
function load_file()
{
$args = func_get_args();
$this->load(call_user_func_array('file_get_contents', $args), true);
// Throw an error if we can't properly load the dom.
if (($error=error_get_last())!==null) {
$this->clear();
return false;
}
}
In order to test error handling, I created this code:
include_once 'simple_html_dom.php';
function getSimpleHtmlDomLoaded($url)
{
$html = false;
$count = 0;
while ($html === false && ($count < 10)) {
$html = new simple_html_dom();
$html->load_file($url);
if ($html === false) {
echo "Error loading url!\n";
sleep(5);
$count++;
}
}
return $html;
}
$url = "inexistent.html";
getSimpleHtmlDomLoaded($url);
The idea behind this code it's to try again if the url is failing to load, if after 10 attemps still fails, it should return false.
However it seems that with an inexistent url, the load_file method never returns false.
Instead I get the following warning message:
PHP Warning: file_get_contents(inexisten.html): failed to open stream
Any idea how to fix this?
Note: Preferably I would like to avoid hacking into the library.
Change your following code:
$html->load_file($url);
if ($html === false) {
for this one:
$ret = $html->load_file($url);
if ($ret === false) {
because you were checking object instance instead of the returned value from load_file() method.
By adding the # sign before a method call, any warnings get supressed. If you use this, always be sure to check for errors yourself as you do now and are sure no other methods are available to make sure no warnings and/or errors pop up.
You should check the actual data that is saved somewhere by the load() method if that equals FALSE instead of the object instance $html.