SAX-based parser not seeking content - php

I am trying to implement a SAX based parser but somehow it only recognizes the start of the element and the end, the content is not provided in the logs. The variable holding the XML is filled correctly, I checked through a simple log.
Here is the code:
<?php
function startElementHandler($parser, $name, $attribs) {
if($name == "id"){
$id = TRUE;
}
}
function endElementHandler($parser,$name){
$id = FALSE;
}
function characterDataHandler($parser,$data){
if($id == TRUE){
echo $data;
}
}
global $id;
$id = FALSE;
$parser = xml_parser_create();
xml_set_element_handler($parser, "startElementHandler","endElementHandler");
xml_set_character_data_handler($parser,"characterDataHandler");
$xml = file_get_contents("http://itunes.apple.com/de/rss/topfreeapplications/limit=100/xml");
xml_parse($parser,$xml);
//xml_parser_free($parser);
?>
Any suggestion how I could recieve the content? Maybe I am missing something strange I am not aware of at the moment.
best regards
tim

Per your comment, $id never becomes true. Maybe you want attribs to have an id and not the name of the element. For example, if you have the XML
<div id="x"> blah </div>
You get
$name="div", $attribs={"id":"x"}
(this came out a bit of php-python, but i hope you get my point)
Is that really your bug?
According to http://www.phpcatalyst.com/php-compare-strings.php you should always compare strings using ===. Is that your bug?

You only used the xml_set_element_handler-callbacks. Those only:
Set up start and end element handlers
If you also want to retrieve the content of those tags, you'll also need to register the xml_set_character_data_handler-callback. Because this one:
Set up character data handler

Related

PHP methods outputting HTML with incorrect structure?

I have a strange situation that I've never encountered where the returned HTML from my class methods doesn't output in the correct structure. For some unknown reason, everything is nested inside the HTML within the first loop, second loop, etc.
index.php
<body>
some html...
<div class="usb-port-container">
<?= $generate->output('short_name',id,'hostname'); ?>
</div>
some html...
<div class="usb-port-container">
<?= $generate->output('short_name',id,'hostname'); ?>
</div>
some html...
</body>
class.generate.php
function __construct() {
$this->getCopy = new getCopyData();
$this->getDrive = new getDriveData();
$this->helper = new helper();
}
function output ($short_name,$id,$hostname) {
$portCount = $this->getCopy->portCount($hostname, 'total');
for ($port = 0; $port < $portCount; ++$port) {
if ($this->getCopy->probePort($hostname,$port)) {
$default = $this->usbPortDefault($short_name,$id,$hostname,$port);
return $default;
} else {
$details = $this->usbPortDetails($short_name,$id,$hostname,$port);
return $details;
}
}
}
function usbPortDefault($short_name,$id,$hostname,$port) {
$port_details = '<div>
<div>----</div>
<div>----</div>
</div>';
return $port_details;
}
function outputRecords($hostname,$port) {
//get records data from database
$records = $this->getCopy->records($hostname,$port);
//create empty variable for HTML
$records_html = "";
foreach ($records as $key => $value) {
if ($value) {
$records_html .= '<div><span class="copy-group">' .$key. '</span><span class="copy-instance">' .$value. '</span></div>';
} else {
return '<div>Unable to get any group or instance</div>';
}
}
return $records_html;
}
function usbPortDetails($short_name,$id,$hostname,$port) {
$port_info = '';
$port_info .= 'bunch of HTML';
$port_info .= $this->outputRecords($hostname,$port);
$port_info .= 'bunch of HTML';
return $port_info;
}
My best guess as to the problem, is that there is an issue with the way I am returning the HTML or something with output buffering. I actually don't know if I need it within my class, but I've taken it out and issue is the same. I've also tried adding output buffering to index.html.
Here is a snippet of the source HTML. The first discrepancy I notice is that the <div class="server-details"></div> highlighted in blue doesn't belong there, it should be inside <div class="dc-container"></div> and adjacent to the prior <div class="server-details"></div>. After port-data-left should be port-data-right but it's nowhere to be found. I'm almost convinced at this point that there's a missing closing tag somewhere but I can't find it. It's been several years since I seriously did any development :D
EDIT: After further investigation, it appears that the final $port_info is not outputting and may be causing this problem. Is there an issue with $port_info .= $this->outputRecords($hostname,$port); being there?
Since you reuse you object instance using $generate, the constructor and destructor methods are only called once in this script.
The constructor starts a new output buffer using ob_start(), which means everything outputted by the output method will be buffered until you flush it.
The issue is, the flush only happens in the destructor, with ob_end_flush(), which is only executed once, after the last output call.
Because you echo the result of the method and start a buffer at the same time, the output must indeed be weird and some nesting / repetition occurs because the next output still adds to the buffer.
As pointed out in the comment, the easiest solution is to turn off the output buffering in the class.
You also need to clear $port_info at the beginning of the method to make this work, although it should be fine, unless $port_info is a global var?
This was probably not cleared in the initial Class in order to make it work when called several times, to concatenate the results (and buffer them, before outputting them all at once)
A good usage of buffers in the middle of a page is to redirect the output of a function to a variable.
For example, you could have a function that echoes code but you don't want it echoed.
You could then do something like:
Some HTML
<?php
ob_start();
call_to_some_function_that_normally_outputs_code();
$myvar = ob_get_clean();
?>
Rest of the HTML

PHP return value after XML exploration

I got a PHP array with a lot of XML users-file URL :
$tab_users[0]=john.xml
$tab_users[1]=chris.xml
$tab_users[n...]=phil.xml
For each user a <zoom> tag is filled or not, depending if user filled it up or not:
john.xml = <zoom>Some content here</zoom>
chris.xml = <zoom/>
phil.xml = <zoom/>
I'm trying to explore the users datas and display the first filled <zoom> tag, but randomized: each time you reload the page the <div id="zoom"> content is different.
$rand=rand(0,$n); // $n is the number of users
$datas_zoom=zoom($n,$rand);
My PHP function
function zoom($n,$rand) {
global $tab_users;
$datas_user=new SimpleXMLElement($tab_users[$rand],null,true);
$tag=$datas_user->xpath('/user');
//if zoom found
if($tag[0]->zoom !='') {
$txt_zoom=$tag[0]->zoom;
}
... some other taff here
// no "zoom" value found
if ($txt_zoom =='') {
echo 'RAND='.$rand.' XML='.$tab_users[$rand].'<br />';
$datas_zoom=zoom($r,$n,$rand); } // random zoom fct again and again till...
}
else {
echo 'ZOOM='.$txt_zoom.'<br />';
return $txt_zoom; // we got it!
}
}
echo '<br />Return='.$datas_zoom;
The prob is: when by chance the first XML explored contains a "zoom" information the function returns it, but if not nothing returns... An exemple of results when the first one is by chance the good one:
// for RAND=0, XML=john.xml
ZOOM=Anything here
Return=Some content here // we're lucky
Unlucky:
RAND=1 XML=chris.xml
RAND=2 XML=phil.xml
// the for RAND=0 and XML=john.xml
ZOOM=Anything here
// content founded but Return is empty
Return=
What's wrong?
I suggest importing the values into a database table, generating a single local file or something like that. So that you don't have to open and parse all the XML files for each request.
Reading multiple files is a lot slower then reading a single file. And using a database even the random logic can be moved to SQL.
You're are currently using SimpleXML, but fetching a single value from an XML document is actually easier with DOM. SimpleXMLElement::xpath() only supports Xpath expression that return a node list, but DOMXpath::evaluate() can return the scalar value directly:
$document = new DOMDocument();
$document->load($xmlFile);
$xpath = new DOMXpath($document);
$zoomValue = $xpath->evaluate('string(//zoom[1])');
//zoom[1] will fetch the first zoom element node in a node list. Casting the list into a string will return the text content of the first node or an empty string if the list was empty (no node found).
For the sake of this example assume that you generated an XML like this
<zooms>
<zoom user="u1">z1</zoom>
<zoom user="u2">z2</zoom>
</zooms>
In this case you can use Xpath to fetch all zoom nodes and get a random node from the list.
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$zooms = $xpath->evaluate('//zoom');
$zoom = $zooms->item(mt_rand(0, $zooms->length - 1));
var_dump(
[
'user' => $zoom->getAttribute('user'),
'zoom' => $zoom->textContent
]
);
Your main issue is that you are not returning any value when there is no zoom found.
$datas_zoom=zoom($r,$n,$rand); // no return keyword here!
When you're using recursion, you usually want to "chain" return values on and on, till you find the one you need. $datas_zoom is not a global variable and it will not "leak out" outside of your function. Please read the php's variable scope documentation for more info.
Then again, you're calling zoom function with three arguments ($r,$n,$rand) while the function can only handle two ($n and $rand). Also the $r is undiefined, $n is not used at all and you are most likely trying to use the same $rand value again and again, which obviously cannot work.
Also note that there are too many closing braces in your code.
I think the best approach for your problem will be to shuffle the array and then to use it like FIFO without recursion (which should be slightly faster):
function zoom($tab_users) {
// shuffle an array once
shuffle($tab_users);
// init variable
$txt_zoom = null;
// repeat until zoom is found or there
// are no more elements in array
do {
$rand = array_pop($tab_users);
$datas_user = new SimpleXMLElement($rand, null, true);
$tag=$datas_user->xpath('/user');
//if zoom found
if($tag[0]->zoom !='') {
$txt_zoom=$tag[0]->zoom;
}
} while(!$txt_zoom && !empty($tab_users));
return $txt_zoom;
}
$datas_zoom = zoom($tab_users); // your zoom is here!
Please read more about php scopes, php functions and recursion.
There's no reason for recursion. A simple loop would do.
$datas_user=new SimpleXMLElement($tab_users[$rand],null,true);
$tag=$datas_user->xpath('/user');
$max = $tag->length;
while(true) {
$test_index = rand(0, $max);
if ($tag[$test_index]->zoom != "") {
break;
}
}
Of course, you might want to add a bit more logic to handle the case where NO zooms have text set, in which case the above would be an infinite loop.

Get hard-coded values from component into plugin in Joomla 3.x

I have a custom component, in fact several. Each will have raw and hard-coded html at the beginning and end of their /view/default.php
I have a system plugin that needs to get this html and in some cases change it to something else, that can be managed in the back end. As a content plugin this works fine on all com_content articles, but it is ignored by components, my understanding is system plugins can do this but i can't get the data into the plugin and return it
example of component text ($text1, $text2 are defined at the top of the document)
JPluginHelper::importPlugin( 'system' );
JPluginHelper::importPlugin('plgSystemMyplugin');
$dispatcher =& JDispatcher::getInstance();
$data = array($text1, $text2); // any number of arguments you want
$data = $dispatcher->trigger('onBeforeRender', $data);
<article>
<div class="spacer" style="height:25px;"></div>
<div class="page_title_text">
<h1>title</h1>
<?php var_dump($data); ?>
</div>
<section>
my plugin:
jimport( 'joomla.plugin.plugin' );
class plgSystemMyplugin extends JPlugin {
function onBeforeRender() {
if (JFactory::getDocument()->getType() != 'html') {
return;
}
else {
$document=JFactory::getDocument();
$document->addCustomTag('<!-- System Plugin has been included (for testing) -->');
$document=JResponse::getBody();
$bob=JResponse::getBody();
$db = &JFactory::getDbo();
$db->setQuery('SELECT 1, 2 FROM #__table');
$results = $db->loadRowList();
$numrows=count($results);
if($numrows >0) {
foreach($results as $regexes) {
$document = str_replace($regexes[0],$regexes[1],$document);
}
return $document;
}
else {
$document = 'error with plugin';
}
JResponse::setBody($document);
return $document;
}
}
}
at the moment $data returns an array with a key 1 and value (string) of "" (blank/empty).
but not the data from the database I am expecting.
in simple terms I have {sometext} in my file and my database and it should return <p>my other text</p>
can you help?
thanks
Ok. Well looking at this deeper there is a couple of issues that jump out. The biggest being that you save getBody into a variable named $bob but then switch everywhere to using $document which is the object form above, not the content.
Also, you had a return $document hanging out in the middle of the code that prevented you from seeing that you were going to set $document as the new body. Probably should be more like below:
$bob=JResponse::getBody();
$db = &JFactory::getDbo();
$db->setQuery('SELECT 1, 2 FROM #__table');
$results = $db->loadRowList();
$numrows=count($results);
if($numrows >0) {
foreach($results as $regexes) {
$bob = str_replace($regexes[0],$regexes[1],$bob);
}
}
else {
$bob = 'error with plugin';
}
JResponse::setBody($bob);
return $document;
}
Original Thoughts:
Two thoughts to get you started. I'm not sure that this will actually fully answer the question, but should get you moving in the right direction.
First, you should not have to trigger the system plugin. They are system plugins, so the system will take care of that for you. If you wanted to use content plugins in your component (which you can definitely do!) then you would have to trigger them like your first set of code. In this case, don't bother with the entire dispatch section.
Second, your plugin looks set up to grab the body from the JDocument correctly, so that should work.
The likely issue is that the entire system plugin is just not being triggered. Make sure that it is installed and everything is named correctly. It has to be at plugins/system/myplugin/myplugin.php based on this name and make sure that the xml file with this also references myplugin as the plugin name. If not, the system won't find the class but likely won't throw an error. It will just skip it. This gives me trouble every time.
To do some checking just to make sure it gets called, I usually throw an echo or var_dump near the top of the file and just inside the function. Confirm that the function is at least getting called first and you should be most of the way to getting this to work.

Passing Object Operators As Strings (PHP)

I'm building a script that takes the contents of several (~13) news feeds and parses the XML data and inserts the records into a database. Since I don't have any control over the structure of the feeds, I need to tailor an object operator for each one to drill down into the structure in order to get the information I need.
The script works just fine if the target node is one step below the root, but if my string contains a second step, it fails ( 'foo' works, but 'foo->bar' fails). I've tried escaping characters and eval(), but I feel like I'm missing something glaringly obvious. Any help would be greatly appreciated.
// Roadmaps for xml navigation
$roadmap[1] = "deal"; // works
$roadmap[2] = "channel->item"; // fails
$roadmap[3] = "deals->deal";
$roadmap[4] = "resource";
$roadmap[5] = "object";
$roadmap[6] = "product";
$roadmap[8] = "channel->deal";
$roadmap[13] = "channel->item";
$roadmap[20] = "product";
$xmlSource = $xmlURL[$fID];
$xml=simplexml_load_file($xmlSource) or die(mysql_error());
if (!(empty($xml))) {
foreach($xml->$roadmap[$fID] as $div) {
include('./_'.$incName.'/feedVars.php');
include('./_includes/masterCategory.php.inc');
$test = sqlVendors($vendorName);
} // end foreach
echo $vUpdated." records updated.<br>";
echo $vInserted." records Inserted.<br><br>";
} else {
echo $xmlSource." returned an empty set!";
} // END IF empty $xml result
While Fosco's solution will work, it is indeed very dirty.
How about using xpath instead of object properties?
$xml->xpath('deals/deal');
PHP isn't going to magically turn your string which includes -> into a second level search.
Quick and dirty hack...
eval("\$node = \"\$xml->" . $roadmap[$fID] . "\";");
foreach($node as $div) {

PHP IF Statement not working - variable does contain correct data, however

I am trying to use an IF statement to check if a variable contains a piece of text, if so, echo yes, if not, echo no. Simple I know, but it does not seem to be working for me. I am very new to PHP, so any replies should be understandable for a monkey! :P
Here is my code. Bear in mind I am working with PHP simple html dom.
<?php
include 'simple_html_dom.php';
$html = file_get_html('http://www.amazon.co.uk/New-Apple-iPod-touch-Generation/dp/B0040GIZTI/ref=br_lf_m_1000333483_1_1_img?ie=UTF8&s=electronics&pf_rd_p=229345967&pf_rd_s=center-3&pf_rd_t=1401&pf_rd_i=1000333483&pf_rd_m=A3P5ROKL5A1OLE&pf_rd_r=1ZW9HJW2KN2C2MTRJH60');
$stock_data = $html->find('span[class=availGreen]',0);
if ( $stock_data == "In stock." ) {
echo "Yes";
}
echo "No";
?>
Now I have tested if the variable did contain the text, using echo $stock_data; it prints the correct text, however the if statement still comes back and echo's No.
Any help, apologies if this is something so simple.
Use the function
var_dump($stock_data);
Maybe there are non-human readable characters in it and you're not checking against them. You can always trim those
I tried your code, and $stock_data returns:
<span class="availGreen">In stock.</span>
and not:
In stock.
I suggest that as a new developer, you should get tools like Firebug for Firefox, or Chrome, which can help you debug the underlying errors...although in this case, since html simply outputs the text within the span tags, the only way to know what $stock_data equals to would be to view the source.
For contains piece of text use this strpos or stripos
And here is right if statement
if ( $stock_data == "In stock." ) {
echo "Yes";
} else {
echo "No";
}
I'm not sure what file_get_html() is but you may want to use file_get_contents() instead (probably with DOMDocument and DOMXpath).
$html = file_get_contents('http://www.amazon.co.uk/New-Apple-iPod-touch-Generation/dp/B0040GIZTI/ref=br_lf_m_1000333483_1_1_img?ie=UTF8&s=electronics&pf_rd_p=229345967&pf_rd_s=center-3&pf_rd_t=1401&pf_rd_i=1000333483&pf_rd_m=A3P5ROKL5A1OLE&pf_rd_r=1ZW9HJW2KN2C2MTRJH60');
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
$in_stock = $xpath->query("/html/body/div[#id='divsinglecolumnminwidth']/form[#id='handleBuy']/table[3]/tbody/tr[3]/td/div/span");
if(!empty($in_stock)) {
echo "yes";
} else {
echo "no";
}
If the same span is occupied when the item is NOT in stock you can test the value like this:
if($in_stock->item(0)->nodeValue == "In stock.") { echo "yes"; } else { echo "no"; }
You may also want to use preg_match if the string "In stock." changes at all.
I don't know much about the library you are using, but if it is returning an object that implements a __toString() (used by echo), then perhaps this will work:
if ( (string) $stock_data == "In stock." ) {
Otherwise, check for member functions that return the string you are looking for.
I should add this from the manual:
It is worth noting that before PHP 5.2.0 the __toString method was only called when it was directly combined with echo() or print().
So if you are running a more recent PHP, the part of my answer regarding casting to string won't apply. (There still may be some member function you can call to get at the string data minus HTML and special chars.)

Categories