The following is my code:
<?php
include('simple_html_dom.php');
$rowdate;
$html = new simple_html_dom();
$html->load_file('http://www.forexfactory.com/calendar.php');
foreach($html->find('.calendar_row') as $e)
{
$date=$e->find('span.date');
if ($date[0] != "")
{
$rowdate=$date[0];
}
$time=$e->find('.time');
$currency=$e->find('.currency');
$impact=$e->find('.impact');
$event=$e->find('.event');
echo $rowdate;echo ",";
echo $time[0];echo ",";
echo $currency[0];echo ",";
echo $impact[0];echo ",";
echo $event[0];
echo "<br>";
}
The above code works fine however $impact is not displayed at all while if you open the url in your browser directly and see the source code , we can see that the impact class is present within each calendar_row
Can anyone please guide me as to what I am doing wrong ?
Instead of:
$impact = $e->find('.impact');
echo $impact[0];
You want:
$impact = $e->find('.impact', 0);
echo $impact;
And you probably really want:
$impact = $e->find('.impact span', 0)->class;
Read the simple html dom documentation if you don't understand why.
Related
i have been working with web crawler. it worked for few sites,
now when i tried it with this particular site, it came nothing. no error nothing.
i wonder what went wrong..
the code goes as:
<?php
require_once('dom/simple_html_dom.php');
$html = file_get_html('http://www.studentdoc.com/phpBB2/viewforum.php?f=18&sid=2a150b97528c8ec47600692cc77daaf3');
$elementCount=0;
foreach($html->find('dl.icon a') as $elemen) {
foreach($elemen->find('dt a') as $element) {
$elementCount++;
$element->href = "http://www.usmleforum.com" . $element->href;
echo '<li target="_blank" class="itemtitle">';
if($elementCount < 5 && $elementCount > 2 && rand(0,1) == 1) {
echo '<span class="item_new">new</span>';
}
echo $element;
echo '</li>';
if($elementCount==12){
break;
}
}
}
?>
please go through the below given link for HTML structure..
http://www.studentdoc.com/phpBB2/viewforum.php?f=18&sid=2a150b97528c8ec47600692cc77daaf3
Any help is appreciated..
There is no DOM element like dl.icon a dt a. You probably want to fetch dl.icon dt a. Remove a from first argument in find method.
Always try to debug your code before asking questions. Simple echo "A"; die(); echo "B"; die(); after every statement will be very helpfull :)
In this case second foreach have 0 elements all the time.
I have an rss feed that I am reading into. I need to retrieve certain data from the field in this feed.
This is the example feed data :
<content:encoded><![CDATA[
<b>When:</b><br />
Weekly Event - Every Thursday: 1:30 PM to 3:30 PM (CT)<br /><br />
<b>Where:</b><br />
100 West Street<BR>2nd floor<BR>Gainesville<BR>
<br>.....
How do I pull out the data for When: and Where: respectively? I attempted to use regex but I am unsure if I am not accessing the data correctly or if my regex expression is wrong. I'm not set on using regex.
This is my code:
foreach ($x->channel->item as $event) {
$eventCounter++;
$rowColor = ($eventCounter % 2 == 0) ? '#FFFFFF' : '#F1F1F1';
$content = $event->children('http://purl.org/rss/1.0/modules/content/');
$contents = $content->encoded;
echo '<tr style="background-color:' . $rowColor . '">';
echo '<td>';
//echo "<a id=buttonRed href='$event->link' title='$event->title' target='_blank'>" . $event->title . "</a>";
echo "" . $event->title . "";
echo '</td>';
echo '<td>';
$re = '%when\:\s*</b>\s*(.|\s)<br \/><br \/>$/i';
if (preg_match($re, $contents, $matches)) {
$date = $matches;
}
echo $date;
echo '</td>';
echo '<td>';
$re = '/^When\:<\/b>()$/';
if (preg_match($re, $contents, $matches)) {
$location = $matches;
}
echo $location;
echo '</td>';
echo '<td>';
echo "<a id=buttonRed href='$event->link' title='$event->title' target='_blank'>Click Here To Register</a>";
echo '</td>';
echo '</tr>';
}
The two $res are just my attempt to get the data out using different regex expressions. Let me know where I am going wrong. Thanks
The following should sort of get you there. (I wrote this from the top of my head and it does not exactly following your XML syntax. But you get the idea.)
<?php
$str = "<root><b>When:</b> whenwhen <b>Where:</b> wherewhere</root>";
$doc = new DOMDocument();
$doc->loadXML($str);
$when = $where = "";
$target = null;
foreach ($doc->documentElement->childNodes as $node) {
if ($node->tagName == "b") {
if (++$i == 1) {
$target = &$when;
} else {
$target = &$where;
}
}
if ($target !== null && $node->nodeType === XML_TEXT_NODE) {
$target .= $node->nodeValue;
}
}
var_dump($when, $where);
I had a problem like this and I ended up using YQL. Take a good look at the page-scraping code given there, especially the select command. Then go the the console and put in your own select statement, specifying the feed url and the xpath to the nodes you're wanting. Select JSON format. Then go down to the bottom of the page, get the REST query url, and use it in a jquery jsonp request. MAGIC!
please, don't extract data from XML-documents via regex.
The long answer is e.g. here: https://stackoverflow.com/a/335446/313145
The short answer is: it is not easier to use regex and will break often.
Why won't my script return the div with the id of "pp-featured"?
<?php
# create and load the HTML
include('lib/simple_html_dom.php');
$html = new simple_html_dom();
$html->load("http://maps.google.com/maps/place?cid=6703996311168776503&q=hills+garage&hl=en&view=feature&mcsrc=google_reviews&num=20&start=0&ved=0CFUQtQU&sa=X&ei=sCq_Tr3mJZToygTOmuCGCg");
$ret = $html->find('div[id=pp-featured]');
# output it!
echo $ret->save();
?>
this gets me on my way. Thanks for your help.
<?php
include_once 'lib/simple_html_dom.php';
$url = "http://maps.google.com/maps/place?cid=6703996311168776503&q=hills+garage&hl=en&view=feature&mcsrc=google_reviews&num=20&start=0&ved=0CFUQtQU&sa=X&ei=sCq_Tr3mJZToygTOmuCGCg";
$html = file_get_html($url);
$ret = $html->find('div[id=pp-reviews]');
foreach($ret as $story)
echo $story;
?>
The library always returns an array because it may be possible that more than one item matches the selector.
If you expect only one you should check to ensure the page your analyzing is behaving as expected.
Suggested solution:
<?php
include_once 'lib/simple_html_dom.php';
$url = "http://maps.google.com/maps/place?cid=6703996311168776503&q=hills+garage&hl=en&view=feature&mcsrc=google_reviews&num=20&start=0&ved=0CFUQtQU&sa=X&ei=sCq_Tr3mJZToygTOmuCGCg";
$html = file_get_html($url);
$ret = $html->find('div[id=pp-reviews]');
if(count($ret)==1){
echo $ret[0]->save();
}
else{
echo "Something went wrong";
}
I need to find links in a part of some html code and replace all the links with two different absolute or base domains followed by the link on the page...
I have found a lot of ideas and tried a lot different solutions.. Luck aint on my side on this one.. Please help me out!!
Thank you!!
This is my code:
<?php
$url = "http://www.oxfordreference.com/views/SEARCH_RESULTS.html?&q=android";
$raw = file_get_contents($url);
$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
$content = str_replace($newlines, "", html_entity_decode($raw));
$start = strpos($content,'<table class="short_results_summary_table">');
$end = strpos($content,'</table>',$start) + 8;
$table = substr($content,$start,$end-$start);
echo "{$table}";
$dom = new DOMDocument();
$dom->loadHTML($table);
$dom->strictErrorChecking = FALSE;
// Get all the links
$links = $dom->getElementsByTagName("a");
foreach($links as $link) {
$href = $link->getAttribute("href");
echo "{$href}";
if (strpos("http://oxfordreference.com", $href) == -1) {
if (strpos("/views/", $href) == -1) {
$ref = "http://oxfordreference.com/views/"+$href;
}
else
$ref = "http://oxfordreference.com"+$href;
$link->setAttribute("href", $ref);
echo "{$link->getAttribute("href")}";
}
}
$table12 = $dom->saveHTML;
preg_match_all("|<tr(.*)</tr>|U",$table12,$rows);
echo "{$rows[0]}";
foreach ($rows[0] as $row){
if ((strpos($row,'<th')===false)){
preg_match_all("|<td(.*)</td>|U",$row,$cells);
echo "{$cells}";
}
}
?>
When i run this code i get htmlParseEntityRef: expecting ';' warning for the line where i load the html
var links = document.getElementsByTagName("a"); will get you all the links.
And this will loop through them:
for(var i = 0; i < links.length; i++)
{
links[i].href = "newURLHERE";
}
You should use jQuery - it is excellent for link replacement. Rather than explaining it here. Please look at this answer.
How to change the href for a hyperlink using jQuery
I recommend scrappedcola's answer, but if you dont want to do it on client side you can use regex to replace:
ob_start();
//your HTML
//end of the page
$body=ob_get_clean();
preg_replace("/<a[^>]*href=(\"[^\"]*\")/", "NewURL", $body);
echo $body;
You can use referencing (\$1) or callback version to modify output as you like.
I am creating an address book xml feed from a MySQL database, everything is working fine, but I have a section tag which gets the first letter of the surname and pops it in that tag. I only want this to display if it has changed, but for some reason my brain isn't working this morning!
Current code:
<?php
echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
echo "<data>";
do {
$char = $row_fetch["surname_add"];
$section = $char[0];
//if(changed???){
echo "<section><character>".$section."</character>";
//}
echo "<person>";
echo "<name>".$row_fetch["firstname_add"]." ".$row_fetch["surname_add"]."</name>";
echo "<title>".$row_fetch["title_add"]."</title>";
echo "</person>";
//if(){
echo "</section>";
//}
} while ($row_fetch = mysql_fetch_assoc($fetch));
echo "</data>";
?>
Any help on this welcome, don't know why I can't think of it!
And if you still want to generate XML manually, I suppose, something like this will work:
$section = "NoSectionStartedYet";
while ($row_fetch = mysql_fetch_assoc($fetch)) {
$char = $row_fetch["surname_add"];
if ($char[0] != $section)
{
if ($section != "NoSectionStartedYet")
{
echo "</section>";
}
$section = $char[0];
echo "<section><character>".$section."</character>";
}
echo "<person>";
echo "<name>".$row_fetch["firstname_add"]." ".$row_fetch["surname_add"]."</name>";
echo "<title>".$row_fetch["title_add"]."</title>";
echo "</person>";
}
echo "</section>";
To be sure that your XML is valid it is better to build a DOM tree, here is an example from the PHP manual:
<?php
$doc = new DOMDocument;
$node = $doc->createElement("para");
$newnode = $doc->appendChild($node);
echo $doc->saveXML();
?>