CURL to grab an XML file associated with this URL - php

I am trying to use CURL to grab an XML file associated with this URL, then i am trying to parse the xml file using DOMxPath.
There are no output errors at this point it is just not displaying anything, i tried to catch some errors but i was unable to figure it out, any direction would be amazing.
<?php
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
function tideTime() {
$ch = curl_init("http://tidesandcurrents.noaa.gov/noaatidepredictions/NOAATidesFacade.jsp?datatype=XML&Stationid=8721138");
$fp = fopen("8721138.xml", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
$dom = new DOMDocument();
#$dom->loadHTML($ch);
$domx = new DOMXPath($dom);
$entries = $domx->evaluate("//time");
$arr = array();
foreach ($entries as $entry) {
$tide = $entry->nodeValue;
}
echo $tide;
}
?>

Youre trying to load the curl resource handle as the DOM which it is not. the curl functions either output directly or output to string.
$ch = curl_init("http://tidesandcurrents.noaa.gov/noaatidepredictions/NOAATidesFacade.jsp?datatype=XML&Stationid=8721138");
$fp = fopen("8721138.xml", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
$data = curl_exec($ch);
curl_close($ch);
fclose($fp);
$dom = new DomDocument();
$dom->loadHTML($data);
// the rest of the code

it seems you try to catch some unavailable xpath, make sure you have ("//time"); in the xml file, are you sure that you grab is a xml file ? or you just put into xml ?

if we look at that page, it seems xml generated by javascript, look at the http://tidesandcurrents.noaa.gov/noaatidepredictions/NOAATidesFacade.jsp?datatype=XML&Stationid=8721138&text=datafiles%2F8721138%2F09122011%2F877%2F&imagename=images/8721138/09122011/877/8721138_2011-12-10.gif&bdate=20111209&timelength=daily&timeZone=2&dataUnits=1&interval=&edate=20111210&StationName=Ponce Inlet, Halifax River&Stationid_=8721138&state=FL&primary=Subordinate&datum=MLLW&timeUnits=2&ReferenceStationName=GOVERNMENT CUT, MIAMI HARBOR ENTRANCE&HeightOffsetLow=*1.00&HeightOffsetHigh=* 1.18&TimeOffsetLow=33&TimeOffsetHigh=5&pageview=dayly&print_download=true&Threshold=&thresholdvalue=
may be you can grab that

Related

Trouble writing results in a csv file

I've written a script in php to fetch links and write them in a csv file from the main page of wikipedia. The script does fetch the links accordingly. However, I can't write the populated results in a csv file. When I execute my script, It does nothing, no error either. Any help will be highly appreciated.
My try so far:
<?php
include "simple_html_dom.php";
$url = "https://en.wikipedia.org/wiki/Main_Page";
function fetch_content($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
$htmlContent = curl_exec($ch);
curl_close($ch);
$dom = new simple_html_dom();
$dom->load($htmlContent);
$links = array();
foreach ($dom->find('a') as $link) {
$links[]= $link->href . '<br>';
}
return implode("\n", $links);
$file = fopen("itemfile.csv","w");
foreach ($links as $item) {
fputcsv($file,$item);
}
fclose($file);
}
fetch_content($url);
?>
1.You are using return in your function, that's why nothing gets written in the file as code stops executing after that.
2.Simplified your logic with below code:-
$file = fopen("itemfile.csv","w");
foreach ($dom->find('a') as $link) {
fputcsv($file,array($link->href));
}
fclose($file);
So the full code needs to be:-
<?php
//comment these two lines when script started working properly
error_reporting(E_ALL);
ini_set('display_errors',1); // 2 lines are for Checking and displaying all errors
include "simple_html_dom.php";
$url = "https://en.wikipedia.org/wiki/Main_Page";
function fetch_content($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
$htmlContent = curl_exec($ch);
curl_close($ch);
$dom = new simple_html_dom();
$dom->load($htmlContent);
$links = array();
$file = fopen("itemfile.csv","w");
foreach ($dom->find('a') as $link) {
fputcsv($file,array($link->href));
}
fclose($file);
}
fetch_content($url);
?>
The reason the file does not get written is because you return out of the function before that code can even be executed.

Retrieving a single line from a html table from another website?

I am try to learn curl usage, but I do not understand how it works fully yet. How can I use curl (or other functions) to access on one (the top) data entry of a table. So far I am only able to retrieve the entire website. How can I only echo the whole table and specifically the first entry. My code is:
<?php
$ch = curl_init("http://www.w3schools.com/html/html_tables.asp");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
?>
Using curl is a good start, but its not going to be enough, as hanky suggested, you need to also use DOMDocument and also you can include DOMXpath.
Sample Code:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.w3schools.com/html/html_tables.asp');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
libxml_use_internal_errors(true);
$html = curl_exec($ch); // the whole document (in string) goes in here
$dom = new DOMDocument();
$dom->loadHTML($html); // load it
libxml_clear_errors();
$xpath = new DOMXpath($dom);
// point it to the particular table
// table with a class named 'reference', second row (first data), get the td
$table_row = $xpath->query('//table[#class="reference"]/tr[2]/td');
foreach($table_row as $td) {
echo $td->nodeValue . ' ';
}
Should output:
Jill Smith 50

Parsing XML with SimpleXML returns nothing

I'm currently trying to parse the MapQuest Traffic API, but when I try to display an incident, nothing appears, and if I do "if empty" in php, it returns empty.
Here's the code:
<?php
$mysongs = simplexml_load_file("http://www.mapquestapi.com/traffic/v1/incidents?key=Fmjtd%7Cluuan1u2nh%2C2a%3Do5-96rw5u&callback=handleIncidentsResponse&boundingBox=$_GET[a], $_GET[b], $_GET[c], $_GET[d]&filters=construction,incidents&inFormat=kvp&outFormat=xml");
echo $mysongs->Incidents[0]->Incident[0]->fullDesc;
?>
The parameters I'm passing: ?a=33.352532499999995&b=-118.2324383&c=34.352532499999995&d=-117.2324383.
Thanks in advance!
Here simplexml_load_file not loading all your xml data so, i created a xml file with name test.xml and then loaded data from test.xml. Now you can print data what you need.
<?php
$a = $_GET['a'];
$b = $_GET['b'];
$c = $_GET['c'];
$d = $_GET['d'];
$xml_feed_url = 'http://www.mapquestapi.com/traffic/v1/incidents?key=Fmjtd|luuan1u2nh%2C2a%3Do5-96rw5u&callback=handleIncidentsResponse&boundingBox='.$a.','.$b.','.$c.','.$d.'&filters=construction,incidents&inFormat=kvp&outFormat=xml';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $xml_feed_url);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml = curl_exec($ch);
curl_close($ch);
$xml2 = new SimpleXMLElement($xml);
$xml2->asXML("test.xml");
$xml2->asXML();
$mysongs = simplexml_load_file("test.xml");
print_r($mysongs);
?>

get information from html table using Curl

i need to get some information about some plants and put it into mysql table.
My knowledge on Curl and DOM is quite null, but i've come to this:
set_time_limit(0);
include('simple_html_dom.php');
$ch = curl_init ("http://davesgarden.com/guides/pf/go/1501/");
curl_setopt($ch, CURLOPT_USERAGENT,"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1");
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept-Language: es-es,en"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT,0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$data = curl_exec ($ch);
curl_close ($ch);
$html= str_get_html($data);
$e = $html->find("table", 8);
echo $e->innertext;
now, i'm really lost about how to move in from this point, can you please guide me?
Thanks!
This is a mess.
But at least it's a (somewhat) consistent mess.
If this is a one time extraction and not a rolling project, personally I'd use quick and dirty regex on this instead of simple_html_dom. You'll be there all day twiddling with the tags otherwise.
For example, this regex pulls out the majority of title/data pairs:
$pattern = "/<b>(.*?)</b>\s*<br>(.*?)</?(td|p)>/si";
You'll need to do some pre and post cleaning before it will get them all though.
I don't envy you having this task...
Your best bet will be to wrape this in php ;)
Yes, this is a ugly hack for a ugly html code.
<?php
ob_start();
system("
/usr/bin/env links -dump 'http://davesgarden.com/guides/pf/go/1501/' |
/usr/bin/env perl -lne 'm/((Family|Genus|Species):\s+\w+\s+\([\w-]+\))/ && \
print $1'
");
$out = ob_get_contents();
ob_end_clean();
print $out;
?>
Use Simple Html Dom and you would be able to access any element/element's content you wish. Their api is very straightforward.
you can try somthing like this.
<?php
$ch = curl_init ("http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($ch);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$data = array();
// get all table rows and rows which are not headers
$table_rows = $xpath->query('//table[#id="tbl-all-product-view"]/tr[#class!="rowH"]');
foreach($table_rows as $row => $tr) {
foreach($tr->childNodes as $td) {
$data[$row][] = preg_replace('~[\r\n]+~', '', trim($td->nodeValue));
}
$data[$row] = array_values(array_filter($data[$row]));
}
echo '<pre>';
print_r($data);
?>

simplexml_load_file from file not ending with .xml

I'm trying to parse an xml file by starting with simplexml_load_file to load the contents. The file comes from a wordpress using an xml feed generated by a .php file.
The problem is it never can load the xml file..I'm not sure what I can do to make this work. Here is the code
<?php
$url = "http://marshallmashup.usc.edu/feed.php";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$result = curl_exec($ch);
curl_close($ch);
$rss = simplexml_load_string($result);
if( ! $rss = simplexml_load_file($url,NULL, LIBXML_NOERROR | LIBXML_NOWARNING) )
{
echo 'unable to load XML file';
}
else
{
echo 'XML file loaded successfully';
}
?>
First of all after this line:
$result = curl_exec($ch);
you should add this one:
$result = utf8_encode($result);
Said that, you'll have no problems with the function simplexml_load_string($result); which will correctly create a DOM based on the string you give to the function and that is the feed gotten from the php page. You can see the result using var_dump($rss); after the statement $rss = simplexml_load_string($result);.

Categories