<?php
$url = "http://www.justdial.com/Delhi-NCR/Pizza-Outlets-%3Cnear%3E-Okhla/";
$ptr = fopen("op.txt","w");
$data = file_get_contents($url);
print_r($data);
$result = htmlentities($data);
$doc = new DOMDocument();
#$doc->loadHTML($result);
$finder = new DOMXPath($doc);
$node = $finder->query("//h3[contains(#class, 'r')]");
?>
Above is the code which I have written to fetch the source code of justdial. The only output which I get is the first pizza outlet.How can I fetch all the results which are shown on the justdial website.
Thanks in advance.
All items are part of the '<div id="tab_block">' html element which its content is built by javascript/AJAX calls, so they cannot appear in the HTML file you load via file_get_contents() since you will get the HTML definition only without the javascript code being interpreted.
However, it means you can access the items/database directly by code if you know the endpoints.
For example (url are shown as when I tested them)
This url will return the first few items of the complete list. It will return something like (in JSON format):
[{docid: "011PXX11.XX11.151106170721.W5H9",…}, {docid: "011PXX11.XX11.140302105210.Y9N8",…},…]
0: {docid: "011PXX11.XX11.151106170721.W5H9",…}
disp_pic: "http://images.jdmagicbox.com/delhi/h9/011pxx11.xx11.151106170721.w5h9/catalogue/6cf575ffbd1090f5a314d2cf40451c88.jpg"
docid: "011PXX11.XX11.151106170721.W5H9"
1: {docid: "011PXX11.XX11.140302105210.Y9N8",…}
disp_pic: "http://images.jdmagicbox.com/delhi/n8/011pxx11.xx11.140302105210.y9n8/catalogue/ecfd2106644df17013e98bb60f40c527.jpg"
docid: "011PXX11.XX11.140302105210.Y9N8"
video: "http://videos.jdmagicbox.com/delhi/n8/011pxx11.xx11.140302105210.y9n8/video/fc2a62242ae03c74c15436dbcc04c33a_m.jpg"
...
the docid can be used to do further query on a particular item, while the disp_pic url will return the image
This url will return the image of the 1st item too but use some parameters
In any case, I just scratch the surface of the whole issue to demonstrate how to proceed. You would need to understand the site logic to read the complete dataset, but it would be easier to contact the webmaster and ask him to describe its API/endpoints for you to access the data. As well as asking him permission to use it even if the 'API' is not protected.
Once you know the endpoint, structure and data description, you can use a PHP library like mashape\unirest to do queries like this:
Unirest\Request::verifyPeer (false) ;
$response =Unirest\Request::get (
'http://www.justdial.com/functions/sortbyphotosnew.php?contractid=011PXX11.XX11.151106170721.W...,
array ( 'Accept' => 'application/json' ),
null
) ;
if $response->code == 200 then the $response->body is your JSON object containing the document array.
Related
Im trying to load search result from an library api using Search and Retrieve via URL (SRU) at : https://data.norge.no/data/bibsys/bibsys-bibliotekbase-bibliografiske-data-sru
If you see the search result links there, its looks pretty much like XML but when i try like i have before with xml using the code below, it just returns a empty object,
SimpleXMLElement {#546}
whats going on here?
My php function in my laravel project:
public function bokId($bokid) {
$apiUrl = "http://sru.bibsys.no/search/biblio?version=1.2&operation=searchRetrieve&startRecord=1&maximumRecords=10&query=ibsen&recordSchema=marcxchange";
$filename = "bok.xml";
$xmlfile = file_get_contents($apiUrl);
file_put_contents($filename, $xmlfile); // xml file is saved.
$fileXml = simplexml_load_string($xmlfile);
dd($fileXml);
}
If i do:
dd($xmlfile);
instead, it echoes out like this:
Making me very confused that i cannot get an object to work with. Code i present have worked fine before.
It may be that the data your being provided ha changed format, but the data is still there and you can still use it. The main problem with using something like dd() is that it doesn't work well with SimpleXMLElements, it tends to have it's own idea of what you want to see of what data there is.
In this case the namespaces are the usual problem. But if you look at the following code you can see a quick way of getting the data from a specific namespace, which you can then easily access as normal. In this code I use ->children("srw", true) to say fetch all child elements that are in the namespace srw (the second argument indicates that this is the prefix and not the URL)...
$apiUrl = "http://sru.bibsys.no/search/biblio?version=1.2&operation=searchRetrieve&startRecord=1&maximumRecords=10&query=ibsen&recordSchema=marcxchange";
$filename = "bok.xml";
$xmlfile = file_get_contents($apiUrl);
file_put_contents($filename, $xmlfile); // xml file is saved.
$fileXml = simplexml_load_string($xmlfile);
foreach ( $fileXml->children("srw", true)->records->record as $record) {
echo "recordIdentifier=".$record->recordIdentifier.PHP_EOL;
}
This outputs...
recordIdentifier=792012771
recordIdentifier=941956423
recordIdentifier=941956466
recordIdentifier=950546232
recordIdentifier=802109055
recordIdentifier=910941041
recordIdentifier=940589451
recordIdentifier=951721941
recordIdentifier=080703852
recordIdentifier=011800283
As I'm not sure which data you want to retrieve as the title, I just wanted to show the idea of how to fetch data when you have a list of possibilities. In this example I'm using XPath to look in each <srw:record> element and find the <marc:datafield tag="100"...> element and in that the <marc:subfield code="a"> element. This is done using //marc:datafield[#tag='100']/marc:subfield[#code='a']. You may need to adjust the #tag= bit to the datafield your after and the #code= to point to the subfield your after.
$fileXml = simplexml_load_string($xmlfile);
$fileXml->registerXPathNamespace("marc","info:lc/xmlns/marcxchange-v1");
foreach ( $fileXml->children("srw", true)->records->record as $record) {
echo "recordIdentifier=".$record->recordIdentifier.PHP_EOL;
$data = $record->xpath("//marc:datafield[#tag='100']/marc:subfield[#code='a']");
$subData=$data[0]->children("marc", true);
echo "Data=".(string)$data[0].PHP_EOL;
}
One of my sources for data recently changed how they are providing a json file to me, they added something before the actual output, and I am having trouble getting the values to display on my landing page.
Old json output
string(6596) "[{"id":239,"solution_id":3486," etc...
New json output
string(6614) "{"picker_offers":[{"id":239,"solution_id":3486," etc...
For my landing page I am using the following:
$datastream = json_decode($result);
foreach($datastream as $component) {
$productid = $component->id;
I was able to successfully output the data to php from their old output, but I am not sure how to get around the value "picker_offers" that is being passed as part of the json file, but it isn't part of the actual data to output.
How can I not include that "picker_offers", or what can I do to be able to read the data? With this new output there is an extra curly bracket wrapper called "picker_offers" around the entire output.
Thank you very much
Solution 1 : if you want to remove picker_offers
$datastream = json_decode($result);
$picker_offers = $datastream->picker_offers;
unset($datastream->picker_offers);
$datastream = $picker_offers;
foreach($datastream as $component) {
$productid = $component->id;
}
Solution 2 : if you don't want to remove picker_offers
$datastream = json_decode($result);
foreach($datastream->picker_offers as $component) {
$productid = $component->id;
}
First, I am pretty clueless with PHP, so be kind! I am working on a site for an SPCA (I'm a Vet and a part time geek). The PHP accesses an xml file from a portal used to administer the shelter and store images, info. The file writes that xml data to JSON and then I use the JSON data in a handlebars template, etc. I am having a problem getting some data from the xml file to outprint to JSON.
The xml file is like this:
</DataFeedAnimal>
<AdditionalPhotoUrls>
<string>doc_73737.jpg</string>
<string>doc_74483.jpg</string>
<string>doc_74484.jpg</string>
</AdditionalPhotoUrls>
<PrimaryPhotoUrl>19427.jpg</PrimaryPhotoUrl>
<Sex>Male</Sex>
<Type>Cat</Type>
<YouTubeVideoUrls>
<string>http://www.youtube.com/watch?v=6EMT2s4n6Xc</string>
</YouTubeVideoUrls>
</DataFeedAnimal>
In the PHP file, written by a friend, the code is below, (just part of it), to access that XML data and write it to JSON:
<?php
$url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/AnimalsForAdoption.aspx";
if ($_GET["type"] == "found") {
$url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/foundanimals.aspx";
} else if ($_GET["type"] == "lost") {
$url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/lostanimals.aspx";
}
$response_xml_data = file_get_contents($url);
$xml = simplexml_load_string($response_xml_data);
$data = array();
foreach($xml->DataFeedAnimal as $animal)
{
$item = array();
$item['sex'] = (string)$animal->Sex;
$item['photo'] = (string)$animal->PrimaryPhotoUrl;
$item['videos'][] = (string)$animal->YouTubeVideoUrls;
$item['photos'][] = (string)$animal->PrimaryPhotoUrl;
foreach($animal->AdditionalPhotoUrls->string as $photo) {
$item['photos'][] = (string)$photo;
}
$item['videos'] = array();
$data[] = $item;
}
echo file_put_contents('../adopt.json', json_encode($data));
echo json_encode($data);
?>
The JSON output works well but I am unable to get 'videos' to write out to the JSON file as the 'photos' do. I just get '/n'!
Since the friend who helped with this is no longer around, I am stuck. I have tried similar code to the foreach statement for photos but am getting nowhere. Any help would be appreciated and the pets would appreciate it as well!
The trick with such implementations is to always look what you have got by dumping data structures to a log file or command line. Then to take a look at the documentation of the data you see. That way you know exactly what data you are working with and how to work with it ;-)
Here it turns out that the video URLs you are interested in are placed inside an object of type SimpleXMLElement with public properties, which is not really surprising if you look at the xml structure. The documentation of class SimpleXMLElement shows the method children() which iterates through all children. Just what we are looking for...
That means a clean implementation to access those sets should go along these lines:
foreach($animal->AdditionalPhotoUrls->children() as $photo) {
$item['photos'][] = (string)$photo;
}
foreach($animal->YouTubeVideoUrls->children() as $video) {
$item['videos'][] = (string)$video;
}
Take a look at this full and working example:
<?php
$response_xml_data = <<< EOT
<DataFeedAnimal>
<AdditionalPhotoUrls>
<string>doc_73737.jpg</string>
<string>doc_74483.jpg</string>
<string>doc_74484.jpg</string>
</AdditionalPhotoUrls>
<PrimaryPhotoUrl>19427.jpg</PrimaryPhotoUrl>
<Sex>Male</Sex>
<Type>Cat</Type>
<YouTubeVideoUrls>
<string>http://www.youtube.com/watch?v=6EMT2s4n6Xc</string>
<string>http://www.youtube.com/watch?v=hgfg83mKFnd</string>
</YouTubeVideoUrls>
</DataFeedAnimal>
EOT;
$animal = simplexml_load_string($response_xml_data);
$item = [];
$item['sex'] = (string)$animal->Sex;
$item['photo'] = (string)$animal->PrimaryPhotoUrl;
$item['photos'][] = (string)$animal->PrimaryPhotoUrl;
foreach($animal->AdditionalPhotoUrls->children() as $photo) {
$item['photos'][] = (string)$photo;
}
$item['videos'] = [];
foreach($animal->YouTubeVideoUrls->children() as $video) {
$item['videos'][] = (string)$video;
}
echo json_encode($item);
The obvious output of this is:
{
"sex":"Male",
"photo":"19427.jpg",
"photos" ["19427.jpg","doc_73737.jpg","doc_74483.jpg","doc_74484.jpg"],
"videos":["http:\/\/www.youtube.com\/watch?v=6EMT2s4n6Xc","http:\/\/www.youtube.com\/watch?v=hgfg83mKFnd"]
}
I would however like to add a short hint:
In m eyes it is questionable to convert such structured information into an associative array. Why? Why not a simple json_encode($animal)? The structure is perfectly fine and should be easy to work with! The output of that would be:
{
"AdditionalPhotoUrls":{
"string":[
"doc_73737.jpg",
"doc_74483.jpg",
"doc_74484.jpg"
]
},
"PrimaryPhotoUrl":"19427.jpg",
"Sex":"Male",
"Type":"Cat",
"YouTubeVideoUrls":{
"string":[
"http:\/\/www.youtube.com\/watch?v=6EMT2s4n6Xc",
"http:\/\/www.youtube.com\/watch?v=hgfg83mKFnd"
]
}
}
That structure describes objects (items with an inner structure, enclosed in json by {...}), not just arbitrary arrays (sets without a structure, enclosed in json by a [...]). Arrays are only used for the two unstructured sets of strings in there: photos and videos. This is much more logical, once you think about it...
Assuming the XML Data and the JSON data are intended to have the same structure. I would take a look at this: PHP convert XML to JSON
You may not need for loops at all.
Hi I have never used xml but need to now, so I am trying to quickly learn but struggling with the structure I think. This is just to display the weather at the top of someones website.
I want to display Melbourne weather using this xml link ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml
Basically I am trying get Melbourne forecast for 3 days (what ever just something that works) there is a forecast-period array [0] to [6]
I used this print_r to view the structure:
$url = "linkhere";
$xml = simplexml_load_file($url);
echo "<pre>";
print_r($xml);
and tried this just to get something:
$url = "linkhere";
$xml = simplexml_load_file($url);
$data = (string) $xml->forecast->area[52]->description;
echo $data;
Which gave me nothing (expected 'Melbourne'), obviously I need to learn and I am but if someone could help that would be great.
Because description is an attribute of <area>, you need to use
$data = (string) $xml->forecast->area[52]['description'];
I also wouldn't rely on Melbourne being the 52nd area node (though this is really up to the data maintainers). I'd go by its aac attribute as this appears to be unique, eg
$search = $xml->xpath('forecast/area[#aac="VIC_PT042"]');
if (count($search)) {
$melbourne = $search[0];
echo $melbourne['description'];
}
This is a working example for you:
<?php
$forecastdata = simplexml_load_file('ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml','SimpleXMLElement',LIBXML_NOCDATA);
foreach($forecastdata->forecast->area as $singleregion) {
$area = $singleregion['description'];
$weather = $singleregion->{'forecast-period'}->text;
echo $area.': '.$weather.'<hr />';
}
?>
You can edit the aforementioned example to extract the tags and attributes you want.
Always remember that a good practice to understand the structure of your XML object is printing out its content using, for instance, print_r
In the specific case of the XML you proposed, cities are specified through attributes (description). For this reason you have to read also those attributes using ['attribute name'] (see here for more information).
Notice also that the tag {'forecast-period'} is wrapped in curly brackets cause it contains a hyphen, and otherwise it wouldn generate an error.
I'm fairly new to php although I've been programming for a couple years.
I'm working on a project and the end goal is to load certain elements of an xml file into an oracle table on a nightly basis. I have a script which runs nightly and saves a the file on my local machine. I've searched endlessly for answers but have been unsuccessful.
Here is an aggregated example of the xml file.
<?xml version="1.0" encoding="UTF-8" ?>
<Report account="7869" start_time="2012-02-23T00:00:00+00:00" end_time="2012-02-23T15:27:59+00:00" user="twilson" more_sessions="false">
<Session id="ID742247692" realTimeID="4306650378">
<Visitor id="5390643113837">
<ip>128.XXX.XX.XX</ip>
<agent>MSIE 8.0</agent>
</Visitor>
</Session>
<Session id="ID742247695" realTimeID="4306650379">
<Visitor id="7110455516320">
<ip>173.XX.XX.XXX</ip>
<agent>Chrome 17.0.963.56</agent>
</Visitor>
</Session>
</Report>
One thing to note is that the xml file will contain several objects which I will need to load into my table and the above example would just be for two rows of data. I'm familiar with the whole process of connecting and loading data into oracle and have setup similar scripts which perform ETL of txt. and csv. files using php. Unfortunately for me in this case the data is stored in xml. The approach I've taken when loading a csv. file is to load the data into an array and proceed from there.
I'm pretty certain that I can use something similar and perhaps create variable for each or something similar but am not really too sure how to do that with an xml. file.
$xml = simplexml_load_file('C:/Dev/report.xml');
echo $xml->Report->Session->Visitor->agent;
In the above code i'm trying to just return the agent associated with each visitor. This returns an error 'Trying to get property of non-object in C:\PHP\chatTest.php on line 11'
The end result would be for me to load the data into a table similar to the example I provided would be to load two rows into my table which would look similar to below however I think I can handle that if i'm able to get the data into an array or something similar.
IP|AGENT
128.XXX.XX.XX MSIE 8.0
173.XX.XX.XXX Chrome 17.0.963.56
Any help would be greatly appreciated.
Revised Code:
$doc = new DOMDocument();
$doc->load( 'C:/Dev/report.xml' );
$sessions = $doc->getElementsByTagName( "Session" );
foreach( $sessions as $session )
{
$visitors = $session->getElementsByTagName( "Visitor" );
foreach( $visitors as $visitor )
$sessionid = $session->getAttribute( 'realTimeID' );
{
$ips = $visitor->getElementsByTagName( "ip" );
$ip = $ips->item(0)->nodeValue;
$agents = $visitor->getElementsByTagName( "agent" );
$agent = $ips->item(0)->nodeValue;
echo "$sessionid- $ip- $agent\n";
}}
?>
The -> operator in PHP means that you are trying to invoke a field or method on an object. Since Report is not a method within $xml, you are receiving the error that you are trying to invoke a property on a non-object.
You can try something like this (don't know if it works, didn't test it and haven't written PHP for a long time, but you can google it):
$doc = new DOMDocument();
$doc->loadXML($content);
foreach ($doc->getElementsByTagName('Session') as $node)
{
$agent = $node->getElementsByTagName('Visitor')->item(0)->getElementsByTagName('agent')->item(0)->nodeValue;
}
edit:
Adding stuff to an array in PHP is easy as this:
$arr = array();
$arr[] = "some data";
$arr[] = "some more data";
The PHP arrays should be seen as a list, since they can be resized on the fly.
I was able to figure this out using simplexml_load_file rather than the DOM approach. Although DOM works after modifying the Leon's suggestion the approach below is what I would suggest.
$xml_object = simplexml_load_file('C:/Dev/report.xml');
foreach($xml_object->Session as $session) {
foreach($session->Visitor as $visitor) {
$ip = $visitor->ip;
$agent = $visitor->agent;
}
echo $ip.','.$agent."\n";
}