Use PHP to load XML Data into Oracle - php

I'm fairly new to php although I've been programming for a couple years.
I'm working on a project and the end goal is to load certain elements of an xml file into an oracle table on a nightly basis. I have a script which runs nightly and saves a the file on my local machine. I've searched endlessly for answers but have been unsuccessful.
Here is an aggregated example of the xml file.
<?xml version="1.0" encoding="UTF-8" ?>
<Report account="7869" start_time="2012-02-23T00:00:00+00:00" end_time="2012-02-23T15:27:59+00:00" user="twilson" more_sessions="false">
<Session id="ID742247692" realTimeID="4306650378">
<Visitor id="5390643113837">
<ip>128.XXX.XX.XX</ip>
<agent>MSIE 8.0</agent>
</Visitor>
</Session>
<Session id="ID742247695" realTimeID="4306650379">
<Visitor id="7110455516320">
<ip>173.XX.XX.XXX</ip>
<agent>Chrome 17.0.963.56</agent>
</Visitor>
</Session>
</Report>
One thing to note is that the xml file will contain several objects which I will need to load into my table and the above example would just be for two rows of data. I'm familiar with the whole process of connecting and loading data into oracle and have setup similar scripts which perform ETL of txt. and csv. files using php. Unfortunately for me in this case the data is stored in xml. The approach I've taken when loading a csv. file is to load the data into an array and proceed from there.
I'm pretty certain that I can use something similar and perhaps create variable for each or something similar but am not really too sure how to do that with an xml. file.
$xml = simplexml_load_file('C:/Dev/report.xml');
echo $xml->Report->Session->Visitor->agent;
In the above code i'm trying to just return the agent associated with each visitor. This returns an error 'Trying to get property of non-object in C:\PHP\chatTest.php on line 11'
The end result would be for me to load the data into a table similar to the example I provided would be to load two rows into my table which would look similar to below however I think I can handle that if i'm able to get the data into an array or something similar.
IP|AGENT
128.XXX.XX.XX MSIE 8.0
173.XX.XX.XXX Chrome 17.0.963.56
Any help would be greatly appreciated.
Revised Code:
$doc = new DOMDocument();
$doc->load( 'C:/Dev/report.xml' );
$sessions = $doc->getElementsByTagName( "Session" );
foreach( $sessions as $session )
{
$visitors = $session->getElementsByTagName( "Visitor" );
foreach( $visitors as $visitor )
$sessionid = $session->getAttribute( 'realTimeID' );
{
$ips = $visitor->getElementsByTagName( "ip" );
$ip = $ips->item(0)->nodeValue;
$agents = $visitor->getElementsByTagName( "agent" );
$agent = $ips->item(0)->nodeValue;
echo "$sessionid- $ip- $agent\n";
}}
?>

The -> operator in PHP means that you are trying to invoke a field or method on an object. Since Report is not a method within $xml, you are receiving the error that you are trying to invoke a property on a non-object.
You can try something like this (don't know if it works, didn't test it and haven't written PHP for a long time, but you can google it):
$doc = new DOMDocument();
$doc->loadXML($content);
foreach ($doc->getElementsByTagName('Session') as $node)
{
$agent = $node->getElementsByTagName('Visitor')->item(0)->getElementsByTagName('agent')->item(0)->nodeValue;
}
edit:
Adding stuff to an array in PHP is easy as this:
$arr = array();
$arr[] = "some data";
$arr[] = "some more data";
The PHP arrays should be seen as a list, since they can be resized on the fly.

I was able to figure this out using simplexml_load_file rather than the DOM approach. Although DOM works after modifying the Leon's suggestion the approach below is what I would suggest.
$xml_object = simplexml_load_file('C:/Dev/report.xml');
foreach($xml_object->Session as $session) {
foreach($session->Visitor as $visitor) {
$ip = $visitor->ip;
$agent = $visitor->agent;
}
echo $ip.','.$agent."\n";
}

Related

Loading a Search and Retrieve via URL (SRU) in php with simplexml_load_string returns an empty object

Im trying to load search result from an library api using Search and Retrieve via URL (SRU) at : https://data.norge.no/data/bibsys/bibsys-bibliotekbase-bibliografiske-data-sru
If you see the search result links there, its looks pretty much like XML but when i try like i have before with xml using the code below, it just returns a empty object,
SimpleXMLElement {#546}
whats going on here?
My php function in my laravel project:
public function bokId($bokid) {
$apiUrl = "http://sru.bibsys.no/search/biblio?version=1.2&operation=searchRetrieve&startRecord=1&maximumRecords=10&query=ibsen&recordSchema=marcxchange";
$filename = "bok.xml";
$xmlfile = file_get_contents($apiUrl);
file_put_contents($filename, $xmlfile); // xml file is saved.
$fileXml = simplexml_load_string($xmlfile);
dd($fileXml);
}
If i do:
dd($xmlfile);
instead, it echoes out like this:
Making me very confused that i cannot get an object to work with. Code i present have worked fine before.
It may be that the data your being provided ha changed format, but the data is still there and you can still use it. The main problem with using something like dd() is that it doesn't work well with SimpleXMLElements, it tends to have it's own idea of what you want to see of what data there is.
In this case the namespaces are the usual problem. But if you look at the following code you can see a quick way of getting the data from a specific namespace, which you can then easily access as normal. In this code I use ->children("srw", true) to say fetch all child elements that are in the namespace srw (the second argument indicates that this is the prefix and not the URL)...
$apiUrl = "http://sru.bibsys.no/search/biblio?version=1.2&operation=searchRetrieve&startRecord=1&maximumRecords=10&query=ibsen&recordSchema=marcxchange";
$filename = "bok.xml";
$xmlfile = file_get_contents($apiUrl);
file_put_contents($filename, $xmlfile); // xml file is saved.
$fileXml = simplexml_load_string($xmlfile);
foreach ( $fileXml->children("srw", true)->records->record as $record) {
echo "recordIdentifier=".$record->recordIdentifier.PHP_EOL;
}
This outputs...
recordIdentifier=792012771
recordIdentifier=941956423
recordIdentifier=941956466
recordIdentifier=950546232
recordIdentifier=802109055
recordIdentifier=910941041
recordIdentifier=940589451
recordIdentifier=951721941
recordIdentifier=080703852
recordIdentifier=011800283
As I'm not sure which data you want to retrieve as the title, I just wanted to show the idea of how to fetch data when you have a list of possibilities. In this example I'm using XPath to look in each <srw:record> element and find the <marc:datafield tag="100"...> element and in that the <marc:subfield code="a"> element. This is done using //marc:datafield[#tag='100']/marc:subfield[#code='a']. You may need to adjust the #tag= bit to the datafield your after and the #code= to point to the subfield your after.
$fileXml = simplexml_load_string($xmlfile);
$fileXml->registerXPathNamespace("marc","info:lc/xmlns/marcxchange-v1");
foreach ( $fileXml->children("srw", true)->records->record as $record) {
echo "recordIdentifier=".$record->recordIdentifier.PHP_EOL;
$data = $record->xpath("//marc:datafield[#tag='100']/marc:subfield[#code='a']");
$subData=$data[0]->children("marc", true);
echo "Data=".(string)$data[0].PHP_EOL;
}

simple_xml_load_string if nothing is returned

I am attempting to only run a loop if xml results actually exist. I am getting the xml results via:
$albums = simplexml_load_string(curl_get($api_url . '/videos.xml'));
What I want to be able to do is that on the next line say:
if($albums = hasAValue())
// Loop
Any ideas? Or a way to check before I load the XML data?
Side note: This is using the Vimeo API.
No, you need to further go down with the resultant with the namespace, reach till body give the xpath and work on.
$albums->registerXPathNamespace('soap', 'http://schemas.xmlsoap.org/soap/envelope/');
To be specific, let me know the XML response you are getting i will let you the output.
UPDATED
$albums = simplexml_load_string("#your response#");
echo count($xml->children());
The dirty way:
$albums = #simplexml_load_string(curl_get($api_url . '/videos.xml'));
if ($albums)
{
...
}
This is dirty because of the Error Control Operator # which is used to "deal" with the error cases (e.g. problem fetching the remote location).
The alternative is to differentiate more here:
$xml = curl_get($api_url . '/videos.xml');
$albums = NULL;
if ($xml)
{
$albums = simplexml_load_string($xml);
}
if ($albums)
{
...
}

Store XML Data to MySQL

I have an XML file. I want to save all the data from the XML file to the database
The file structure of XML is like
<STORY>
<BYLINE>abc</BYLINE>
<STORYID>123456</STORYID>
</STORY>
The code for storing data to database that I am using is
$dom = new DOMDOcument();
$dom->loadXML(equitymarketnews/$zname);
$xpath = new DOMXpath($dom);
$res = $xpath->query("//STORY/");
$allres = array();
foreach($res as $node){
$result = array();
$byline = mysql_real_escape_string($node->getElementsByTagName("BYLINE")->item(0)->nodeValue);
$storyid = mysql_real_escape_string($node->getElementsByTagName("STORYID")->item(0)->nodeValue);
}
$sql12="insert into equitymarketnews values('$byline','$storyid')";
mysql_query($sql12);
I am getting nothing in my database. All values are blanks.
Where am I going wrong?
I think something is wrong with this line
$res = $xpath->query("//STORY/");
i want to story the data ie ABC and 12345 FROm XML File To Table in database
I don't really know what your question is but assuming that the code you posted does not work as you aspect, one thing i noticed is the insertion of the record:
$sql12="insert into equitymarketnews values('$byline','$storyid','$pubdate','$author','$cat','$subcat','$titleline','$subtitleline,'$storymain','$flag')";
mysql_query($sql12);
shouldn't it be inside your foreach loop? Otherwise you won't get anything into your database.
In my opinion it should look something like that:
foreach($res as $node){
$result = array();
$byline = mysql_real_escape_string($node->getElementsByTagName("BYLINE")->item(0)->nodeValue);
$storyid = mysql_real_escape_string($node->getElementsByTagName("STORYID")->item(0)->nodeValue);
$sql12="insert into equitymarketnews values('$byline','$storyid')";
mysql_query($sql12);
}
You can actually use mysql client directly for importing XML data. I do not have much experience to provide you with a code sample, but MySQL docs have quite a bit documentation on it.
Essentially, you can do something like this:
LOAD XML LOCAL INFILE 'address.xml' INTO TABLE quitymarketnews ROWS IDENTIFIED BY '<STORY>';

Parsing XML with PHP (simplexml)

Firstly, may I point out that I am a newcomer to all things PHP so apologies if anything here is unclear and I'm afraid the more layman the response the better. I've been having real trouble parsing an xml file in to php to then populate an HTML table for my website. At the moment, I have been able to get the full xml feed in to a string which I can then echo and view and all seems well. I then thought I would be able to use simplexml to pick out specific elements and print their content but have been unable to do this.
The xml feed will be constantly changing (structure remaining the same) and is in compressed format. From various sources I've identified the following commands to get my feed in to the right format within a string although I am still unable to print specific elements. I've tried every combination without any luck and suspect I may be barking up the wrong tree. Could someone please point me in the right direction?!
$file = fopen("compress.zlib://$url", 'r');
$xmlstr = file_get_contents($url);
$xml = new SimpleXMLElement($url,null,true);
foreach($xml as $name) {
echo "{$name->awCat}\r\n";
}
Many, many thanks in advance,
Chris
PS The actual feed
Since no one followed my closevote, I think I can just as well put my own comments as an answer:
First of all, SimpleXml can load URIs directly and it can do so with stream wrappers, so your three calls in the beginning can be shortened to (note that you are not using $file at all)
$merchantProductFeed = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
To get the values you can either use the implicit SimpleXml API and drill down to the wanted elements (like shown multiple times elsewhere on the site):
foreach ($merchantProductFeed->merchant->prod as $prod) {
echo $prod->cat->awCat , PHP_EOL;
}
or you can use an XPath query to get at the wanted elements directly
$xml = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
foreach ($xml->xpath('/merchantProductFeed/merchant/prod/cat/awCat') as $awCat) {
echo $awCat, PHP_EOL;
}
Live Demo
Note that fetching all $awCat elements from the source XML is rather pointless though, because all of them have "Bodycare & Fitness" for value. Of course you can also mix XPath and the implict API and just fetch the prod elements and then drill down to the various children of them.
Using XPath should be somewhat faster than iterating over the SimpleXmlElement object graph. Though it should be noted that the difference is in an neglectable area (read 0.000x vs 0.000y) for your feed. Still, if you plan to do more XML work, it pays off to familiarize yourself with XPath, because it's quite powerful. Think of it as SQL for XML.
For additional examples see
A simple program to CRUD node and node values of xml file and
PHP Manual - SimpleXml Basic Examples
Try this...
$url = "http://datafeed.api.productserve.com/datafeed/download/apikey/58bc4442611e03a13eca07d83607f851/cid/97,98,142,144,146,129,595,539,147,149,613,626,135,163,168,159,169,161,167,170,137,171,548,174,183,178,179,175,172,623,139,614,189,194,141,205,198,206,203,208,199,204,201,61,62,72,73,71,74,75,76,77,78,79,63,80,82,64,83,84,85,65,86,87,88,90,89,91,67,92,94,33,54,53,57,58,52,603,60,56,66,128,130,133,212,207,209,210,211,68,69,213,216,217,218,219,220,221,223,70,224,225,226,227,228,229,4,5,10,11,537,13,19,15,14,18,6,551,20,21,22,23,24,25,26,7,30,29,32,619,34,8,35,618,40,38,42,43,9,45,46,651,47,49,50,634,230,231,538,235,550,240,239,241,556,245,244,242,521,576,575,577,579,281,283,554,285,555,303,304,286,282,287,288,173,193,637,639,640,642,643,644,641,650,177,379,648,181,645,384,387,646,598,611,391,393,647,395,631,602,570,600,405,187,411,412,413,414,415,416,649,418,419,420,99,100,101,107,110,111,113,114,115,116,118,121,122,127,581,624,123,594,125,421,604,599,422,530,434,532,428,474,475,476,477,423,608,437,438,440,441,442,444,446,447,607,424,451,448,453,449,452,450,425,455,457,459,460,456,458,426,616,463,464,465,466,467,427,625,597,473,469,617,470,429,430,615,483,484,485,487,488,529,596,431,432,489,490,361,633,362,366,367,368,371,369,363,372,373,374,377,375,536,535,364,378,380,381,365,383,385,386,390,392,394,396,397,399,402,404,406,407,540,542,544,546,547,246,558,247,252,559,255,248,256,265,259,632,260,261,262,557,249,266,267,268,269,612,251,277,250,272,270,271,273,561,560,347,348,354,350,352,349,355,356,357,358,359,360,586,590,592,588,591,589,328,629,330,338,493,635,495,507,563,564,567,569,568/mid/2891/columns/merchant_id,merchant_name,aw_product_id,merchant_product_id,product_name,description,category_id,category_name,merchant_category,aw_deep_link,aw_image_url,search_price,delivery_cost,merchant_deep_link,merchant_image_url/format/xml/compression/gzip/";
$zd = gzopen($url, "r");
$data = gzread($zd, 1000000);
gzclose($zd);
if ($data !== false) {
$xml = simplexml_load_string($data);
foreach ($xml->merchant->prod as $pr) {
echo $pr->cat->awCat . "<br>";
}
}
<?php
$xmlstr = file_get_contents("compress.zlib://$url");
$xml = simplexml_load_string($xmlstr);
// you can transverse the xml tree however you want
foreach ($xml->merchant->prod as $line) {
// $line->cat->awCat -> you can use this
}
more information here
Use print_r($xml) to see the structure of the parsed XML feed.
Then it becomes obvious how you would traverse it:
foreach ($xml->merchant->prod as $prod) {
print $prod->pId;
print $prod->text->name;
print $prod->cat->awCat; # <-- which is what you wanted
print $prod->price->buynow;
}
$url = 'you url here';
$f = gzopen ($url, 'r');
$xml = new SimpleXMLElement (fread ($f, 1000000));
foreach($xml->xpath ('//prod') as $name)
{
echo (string) $name->cat->awCatId, "\r\n";
}

Not finding elements using getElementsByTagName() using DOMDocument

I'm trying to loop through multiple <LineItemInfo> products contained within a <LineItems> within XML I'm parsing to pull product Ids out and send emails and do other actions for each product.
The problem is that it's not returning anything. I've verified that the XML data is valid and it does contain the necessary components.
$itemListObject = $orderXML->getElementsByTagName('LineItemInfo');
var_dump($itemListObject->length);
var_dump($itemListObject);
The output of the var_dump is:
int(0)
object(DOMNodeList)#22 (0) {
}
This is my first time messing with this and it's taken me a couple of hours but I can't figure it out. Any advice would be awesome.
EDIT:
My XML looks like this... except with a lot more tags than just ProductId
<LineItems>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
</LineItems>
Executing the following code does NOT get me the ProductId
$itemListObject = $orderXML->getElementsByTagName('LineItemInfo');
foreach ($itemListObject as $element) {
$product = $element->getElementsByTagName('ProductId');
$productId = $product->item(0)->nodeValue;
echo $productId.'-';
}
EDIT #2
As a side note, calling
$element->item(0)->nodeValue
on $element instead of $product caused my script's execution to discontinue and not throwing any errors that were logged by the server. It's a pain to debug when you have to run a credit card to find out whether it's functioning or not.
DOMDocument stuff can be tricky to get a handle on, because functions such as print_r() and var_dump() don't necessarily perform the same as they would on normal arrays and objects (see this comment in the manual).
You have to use various functions and properties of the document nodes to pull out the data. For instance, if you had the following XML:
<LineItemInfo attr1="hi">This is a line item.</LineItemInfo>
You could output various parts of that using:
$itemListObjects = $orderXML->getElementsByTagName('LineItemInfo');
foreach($itemListObjects as $node) {
echo $node->nodeValue; //echos "This is a line item."
echo $node->attributes->getNamedItem('attr1')->nodeValue; //echos "hi"
}
If you had a nested structure, you can follow basically the same procedure using the childNodes property. For example, if you had this:
<LineItemInfo attr1="hi">
<LineItem>Line 1</LineItem>
<LineItem>Line 2</LineItem>
</LineItemInfo>
You might do something like this:
$itemListObjects = $orderXML->getElementsByTagName('LineItemInfo');
foreach($itemListObjects as $node) {
if ($node->hasChildNodes()) {
foreach($node->childNodes as $c) {
echo $c->nodeValue .",";
}
}
}
//you'll get output of "Line 1,Line 2,"
Hope that helps.
EDIT for specific code and XML
I ran the following code in a test script, and it seemed to work for me. Can you be more specific about what's not working? I used your code exactly, except for the first two lines that create the document. Are you using loadXML() over loadHTML()? Are there any errors?
$orderXML = new DOMDocument();
$orderXML->loadXML("
<LineItems>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
</LineItems>
");
$itemListObject = $orderXML->getElementsByTagName('LineItemInfo');
foreach ($itemListObject as $element) {
$product = $element->getElementsByTagName('ProductId');
$productId = $product->item(0)->nodeValue;
echo $productId.'-';
}
//outputs "149593-149595-"
XML tags tend to be lower-camel-case (or just "camel-case"), i.e. "lineItemInfo", instead of "LineItemInfo" and XML is case-sensitive, so check for that.

Categories