How to parse this OFX file? - php
this is an original ofx file as it comes from m bank (no worries, theres nothing sensitive, i cut out the middle part with all the transactions)
Open Financial Exchange (OFX) is a
data-stream format for exchanging
financial information that evolved
from Microsoft's Open Financial
Connectivity (OFC) and Intuit's Open
Exchange file formats.
now i need to parse this. i already saw that question, but this is not a dup because i am interested in how to do this.
i am sure i could figure out some clever regexps that would do the job, but that is ugly and error vulnerable (if the format is changed, some fields may be missing, the formatting/white spaces are different etc etc...)
OFXHEADER:100
DATA:OFXSGML
VERSION:102
SECURITY:NONE
ENCODING:USASCII
CHARSET:1252
COMPRESSION:NONE
OLDFILEUID:NONE
NEWFILEUID:NONE
<OFX>
<SIGNONMSGSRSV1>
<SONRS>
<STATUS>
<CODE>0
<SEVERITY>INFO
</STATUS>
<DTSERVER>20110420000000[+1:CET]
<LANGUAGE>ENG
</SONRS>
</SIGNONMSGSRSV1>
<BANKMSGSRSV1>
<STMTTRNRS>
<TRNUID>1
<STATUS>
<CODE>0
<SEVERITY>INFO
</STATUS>
<STMTRS>
<CURDEF>EUR
<BANKACCTFROM>
<BANKID>20404
<ACCTID>02608983629
<ACCTTYPE>CHECKING
</BANKACCTFROM>
<BANKTRANLIST>
<DTSTART>20110207
<DTEND>20110419
<STMTTRN>
<TRNTYPE>XFER
<DTPOSTED>20110205000000[+1:CET]
<TRNAMT>-6.12
<FITID>C74BD430D5FF2521
<NAME>unbekannt
<MEMO>BILLA DANKT 1265P K2 05.02.UM 17.49
</STMTTRN>
<STMTTRN>
<TRNTYPE>XFER
<DTPOSTED>20110207000000[+1:CET]
<TRNAMT>-10.00
<FITID>C74BE0F90A657901
<NAME>unbekannt
<MEMO>AUTOMAT 13177 KARTE2 07.02.UM 10:22
</STMTTRN>
............................. goes on like this ........................
<STMTTRN>
<TRNTYPE>XFER
<DTPOSTED>20110418000000[+1:CET]
<TRNAMT>-9.45
<FITID>C7A5071492D14D29
<NAME>unbekannt
<MEMO>HOFER DANKT 0408P K2 18.04.UM 18.47
</STMTTRN>
</BANKTRANLIST>
<LEDGERBAL>
<BALAMT>1992.29
<DTASOF>20110420000000[+1:CET]
</LEDGERBAL>
</STMTRS>
</STMTTRNRS>
</BANKMSGSRSV1>
</OFX>
i currently use this code which gives me the desired result:
<?
$files = array();
$files[] = '***_2011001.ofx';
$files[] = '***_2011002.ofx';
$files[] = '***_2011003.ofx';
system('touch file.csv && chmod 777 file.csv');
$fp = fopen('file.csv', 'w');
foreach($files as $file) {
echo $file."...\n";
$content = file_get_contents($file);
$content = str_replace("\n","",$content);
$content = str_replace(" ","",$content);
$regex = '|<STMTTRN><TRNTYPE>(.+?)<DTPOSTED>(.+?)<TRNAMT>(.+?)<FITID>(.+?)<NAME>(.+?)<MEMO>(.+?)</STMTTRN>|';
echo preg_match_all($regex,$content,$matches,PREG_SET_ORDER)." matches... \n";
foreach($matches as $match) {
echo ".";
array_shift($match);
fputcsv($fp, $match);
}
echo "\n";
}
echo "done.\n";
fclose($fp);
this is really ugly and if this was a valid xml file i would personally kill myself for that, but how to do it better?
Your code seems fine, considering that the file isn't XML or even SGML. The only thing you could do is try to make a more generic SAX-like parser. That is, you simply go through the input stream one block at a time (where block can be anything, e.g. a line or simply a set amount of characters). Then, call a callback function every time you encounter an <ELEMENT>. You can even go as fanciful as building a parser class where you can register callback functions that listen to specific elements.
It will be more generic and less "ugly" (for some definition of "ugly") but it will be more code to maintain. Nice to do and nice to have if you need to parse this file format a lot (or in a lot of different variations). If your posted code is the only place you do this then just KISS.
// Load Data String
$str = file_get_contents($fLoc);
$MArr = array(); // Final assembled master array
// Fetch all transactions
preg_match_all("/<STMTTRN>(.*)<\/STMTTRN>/msU",$str,$m);
if ( !empty($m[1]) ) {
$recArr = $m[1]; unset($str,$m);
// Parse each transaction record
foreach ( $recArr as $i => $str ) {
$_arr = array();
preg_match_all("/(^\s*<(?'key'.*)>(?'val'.*)\s*$)/m",$str,$m);
foreach ( $m["key"] as $i => $key ) {
$_arr[$key] = trim($m["val"][$i]); // Reassemble array key => val
}
array_push($MArr,$_arr);
}
}
print_r($MArr);
function close_tags($x)
{
return preg_replace('/<([A-Za-z0-9.]+)>([^<\r\n]+)/', '<\1>\2</\1>', $x);
}
$ofx = file_get_contents('myfile.ofx');
$body = '<OFX>'.explode('<OFX>', $ofx)[1]; // strip the header
$xml = close_tags($body); // make valid XML
$reader = new SimpleXMLElement($xml);
foreach($reader->xpath('//STMTTRN') as $txn): // find and loop through all STMTTRN tags, note the double forward slash
// get the tag contents by casting as (string) to invoke the SimpleXMLElement::__toString() method
$trntype = (string)$txn->TRNTYPE;
$dtposted = (string)$txn->DTPOSTED;
$trnamt = (string)$txn->TRNAMT;
$name = (string)$xn->NAME;
$memo = (string)$txn->MEMO;
endforeach;
Related
Replacing characters in a returned Header Location in PHP
I'm trying to filter returned header location after POST using below PHP code. The returned header location is required for further processing of the payment status and saving the status in the db. The API provider seems not supportive since they don't reply on time or fail to reply at all. $output = curl_exec($curl); $lines = explode("\n",$output); $out = array(); $headers = true; foreach ($lines as $l){ $l = trim($l); if ($headers && !empty($l)){ if (strpos($l,'location') !== false){ $p = explode(' ',$l); $out['Headers']['location'] = trim($p[1]); $url = json_encode($out['Headers']['location']); echo json_encode($out['Headers']['location']); } } } The echo output is as below:- "https:\/\/sandbox.kopokopo.com\/api\/v1\/payments\/c122c1d2-8e07-48d3-8c9d-597829447fda" How do I make the output to be a valid url without "\" ? I'll really appreciate your valuable assistance.
Your json_encode call is causing the problem, by escaping the / values. Demo: https://3v4l.org/S3VWD There's no need to encode a single string like this. JSON is mainly useful when you have a more complex set of information (e.g. multiple separate data items) that you want to output in a structured way. echo $out['Headers']['location']; is all you need in this case.
PHP foreach statement issue with accessing XML data
First, I am pretty clueless with PHP, so be kind! I am working on a site for an SPCA (I'm a Vet and a part time geek). The PHP accesses an xml file from a portal used to administer the shelter and store images, info. The file writes that xml data to JSON and then I use the JSON data in a handlebars template, etc. I am having a problem getting some data from the xml file to outprint to JSON. The xml file is like this: </DataFeedAnimal> <AdditionalPhotoUrls> <string>doc_73737.jpg</string> <string>doc_74483.jpg</string> <string>doc_74484.jpg</string> </AdditionalPhotoUrls> <PrimaryPhotoUrl>19427.jpg</PrimaryPhotoUrl> <Sex>Male</Sex> <Type>Cat</Type> <YouTubeVideoUrls> <string>http://www.youtube.com/watch?v=6EMT2s4n6Xc</string> </YouTubeVideoUrls> </DataFeedAnimal> In the PHP file, written by a friend, the code is below, (just part of it), to access that XML data and write it to JSON: <?php $url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/AnimalsForAdoption.aspx"; if ($_GET["type"] == "found") { $url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/foundanimals.aspx"; } else if ($_GET["type"] == "lost") { $url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/lostanimals.aspx"; } $response_xml_data = file_get_contents($url); $xml = simplexml_load_string($response_xml_data); $data = array(); foreach($xml->DataFeedAnimal as $animal) { $item = array(); $item['sex'] = (string)$animal->Sex; $item['photo'] = (string)$animal->PrimaryPhotoUrl; $item['videos'][] = (string)$animal->YouTubeVideoUrls; $item['photos'][] = (string)$animal->PrimaryPhotoUrl; foreach($animal->AdditionalPhotoUrls->string as $photo) { $item['photos'][] = (string)$photo; } $item['videos'] = array(); $data[] = $item; } echo file_put_contents('../adopt.json', json_encode($data)); echo json_encode($data); ?> The JSON output works well but I am unable to get 'videos' to write out to the JSON file as the 'photos' do. I just get '/n'! Since the friend who helped with this is no longer around, I am stuck. I have tried similar code to the foreach statement for photos but am getting nowhere. Any help would be appreciated and the pets would appreciate it as well!
The trick with such implementations is to always look what you have got by dumping data structures to a log file or command line. Then to take a look at the documentation of the data you see. That way you know exactly what data you are working with and how to work with it ;-) Here it turns out that the video URLs you are interested in are placed inside an object of type SimpleXMLElement with public properties, which is not really surprising if you look at the xml structure. The documentation of class SimpleXMLElement shows the method children() which iterates through all children. Just what we are looking for... That means a clean implementation to access those sets should go along these lines: foreach($animal->AdditionalPhotoUrls->children() as $photo) { $item['photos'][] = (string)$photo; } foreach($animal->YouTubeVideoUrls->children() as $video) { $item['videos'][] = (string)$video; } Take a look at this full and working example: <?php $response_xml_data = <<< EOT <DataFeedAnimal> <AdditionalPhotoUrls> <string>doc_73737.jpg</string> <string>doc_74483.jpg</string> <string>doc_74484.jpg</string> </AdditionalPhotoUrls> <PrimaryPhotoUrl>19427.jpg</PrimaryPhotoUrl> <Sex>Male</Sex> <Type>Cat</Type> <YouTubeVideoUrls> <string>http://www.youtube.com/watch?v=6EMT2s4n6Xc</string> <string>http://www.youtube.com/watch?v=hgfg83mKFnd</string> </YouTubeVideoUrls> </DataFeedAnimal> EOT; $animal = simplexml_load_string($response_xml_data); $item = []; $item['sex'] = (string)$animal->Sex; $item['photo'] = (string)$animal->PrimaryPhotoUrl; $item['photos'][] = (string)$animal->PrimaryPhotoUrl; foreach($animal->AdditionalPhotoUrls->children() as $photo) { $item['photos'][] = (string)$photo; } $item['videos'] = []; foreach($animal->YouTubeVideoUrls->children() as $video) { $item['videos'][] = (string)$video; } echo json_encode($item); The obvious output of this is: { "sex":"Male", "photo":"19427.jpg", "photos" ["19427.jpg","doc_73737.jpg","doc_74483.jpg","doc_74484.jpg"], "videos":["http:\/\/www.youtube.com\/watch?v=6EMT2s4n6Xc","http:\/\/www.youtube.com\/watch?v=hgfg83mKFnd"] } I would however like to add a short hint: In m eyes it is questionable to convert such structured information into an associative array. Why? Why not a simple json_encode($animal)? The structure is perfectly fine and should be easy to work with! The output of that would be: { "AdditionalPhotoUrls":{ "string":[ "doc_73737.jpg", "doc_74483.jpg", "doc_74484.jpg" ] }, "PrimaryPhotoUrl":"19427.jpg", "Sex":"Male", "Type":"Cat", "YouTubeVideoUrls":{ "string":[ "http:\/\/www.youtube.com\/watch?v=6EMT2s4n6Xc", "http:\/\/www.youtube.com\/watch?v=hgfg83mKFnd" ] } } That structure describes objects (items with an inner structure, enclosed in json by {...}), not just arbitrary arrays (sets without a structure, enclosed in json by a [...]). Arrays are only used for the two unstructured sets of strings in there: photos and videos. This is much more logical, once you think about it...
Assuming the XML Data and the JSON data are intended to have the same structure. I would take a look at this: PHP convert XML to JSON You may not need for loops at all.
How can I break out segments of a 'PHP' file into raw PHP, -and- possible raw HTML -in order-
So I've got a concept of how to do this - but actually implementing me is a bit of a stumper for myself; mostly due to my lack of regex experience - but let's get into it. I'd like to 'parse' through a 'php' file that could contain something like the following: <?php function Something() { } ?> <html> <body> <? Something(); ?> </body> </html> <?php // Some more code or something ?> If interpreted exactly - the above is worthless jibberish - but it is a good example of what I'd like to be able to parse, or interpret... The idea is that I would read the contents of the above file, and break it out into an ordered array of its respective pieces; while tracking what 'type' each 'segment' is, so that I can either simply echo it, or run an 'eval()' on it. Effectively, I'd like to end up with an array something like this: $FileSegments = array(); $FileSegments[0]['type'] = "PHP"; $FileSegments[0]['content'] = " function Something() { }"; $FileSegments[1]['type'] = "HTML"; $FileSegments[1]['content'] = " <html> <body>"; $FileSegments[2]['type'] = "PHP"; $FileSegments[2]['content'] = "Something();" And so on... The initial idea was to simply 'include()' or 'require()' the file in question, and grab its output from the output buffer - but it dawned on me that I would like to be able to inject some 'top level' variables into each one of these files before evaluating the code. To do this, I would have to 'eval()' my injected code, with the contents of the file after said injection - but in order to do this with the ability to handle raw HTML in the file too, I would have to basically write a temporary clone of the whole file, that just had my injected code written before the actual contents... Cumbersome, and slow. I hope you're all following here... If not I can clarify... The only other piece I feel I should note before finalizing this question; is that I would like to retain any variables or symbols in general ( for instance the 'Something() function ) created in segments 0 and 2, for instance, and pass them down to segment '4'... I feel like this might be achievable using the extract method, and then manually writing in those pieces of data before my next segment executes - but again I'm shooting a little in the dark on that. If anyone has a better approach, or can give me some brief code on just extracting these 'segments' out of a file, I would be ecstatic. cheers ETA: It dawns on me that I can probably pose this question a little more simply: If there isn't a 'simple' way to do the above, is there a way to handle a String in the exact same way that 'require()' and 'include()' handle a File?
<?php $str = file_get_contents('filename.php'); // get values from starting characters $php_full = array_filter(explode('<?php', $str)); $php = array_filter(explode('<?', $str)); $html = array_filter(explode('?>', $str)); // remove values after last expected characters foreach ($php_full as $key => $value) { $php_full_result[] = substr($value, 0, strpos($value, '?>')); } foreach ($php as $key => $value) { if( strpos($value,'php') !== 0 ) { $php_result[] = substr($value, 0, strpos($value, '?>')); } } $html_result[] = substr($str, 0, strpos($str, '<?')); foreach ($html as $key => $value) { $html_result[] = substr($value, 0, strpos($value, '<?')); } $html_result = array_filter($html_result); echo '<pre>'; print_r($php_full_result); echo '</pre>'; echo '<pre>'; print_r($php_result); echo '</pre>'; echo '<pre>'; var_dump($html_result); echo '</pre>'; ?> This will give you 3 arrays of file segments you want, not the exact format you wanted but you can easily modify this arrays to your needs. For "I'd like to break all of my '$GLOBALS' variables out into their 'simple' names" part you can use extract like extract($GLOBALS);
Breaking foreach at certain string/ Reading through text file and generating XML
I don't know if this is the right way to go about it, but right now I am dealing with a very large text file of membership details. It is really inconsistent though, but typically conforming to this format: Name School Department Address Phone Email &&^ (indicating the end of the individual record) What I want to do with this information is read through it, and then format it into XML. So right now I have a foreach reading through the long file like this: <?php $textline = file("asrlist.txt"); foreach($textline as $showline){ echo $showline . "<br>"; } ?> And that's where I don't know how to continue. Can anybody give me some hints on how I could organize these records into XML?
Here a straightforward solution using simplexml: $members = explode('&&^', $textline); // building array $members $xml = new SimpleXMLElement("<?xml version="1.0" encoding="UTF-8"?><members></members>"); $fieldnames = array('name','school','department','address','phone','email'); // set $fieldsep to character(s) that seperate fields from each other in your textfile $fieldsep = '\p\n'; // a wild guess... foreach ($members as $member) { $m = explode($fieldsep, $member); // build array $m; $m[0] would contain "name" etc. $xmlmember = $xml->addChild('member'); foreach ($m as $key => $data) $xmlmember->addChild($fieldnames[$key],$data); } // foreach $members $xml->asXML('mymembers.xml'); For reading and parsing the text-file, CSV-related functions could be a good alternative, as mentioned by other users.
To read big files you can use fgetcsv
If && works as a delimiter for records in that file, you could start with replacing it with </member><member>. Prepend whole file with <member> and append </member> at the end. You will have something XML alike. How to replace? You might find unix tools like sed useful. sed 's/&&/\<\/member\>\<member\>/' <input.txt >output.xml You can also accomplish it with PHP, using str_replace(): foreach($textline as $showline){ echo str_replace( '&&', '</member><member>', $showline ) . "<br>"; }
Parsing XML with PHP (simplexml)
Firstly, may I point out that I am a newcomer to all things PHP so apologies if anything here is unclear and I'm afraid the more layman the response the better. I've been having real trouble parsing an xml file in to php to then populate an HTML table for my website. At the moment, I have been able to get the full xml feed in to a string which I can then echo and view and all seems well. I then thought I would be able to use simplexml to pick out specific elements and print their content but have been unable to do this. The xml feed will be constantly changing (structure remaining the same) and is in compressed format. From various sources I've identified the following commands to get my feed in to the right format within a string although I am still unable to print specific elements. I've tried every combination without any luck and suspect I may be barking up the wrong tree. Could someone please point me in the right direction?! $file = fopen("compress.zlib://$url", 'r'); $xmlstr = file_get_contents($url); $xml = new SimpleXMLElement($url,null,true); foreach($xml as $name) { echo "{$name->awCat}\r\n"; } Many, many thanks in advance, Chris PS The actual feed
Since no one followed my closevote, I think I can just as well put my own comments as an answer: First of all, SimpleXml can load URIs directly and it can do so with stream wrappers, so your three calls in the beginning can be shortened to (note that you are not using $file at all) $merchantProductFeed = new SimpleXMLElement("compress.zlib://$url", null, TRUE); To get the values you can either use the implicit SimpleXml API and drill down to the wanted elements (like shown multiple times elsewhere on the site): foreach ($merchantProductFeed->merchant->prod as $prod) { echo $prod->cat->awCat , PHP_EOL; } or you can use an XPath query to get at the wanted elements directly $xml = new SimpleXMLElement("compress.zlib://$url", null, TRUE); foreach ($xml->xpath('/merchantProductFeed/merchant/prod/cat/awCat') as $awCat) { echo $awCat, PHP_EOL; } Live Demo Note that fetching all $awCat elements from the source XML is rather pointless though, because all of them have "Bodycare & Fitness" for value. Of course you can also mix XPath and the implict API and just fetch the prod elements and then drill down to the various children of them. Using XPath should be somewhat faster than iterating over the SimpleXmlElement object graph. Though it should be noted that the difference is in an neglectable area (read 0.000x vs 0.000y) for your feed. Still, if you plan to do more XML work, it pays off to familiarize yourself with XPath, because it's quite powerful. Think of it as SQL for XML. For additional examples see A simple program to CRUD node and node values of xml file and PHP Manual - SimpleXml Basic Examples
Try this... $url = "http://datafeed.api.productserve.com/datafeed/download/apikey/58bc4442611e03a13eca07d83607f851/cid/97,98,142,144,146,129,595,539,147,149,613,626,135,163,168,159,169,161,167,170,137,171,548,174,183,178,179,175,172,623,139,614,189,194,141,205,198,206,203,208,199,204,201,61,62,72,73,71,74,75,76,77,78,79,63,80,82,64,83,84,85,65,86,87,88,90,89,91,67,92,94,33,54,53,57,58,52,603,60,56,66,128,130,133,212,207,209,210,211,68,69,213,216,217,218,219,220,221,223,70,224,225,226,227,228,229,4,5,10,11,537,13,19,15,14,18,6,551,20,21,22,23,24,25,26,7,30,29,32,619,34,8,35,618,40,38,42,43,9,45,46,651,47,49,50,634,230,231,538,235,550,240,239,241,556,245,244,242,521,576,575,577,579,281,283,554,285,555,303,304,286,282,287,288,173,193,637,639,640,642,643,644,641,650,177,379,648,181,645,384,387,646,598,611,391,393,647,395,631,602,570,600,405,187,411,412,413,414,415,416,649,418,419,420,99,100,101,107,110,111,113,114,115,116,118,121,122,127,581,624,123,594,125,421,604,599,422,530,434,532,428,474,475,476,477,423,608,437,438,440,441,442,444,446,447,607,424,451,448,453,449,452,450,425,455,457,459,460,456,458,426,616,463,464,465,466,467,427,625,597,473,469,617,470,429,430,615,483,484,485,487,488,529,596,431,432,489,490,361,633,362,366,367,368,371,369,363,372,373,374,377,375,536,535,364,378,380,381,365,383,385,386,390,392,394,396,397,399,402,404,406,407,540,542,544,546,547,246,558,247,252,559,255,248,256,265,259,632,260,261,262,557,249,266,267,268,269,612,251,277,250,272,270,271,273,561,560,347,348,354,350,352,349,355,356,357,358,359,360,586,590,592,588,591,589,328,629,330,338,493,635,495,507,563,564,567,569,568/mid/2891/columns/merchant_id,merchant_name,aw_product_id,merchant_product_id,product_name,description,category_id,category_name,merchant_category,aw_deep_link,aw_image_url,search_price,delivery_cost,merchant_deep_link,merchant_image_url/format/xml/compression/gzip/"; $zd = gzopen($url, "r"); $data = gzread($zd, 1000000); gzclose($zd); if ($data !== false) { $xml = simplexml_load_string($data); foreach ($xml->merchant->prod as $pr) { echo $pr->cat->awCat . "<br>"; } }
<?php $xmlstr = file_get_contents("compress.zlib://$url"); $xml = simplexml_load_string($xmlstr); // you can transverse the xml tree however you want foreach ($xml->merchant->prod as $line) { // $line->cat->awCat -> you can use this } more information here
Use print_r($xml) to see the structure of the parsed XML feed. Then it becomes obvious how you would traverse it: foreach ($xml->merchant->prod as $prod) { print $prod->pId; print $prod->text->name; print $prod->cat->awCat; # <-- which is what you wanted print $prod->price->buynow; }
$url = 'you url here'; $f = gzopen ($url, 'r'); $xml = new SimpleXMLElement (fread ($f, 1000000)); foreach($xml->xpath ('//prod') as $name) { echo (string) $name->cat->awCatId, "\r\n"; }