I'm using class to load html on one page, get HTML table content. The page is in windows-1250, so I'm using iconv to convert it to utf-8.
All this is done in one class, that I'm calling like this: $tableHtml = suplUpdater::getTableHtml(someParams,...);. When I echo that variable directly, everything looks nice. However, I want to parse the table rows with PHP DOMDocument to save them to database. Code looks like this:
$tableData = suplUpdater::getTableHtml(1400450400);
//echo($tableData);
$document = new DOMDocument();
$document->loadHTML($tableData);
$rows = $document->getElementsByTagName('tr');
$rows->item(0)->parentNode->removeChild($rows->item(0));//first row is just a header
$output = array();
foreach ($rows as $row) {
$currentOutput = array();
foreach ($row->childNodes as $cell) {
if ($cell->nodeType == 1) {
$currentOutput[] = $cell->nodeValue;
}
}
$output[] = $currentOutput;
}
When I do var_dump($output);, I get array, but it has messed up encoding. Where could be the problem? If needed, I can provide source table data.
EDIT:
When I copy table html to txt file, encoded in utf-8 and I do file_get_contents('tableHtml.txt'), I get the same result.
EDIT:
I have uploaded sample data here:http://anagmate.moxo.cz/data.txt
EDIT:
Screenshot of echo and var_dump is here:http://anagmate.moxo.cz/supl.png
Related
I'm trying to convert some XML files I have to CSV using PHP SimpleXML class. However, I'm unable to achieve the result I want, because one parent could have several child elements with the same name. My current XML file is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<club>
<name>Green Riders</name>
<membership>Free</membership>
<boardMember>
<name>James F.</name>
<position>CEO</position>
</boardMember>
<boardMember>
<name>Helen D.</name>
<position>Associate Director</position>
</boardMember>
</club>
<club>
<name>Broken Dice</name>
<membership>Paid</membership>
<boardMember>
<name>Patrick B.</name>
<position>CEO</position>
</boardMember>
</club>
</root>
The CSV output I was hoping to achieve is as such:
club,name,membership,boardMember>Name,boardMember>position
Green Riders,Free,James F.,CEO
Green Riders,Free,Helen D., Associate Director
Broken Dice,Paid,Patrick B., CEO
Is there anyway to achieve this without hard-coding the element names into the script (i.e. make it work on any generic XML file)?
I'm really hoping this is possible, given that I'll be having more than 25 XML variants; so would really be inefficient to write a dedicated script for each.
Thanks!
Since every child node's data need to be a row in the csv including the root root data, First you can capture & store the root data, then traverse the children and print their data with the root's data preceding them.
Please check the following code:
$xml = simplexml_load_file("your_xml_file.xml") or die("Error: Cannot create object");
$csv_delimeter = ",";
$csv_new_line = "\n";
foreach($xml->children() as $n) {
$club_data = array();
$club_data[] = $n->name;
$club_data[] = $n->membership;
if (isset($n->boardMember)) {
foreach ($n->boardMember as $boardMember) {
$boardMember_data = $club_data;
$boardMember_data[] = $boardMember->name;
$boardMember_data[] = $boardMember->position;
echo implode($csv_delimeter, $boardMember_data).$csv_new_line;
}
}
else {
echo implode($csv_delimeter, $club_data).$csv_new_line;
}
}
After testing with the example xml data, it generated the following type of output:
Green Riders,Free,James F.,CEO
Green Riders,Free,Helen D., Associate Director
Broken Dice,Paid,Patrick B., CEO
You can set different values based on your scenario for:
$csv_delimeter = ",";
$csv_new_line = "\n";
As there are no strict rules in csv output - like delimeter can be ",", ",", ";" or "|" and also new line can be "\n\r"
The codes prints csv rows one-by-one on the fly, but if you are to save csv data in a file, then instead of writing rows one-by-one, better approach would be create the entire array and write it once(as disk access is costly) unless the xml data is large. You will get plenty of simple php array-to-csv function examples in the net.
It is not really possible. XML is a nested structure and you miss the information. You can define some default mapping for XML structures, but that gets really complex really fast. So it is far easier (and less time consuming) to define the mapping by hand.
A Reusable Conversion
function readXMLAsRecords(string $xml, array $map) {
// load the xml
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
// iterate the elements defining the rows
foreach ($xpath->evaluate($map['row']) as $row) {
$line = [];
// get the field values from the current $row
foreach ($map['columns'] as $name => $expression) {
$line[$name] = $xpath->evaluate($expression, $row);
}
// return a line
yield $line;
}
}
The Mapping
With DOMXpath::evaluate() Xpath expressions can return strings. So we need one expression that returns the boardMember nodes and a list of expressions for the fields.
$map = [
'row' => '/root/club/boardMember',
'columns' => [
'club_name' => 'string(parent::club/name)',
'club_membership' => 'string(parent::club/membership)',
'board_member_name' => 'string(name)',
'board_member_position' => 'string(position)'
]
];
To CSV
readXMLAsRecords() returns a generator, you can use foreach on it:
$csv = fopen('php://stdout', 'w');
fputcsv($csv, array_keys($map['columns']));
foreach (readXMLAsRecords($xml, $map) as $record) {
fputcsv($csv, $record);
}
Output:
club_name,club_membership,board_member_name,board_member_position
"Green Riders",Free,"James F.",CEO
"Green Riders",Free,"Helen D.","Associate Director"
"Broken Dice",Paid,"Patrick B.",CEO
I'm pulling data from my mysql database and using it to create an XML file. the idea is for each row of data pulled, it be appended to its own XML file
Here's my sample code:
//sql statement here
$results = $fetch_data->fetchAll(PDO::FETCH_ASSOC);
foreach($results as $data)
{
$invoice_no = $data["invoice_no"];
$doc = new DOMDocument();
$doc->formatOutput = true;
$Order = $doc->appendChild($doc->createElement('Order'));
$OrderHead = $Order->appendChild($doc->createElement('OrderHead'));
$Schema = $doc->createElement('Schema');
$Schema->appendChild($doc->createElement('Version', '3.05'));
$OrderHead->appendChild($Schema);
$CrossReference = $doc->createElement('CrossReference', $invoice_no);
$OrderHead->appendChild($CrossReference);
//more xml code...
$doc->save($invoice_no.".xml", LIBXML_NOEMPTYTAG);
}
As shown above, i'm looping through the data from the database to create the XML file then want to save each file(s) prefixed by the $invoice_no. But only one XML file is created and to be more specific, only the last record pulled from the database is used to create the XML file. this is happening though i'm creating the file inside my loop
where could i be going wrong?
I want to know if it's possible to get a cell by its name in an xls document, I mean Ii have this info in a excel file:
Normally to get the coordinate of the cell with the value "ASUS"
$objPHPExcel->getActiveSheet()->getCell('B3')->getValue();
my problem is that users send my this excel file, and sometimes the rows are in disorder, e.g the row B3 sometimes appear in a different row like "B6" or "B7" or "B5", how I can get the cell "ASUS" getting by cell name "Modelo"
There is nothing built-in to PHPExcel to do a search, but using the following example, it can do well for your problem.
$foundInCells = array();
$searchTerm = 'ASUS';
foreach ($objPHPExcel->getWorksheetIterator() as $CurrentWorksheet) {
$ws = $CurrentWorksheet->getTitle();
foreach ($CurrentWorksheet->getRowIterator() as $row) {
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(true);
foreach ($cellIterator as $cell) {
if ($cell->getValue() == $searchTerm) {
$foundInCells[] = $ws . '!' . $cell->getCoordinate();
}
}
}
}
var_dump($foundInCells);
You'd have to loop through the entire file to find it just like searching a value in a 2D array.
Take a look at this answer
Hello I have the following xml results that are returned from a remote site
<ResultSet totalResultsAvailable="1">
<Product orderNo="5321" partNo="A2345" truckable="1">
<Manufacturer id="22">WIDGET 4 U</Manufacturer>
<Model id="356">ACME 500</Model>
<Years>95-98</Years>
<ProductType id="23" categoryID="4">Cool Red Widgest</ProductType>
<Material id="6">shiny stuff</Material>
<PartNo>A2345</PartNo>
<Code/>
</Product>
</ResultSet>
I am simply trying to pull the xml results and place in a new csv file with the following code:
but I get and error: Warning:
Invalid argument supplied for foreach() in /home/myServer/public_html/xmlParser2.php on line 14
Here is my code:
<?
echo 'Write XML to CSV';
$basenameLong ='http://thisIsTheURLto.com/myFeed/?key=123456789&mode=getProducts;
$fileNameCSV = 'xmlParseContent.csv';
$feedContent = '';
echo '<br/>Starting......';
$feedContent = file_get_contents($basenameLong);
$fh = fopen($fileNameCSV, 'w+'); //create new CSV file if not exists else append
foreach($feedContent->ResultSet->Product as $product) {
fputcsv($f, get_object_vars($product),',','"');
}
fclose($fh);
?>
I know this code is very elementary but can you help me find the issue. I am a novice and I dont see it.
This line is wrong :
fputcsv($f, get_object_vars($product),',','"');
if you want to put blank values, try doing this :
fputcsv($f, get_object_vars($product),'','','');
Your problem is that you never parse your XML file. Replace file_get_contents with simplexml_load_file and it should work.
Using PHP to convert XML to CSV is fairly easy, at least in the situations I've encountered so far. In my case, it would save me significant work if I could simply convert structured XML data into CSV data. Typically, I want to convert only the data in a particular xpath of the original XML document. The PHP function below will load an XML file and convert the elements in the specified xpath to simple csv data.
function xml2csv ($xmlFile, $xPath) {
// Load the XML file
$xml = simplexml_load_file($xmlFile);
// Jump to the specified xpath
$path = $xml->xpath($xPath);
// Loop through the specified xpath
foreach($path as $item) {
// Loop through the elements in this xpath
foreach($item as $key => $value) {
$csvData .= '"' . trim($value) . '"' . ',';
}
// Trim off the extra comma
$csvData = trim($csvData, ',');
// Add an LF
$csvData .= "\n";
}
// Return the CSV data
return $csvData;
}
I have the following data being generated from a google spreadsheet rss feed.
いきます,go,5
きます,come,5
かえります,"go home, return",5
がっこう,school,5
スーパー,supermarket,5
えき,station,5
ひこうき,airplane,5
Using PHP I can do the following:
$url = 'http://google.com.....etc/etc';
$data = file_get_contents($url);
echo $data; // This prints all Japanese symbols
But if I use:
$url = 'http://google.com.....etc/etc';
$handle = fopen($url);
while($row = fgetcsv($handle)) {
print_r($row); // Outputs [0]=>,[1]=>'go',[2]=>'5', etc, i.e. the Japanese characters are skipped
}
So it appears the Japanese characters are skipped when using either fopen or fgetcsv.
My file is saved as UTF-8, it has the PHP header to set it as UTF-8, and there is a meta tag in the HTML head to mark it as UTF-8. I don't think it's the document it's self because it can display characters through the file_get_contents method.
Thanks
I can't add comment to the answer from Darien
I reproduce the problem, after change a locale the problem was solved.
You must install jp locale on server before trying repeat this.
Ubuntu
Add a new row to the file /var/lib/locales/supported.d/local
ja_JP.UTF-8 UTF-8
And run command
sudo dpkg-reconfigure locales
Or
sudo locale-gen
Debian
Just execute "dpkg-reconfigure locales" and select necesary locales (ja_JP.UTF-8)
I don't know how do it for other systems, try searching by the keywords "locale-gen locale" for your server OS.
In the php file, before open csv file, add this line
setlocale(LC_ALL, 'ja_JP.UTF-8');
This looks like it might be the same as PHP Bug 48507.
Have you tried changing your PHP locale setting prior to running the code and resetting it afterwards?
You might want to consider this library. I remember using it some time back, and it is much nicer than the built-in PHP functions for handling CSV files. がんばって!
May be iconv character encoding help you
http://php.net/manual/en/function.iconv.php
You can do that by hand not using fgetcsv and friends:
<?php
$file = file('http://google.com.....etc/etc');
foreach ($file as $row) {
$row = preg_split('/,(?!(?:[^",]|[^"],[^"])+")/', trim($row));
foreach ($row as $n => $cell) {
$cell = str_replace('\\"', '"', trim($cell, '"'));
echo "$n > $cell\n";
}
}
Alternatively you can opt in for a more fancy closures-savvy way:
<?php
$file = file('http://google.com.....etc/etc');
array_walk($file, function (&$row) {
$row = preg_split('/,(?!(?:[^",]|[^"],[^"])+")/', trim($row));
array_walk($row, function (&$cell) {
$cell = str_replace('\\"', '"', trim($cell, '"'));
});
});
foreach ($file as $row) foreach ($row as $n => $cell) {
echo "$n > $cell\n";
}