How to find the values for namespace content:encoded and dc:creator with the following code
Unfortunately I cannot use simplepie or magpierss or even simplexml.
I know I've to use $doc->getElementsByTagName, but cannot figure out where?
<?php
function rss_to_array($tags, $array, $url) {
$doc = new DOMdocument();
#$doc->load($url);
$rss_array = array();
foreach($tags as $tag) {
if ($doc->getElementsByTagName($tag)) {
foreach($doc->getElementsByTagName($tag) AS $node) {
$items = array();
foreach($array AS $key => $values) {
$items[$key] = array();
foreach($values as $value) {
if ($itemsCheck = $node->getElementsByTagName($value)) {
for( $j=0 ; $j < $itemsCheck->length; $j++ ) {
if (($attribute = $itemsCheck->item($j)->nodeValue) != "") {
$items[$key][] = $attribute;
} else if ($attribute = $itemsCheck->item($j)->getAttribute('term')) {
$items[$key][] = $attribute;
} else if ($itemsCheck->item($j)->getAttribute('rel') == 'alternate') {
$items[$key][] = $itemsCheck->item($j)->getAttribute('href');
}
}
}
}
}
array_push($rss_array, $items);
}
}
}
return $rss_array;
}
$rss_item_tags = array('item', 'entry');
$rss_tags = array(
'title' => array('title'),
'description' => array('description', 'content', 'summary'),
'link' => array('link', 'feedburner'),
'category' => array('category')
);
$rssfeed = rss_to_array($rss_item_tags, $rss_tags, $url);
echo '<pre>';
print_r($rssfeed);
echo '</pre>';
exit;
?>
for RSS feeds, try using simplexml_load_file. It creates an object out of the XML and, as all RSS feeds are the same, then you can do something like:
$feed = simplexml_load_file(your_rss_url_here);
for($i=0; $i < 10; $i++){
// this is assuming there are 10 pieces of content for each RSS you're loading
$link = $feed->channel->item[$i]->link;
// do each for pubdate, author, description, title, etc.
}
http://php.net/manual/en/book.simplexml.php
Related
I want to create via php a script to convert xml to csv. I get the xml from url and with the follow code i make a csv. The problem is that the field goes vertical instead horizontal.
For example my xml is like:
<product>
<id>1001</id>
<sku>product1</sku>
<name>Product 1 Name</name>
<manufacturer>My Company</manufacturer>
</product>
<product>
<id>1002</id>
<sku>product2</sku>
<name>Product 2 Name</name>
<manufacturer>My Company</manufacturer>
</product>
<product>
<id>1003</id>
<sku>product3</sku>
<name>Product 3 Name</name>
<manufacturer>My Company</manufacturer>
</product>
And i get something like:
id,1001
sku,product1
name,"product 1"
manufacturer,My Company
id,1002
sku,product2
name,"product 2"
manufacturer,My Company
id,1003
sku,product3
name,"product 3"
manufacturer,My Company
instead this (this i want)
"id","sku","name","manufactuer"
"1001","product1","Product 1","My Company"
"1002","product2","Product 2","My Company"
"1003","product3","Product 3","My Company"
My code now is
file_put_contents("products.xml", fopen("https://xml.mysite.com/get.asp?xml=products&key=myxml", 'r'));
if (file_exists('products.xml')){
$xml = simplexml_load_file('products.xml');
file_put_contents("products.csv", "");
$f = fopen('products.csv', 'w');
createCsv($xml, $f);
fclose($f);
}
function createCsv($xml,$f){
foreach ($xml->children() as $item) {
$hasChild = (count($item->children()) > 0)?true:false;
if(!$hasChild){
$put_arr = array($item->getName(),$item);
fputcsv($f, $put_arr ,',','"');
} else {
createCsv($item, $f);
}
}
}
What i can do please?
SimpleXML (and DOM) can use Xpath to fetch elements from an XML. You would need one expression for the rows and a list of expressions for the columns.
function readRowsFromSimpleXML(
SimpleXMLElement $element, string $rowExpression, array $columnExpressions
): Generator {
foreach ($element->xpath($rowExpression) as $rowNode) {
$row = [];
foreach ($columnExpressions as $column => $expression) {
$row[$column] = (string)($rowNode->xpath($expression)[0] ?? '');
}
yield $row;
}
}
$rows = readRowsFromSimpleXML(
simplexml_load_file('products.xml'),
'//product',
$columns = [
'id' => './id',
'sku' => './sku',
'name' => './name',
'price' => './price',
'manufacturer' => './manufacturer'
]
);
readRowsFromSimpleXML(...) will return a Generator. It will not read the data yet. This will only happen if you resolve it - for example with foreach().
Addressing the row and column data explicitly keeps the output more stable. It even work if an element is missing. I added a price column to show this.
To put this into a CSV you have to iterate the generator:
$fh = fopen('php://stdout', 'w');
fputcsv($fh, array_keys($columns));
foreach ($rows as $row) {
fputcsv($fh, array_values($row));
}
Output:
id,sku,name,price,manufacturer
1001,product1,"Product 1 Name",,"My Company"
1002,product2,"Product 2 Name",,"My Company"
1003,product3,"Product 3 Name",,"My Company"
This works with more complex expressions as well. For example reading a currency attribute of the price element or multiple images:
$columns = [
'id' => './id',
'sku' => './sku',
'name' => './name',
'manufacturer' => './manufacturer',
'price' => './price',
'price' => './price/#currency',
'image0' => '(./image)[1]',
'image1' => '(./image)[2]'
]';
If you need to aggregate values, add a callback to the column definition.
function readRowsFromSimpleXML(
SimpleXMLElement $element, string $rowExpression, array $columnExpressions
): Generator {
foreach ($element->xpath($rowExpression) as $rowNode) {
$row = [];
foreach ($columnExpressions as $column => $options) {
if (is_array($options)) {
[$expression, $callback] = $options;
} else {
$expression = $options;
$callback = null;
}
$values = $rowNode->xpath($expression);
if ($callback) {
$row[$column] = $callback($values);
} else {
$row[$column] = (string)($rowNode->xpath($expression)[0] ?? '');
}
}
yield $row;
}
}
$rows = readRowsFromSimpleXML(
simplexml_load_file('products.xml'),
'//product',
$columns = [
'id' => './id',
'sku' => './sku',
// ...
'categories' => [ './category', fn ($values) => implode(',', $values) ]
]
);
Complex configuration arrays are difficult to maintain. A more encapsulated approach would be a class. The following class works with SimpleXML and DOM. The fields/columns are added with a method.
class XMLRecordsReader implements \IteratorAggregate {
private $_source;
private $_expression = './*';
private $_fields = [];
public function __construct($source) {
if ($source instanceof \SimpleXMLElement) {
$this->_source = dom_import_simplexml($source);
return;
}
if ($source instanceof \DOMNode) {
$this->_source = $source;
return;
}
throw new \InvalidArgumentException('Need SimpleXMLElement or DOMNode $source.');
}
public function setExpression(string $expression): self {
$this->_expression = $expression;
return $this;
}
public function addField(string $name, string $expression, callable $mapper = null): self {
$this->_fields[$name] = [$expression, $mapper];
return $this;
}
public function getIterator(): \Generator {
$xpath = new DOMXpath(
$this->_source instanceof DOMDocument ? $this->_source : $this->_source->ownerDocument
);
foreach ($xpath->evaluate($this->_expression, $this->_source) as $node) {
$record = [];
foreach ($this->_fields as $field => $options) {
[$expression, $mapper] = $options;
$values = $xpath->evaluate($expression, $node);
if ($mapper) {
$record[$field] = $mapper($values);
} else if ($values instanceof DOMNodeList) {
$value = $values[0] ?? null;
$record[$field] = $value->textContent ?? '';
} else {
$record[$field] = (string)($values ?? '');
}
}
yield $record;
}
}
}
$reader = new XMLRecordsReader(
simplexml_load_file('products.xml'),
);
$reader
->addField('id', './id')
->addField('sku', './sku')
->addField('name', './name')
->addField('manufacturer', './manufacturer')
->addField('price', './price')
->addField('currency', './price/#currency')
->addField('image0', '(./image)[1]')
->addField('image1', '(./image)[2]')
->addField(
'categories',
'./category',
fn (\DOMNodeList $values) => implode(
',',
array_map(
fn (\DOMNode $node) => $node->textContent,
iterator_to_array($values)
)
)
);
var_dump(iterator_to_array($reader));
I am revamping an application that is using PHP on the serverside which outputs JSON format.
{"by":"Industrie LLC","dead":false,"descendants":396,"id":"396","kids":[1,396],"score":396,"time":"396","title":"Industrie LLC","type":"comment","url":"www.nytimes.com"}
as it is i am getting the last column of mysql data.i know it is something with the loops but i have no idea what in specific.
My PHP code is here
$sql_metro_company_doc_legal = "SELECT * FROM ".$configValues['CONFIG_DB_TBL_PRE']."posts where post_type='company'";
$res_metro_company_doc_legal = $dbSocket->query($sql_metro_company_doc_legal);
while($row_metro_company_doc_legal = $res_metro_company_doc_legal->fetchRow()) {
$notice2[] = $row_metro_company_doc_legal[5];
$notice8[] = strtotime($row_metro_company_doc_legal[0]);
$notice9[] = $row_metro_company_doc_legal[0];
$notice3[] = $row_metro_company_doc_legal[0];
$notice = array("id" => "".$row_metro_company_doc_legal[1]."","title"=>"".$row_metro_company_doc_legal[0]."");
$notice10[] = $row_metro_company_doc_legal[0];
$notice6[] = $row_metro_company_doc_legal[0];
$notice11[] = $row_metro_company_doc_legal[5];
$notice7[] = strtotime($row_metro_company_doc_legal[2]);
$notice12[] = 'www.nytimes.com';
$notice7[] = "comment";
}
foreach ($notice2 as $status2) {
$_page['by'] = $status2;
}
foreach ($notice8 as $status8) {
$_page['dead'] = $status8;
}
foreach ($notice9 as $status9) {
$_page['descendants'] = (int)$status9;
}
foreach ($notice3 as $status3) {
$_page['id'] = $status3;
}
foreach ($notice as $status) {
$_page['kids'][] = (int)$status;
}
foreach ($notice10 as $status10) {
$_page['score'] = (int)$status10;
}
foreach ($notice6 as $status6) {
$_page['time'] = $status6;
}
foreach ($notice11 as $status11) {
$_page['title'] = $status11;
}
foreach ($notice7 as $status7) {
$_page['type'] = $status7;
}
foreach ($notice12 as $status12) {
$_page['url'] = $status12;
}
foreach ($notice4 as $status4) {
$_page['parent'] = (int)$status4;
}
foreach ($notice5 as $status5) {
$_page['text'] = $status5;
}
//sets the response format type
header("Content-Type: application/json");
//converts any PHP type to JSON string
echo json_encode($_page);
You need to make a 2-dimensional array in $_page.
$_page = array();
foreach ($notice2 as $i => $status) {
$_page[] = array(
'by' => $status,
'dead' => $status8[$i],
'descendants' => (int)$status9[$i],
'id' => $status3[$i],
// and so on for the rest
);
}
header ("Content-type: application/json");
echo json_encode($_page);
Despite the xpath being correct (to the best of my knowledge), this code is still outputting strangely.
By this I mean that lots of was_price and now_price values are not being scraped from the page and so are returning as £.
Any idea what's wrong?
Here's the site I'm scraping from.
Code:
function scrape($list_url, $shop_name, $photo_location, $photo_url_root, $product_location, $product_url_root, $was_price_location, $now_price_location, $gender, $country, mysqli $con)
{
$html = file_get_contents($list_url);
$doc = new DOMDocument();
libxml_use_internal_errors(TRUE);
if(!empty($html))
{
$doc->loadHTML($html);
libxml_clear_errors(); // remove errors for yucky html
$xpath = new DOMXPath($doc);
/* FIND LINK TO PRODUCT PAGE */
$products = array();
$row = $xpath->query($product_location);
/* Create an array containing products */
if ($row->length > 0)
{
foreach ($row as $location)
{
$product_urls[] = $product_url_root . $location->getAttribute('href');
}
}
else { echo "product location is wrong<br>";}
$imgs = $xpath->query($photo_location);
/* Create an array containing the image links */
if ($imgs->length > 0)
{
foreach ($imgs as $img)
{
$photo_url[] = $photo_url_root . $img->getAttribute('src');
}
}
else { echo "photo location is wrong<br>";}
$was = $xpath->query($was_price_location);
/* Create an array containing the was price */
if ($was->length > 0)
{
foreach ($was as $price)
{
$stripped = preg_replace("/[^0-9,.]/", "", $price->nodeValue);
$was_price[] = "£".$stripped;
}
}
else { echo "was price location is wrong<br>";}
$now = $xpath->query($now_price_location);
/* Create an array containing the sale price */
if ($now->length > 0)
{
foreach ($now as $price)
{
$stripped = preg_replace("/[^0-9,.]/", "", $price->nodeValue);
$now_price[] = "£".$stripped;
}
}
else { echo "now price location is wrong<br>";}
$result = array();
/* Create an associative array containing all the above values */
foreach ($product_urls as $i => $product_url)
{
$result[] = array(
'product_url' => $product_url,
'shop_name' => $shop_name,
'photo_url' => $photo_url[$i],
'was_price' => $was_price[$i],
'now_price' => $now_price[$i]
);
}
echo json_encode($result);
}
else
{
echo "this is empty";
}
}
$list_url = "http://www.asos.com/Women/Sale/70-Off-Sale/Cat/pgecategory.aspx?cid=16903&pge=0&pgesize=1002&sort=-1";
$shop_name = "ASOS";
$photo_location = "//ul[#id='items']/li/div[#class='categoryImageDiv']/*[1]/img";
$photo_url_root = "";
$product_location = "//ul[#id='items']/li/div[#class='categoryImageDiv']/*[1]";
$product_url_root = "http://www.asos.com";
$was_price_location = "//ul[#id='items']/li/div[#class='productprice']/span[#class='price' or #class='recRP rrp']"; // leave recRP rrp
$now_price_location = "//ul[#id='items']/li/div[#class='productprice']/span[#class='prevPrice previousprice' or #class='price outlet-current-price']"; // leave outlet-current-price
$gender = "f";
$country = "UK";
scrape($list_url, $shop_name, $photo_location, $photo_url_root, $product_location, $product_url_root, $was_price_location, $now_price_location, $gender, $country, $con);
I was counting the number of matches per site, and it looks like that there are 1563 hits for your was_price and only 1440 for your now_price. This tells me that either your Xpath isn't working in 100% of the cases or that some of the articles only have one price.
So you have to make sure that all of our XPath expressions return the same amount of results, so that: products = new_price = old_price = images
This question already has answers here:
How to sort a xml file using DOM
(2 answers)
Closed 10 years ago.
i have an xml file which contains around 60 books which i need to put into ascending order from the borrowedcount using php so far my code shows all the books but will not sort? any help would be really appriecated
PHP
<?php
$xmlassignDoc = new DOMDocument();
$xmlassignDoc->load("books.xml");
$books = $xmlBookDoc->getElementsByTagName("item");
foreach($books as $list)
{
$course = $list->getElementsByTagName("course");
$course = $course->item(0)->nodeValue;
//HERE is where the GET function will be
if ($course == "CC150")
{
print_r($array);
$id = $list->getAttribute("id");
echo "<b>Book ID: </b> $id <br>";
$title = $list->getElementsByTagName("title");
$title = $title->item(0)->nodeValue;
echo "<b>Title: </b> $title <br>";
$isbn = $list->getElementsByTagName("isbn");
$isbn = $isbn->item(0)->nodeValue;
echo "<b>ISBN: </b> $isbn <br>";
$borrowed = $list->getElementsByTagName("borrowedcount");
$borrowed = $borrowed->item(0)->nodeValue;
echo "<b>Borrowed Count: </b> $borrowed <br>";
echo "<br>";
}
}
//print $xmlBookDoc->saveXML();
?>
xml file
<?xml version="1.0" encoding="utf-8"?>
<bookcollection>
<items>
<item id="51390">
<title>Management of systems development /</title>
<isbn>0091653215</isbn>
<url>http://library.hud.ac.uk/catlink/bib/51390</url>
<borrowedcount>45</borrowedcount>
<courses>
<course>CC140</course>
<course>CC210</course>
</courses>
</item>
<item id="483">
<title>Database systems management and design /</title>
<isbn>0877091153</isbn>
<url>http://library.hud.ac.uk/catlink/bib/483</url>
<borrowedcount>28</borrowedcount>
<courses>
<course>CC140</course>
</courses>
</item>
<item id="585842">
<title>E-learning skills /</title>
<isbn>0230573126</isbn>
<url>http://library.hud.ac.uk/catlink/bib/585842</url>
<borrowedcount>5</borrowedcount>
<courses>
<course>CC157</course>
</courses>
</item>
My solution:
$books = array();
$xml = simplexml_load_file('books.xml');
foreach($xml->items->item as $item) {
$books[] = array(
'id' => (string)$item->attributes()->id,
'title' => (string)$item->title,
'isbn' => (string)$item->isbn,
'course' => (string)$item->courses->course[0],
'borrowed_count' => intval($item->borrowedcount)
);
}
array_sort_by_column($books, 'borrowed_count');
var_dump($books);
And the sorting function:
function array_sort_by_column(&$array, $column, $direction = SORT_ASC) {
$reference_array = array();
foreach($array as $key => $row) {
$reference_array[$key] = $row[$column];
}
array_multisort($reference_array, $direction, $array);
}
file '1.php':
<?php
include 'books.php';
$b=new books();
$arr=$b->load('books.xml'); //1. load books from xml to array
usort($arr, array('books','cmp')); //2. sort array
$b->save('out.xml',$arr); //3. save array to xml
?>
file 'books.php':
<?php
class books
{
//load books from xml to array
public function load($fname)
{
$doc=new DOMDocument();
if($doc->load($fname)) $res=$this->parse($doc);
else throw new Exception('error load XML');
return $res;
}
static public function cmp($a, $b)
{
if ($a['fields']['borrowedcount'] == $b['fields']['borrowedcount']) {
return 0;
}
return ($a['fields']['borrowedcount'] < $b['fields']['borrowedcount']) ? -1 : 1;
}
private function parse($doc)
{
$xpath = new DOMXpath($doc);
$items = $xpath->query("items/item");
$result = array();
foreach($items as $item)
{
$result[]=array('id'=>$item->getAttribute('id'), 'fields'=>$this->parse_fields($item));
}
return $result;
}
private function parse_fields($node)
{
$res=array();
foreach($node->childNodes as $child)
{
if($child->nodeType==XML_ELEMENT_NODE)
{
$res[$child->nodeName]=$this->get_value($child);
}
}
return $res;
}
private function get_value($node)
{
if($node->nodeName=='courses')
{
$res=array();
foreach($node->childNodes as $child)
{
if($child->nodeType==XML_ELEMENT_NODE)
{
$res[]=$child->nodeValue;
}
}
return $res;
}
else
{
return $node->nodeValue;
}
}
//save array to xml
public function save($fname, $rows)
{
$doc = new DOMDocument('1.0','utf-8');
$doc->formatOutput = true;
$bc = $doc->appendChild($doc->createElement('bookcollection'));
$items = $bc->appendChild($doc->createElement('items'));
foreach($rows as $row)
{
$item=$items->appendChild($doc->createElement('item'));
$item->setAttribute('id',$row['id']);
foreach($row['fields'] as $field_name=>$field_value)
{
$f=$item->appendChild($doc->createElement($field_name));
if($field_name=='courses')
{
foreach($field_value as $course_val)
{
$course=$f->appendChild($doc->createElement('course'));
$course->appendChild($doc->createTextNode($course_val));
}
}
else
{
$f->appendChild($doc->createTextNode($field_value));
}
}
}
file_put_contents($fname, $doc->saveXML());
}
}
?>
this my code. It create xml file from mysql..
my problem:
for($i=0; $i<count($str_exp1); $i++) // HERE
{
$str_exp2 = explode(",", $str_exp1[$i]);
$newnode->setAttribute("lat", $str_exp2[0]);
$newnode->setAttribute("lng", $str_exp2[1]);
}
for not show the all data... it only show me latest one data.. i cant find where is there problem..
P.S. Sorry for my english
0
$doc = new DOMDocument("1.0");
$node = $doc->createElement("marker");
$parnode = $doc->appendchild($node);
$result = mysql_query("SELECT * FROM usersline");
if(mysql_num_rows($result)>0)
{
header("Content-type: text/xml");
while ($mar = mysql_fetch_array($result))
{
$node = $doc->createElement("line");
$newnode = $parnode->appendChild($node);
$newnode->setAttribute("id_line", $mar['id_line']);
$newnode->setAttribute("color", $mar['colour']);
$newnode->setAttribute("width", $mar['width']);
$node = $doc->createElement("point");
$newnode = $parnode->appendChild($node);
$str_exp1 = explode(";", $mar['coordinats']);
for($i=0; $i<count($str_exp1); $i++) // HERE
{
$str_exp2 = explode(",", $str_exp1[$i]);
$newnode->setAttribute("lat", $str_exp2[0]);
$newnode->setAttribute("lng", $str_exp2[1]);
}
}
$xmlfile = $doc->saveXML();
echo $xmlfile;
}
else
{
echo "<p>Ëèíèé íå îáíàðóæåíî!</p>";
}
Your problem is that you set multiple values to the same node. So you are always overwriting the attribute values with the latest lat/long value.
Instead you need to add a new element per each lat/long pair because XML elements do not have duplicate attributes.
Some example code based on your question, as you can see I introduce some functions to keep things more modular:
$result = $db->query("SELECT * FROM usersline");
if (!$result || !count($result)) {
echo "<p>Ëèíèé íå îáíàðóæåíî!</p>";
return;
}
$doc = new DOMDocument("1.0");
$doc->loadXML('<marker/>');
$marker = $doc->documentElement;
foreach ($result as $mar) {
$line = $doc->createElement('line');
$attributes = array_map_array(['id_line', 'colour' => 'color', 'width'], $mar);
element_add_attributes($line, $attributes);
foreach (coordinates_to_array($mar['coordinats']) as $latlong) {
$point = $doc->createElement('point');
element_add_attributes($point, $latlong);
$line->appendChild($point);
}
$marker->appendChild($line);
}
header("Content-type: text/xml");
echo $doc->saveXML();
function element_add_attributes(DOMElement $element, array $attributes)
{
foreach ($attributes as $name => $value) {
if (!is_string($name)) continue;
$element->setAttribute($name, $value);
}
}
function array_map_array(array $map, array $array)
{
$result = array();
foreach ($map as $alias => $name) {
$source = is_string($alias) ? $alias : $name;
$result[$name] = $array[$source];
}
return $result;
}
function coordinates_to_array($coordinates)
{
$result = array();
$coordinatePairs = explode(";", $coordinates);
foreach ($coordinatePairs as $coordinatePair) {
list($pair['lat'], $pair['lng']) = explode(',', $coordinatePair, 2) + ['', ''];
$result[] = $pair;
}
return $result;
}
I hope this example is helpful and shows you some ways how you can put a problem apart so that your code becomes more easy and more stable.
To make use of $db->query(...) first define a class that has the query method:
class DB {
public function query($sql) {
$dbhandle = mysql_query($sql);
$result = array();
while ($mar = mysql_fetch_array($dbhandle))
$result[] = $mar
;
return $result;
}
}
Then instantiate it:
$db = new DB();
You can then use the code above for that part.
For the problem with the PHP 5.4 array notation for example in this line:
$attributes = array_map_array(['id_line', 'colour' => 'color', 'width'], $mar);
First of all extract the array out of it:
$mapping = ['id_line', 'colour' => 'color', 'width'];
$attributes = array_map_array($mapping, $mar);
Then define the array with the array( and ) notation instead of [ and ]:
$mapping = array('id_line', 'colour' => 'color', 'width');
$attributes = array_map_array($mapping, $mar);
Do so as well in other places, e.g.
['', '']
becomes
array('', '')
and similar.
Replace your code with this:
$str_exp1 = explode(";", $mar['coordinats']);
$newnode->setAttribute("lat", $str_exp1[0]);
$newnode->setAttribute("lng", $str_exp1[1]);