I have an XML file that contains the SKU of products. I also have a folder that corresponds to this XML file.
Snippet from XML:
<Feed>
<Product>
<ItemCode>ALT-AAB-BL</ItemCode>
<BaseItemCode>ALT-AAB</BaseItemCode>
<StockCheckCode>ALT-AAB-BL</StockCheckCode>
</Product>
<Product>
<ItemCode>ALT-AAB-L</ItemCode>
<BaseItemCode>ALT-AAB</BaseItemCode>
<StockCheckCode>ALT-AAB-L</StockCheckCode>
</Product>
<Product>
<ItemCode>ALT-AAB-N</ItemCode>
<BaseItemCode>ALT-AAB</BaseItemCode>
<StockCheckCode>ALT-AAB-N</StockCheckCode>
</Product>
</Feed>
I have been trying it with php but I am a junior and dont know where to start so I will give you some pseudo code.
if $domelement->ItemCode != filename.jpg{
delte.jpg;
}
Yes this pseudo code is terrible. I basically am able to pull in the .xml file and was able to manipulate data.
I basically want to delete the files that is not present in the xml file and preserve the rest. I know how to appedn the ItemCode with .png if I need to.
<?php
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load('altitude.xml');
$xpath = new DOMXPath($dom);
$query = sprintf('/Feed/Product/BaseItemCode');
foreach($xpath->query($query) as $record) {
//delete file that is not present in BaseItemCode
}
I just want the files not present in xml->BaseItemCode (which I will append with .png or .jpg) to be deleted from the folder.
You need two lists: whitelist from XML and all items list from system.
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load('altitude.xml');
$xpath = new DOMXPath($dom);
$query = sprintf('/Feed/Product/BaseItemCode');
$xmlList = [];
foreach($xpath->query($query) as $record) {
$xmlList[] = $record->ItemCode . '.jpg';
$xmlList[] = $record->ItemCode . '.png'; // If you can, use smarter way
}
$directory = '/full/path/to/dir';
$dirList = array_diff(scandir($directory), array('..', '.'));
$filesToDelete = array_diff($dirList, $xmlList);
foreach ($filesToDelete as $file) {
unlink($directory . DIRECTORY_SEPARATOR . $file);
}
#justinas 's method worked. I had this weird space at the end of every single array element imported from my CSV file. I converted my XML to a CSV file and used it as an array.
<?php
$csv = file('convertcsv.csv');
function test_alter(&$item1, $key, $prefix)
{
$item1 = "$item1$prefix";
}
array_walk($csv, 'test_alter', '.png');
//var_dump($csv);
$directory = 'img';
$dirList = array_diff(scandir($directory), array('..', '.'));
$filesToDelete = array_diff($dirList, $csv);
foreach ($filesToDelete as $file) {
unlink($directory . DIRECTORY_SEPARATOR . $file);
}
echo "klaar"
?>
Can anyone tell me why there is a blank space after every single element in the array if you use:
$csv = file('convertcsv.csv');
as an array?
Related
The following php script gives count of elements in a single xml file in the folder uploads. But I have number of xml files in the folder. What to modify in the following script so that I get result in tabular format with the file name and element count for all the xml files in the folder.
<?php
$doc = new DOMDocument;
$xml = simplexml_load_file("uploads/test.xml");
//file to SimpleXMLElement
$xml = simplexml_import_dom($xml);
print("Number of elements: ".$xml->count());
?>
You're first loading the XML file into a SimpleXMLElement then import it into a DOMElement and call the method count() on it. This method does not exists on DOMElement - only on SimpleXMLElement. So the import would not be necessary.
You can use a GlobIterator to iterate the files:
$directory = __DIR__.'/uploads';
// get an iterator for the XML files
$files = new GlobIterator(
$directory.'/*.xml', FilesystemIterator::CURRENT_AS_FILEINFO
);
$results = [];
foreach ($files as $file) {
// load file using absolute file path
// the returned SimpleXMLElement wraps the document element node
$documentElement = simplexml_load_file($file->getRealPath());
$results[] = [
// file name without path
'file' => $file->getFilename(),
// "SimpleXMLElement::count()" returns the number of children of an element
'item-count' => $documentElement->count(),
];
}
var_dump($results);
With DOM you can use Xpath to fetch specific values from the XML.
$directory = __DIR__.'/uploads';
// get an iterator for the XML files
$files = new GlobIterator(
$directory.'/*.xml', FilesystemIterator::CURRENT_AS_FILEINFO
);
// only one document instance is needed
$document = new DOMDocument();
$results = [];
foreach ($files as $file) {
// load the file into the DOM document
$document->load($file->getRealPath());
// create an Xpath processor for the loaded document
$xpath = new DOMXpath($document);
$results[] = [
'file' => $file->getFilename(),
// use an Xpath expression to fetch the value
'item-count' => $xpath->evaluate('count(/*/*)'),
];
}
var_dump($results);
The Xpath Expression
Get the document element /*
Get the child elements of the document element /*/*
Count them count(/*/*)
* is an universal selector for any element node. If you can you should be more specific and use the actual element names (e.g. /list/item).
First, create a function with the logic you have:
function getXML($path) {
$doc = new DOMDocument;
$xml = simplexml_load_file($path);
//file to SimpleXMLElement
$xml = simplexml_import_dom($xml);
return $xml;
}
Note that I:
have converted the path into a parameter, so you can reuse the same logic for your files
separated the parsing of XML from showing it
returned the XML itself, so you can get the count or you can do whatever else you may want with it
This is how you can get the files of a given path:
$files = array_diff(scandir('uploads'), array('.', '..'));
we get all files except for . and .., which are surely not of interest here. Read more about scandir here: https://www.php.net/manual/en/function.scandir.php
You received an array of filenames on success, so, let's loop it and perform the logic you need:
$xmls = [];
foreach ($files as $file) {
if (str_ends_with($file, '.xml')) {
$xmls[] = $file . "\t" . getXML('uploads/' . $file)->count();
}
}
echo implode("\n", $xmls);
EDIT
As #Juan kindly explained in the comment section, one can use
$files = glob("./uploads/*.xml");
instead of scandir and that would ensure that we no longer need a call for array_diff and later we can avoid the if inside the loop:
$xmls = [];
foreach ($files as $file) {
$xmls[] = $file . "\t" . getXML('uploads/' . $file)->count();
}
echo implode("\n", $xmls);
I have this xml code, and I need to get every value from . I've tried but what I get is only the first value of . I wonder what's wrong with my code.
Here's the xml code:
<item>
<g:detailed_images>
<g:detailed_image>hat.png</g:detailed_image>
<g:detailed_image>tie.png</g:detailed_image>
<g:detailed_image>eye_glass.png</g:detailed_image>
<g:detailed_image>watch.png</g:detailed_image>
</g:detailed_images>
</item>
<item>
<g:detailed_images>
<g:detailed_image>shoe.png</g:detailed_image>
<g:detailed_image>socks.png</g:detailed_image>
<g:detailed_image>hand_gloves.png</g:detailed_image>
<g:detailed_image>scarf.png</g:detailed_image>
</g:detailed_images>
</item>
And this is my code:
foreach($xpath->evaluate('//item') as $item)
{
$detailed_images = $xpath->evaluate('g:detailed_images', $item);
foreach ($detailed_images as $img)
{
$simg = $xpath->evaluate('string(g:detailed_image)', $img);
echo 'image = ';
echo $simg;
}
}
My result is:
image = hat.png
image = shoe.png
While what I want is this:
image = hat.png
image = tie.png
image = eye_glass.png
image = watch.png
image = shoe.png
image = socks.png
image = hand_gloves.png
image = scarf.png
Thanks for the help.
As you can see, you're only getting the first detailed_image of each detailed_images. So, keeping the way you're doing it, you'd need to have another foreach on $simg and print each resulting node. But you don't need to do all that XPath querying to get those elements. You can get there just fine with only one query:
//item/g:detailed_images/g:detailed_image
PHP Code:
$dom = new DOMDocument;
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
foreach($xpath->evaluate('//item/g:detailed_images/g:detailed_image') as $item) {
var_dump($item->nodeValue);
}
Demo
I receive this error while i am trying to combine my xml files.I read other questions and answers put i could not find any solution for my code. I cannot increase ram of computer. Here is my code
public function mergeXml ($filename,$source){
$events = array();
// open each xml file in this directory
foreach(glob("$source/*.xml") as $files) {
// get the contents of the the current file
$events[] =$files; // throw all files into an array .
}
// Replace the strings below with the actual filenames, add or decrease as fit
$out = new \DOMDocument();
$root = $out->createElement("documents");
foreach ($events as $file) { //get each file from array
$obj = new \DOMDocument();
$obj->load($file); //load files to obj.
$xpath = new \DOMXPath($obj);
foreach ($xpath->query("/*/node()") as $node)
$root->appendChild($out->importNode($node, true)); }
$out->appendChild($root);
file_put_contents("$source/$filename.xml",$out->saveXML());
i have been stuck for days and cant seem to figure this out.
I am reading a dir and taking those filenames and creating a xml file.
issue is only want to use video files or image files. the xml is then read into a table with pagination. so i only want files i will use, not .txt, .php or other .xml in that directory.
here is code i have now:
if ($handle = opendir('recordings')) {
while (false !== ($entry = readdir($handle))) {
if ($entry != "." && $entry != "..") {
$loadThisUrl = "recordings/recordings.xml";
$xmldoc = new DomDocument( '1.0' );
$xmldoc->preserveWhiteSpace = false;
$xmldoc->formatOutput = true;
if( $xml = file_get_contents( $loadThisUrl ) ) {
$xmldoc->loadXML( $xml, LIBXML_NOBLANKS );
// find the channels tag
$root = $xmldoc->getElementsByTagName('channels')->item(0);
// create the <channel> tag
$channel= $xmldoc->createElement('channel');
// add ID to <channel> tag
$id = $xmldoc->getElementsByTagName('channel')->length;
$idAttribute = $xmldoc->createAttribute("id");
$idAttribute->value = $id;
// add name to <channel> tag
$nameAttribute = $xmldoc->createAttribute("name");
$nameAttribute->value = $entry;
// add url to <channel> tag
$urlAttribute = $xmldoc->createAttribute("url");
$urlAttribute->value = $entry;
// add category to <channel> tag
$categoryAttribute = $xmldoc->createAttribute("category");
$categoryAttribute->value = "recordings";
// add quality to <channel> tag
$qualityAttribute = $xmldoc->createAttribute("quality");
$qualityAttribute->value = "best";
$channel->appendChild($idAttribute);
$channel->appendChild($urlAttribute);
$channel->appendChild($nameAttribute);
$channel->appendChild($categoryAttribute);
$channel->appendChild($qualityAttribute);
// add the product tag before the first element in the <headercontent> tag
$root->insertBefore( $channel, $root->firstChild );
$xmldoc->save($loadThisUrl);
}
}
}
closedir($handle);
}
I will try to explain as well as possible what I'm trying to do.
I have a folder on a server with about 100 xml files. These xml files are content pages with text and references to attachment filenames on the server that will be pushed to a wiki through an API.
It's all working fine 1 XML file at a time but I want to loop through each one and run my publish script on them.
I tried with opendir and readdir and although it doesn't error it only picks up the one file anyway.
Could someone give me an idea what I have to do. I'm very new to PHP, this is my first PHP project so my code is probably not very pretty!
Here's my code so far.
The functions that gets the XML content from the XML file:
<?php
function gettitle($file)
{
$xml = simplexml_load_file($file);
$xmltitle = $xml->xpath('//var[#name="HEADLINE"]/string');
return $xmltitle[0];
}
function getsummary($file)
{
$xml = simplexml_load_file($file);
$xmlsummary = $xml->xpath('//var[#name="summary"]/string');
return $xmlsummary[0];
}
function getsummarymore($file)
{
$xml = simplexml_load_file($file);
$xmlsummarymore = $xml->xpath('//var[#name="newslinetext"]/string');
return $xmlsummarymore[0];
}
function getattachments($file)
{
$xml = simplexml_load_file($file);
$xmlattachments = $xml->xpath('//var[#name="attachment"]/string');
return $xmlattachments[0];
}
?>
Here's the main publish script which pushes the content to the wiki:
<?php
// include required classes for the MindTouch API
include('../../deki/core/dream_plug.php');
include('../../deki/core/deki_result.php');
include('../../deki/core/deki_plug.php');
//Include the XML Variables
include('loadxmlfunctions.php');
//Path to the XML files on the server
$path = "/var/www/dekiwiki/skins/importscript/xmlfiles";
// Open the XML file folder
$dir_handle = #opendir($path) or die("Unable to open $path");
// Loop through the files
while ($xmlfile = readdir($dir_handle)) {
if($xmlfile == "." || $xmlfile == ".." || $xmlfile == "index.php" )
continue;
//Get XML content from the functions and put in the initial variables
$xmltitle = gettitle($xmlfile);
$xmlsummary = getsummary($xmlfile);
$xmlsummarymore = getsummarymore($xmlfile);
$xmlattachments = getattachments($xmlfile);
//Build the variables for the API from the XML content
//Create the page title - replace spaces with underscores
$pagetitle = str_replace(" ","_",$xmltitle);
//Create the page path variable
$pagepath = '%252f' . str_replace("'","%27",$pagetitle);
//Strip HTML from the $xmlsummary and xmlsummarymore
$summarystripped = strip_tags($xmlsummary . $xmlsummarymore, '<p><a>');
$pagecontent = $summarystripped;
//Split the attachments into an array
$attachments = explode("|", $xmlattachments);
//Create the variable with the filenames
$pagefilenames = '=' . $attachments;
$pagefilenamefull = $xmlattachments;
//Create the variable with the file URL - Replace the URL below to the correct one
$pagefileurl = 'http://domain/skins/importscript/xmlfiles/';
//authentication
$username = 'admin';
$password = 'password';
// connect via proxy
$Plug = new DreamPlug('http://domain/#api');
// setup the deki api location
$Plug = $Plug->At('deki');
//authenticate with the following details
$authResult = $Plug->At('users', 'authenticate')->WithCredentials($username, $password)->Get();
$authToken = $authResult['body'];
$Plug = $Plug->With('authtoken', $authToken);
// Upload the page content - http://developer.mindtouch.com/Deki/API_Reference/POST:pages//%7Bpageid%7D//contents
$Plug_page = $Plug->At('pages', '=Development%252f' . $pagetitle, 'contents')->SetHeader('Expect','')->Post($pagecontent);
// Upload the attachments - http://developer.mindtouch.com/MindTouch_Deki/API_Reference/PUT:pages//%7Bpageid%7D//files//%7Bfilename%7D
for($i = 0; $i < count($attachments); $i++){
$Plug_attachment = $Plug->At('pages', '=Development' . $pagepath, 'files', '=' . $attachments[$i])->SetHeader('Expect','')->Put($pagefileurl . $attachments[$i]);
}
}
//Close the XMl file folder
closedir($dir_handle);
?>
Thanks for any help!
To traverse a directory of XML files you can just do:
$files = glob("$path/*.xml");
foreach($files as $file)
{
$xml = simplexml_load_file($file);
$xmltitle = gettitle($xml);
$xmlsummary = getsummary($xml);
$xmlsummarymore = getsummarymore($xml);
$xmlattachments = getattachments($xml);
}
I also recommend you make a minor adjustment to your code so simplexml doesn't need to parse the same file four times to get the properties you need:
function gettitle($xml)
{
$xmltitle = $xml->xpath('//var[#name="HEADLINE"]/string');
return $xmltitle[0];
}
function getsummary($xml)
{
$xmlsummary = $xml->xpath('//var[#name="summary"]/string');
return $xmlsummary[0];
}
function getsummarymore($xml)
{
$xmlsummarymore = $xml->xpath('//var[#name="newslinetext"]/string');
return $xmlsummarymore[0];
}
function getattachments($xml)
{
$xmlattachments = $xml->xpath('//var[#name="attachment"]/string');
return $xmlattachments[0];
}
Try changing your while loop to and see if that helps out better:
while (false !== ($xmlfile = readdir($dir_handle)))
Let me know.
EDIT:
By using the old way, there could have been a directory name that could have evaluated to false and stopped the loop, the way I suggested is considered the right way to loop over a directory while using readdir taken from here