Cron not creating file - php

With the following code I'm creating a xml file with the info obtained from my database:
<?php
//include 'config.php';
include '/var/www/html/folder/config.php';
$now=date('Y-m-d h:i:s');
echo "Date: ".$now."<br><br>";
$sql="SELECT * FROM awards WHERE active=3";
$result=mysql_query($sql);
// create doctype
$dom = new DOMDocument("1.0");
// create root element
$root = $dom->createElement("data");
$dom->appendChild($root);
$dom->formatOutput=true;
while($data=mysql_fetch_array($result)){
echo $data['title'];
// create ITEM
$item = $dom->createElement("item");
$root->appendChild($item);
// ID DOM
$subitem = $dom->createElement("id");
$item->appendChild($subitem);
$text = $dom->createTextNode($data['id']);
$subitem->appendChild($text);
// title DOM
$subitem = $dom->createElement("title");
$item->appendChild($subitem);
$text = $dom->createTextNode($data['title']);
$subitem->appendChild($text);
}
if(unlink ("api/2.xml")){
echo "deleted<br>";
}
if($dom->save("api/2.xml")){
echo "created";
}
?>
This is working with no problem, file 2.xml is created, when I execute it manually.
But when I add it to the crontab the log shows that the cron is being executed (I obtain the date echoed at the beginning of the script and also the title echoed inside the while loop) but the 2.xml file is not created.
Any clues why is it not created?

If you migrate a script to cron than you always need to check two things:
File permissions, the cron job might get executed with different rights (Reminder: root is not the solution to everything).
Implicit paths, the cron job will have a different working directory.
We can't check the file permissions for you, but I can tell you that you're using implicit paths which, most likely, can not work in that form:
if(unlink("api/2.xml")){
echo "deleted<br>";
}
if($dom->save("api/2.xml")){
echo "created";
}
You now have the folder api floating around somewhere in your filesystem. Use absolute paths and you're good to go.

Related

Remove Prestashop orphan images not stored in DB

I need to clean a shop running Prestashop, actually 1.7, since many years.
With this script I removed all the images in the DB not connected to any product.
But there are many files not listed in the DB. For example, actually I have 5 image sizes in settings, so new products shows 6 files in the folder (the 5 above and the imageID.jpg file) but some old product had up to 18 files. Many of these old products have been deleted but in the folder I still find all the other formats, like "2026-small-cart.jpg".
So I tried creating a script to loop in folders, check image files in it and verify if that id_image is stored in the DB.
If not, I can delete the file.
It works but obviously the loop is huge and it stops working as long as I change the starting path folder.
I've tried to reduce the DB queries storing some data (to delete all the images with the same id with a single DB query), but it still crashes as I change the starting path.
It only works with two nested loops (really few...).
Here is the code. Any idea for a better way to get the result?
Thanks!
$shop_root = $_SERVER['DOCUMENT_ROOT'].'/';
include('./config/config.inc.php');
include('./init.php');
$image_folder = 'img/p/';
$image_folder = 'img/p/2/0/3/2/'; // TEST, existing product
$image_folder = 'img/p/2/0/2/6/'; // TEST, product deleted from DB but files in folder
//$image_folder = 'img/p/2/0/2/'; // test, not working...
$scan_dir = $shop_root.$image_folder;
// will check only images...
global $imgExt;
$imgExt = array("jpg","png","gif","jpeg");
// to avoid multiple queries for the same image id...
global $lastID;
global $delMode;
echo "<h1>Examined folder: $image_folder</h1>\r\n";
function checkFile($scan_dir,$name) {
global $lastID;
global $delMode;
$path = $scan_dir.$name;
$ext = substr($name,strripos($name,".")+1);
// if is an image and file name starts with a number
if (in_array($ext,$imgExt) && (int)$name>0){
// avoid extra queries...
if ($lastID == (int)$name) {
$inDb = $lastID;
} else {
$inDb = (int)Db::getInstance()->getValue('SELECT id_product FROM '._DB_PREFIX_.'image WHERE id_image ='.((int) $name));
$lastID = (int)$name;
$delMode = $inDb;
}
// if haven't found an id_product in the DB for that id_image
if ($delMode<1){
echo "- $path has no related product in the DB I'll DELETE IT<br>\r\n";
//unlink($path);
}
}
}
function checkDir($scan_dir,$name2) {
echo "<h3>Elements found in the folder <i>$scan_dir$name2</i>:</h3>\r\n";
$files = array_values(array_diff(scandir($scan_dir.$name2.'/'), array('..', '.')));
foreach ($files as $key => $name) {
$path = $scan_dir.$name;
if (is_dir($path)) {
// new loop in the subfolder
checkDir($scan_dir,$name);
} else {
// is a file, I'll check if must be deleted
checkFile($scan_dir,$name);
}
}
}
checkDir($scan_dir,'');
I would create two files with lists of images.
The first file is the result of a query from your database of every image file referenced in your data.
mysql -BN -e "select distinct id_image from ${DB}.${DB_PREFIX}image" > all_image_ids
(set the shell variables for DB and DB_PREFIX first)
The second file is every image file currently in your directories. Include only files that start with a digit and have an image extension.
find img/p -name '[0-9]*.{jpg,png,gif,jpeg}' > all_image_files
For each filename, check if it's in the list of image ids. If not, then output the command to delete the file.
cat all_image_files | while read filename ; do
# strip the directory name and convert filename to an integer value
b=$(basename $filename)
image_id=$((${b/.*/}))
grep -q "^${image_id}$" all_image_ids || echo "rm ${filename}"
done > files_to_delete
Read the file files_to_delete to visually check that the list looks right. Then run that file as a shell script:
sh files_to_delete
Note I have not tested this solution, but it should give you something to experiment with.

cache folder not functioning, strange file names being saved (php)

Have a project where I'm scraping a few sites with data, then outputting onto one site. To help with load times, I'm trying to rig it so once every 10 mins, my main website does a full data scrape, then stores it all into a cache folder called "cache", stored in the root folder. Then, anytime I refresh main site after that 10 mins, it pulls from the cache, making load times quite fast at that point.
Trouble is, load times haven't changed, which it really should using this method, so I'm doing something wrong. Would appreciate any help. Now I can confirm the data IS being stored in the cache, because I see the files automatically appearing there. So the issue has to be that the code is broken where specified to grab the data from cache, after it's stored every 10 minutes, it's not grabbing the data.
*part of me wonders if the issue is with how the filenames are being saved in cache, right now it seems to be random values. for ex, one is named f32dd7f0b85eb4c1be0bb9a417cc29ea553d898e.html
I'd think it needs to be saved as a specific file name. Not sure how to achieve that though. The code at the end of my php reference files seem to specify this, so not sure issue. The code that is supposed to be doing this is at the bottom of the post.
I'm really new to php, and honestly have only gotten this far through some very nice and helpful people. I'm close, but not quite there yet with this cache framework.
global.php in root folder:
<?php
$_cache_time =600; //10 minutes
$_cache_dir="./cache"; //cache dir
function deleteBlankInArray($var){
return !ctype_space($var)&&!empty($var);
}
function cache_start($filename)
{
global $_cache_dir,$_cache_time;
$cachefile = $_cache_dir.'/'.sha1($filename).'.html';
ob_start();
if(file_exists($cachefile) && (time( )-$_cache_time <
filemtime($cachefile)))
{
include($cachefile);
ob_flush();
return true;
}
return false;
}
function cache_end($filename)
{
global $_cache_dir,$_cache_time;
$cachefile = $_cache_dir.'/'.sha1($filename).'.html';
$fp = fopen($cachefile, 'w');
fwrite($fp, ob_get_contents());
fclose($fp);
ob_flush();
}
My main website, is an xhtml site. It's referencing these php pages like this:
<?php include 's&pcurrent.php';?>
<?php include 'news.php';?>
It's referencing/outputting multiple php files, which is why load times are slow, if not pulling from cache.
And lastly, this is an example of one of my php files that are being "included". This one is called litecoinchange.php
<?php
error_reporting(E_ALL^E_NOTICE^E_WARNING);
include_once "global.php";
//filename of the file
if(!cache_start("litecoinchange.php")){
$doc = new DOMDocument;
// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile('https://coinmarketcap.com/');
$xpath = new DOMXPath($doc);
$query = "//tr[#id='id-litecoin']";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
$result = trim($entry->textContent);
$ret_ = explode(' ', $result);
//make sure every element in the array don't start or end with blank
foreach ($ret_ as $key=>$val){
$ret_[$key]=trim($val);
}
//delete the empty element and the element is blank "\n" "\r" "\t"
//I modify this line
$ret_ = array_values(array_filter($ret_,deleteBlankInArray));
//echo the last element
echo $ret_[7];
//filename of the file
cache_end("litecoinchange");
}
}

php scraper scripts need to be changed

this script harvests links out of a seed url and only prints them in command shell (or browser) rather than saving elsewhere. I want the script to store any outputs in .txt file within the folder where the script resides. I need suggestions what could be the efficient way to do that. Please give me hints.
<?php
# Initialization
include("LIB_http.php"); // http library
include("LIB_parse.php"); // parse library
include("LIB_resolve_addresses.php"); // address resolution library
include("LIB_exclusion_list.php"); // list of excluded keywords
include("LIB_simple_spider.php"); // spider routines used by this app.
set_time_limit(3600); // Don't let PHP timeout
$SEED_URL = "http://www.schrenk.com"; // First URL spider downloads
$MAX_PENETRATION = 1; // Set spider penetration depth
$FETCH_DELAY = 1; // Wait one second between page fetches
$ALLOW_OFFISTE = false; // Don't allow spider to roam from the SEED_URL's domain
$spider_array = array();
# Get links from $SEED_URL
echo "Harvesting Seed URL \n";
$temp_link_array = harvest_links($SEED_URL);
$spider_array = archive_links($spider_array, 0, $temp_link_array);
# Spider links in remaining penetration levels
for($penetration_level=1; $penetration_level<=$MAX_PENETRATION; $penetration_level++)
{
$previous_level = $penetration_level - 1;
for($xx=0; $xx<count($spider_array[$previous_level]); $xx++)
{
unset($temp_link_array);
$temp_link_array = harvest_links($spider_array[$previous_level][$xx]);
echo "Level=$penetration_level, xx=$xx of ".count($spider_array[$previous_level])." <br>\n";
$spider_array = archive_links($spider_array, $penetration_level, $temp_link_array);
}
}
?>
Use file_put_contents PHP function with enable append file flag.
$file = 'file_name.txt';
file_put_contents($file, $text_to_write_to_file, FILE_APPEND);
Ref: http://www.php.net/manual/en/function.file-put-contents.php
I would recommend first creating a variable to store the output in the script. So at the top (under the $spider_array=array() ) add:
$output = "";
The change all the lines with echo to be $output .=
This will store all the content sent to the screen or the browser into the $output variable.
Now at the bottom of the script, after everything has been scraped and the spider is finished, save the output to a file:
$filename = date('Y_m_d_H_i_s') . '.txt';
$filepath = dirname(__FILE__);
file_put_contents($filepath . '/' . $filename, $output);
This should save the output in a file within the same folder as the script with a date/time file name. (This code was written using examples from php.net, exact implementation may need a bit of debugging, but this should get you close enough.

Do While LOOP - best approach

I have the right PHP scripting to create a random number and make a new folder on the server with that # as it's name. If the folder exists the script stops. What I can't figure out though is how to direct the script to generate a new random # if the folder already exists and try again until it finds a unused number/folder. I think a do while is what I'm looking for but not sure if I have written it correctly or not (Don't want to test it on the server for fear of creating a forever looping mkdir command).
Here is the one off code being used
<?php
$clientid = rand(1,5);
while (!file_exists("clients/$clientid"))
{
mkdir("clients/$clientid", 0755, true);
exit("Your new business ID is($clientid)");
}
echo ("The client id is $clientid");
?>
Here is the do while I am contemplating - is this correct or do I need to do this a different way?
<?php
$clientid = rand(1,5);
do {mkdir("clients/$clientid", 0755, true);
exit("Your new business ID is($clientid)");}
while (!file_exists("clients/$clientid"));
echo ("The client id is $clientid");
?>
The problem is that you only generate a new number once, outside the loop. This means that you end up with a loop that never terminates. Invert the loop and and generate a new number each iteration:
$clientid = rand(1,5);
while (file_exists("clients/$clientid"))
{
// While we are in here, the file exists. Generate a new number and try again.
$clientid = rand(1,5);
}
// We are now guaranteed that we have a unique filename.
mkdir("clients/$clientid", 0755, true);
exit("Your new business ID is($clientid)");
I would do something like this:
<?php
$filename = md5(time().rand()) . ".txt";
while(is_file("clients/$filename")){
$filename = md5(time().rand()) . ".txt";
}
touch("clients/$filename");
useful tip for when your testing code on a while loop; create variable as a safety count and increment it then if your other logic causes an infinite problem it breaks out, like this:
$safetyCount = 0;
while (yourLogic && $safeCount < 500){
//more of your logic
$safetyCount++;
}
obviously if you need 500 lower / higher then set it to whatever, this just makes sure you'll not kill your machine. :)

Best way to move entire directory tree in PHP

I would like to move a file from one directory to another. However, the trick is, I want to move file with entire path to it.
Say I have file at
/my/current/directory/file.jpg
and I would like to move it to
/newfolder/my/current/directory/file.jpg
So as you can see I want to retain the relative path so in the future I can move it back to /my/current/directory/ if I so require. One more thing is that /newfolder is empty - I can copy anything in there so there is no pre-made structure (another file may be copied to /newfolder/my/another/folder/anotherfile.gif. Ideally I would like to be able to create a method that will do the magic when I pass original path to file to it. Destination is always the same - /newfolder/...
You may try something like this if you're in an unix/linux environment :
$original_file = '/my/current/directory/file.jpg';
$new_file = "/newfolder{$original_file}";
// create new diretory stricture (note the "-p" option of mkdir)
$new_dir = dirname($new_file);
if (!is_dir($new_dir)) {
$command = 'mkdir -p ' . escapeshellarg($new_dir);
exec($command);
}
echo rename($original_file, $new_file) ? 'success' : 'failed';
you can simply use the following
<?php
$output = `mv "/my/current/directory/file.jpg" "/newfolder/my/current/directory/file.jpg"`;
if ($output == 0) {
echo "success";
} else {
echo "fail";
}
?>
please note I'm using backtick ` to execute instead of using function like exec

Categories