Repetitively Retrieve Data from Site via PHP - php

When accessing http://www.example.net, a CSV file is downloaded with the most current data regarding that site. I want to have my site, http://www.example.com, access http://www.example.net on an hour by hour basis in order to get updated information.
I want to then use the updated information stored in the CSV file to compare changes from data in previous CSV files. I obviously have no idea what the best plan of attack would be so any help would be appreciated. I am just looking for a general outline of how I should proceed, but the more information the better.
By the way, I'm using a LAMP bundle so PHP and mySQL solutions are preferred.

I think the most easy way for you to handle this would be to have a cron job running every hour (or scheduled task if are on windows), downloading the CSV with curl or file_get_contents(manual). When you have downloaded the CSV you can import new data in your MySQL database.
The CSV should have some kind of timestamp on every row so you can easily separate new and old data.
Also handling XML would be better then plain CSV.
A better way to setup that would be you to create a webservice on http://www.example.com and update in real time from your http://www.example.net. But it requires you to have access to both websites.

Depending on the OS you're using, you're looking at a scheduled task (Windows) or a cron job (*nix) to kick up a service/app that would pull the new CSV and compare it to an older copy.

You'll definitely want to go the route of a cron job. I'm not exactly sure what you want to do with the differences, however, if you just want an email, here is one potential (and simplified) option:
wget http://uri.com/file.txt && diff file.txt file_previous.txt | mail -s "Differences" your#email.com && mv file.txt file_previous.txt
Try this command by itself from your command line (I'm guessing you are using a *nix box) to see if you can get it working. From there, I would save this to a shell file in the directory where you want to save your CSV files.
cd /path/to/directory
vi process_csv.sh
And add the following:
#!/bin/bash
cd /path/to/directory
wget http://uri.com/file.txt
diff file.txt file_previous.txt | mail -s "Differences" your#email.com
mv file.txt file_previous.txt
Save and close the file. Make the new shell script executable:
chmod +x process_csv.sh
From there, start investigating the cronjob route. It could be as easy as checking to see if you can edit your crontab file:
crontab -e
With luck, you'll be able to enter your cronjob and save/close the file. It will look something like the following:
01 * * * * /path/to/directory/process_csv.sh
I hope you find this helpful.

Related

Cronjob to detect last modified files

I want to run a cronjob every minute to detect all files that were changed in the last minute in a specific directory (with about 300.000 inodes) and export this file list to a csv.
Is it possible to run an optimized command to do that? I cant run a "find" with sort flag in this directory cause it is huge and it will probably take more than 1 minute to run all files.
Is there any command I can do that? Or run any specific program on the background of the server that logs every changed file as it is changed? If there is a command using PHP to do this I am fine, I can create a cron to execute a PHP script, no problem.
There is a Linux utility called incron that can be used similar to normal cron, but rather than events being time based, they work off of inotify and are fired from file events.
You can find the Ubuntu man page here: http://manpages.ubuntu.com/manpages/intrepid/man5/incrontab.5.html
I personally have not had to use it for anything too complex, but it roughly goes like this:
Install it:
sudo apt-get install incron
Open the editor to add an entry:
incrontab -e
Put something like this:
/var/www/myfolder IN_MODIFY curl https://www.example.com/api/file-updated/$#
The first part is the file or folder to watch. The second part is the event. And the third part is the command.
I think that $# is the placeholder for the file in question.

Running php script with cron: Could not open logfile, so such file or directory

I have a Raspberry Pi set up as a seedbox. I have a cron job that will run every 10 minutes, check for finished files using transmission-remote -l, grep for entries that are done (100%), get the names of the folders, copy these to the external drive, and then delete the original files on my Pi.
For every action that is done (A torrent is added, a torrent is finished, a file transfer has started, a file transfer has finished and the files have been deleted) an entry is written to my logfile.log, which is located in the same directory as all scripts, ´/home/pi/dev/´. Inside that folder I have a subfolder, logs that keeps logfiles on all moves from the pi to the external drive. These logfiles are all named after the folder/file being moved.
As I said, every 10 minutes, torrentfinished.phpis run through the cron job
*/10 * * * * php -f /home/pi/dev/torrentfinished.php
All output from the job is sent to my mail at /var/mail/pi.
Now, if I run the script manually, I can run it from anywhere by writing
php -f /home/pi/dev/torrentfinished.php
I have some debug lines written in right beneath the execution of each command. (I use shell_exec to run the commands because I'm more comfortable writing in php than bash).
It'll output
Started transfer
Wrote transfer to logfile
In logfilean entry is then added, with the text $timestamp : started transfer of data from torrent $torrentname. A separate file is created in logs/$torrentname.log. Basically, everything works perfectly.
However, when the cron job runs, I get the following output in /var/mail/pi
Unable to open logfile: No such file or directory
Started transfer
Wrote transfer to logfile
But as you've probably guessed, nothing happens. The files remain in their spot on the Pi and are not transferred. In addition, nothing is written to logfile or logs/$torrentname.log.
I've been wracking my brain over this, and been using chmod 777 on more files than could possibly be considered necessary nor safe, simply to make sure this isn't a permissions issue. I may have missed something of course, but I wouldn't think so. I've also tried renaming the file logfile to something else, but I still get the same error.
I have no more ideas on what to do, so if any of you have ideas, please do tell!
When you use this:
php -f /home/pi/dev/torrentfinished.php
You stay at /home/pi/dev/ directory. And logfile is written at /home/pi/dev/logs
When you run script in cron, base directory is another (for example it may be /bin or /usr/bin).
Try to use DIR or FILE constants to set a logfile path.
It might help to see the actual php code. My first step here would be to have the code print the path to logfile so I could figure out why it thinks it doesn't exist (are you using a relative path or some environment variable, because cron tends to run in a sanitized environment).

Bash/Shell Script for automatic backup of website

I'm brand new to shell scripting and have been searching for examples on how to create a backup script for my website but I'm unable find something or at least something I understand.
I have a Synology Diskstation server that I'd like to use to automatically (through its scheduler) take backups of my website.
I currently am doing this via Automator on my Mac in conjunction with the Transmit FTP program, but making this a command line process is where I struggle.
This is what I'm looking to do in a script:
1) Open a URL without a browser (this URL creates a mysql dump of the databases on the server to be downloaded later). example url would be http://mywebsite.com/dump.php
2) Use FTP to download all files from the server. (Currently Transmit FTP handles this as a sync function and only downloads files where the remote file date is newer than the local file. It also will remove any local files that don't exist on the remote server).
3) Create a compressed archive of the files from step 2, named as website_CURRENT-DATE
4) Move archive from step 3 to a specific folder and delete any file in this specific folder that's older than 120 Days.
Right now I don't know how to do step 1, or the synchronization in step 2 (I see how I can use wget to download the whole site, but that seems as though it will download everything each time it runs, even if its not been changed).
Steps 3 and 4 are probably easy to find via searching, but I haven't searched for that yet since I can't get past step 1.
Thanks!
Also FYI my web-host doesn't do these types of backups, so that's why I like to do my own.
Answering each of your questions in order, then:
Several options, the most common of which would be one of wget http://mywebsite.com/dump.php or curl http://mywebsite.com/dump.php.
Since you have ssh access to the server, you can very easily use rsync to grab a snapshot of the files on-disk with e. g. rsync -essh --delete --stats -zav username#mywebsite.com:/path/to/files/ /path/to/local/backup.
Once you have the snapshot from rsync, you can make a compressed, dated copy with cd /path/to/local/backup; tar cvf /path/to/archives/website-$(date +%Y-%m-%d).tgz *
find /path/to/archives -mtime +120 -type f -exec rm -f '{}' \; will remove all backups older than 120 days.

Running a php script with a .bat file

I need to run a php script at midnight every night on my server. On a linux system I'd set up a cron job, but I'm stuck with a windows system.
I know I have to set up a task using the windows task scheduler, and that the task will need to run a .bat file which in turn will run the php file, but I'm stuck trying to write the .bat file.
What I currently have is:
#echo off
REM this command runs the nightly cron job
start "C:\Program Files (x86)\PHP\v5.3\php.exe" -f C:\inetpub\wwwroot\sitename\crons\reminder-email.php
But when I try to manually run the .bat file to test it, I get a windows alert saying
"Windows cannot find '-f'. Make sure you typed the name correctly, and then try again.
What have I missed?
The START command optionally accepts a title for the created window as its first argument; in this case, it thinks that C:\Program Files (x86)\PHP\v5.3\php.exe is the title to display and -f (the second argument) is the executable you want to run.
You can therefore fix this by providing a placeholder title, e.g.
start "email reminder task" "C:\Program Files (x86)\PHP\v5.3\php.exe" -f C:\inetpub\wwwroot\sitename\crons\reminder-email.php
Or, preferably, you can ditch the START command altogether (you aren't using any of its unique facilities) and just run PHP directly:
"C:\Program Files (x86)\PHP\v5.3\php.exe" -f C:\inetpub\wwwroot\sitename\crons\reminder-email.php
Actually, you don't even need a batch-file.
You can run the php-script from the task scheduler.
Just let the task scheduler run php.exe and set the location of the php-file as the argument of the task.
Can I suggest a small change.
echo off
REM This adds the folder containing php.exe to the path
PATH=%PATH%;C:\Program Files (x86)\PHP\v5.3
REM Change Directory to the folder containing your script
CD C:\inetpub\wwwroot\sitename\crons
REM Execute
php reminder-email.php
PS. Putting Apache,MySQL or PHP in Program Files is a bad idea. Dont use windows folders with spaces in their names.
How about this?
set php="C:\Program Files (x86)\PHP\v5.3\php.exe"
%php% -f C:\inetpub\wwwroot\sitename\crons\reminder-email.php
Sadly this took close to 10 hours to figure out. Some protocols such as WinRM and PSEXEC must pass the logged in user account regardless of the credentials supplied (that or PHP overrides whatever you type in). At any rate, to get PSEXEC, WinRM or just kicking off batch files that connect to remote computers, you will need to change the IIS run as account to a user with rights to those resources (service account). I realize this is a huge security hole, but that is by design. PHP is secure so that it can't be hacked easily, you have to override their security by specifying a run as account. Not the same thing as your app pool account - although your IIS credentials can use your app pool account.
Try like this guys!
cd E:\xampp\htdocs\my-project
E:
php artisan schedule:run

Folder monitoring and event triggering according to folder status in php

There is a folder in which xml files are beeing copied at no particular time, when an event is happening. I want a php way to inspect the folder's status and when an xml file arrives, an event will be triggered.(ex.call to the xml parser). So which is the best way (in php) to monitor a folder and trigger events according to it's status? Thanx!
Haven't tried it, but maybe Inotify can help you:
inotify is a Linux kernel subsystem that acts to extend filesystems to notice changes to the filesystem, and report those changes to applications.
There's a PHP extension for inotify, see InotifyDocs and inotifyPECL.
Another alternative if you're running on linux is to use a PHP-independent daemon to monitor a directory for changes. You can use dnotify for it (obsoleted by inotify), something like:
dnotify -a -r -b -s /path/ -e <command>;
It will execute the command each time one of the files in other folder are modified (-a -r -b -s = any access/recursive directory lookup/run in background/no output).
Related:
Read file change in php (linux equivalent of tail -f )
How to efficiently monitor a directory for changes on linux?
I think the most simple way to do this is to use cron job to examine the folder every minute. The other option is to trigger your php script from another script/program that copies new xml file to the directory.
Cron enables you to run your script every minute. If you want instant response you should write a shell script(http://aplawrence.com/Unixart/watchdir.html) that constanlty runs in the background or maybe pearl daemon to you detect new file and trigger your php script to examine changes.
You should take a look at FAM (File Alteration Monitor). PHP 4 based binary extension (beta status); documentation.
Use cron to regularly scandir() the directory.

Categories