Sorry if this is a duplicate question...I've searched around and found similar advice but nothing that helps my exact problem. And please excuse the noob questions, CRON is a new thing for me.
I have a codeigniter script that scrapes the html DOM of another site and stores some of that in a database. I'd like to run this script at a regular interval. This has lead me to looking into cron jobs.
The page I have is at myserver.com/index.php/update
I realize I can run a cron job with curl and run this page. If I want to be a bit more secure I can put a string at the end like:
myserver.com/index.php/update/asdfh2784fufds
And check for that in my CI controller.
This seems like it would be mostly secure, but doesn't seem like the "right" way to do things.
I've looked into running CI from the command line, and can execute basic pages like:
php index.php mycontroller
But when I try to do:
php index.php update
It doesn't work. I suspect this is because it needs to use HTTP to scrape the DOM of the outside page.
So, my question:
How do I securely run a codeigniter script with a cron job that needs HTTP access?
You have a couple options. The easiest would be to have your script ensure that the $_SERVER['REMOTE_ADDR'] is coming from the same machine before executing.
Another would be to use https and have wget or curl use HTTP authentication.
What exactly went wrong?
What error did it throw?
I have used CI from the command line before without any problems.
Don't forget that in case you are not on the folder the script is located you need to specify the full path to it.
something like
php /path/to/ci_folder/index.php update
Also on your controller you can add.
if ($this->input->is_cli_request())
// run the script
else
// echo some message saying not allowed.
This will run what is needed only if the php script is running on the command line.
Hope it Helped.
Related
Can I run a php script from command line with the following usage:
php http://phpfile.php username password config_file
When I run this it says cannot open input file http://phpfile.php
If there is what would be the best way to execute this within a php script?
Thanks
This is practically not possible. You cannot execute a php script hosted on someone else's server in your cli.
WHY?
Consider this case. Facebook has a php script which adds a comment to the database. So What would be the outcome if someone executes this script from local command line, and goes about adding comments to the database ? This would really mess up the systems. Or consider something like a file hosting script. You can very well imagine the outcomes if anyone can delete any file from their own cli.
Solution!
The file can be executed if:
Either you have the script saved locally and run it
Or make a get or post request to script with required data and make it do stuff.
Summing up
You can execute it only if the owner allows it (via get and post requests)
Refer these
Do a post request using
Guide to making requests
Not sure if I understood the use/purpose of PHP entirely, but what seems to me that a .php file only executes when it is being called/executed by something before it, could be a html or a php file.
A thought, is it possible that a php file written, and it would just be activated by its own, example over a duration span of time, every week or so it would do something?
Thanks for looking...
You are looking for a cron job. This allows you to save a line of code on your remote server that will execute based on the criteria you set. You can make it execute a variety of files but PHP files are definitely one of the files you can execute in this manner.
As mentioned by nathan, you will be looking for a cron job. This is a server side setting in the server that will call a url at a set interval.
You seem to not really understand how PHP works. PHP scripts are called server-side before sending data to the client. They are run once when the client is accessing the script.
what my page do is:
download an array from different server (first.php)
php script parse values
parsed values are sent with ajax call
on the next (ajax called) page (second.php) there are some mysql queries
if values pass condition, values are written to database
.... So, when I run my first.php.. it loads second.php, everything's fine..
but what I want to know if it is possible to let it make by cron?
If not, what should I do?
Thanks.
There are certain things you need to understand in this regard.
The first is that PHP can be run as either a web server module or as a standalone executable. When you run it as a web server module, you open it from the browser, all related web technologies (html/css/js) etc get parsed and work in unison.
When you run it from command line using cron like say /usr/bin/php mywebpage.php
then the php executable DOES NOT parse/understand the other web technologies and so your page will fail.
There are two workarounds for this:
Rewrite only those web-enabled parts so that the ajax/js stuff gets
handled by PHP. Basically rule of the thumb is that if you are
running a CLI php script, it should contain ONLY core PHP. This is the preferred way. You will need to move the ajax calls to inside the same file and just make it a single execution flow like any regular program.
If for some reason you cannot do the above, you can try something like this:
/path/to/browser http://mysite/mywebpage.php. Here what you are doing is, you are running a browser executable and then calling the webpage URL. This way the page is being executed within the browser's environment and it will be able to parse and understand the ajax/js calls.
Yes you can create a cron job in the below way.
1) download an array from different server (first.php)
2) php script parse values in first.php
3) Include the second file, second.php by include_once which executes mysql queries
4) If everything is correct insert them to database.
It sounds like you need a standalone JavaScript shell. There are a number listed at:
https://developer.mozilla.org/en-US/docs/JavaScript/Shells
The way my site is setup, I need to manually visit two URLs to trigger the mail system. One URL compiles a list of emails, another sends them off.
I'd like to automate this using a cronjob, but here's the problem. I am using the Kohana framework and I don't think copy pasting the code within the controllers will work.
The easiest way to accomplish what I am doing is to have the two URLs visited every 5 minutes or so. Is it possible to "visit" (for a lack of better word) sites in PHP?
Yes, if you just use file_get_contents or access it by cURL, it would be considered "visited" as it will simply create a GET request.
file_get_contents($url1);
file_get_contents($url2);
If you just want to 'visit' a web site you could retrieve it via file_get_contents(), or if you have the curl extension installed you could fire up a curl request at your URLS.
If you are running the cron job on the same machine as the server you can call Kohana on the command line using this syntax.
/usr/bin/php index.php --uri=controller/action
Replace controller/action with the route you wish to call.
Note that any $SERVER variables are not defined when you invoke Kohana in this manner.
I'm using some PHP scripts from FeedForAll to join together RSS feeds (RSSmesh) and display them as HTML (RSS2HTML).
Because I intend to run these scripts fairly intensively and don't want the resulting HTTP requests and bandwidth to count towards my hosting quota, I am in the process of moving to running them on the web host's server in an umbrella PHP "batch" script, and call this script via cron (this is a Linux server, by the way).
Here's a (working) sample request over HTTP:
http://www.mydomain.com/a/rss2htmlcore/rss2html2.php?XMLFILE=http://www.mydomain.com/a/myapp/xmlcache/feed.xml&TEMPLATE=template.html
This will produce the desired HTML output. An example of how I want this to work on the command line:
/srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php /srv/customers/mycustomer#/mydomain.com/www/a/myapp/xmlcache/feed.xml /srv/customers/mycustomer#/mydomain.com/www/a/template.html
This is with the correct shebang line added to "rss2html2-cli.php". I could just as well specify the executable ("/usr/local/bin/php") in the request, I doubt it makes a difference because I am able to run another script (that I wrote myself) either way without problems.
Now, RSS2HTML and RSSmesh are different in that, for starters, they include secondary files -- for example, both include an XML parser script -- and I suspect that this is where I am getting a bit in over my head.
Right now I'm calling exec() from the "umbrella" batch script, like so:
exec("/srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php /srv/customers/mycustomer#/mydomain.com/www/a/myapp/xmlcache/feed.xml /srv/customers/mycustomer#/mydomain.com/www/a/template.html", $output)
But no output is being produced. What's the best way to go about this and what "gotchas" should I keep in mind? Is exec() the right way to approach this? It works fine for the other (simple) script but that writes its own output. For this I want to get the output and write it to a file from within the umbrella script if possible. I've also tried output buffering but to no avail.
Do I need to pay attention to anything specific with regard to the includes? Right now they're specified in the scripts as include_once("FeedForAll_XMLParser.inc.php"); and the specified files are indeed in the same folder.
Further info:
-This is a Linux server.
-I have no direct access to the shell, so I can't test things directly on a command line, everything is via crontab.
-I will admit that support for the FeedForAll scripts leaves a lot to be desired, but I'd like to keep using their scripts if at all possible, if only because I know them and have been using them for a while. I have looked into Simplepie, but the FFA scripts do some things that I've seen no obvious solutions for with Simplepie, like limiting the number of items per individual feed (RSSmesh) or limiting the description length (RSS2HTML).
-Yahoo! Pipes is out, they cache their data for too long for my application.
Should you want to take a look at the code, here are the scripts as txt files. RSS2HTML2 and RSSmesh are the FeedForAll scripts, FeedForAll_XMLParser... is the included parser. Note that I have not yet amended these to handle $argv etc. I have however in "scraper-universal-rss-cli", which works fine with CLI.
If anyone has any thoughts to share on this it would be very much appreciated. Thank you in advance.
I think the $hideErrors = 0; line in rss2html is not helping. Since isset is used to check if errors should be displayed you should comment this out. Setting it to zero does nothing since a variable set to 0 still evaluates to true with isset.
Re-run and see if it throws up some errors for you.
Use wget or curl to issue the request against the local web server. Don't use CLI.