Php efficiency question --> Database call vs. File Write vs. Calling C++ executable - php

What I wish to achieve is - log all information about each and every visit to every page ofmy website (like ip address, browser, referring page, etc). Now this is easy to do.
What I am interested is doing this in a way so as to cause minimum overhead (runtime) in the php scripts. What is the best approach for this efficiency-wise:
1) Log all information to a database table
2) Write to a file (from php directly)
3) Call a C++ executable, that will write this info to a file in parallel [so the script can continue execution without waiting for the file write to occur ...... is this even possible]
I may be trying to optimize unnecessarily/prematurely, but still - any thoughts / ideas on this would be appreciated. (I think efficiency of file write/logging can really be a concern if I have say 100 visits per minute...)
Thanks & Regards,
JP

You have this C++ executable. Called web-server. It logs every hit to your site already.

Robust but could be a pain to implement
Be careful for multithreading. What happens if two users call simultaneously your php script and the file is already open for writing.
Same as 2 but the exception will occur in the C++ executable.
I would suggest you using a logging framework such as log4php.

Related

PHP - Multiple scripts at once (AJAX)

After asking this question, someone pointed on the right direction of not being able to execute a second script at all if one was already running.
I usually make apps which rely on the execution of AJAX calls to PHP pages, and today I found that trying to write on a file with fwrite() on a PHP script and trying to read that same file with fread() (to get progress feedback) on another AJAX call ended up in the second script only being executed when the first one had already finished.
Even trying to echo a simple "hello" (echo "hello"; exit;) would not show nothing on the page until the first script was finished.
So, I'm asking: is this a normal configuration? Is this the same on every installation of PHP by default? Is some configuration on php.ini that I can change?
Or it has to do with the server (in my case, Microsoft IIS 10)? Can someone shed some light on how to be able to execute multiple PHP scripts on different AJAX calls at once (or before the others finish)?
I know I'm not giving much information about the settings of my context, but I don't know neither where to look into.
Thank you everyone for your time and help!
As Luis said it could be a write-lock on the file that you're trying to modify. However another possibility if you're using sessions that use files (rather than a database), or a framework that uses file-based-sessions - then this behavior could also be a result of session-locking. My money would be on Luis' answer though - you should probably be using a database rather than a file unless you have a solid reason not to.

Scheduled task (cronjob) with PHP

I'm creating a website that requires a file to be generated and stored on the server periodically (an XML feed for iTunes). The page is generated using ExpressionEngine. I discovered that the website's current server has a very restricted cPanel and doesn't have access to cron.
So I'm considering two options; find an alternative way to access the cronjobs (if they are available), or find an alternative way to created regularly scheduled tasks.
Regarding the first option, how would I go about determining if a server has cron available? I'm not sure how useful this would be anyway since I don't think the server allows shell access (it's a very basic setup for people who aren't tech savvy).
Regarding the second option, a friend mentioned to me that the functionality of cronjobs can just be done in PHP. How would I go about this?
Or, am I perhaps thinking too much with this? The page in ExpressionEngine that outputs the XML file is domain.com/itunes/itunes_feed. This just has some EE tags that outputs the relevant XML and the resultant page is in .xml format. Is it enough to just submit the above url to iTunes, or does it have to be a url to the actual pre-existing file on the server?
Option 1
Simply contact your hosts and ask them do they support cron jobs, and if so, how to set up.
Option 2
I only just set up my own set of cron jobs yesterday..
Create a php file that runs the code you want,
Set up and account on https://www.easycron.com/
Upload your php file to easycron
Set the times in which you would like your php code to run
Simple as that! Does that make sense?

Emulating PHP's CLI in a browser

I'm considering the idea of a browser-based PHP IDE and am curious about the possibility of emulating the command line through the browser, but I'm not familiar enough with developing tools for the CLI to know if it's something that could be done easily or at all. I'd like to do some more investigation, but so far haven't been able to find very many resources on it.
From a high level, my first instinct is to set up a text input which would feed commands to a PHP script via AJAX and return any output onto the page. I'm just not familiar enough with the CLI to know how to interface with it in that context.
I don't need actual code, though that would be useful too, but I'm looking for more of which functions, classes or APIs I should investigate further. Ideally, I would prefer something baked into PHP (assume PHP 5.3) and not a third-party library. How would you tackle this? Are there any resources or projects I should know about?
Edit: The use case for this would be a localhost or development server, not a public facing site.
Call this function trough a RPC or a direct POST from javascript, which does things in this order:
Write the PHP code to a file (with a random name) in a folder (with a random name), where it will sit alone, execute, and then be deleted at the end of execution.
The current PHP process will not run the code in that file. Instead it has to have exec permissions (safe_mode off). exec('php -c /path/to/security_tight/php.ini') (see php -?)
Catch any ouput and send it back to the browser. You are protected from any weird errors. Instead of exec I recomment popen so you can kill the process and manually control the timeout of waiting for it to finish (in case you kill that process, you can easily send back an error to the browser);
You need lax/normal security (same as the entire IDE backend) for the normal PHP process which runs when called through the browser.
You need strict and paranoid security for the php.ini and php process which runs the temporary script (go ahead and even separate it on another machine which has no network/internet access and has its state reverted to factory every hour just to be sure).
Don't use eval(), it is not suitable for this scenario. An attacker can jump out into your application and use your current permissions and variables state against you.
The basic version would be
you scripts outputs a form with a line input
The form action points to your script
The script takes the input on the form and passes it to eval
pass any output from eval to the browser
output the form again
The problem is, that defined functions and variables are lost between each request.
Would you could to is to add each line that is entered to your session. Lets say
$inputline = $_GET['line'];
$_SESSION['script'] .= $inputline . PHP_EOL;
eval($_SESSION['script'];
by this, on each session a the full PHP script is executed (and of course you will get the full output).
Another option would be to create some kind of daemon (basically an instance of a php -a call) that runs on the server in the background and gets your input from the browser and passes the output.
You could connect this daemon to two FIFO devices (one for the input and one for the output) and communicate via simple fopen.
For each user that is using your script, a new daemon process has to be spawned.
Needless to say, that it is important to secure your script against abuse.
Recently I read about a PHP interpreter written in Javascript php.js, so you could write and execute PHP code using your browser only. I'm not sure if this is what you need in the end but it sounds interesting.
We've tested some products at my university for ssh-accessing our lab servers and used some of the Web-SSH-Tools - they basically do exactly what you want. The Shell-In-A-Box-Project may be bound to any interpreter you like and may be used with an interactive php-interpreter, if desired (on the demo-page, they used a basic-interpreter). The project may serve as a basis for a true PHP-IDE. These have the advantage of being capable of interacting with any console-based editor as well (e.g. vi, emacs or nano), as well as being able to give administrative commands (e.g. creating folders, changing ownerships or ACLs or rebooting a service).
Mozilla also has a full-featured webbased IDE called Bespin, which is also highly extensible and configurable.
As you stated, that the page is not for the public, you of course have to protect the page with Authentication and SSL to combat session hijacking.

How are AJAX progress indicators for modern PHP web applications implemented?

I've seen many web apps that implement progress bars, however, my question is related to the non-uploading variety.
Many PHP web applications (phpBB, Joomla, etc.) implement a "smart" installer to not only guide you through the installation of the software, but also keep you informed of what it's currently doing. For instance, if the installer was creating SQL tables or writing configuration files, it would report this without asking you to click. (Basically, sit-back-and-relax installation.)
Another good example is with Joomla's Akeeba Backup (formerly Joomla Pack). When you perform a backup of your Joomla installation, it makes a full archive of the installation directory. This, however, takes a long time, and hence requires updates on the progress. However, the server itself has a limit on PHP script execution time, and so it seems that either
The backup script is able to bypass it.
Some temp data is stored so that the archive is appended to (if archive appending is possible).
Client scripts call the server's PHP every so often to perform actions.
My general guess (not specific to Akeeba) is with #3, that is:
Web page JS -> POST foo/installer.php?doaction=1 SESSID=foo2
Server -> ERRCODE SUCCESS
Web page JS -> POST foo/installer.php?doaction=2 SESSID=foo2
Server -> ERRCODE SUCCESS
Web page JS -> POST foo/installer.php?doaction=3 SESSID=foo2
Server -> ERRCODE SUCCESS
Web page JS -> POST foo/installer.php?doaction=4 SESSID=foo2
Server -> ERRCODE FAIL Reason: Configuration.php not writable!
Web page JS -> Show error to user
I'm 99% sure this isn't the case, since that would create a very nasty dependency on the user to have Javascript enabled.
I guess my question boils down to the following:
How are long running PHP scripts (on web servers, of course) handled and are able to "stay alive" past the PHP maximum execution time? If they don't "cheat", how are they able to split the task up at hand? (I notice that Akeeba Backup does acknowledge the PHP maximum execution time limit, but I don't want to dig too deep to find such code.)
How is the progress displayed via AJAX+PHP? I've read that people use a file to indicate progress, but to me that seems "dirty" and puts a bit of strain on I/O, especially for live servers with 10,000+ visitors running the aforementioned script.
The environment for this script is where safe_mode is enabled, and the limit is generally 30 seconds. (Basically, a restrictive, free $0 host.) This script is aimed at all audiences (will be made public), so I have no power over what host it will be on. (And this assumes that I'm not going to blame the end user for having a bad host.)
I don't necessarily need code examples (although they are very much appreciated!), I just need to know the logic flow for implementing this.
Generally, this sort of thing is stored in the $_SESSION variable. As far as execution timeout goes, what I typically do is have a JavaScript timeout that sets the innerHTML of an update status div to a PHP script every x number of seconds. When this script executes, it doesn't "wait" or anything like that. It merely grabs the current status from the session (which is updated via the script(s) that is/are actually performing the installation) then outputs that in whatever fancy method I see fit (status bar, etc).
I wouldn't recommend any direct I/O for status updates. You're correct in that it is messy and inefficient. I'd say $_SESSION is definitely the way to go here.

Server file read/write concurrency issue

I have a web-service serving from a MySQL database. I would like to create cache file to improve the performance. The idea is once a while we read data from DB and generate a text file. My question is:
What if a client-side user is accessing the file while we are generating it?
We are using LAMP. In PHP there is flock() handles concurrency problem, but my understanding is that it's only for when 2 PHP processes accessing the file simultaneously. Our case is different.
I don't know whether this will cause issues at all. If so, how can I prevent it?
Thanks,
don't use locking;
if your cachefile is /tmp/cache.txt then you should always regenerate the cache to /tmp/cache2.txt and then do a
mv /tmp/cache2.txt /tmp/cache.txt
or
rename('/tmp/cache2.txt','/tmp/cache.txt')
the mv/rename operation is atomic if it happens inside the same filesystem; no locking needed
All sorts of optimisation options here;
1) Are you using the MySQL queryCache - that can take a huge load off the database to start with.
2) You could pull the file through a web proxy like squid (or Apache configured as a reverse caching proxy). I do this all the time and it's a really handy technique - generate the file by fetching it from a url using wget for example (that way you can have it in a cron job). The web proxy takes care of either delivering the same file that was there before, or regenerating it if needs be.
3) You don't want to be rolling your own file locking solution in this scenario.
Depending on your scenario, you could also consider cacheing pages in something like memcache which is fantastic for high traffic scenarios, but possibly beyond the scope of this question.
You can use A -> B switching to avoid this issue.
E.g. : Let there be two copies of this cache file A and B, program should read these via a symlink, C.
When program is building the cache, it would modify the file that is not "current" I.e. if C link to A, update B. Once update is complete, switch symlink to B.
next time, update A and switch symlink to A once update is complete.
this way clients would never read a file while it is being updated.
When a client-side access the file, it reads it as it is in that moment.
flock() is for when 2 PHP processes accessing the file simultaneously.
I would solve it like this:
While generating the new text file, save it to a temporary file (cache.tmp), that way the old file (cache.txt) is being accessed like before.
When generation is done, delete the old file and rename the new file
To avoid problems during that short period of time, your code should check wether cache.txt exists and retry for a short period of time.
Trivial but that should do the trick

Categories