A little php script (logical help needed) - php

I am a .net developer and devolving an application for a company. For that I need to write a little php script to meet my needs.
My app need to check some information which randomly change almost every second from internet. I am thinking to make a php script so that I can give app the needed information. My idea is to use a simple text file instead of a mysql database (I am free to use a mysql db also). And then make two php pages. For example writer.php and reader.php
work of writer.php is very simple. This file will save the submitted data to the text file I want to use as db.
reader.php will read the text file and then show as simple text and on every read it will also empty the text file.This file will be read by my app.
work done.
Now the logical questions.
reader.php will be read by 40 clients in the same time. If there is
any conflicts?
If this method will be fast than mysql db?
If this method is more resource consuming than a mysql db?

You will have to lock the file for I/O for the time of writting (PHP flock() function). This may slow down things a bit when there will be more clients at same time, as when file will be locked by one user, everyone else would have to wait. The other problem that may appear when writting alot o data is that writting queue may become infinite when there would be many write requests.
MySQL seems to be better idea, as it caches both write and read requests, and it is implemented to avoid simultanous access conflicts.

Related

PHP variable authentication

Is it a bad idea to write a file with php for authentication?
An example:
A user submits a login form. If the credentials are invalid, the PHP writes a new file with the filename as the attempted username, and the contents would have a variable containing the number of attempts. Then that file would be included for the next login attempt, and if login attempts= 2 or whatever, display a reCaptcha.
Are there any obvious flaws with such a technique? I see most suggest using a database to store the login attempts and such, and I have no problem with doing it that way, but I was just curious.
A file is just another form of a database. If you implement this solution carefully, there is no real difference between implementing this via a database or via files.
The problem is the extra overhead of managing the sessions via files and writing all the code to do this properly.
In the end a database operates on files as well except for databases in memory.
Handling files yourself is not very efficient. Databases solve some complicated problems that you will face when writing/reading to files yourself like mentioned here database vs. flat files

Potentially write to same file in PHP multiple times at once?

I am using PHP's fputcsv to log votes in an application we are making. The saving part of my code roughly ressembles this:
$handle = fopen('votes.csv', 'a');
fputcsv($handle, $data);
fclose($handle);
This works flawlessly in tests. However, I have a small concern. When deployed in production, it's possible that many many users will be making a request to this script at the same time. I am curious as to how PHP will handle this.
Will I potentially have problems, and lose votes because of that? If so, what can I do to prevent it? Is the solution more complex than simply using a database? And finally, how can I test for this situation, where a lot of requests would be made at the same time? Are there things that already exist to test this sort of stuff?
Writing to a file can cause issues with concurrent users. If you instead insert into a database, you can let the database itself handle the queue. If you run out of connections, it is easily tracked and you can see the load on the db as you go.
An insert in a database will be less resource heavy than an append to a file. Having said that, you would need pretty heavy load for either to take effect - but with a database, you have the build in query queue to alleviate a good portion of the concurrent stress.
When you send a request to a database, it actually goes into a queue for processing. It only fails to be executed if there is a timeout in your PHP code (basically, PHP is told to abandon the wait for the db to respond - and you can control this via PHP and Apache settings) so you have a fantastic built-in buffer.

max_execution_time Alternative

So here's the lowdown:
The client i'm developing for is on HostGator, which has limited their max_execution_time to 30 seconds and it cannot be overridden (I've tried and confirmed it cannot be via their support and wiki)
What I'm have the code doing is take an uploaded file and...
loop though the xml
get all feed download links within the file
download each xml file
individually loop though each xml array of each file and insert the information of each item into the database based on where they are from (i.e. the filename)
Now is there any way I can queue this somehow or split the workload into multiple files possibly? I know the code works flawlessly and checks to see if each item exists before inserting it but I'm stuck getting around the execution_limit.
Any suggestions are appreciated, let me know if you have any questions!
The timelimit is in effect only when executing PHP scripts through a webserver, if you execute the script from CLI or as a background process, it should work fine.
Note that executing an external script is somewhat dangerous if you are not careful enough, but it's a valid option.
Check the following resources:
Process Control Extensions
And specifically:
pcntl-exec
pcntl-fork
Did you know you can trick the max_execution_time by registering a shutdown handler? Within that code you can run for another 30 seconds ;-)
Okay, now for something more useful.
You can add a small queue table in your database to keep track of where you are in case the script dies mid-way.
After getting all the download links, you add those to the table
Then you download one file and process it; when you're done, you check them off (delete from) from the queue
Upon each run you check if there's still work left in the queue
For this to work you need to request that URL a few times; perhaps use JavaScript to keep reloading until the work is done?
I am in such a situation. My approach is similar to Jack's
accept that execution time limit will simply be there
design the application to cope with sudden exit (look into register_shutdown_function)
identify all time-demanding parts of the process
continuously save progress of the process
modify your components so that they are able to start from arbitrary point, e.g. a position in a XML file or continue downloading your to-be-fetched list of XML links
For the task I made two modules, Import for the actual processing; TaskManagement for dealing with these tasks.
For invoking TaskManager I use CRON, now this depends on what webhosting offers you, if it's enough. There's also a WebCron.
Jack's JavaScript method's advantage is that it only adds requests if needed. If there are no tasks to be executed, the script runtime will be very short and perhaps overstated*, but still. The downsides are it requires user to wait the whole time, not to close the tab/browser, JS support etc.
*) Likely much less demanding than 1 click of 1 user in such moment
Then of course look into performance improvements, caching, skipping what's not needed/hasn't changed etc.

Background PHP worker

I have a script that takes a while to process, it has to take stuff from the DB and transfers data to other servers.
At the moment i have it do it immediately after the form is submitted and it takes the time it takes to transfer that data to say its been sent.
I was wondering is the anyway to make it so it does not do the process in front of the client?
I dont want a cron as it needs to be sent at the same time but just not loading with the client.
A couple of options:
Exec the PHP script that does the DB work from your webpage but do not wait for the output of the exec. Be VERY careful with this, don't blindly accept any input parameters from the user without sanitising them. I only mention this as an option, I would never do it myself.
Have your DB updating script running all the time in the backgroun, polling for something to happen that triggers its update. Say, for example, it could be checking to see if /tmp/run.txt exists and will start DB update if it does. You can then create run.txt from your webpage and return without waiting for a response.
Create your DB update script as a daemon.
Here are some things you can take a look at:
How much data are you transferring, and by transfer is it more like a copy-and-paste the data only, or are you inserting the data from your db into the destination server and then deleting the data from your source?
You can try analyzing your SQL to see if there's any room for optimization.
Then you can check your php code as well to see if there's anything, even the slightest, that might aid in performing the necessary tasks faster.
Where are the source and destination database servers located (in terms of network and geographically, if you happen to know) and how fast the source and destination servers are able to communicate through the net/network?

Security implications of writing files using PHP

I'm currently trying to create a CMS using PHP, purely in the interest of education. I want the administrators to be able to create content, which will be parsed and saved on the server storage in pure HTML form to avoid the overhead that executing PHP script would incur. Unfortunately, I could only think of a few ways of doing so:
Setting write permission on every directory where the CMS should want to write a file. This sounds like quite a bad idea.
Setting write permissions on a single cached directory. A PHP script could then include or fopen/fread/echo the content from a file in the cached directory at request-time. This could perhaps be carried out in a Mediawiki-esque fashion: something like index.php?page=xyz could read and echo content from cached/xyz.html at runtime. However, I'll need to ensure the sanity of $_GET['page'] to prevent nasty variations like index.php?page=http://www.bad-site.org/malicious-script.js.
I'm personally not too thrilled by the second idea, but the first one sounds very insecure. Could someone please suggest a good way of getting this done?
EDIT: I'm not in the favour of fetching data from the database. The only time I would want to fetch data from the database would be when the content is cached. Secondly, I do not have access to memcached or any PHP accelerator.
Since you're building a CMS, you'll have to accept that if the user wants to do evil things to visitors, they very likely can. That's true regardless of where you store your content.
If the public site is all static content, there's nothing wrong with letting the CMS write the files directly. However, you'll want to configure the web server to not execute anything in any directory writable by the CMS.
Even though you don't want to hit the database every time, you can set up a cache to minimize database reads. Zend_Cache works very nicely for this, and can be used quite effectively as a stand-alone component.
You should put your pages in a database and retrieve them using parameterized SQL queries.
I'd go with the second option but modify it so the files are retrieved using mod_rewrite rather than a custom php function.

Categories