I'm using Zend Framework and Zend_Cache with the File backend.
According to the ZF manual the recommended place to store the cache files would be under /data/cache but I'm thinking it would make more sense to store them under /temp/cache. Why is /data/cache preferred?
Here is a link to the part of ZF manual I mentioned:
http://framework.zend.com/manual/en/project-structure.project.html
I guess you're talking about these recommendations: Recommended Project Directory Structure.
The interesting parts are:
data/: This directory provides a place to store application data that
is volatile and possibly temporary. The disturbance of data in this
directory might cause the application to fail. Also, the information
in this directory may or may not be committed to a subversion
repository. Examples of things in this directory are session files,
cache files, sqlite databases, logs and indexes.
temp/: The temp/ folder is set aside for transient application data. This information would not typically be committed to the
applications svn repository. If data under the temp/ directory were
deleted, the application should be able to continue running with a
possible decrease in performance until data is once again restored or
recached.
Now, you can understand that Zend doesn't recommend to store your Zend_Cache data into data/cache/ only, but it could also be stored under the temp/ directory. The real question is: should I commit these data cache files and are they necessary for the application to run correctly? Once you answered these questions, you know where you should put your cache files. In my opinion, in most cases cached data should be stored under the temp/ directory.
Finally, remember that this is only a recommendation, you are always free to do the way you want.
I can't find the part of the Zend_Cache manual that recommends using data/cache as the cache directory, maybe you could link to it. I did find some examples that use ./temp/.
Either way, Zend Cache doesn't care where you decide to store the cache files, it is up to you. You just need to make sure that the directory is readable and writable by PHP.
Related
Currently I'm storing the configuration for my PHP scripts in variables and constants within another PHP script (e.g. config.php).
So each time a script is called, it includes the configuration script to gain access to the values of the variables/constants.
Since INI-files are easier to parse by other scripts, I thought about storing values for my configuration in such a file an read it using parse_ini_file().
In my notion PHP keeps script-files in memory, so including a script-file does (usually) not cause IO (Or does Zend do the caching? Or are the sources not cached at all?).
How is it with reading custom INI-files. I know that for .user.ini there is caching (see user_ini.cache_ttl), but does PHP also cache custom INI-files?, or does a call to parse_ini_file() always cause IO?
Summary
The time required to load configuration directives (which is not the same as the time needed by the app to perform those directives) is usually negligible - below one millisecond for most "reasonably sized" configurations. So don't worry - INI, PHP, or JSON are, performance wise, all equally good choices. Even if PHP were ten times faster than JSON, that would be like loading in 0.001s instead of 0.01s; very few will ever notice.
That said, there are considerations when deciding where to store config data.
.ini vs .php config storage
Time to load: mostly identical unless caching is involved (see below), and as I said, not really important.
ease of use: .ini is easier to read and modify for a human. This may be an advantage, or a disadvantage (if the latter, think integrity check).
data format: PHP can store more structured data than .ini files, unless really complicated workarounds are used. But consider the possibility of using JSON instead of INI.
More structured data means that you can more easily create a "super configuration" PHP or JSON holding the equivalent of several INI files, while keeping information well isolated.
automatic redundancy control: PHP file inclusion can be streamlined with require_once.
user modifications: there are visual INI and JSON editors that can allow a user to modify a INI or JSON file while keeping it, at least, syntactically valid. Not so for PHP (you would need to roll your own).
Caching
The PHP core does not do caching. Period. That said, you'll never use the PHP core alone: it will be loaded as a (fast)CGI, an Apache module, et cetera. Also you might not use a "barebones" installation but you could have (chances are that you will have) several modules installed.
Both the "loader" part and the "module" part might do caching; and their both doing this could lead to unnecessary duplications or conflicts, so it is worth checking this out:
the file (but this does not change between INI, JSON and PHP files) will be cached into the filesystem I/O subsystem layer and, unless memory is really at a premium, will be loaded from there (on a related note, this is one of the reasons why not all filesystems are equally good for all websites).
if you need the configuration in several files, and use require_once in all of them, the configuration will be loaded once only, as soon as it is needed. This is not caching, but it is a performance improvement nonetheless.
several modules exist (Zend, opcache, APC, ...) that will cache all PHP files, configuration included. They will not cache INI files, though.
the caching done by modules (e.g. opcache) can (a) ignore further modifications to the file system, which means that upon modifying a PHP file, you'll need to somehow reload or invalidate the cache; how to do this changes from module to module; (b) implement shortcuts that might conflict with either the file system data management or its file structure (famously, opcache can ignore the path part of a file, allowing for much faster performances unless you have two files with the same name in different directories, when it risks loading one instead of the other).
Performance enhancement: cache digested data instead of config directives
Quite often it will be the case that depending on some config directive, you will have to perform one of several not trivial operations. Then you will use the results for the actual output.
What slows down the workflow in this case is not reading whether, say, "config.layout" is "VERTICAL" or "HORIZONTAL", but actually generating the layout (or whatever else). In this case you might reap huge benefits by storing the generated object somewhere:
serialized inside a file (e.g. cache/config.layout.vertical.html.gz). You will probably need to deploy some kind of 'stale data check' if the layout changes, or some kind of cache invalidation procedure. (For layouts specifically, you could check out Twig, which also does parameterized template caching).
inside a keystore, such as Redis.
in a RDBMS database such as MySQL (even if that's overkill - you'd use it as a keystore, basically).
faster NoSQL alternatives such as MongoDB.
Additional options
You will probably want to read about client caching and headers, and possibly explore whatever options your hosting offers (load balancers, HTTP caches such as Varnish, etc.).
parse_ini_file() uses standard operations to convert the file into an array.
First Some Background
I'm planning out the architecture for a new PHP web application and trying to make it as easy as possible to install. As such, I don't care what web server the end user is running so long as they have access to PHP (setting my requirement at PHP5).
But the app will need some kind of database support. Rather than working with MySQL, I decided to go with an embedded solution. A few friends recommended SQLite - and I might still go that direction - but I'm hesitant since it needs additional modules in PHP to work.
Remember, the aim is easy of installation ... most lay users won't know what PHP modules their server has or even how to find their php.ini file, let alone enable additional tools.
My Current Objective
So my current leaning is to go with a filesystem-based data store. The "database" would be a folder, each "table" would be a specific subfolder, and each "row" would be a file within that subfolder. For example:
/public_html
/application
/database
/table
1.data
2.data
/table2
1.data
2.data
There would be other files in the database as well to define schema requirements, relationships, etc. But this is the basic structure I'm leaning towards.
I've been pretty happy with the way Microsoft built their Open Office XML file format (.docx/.xlsx/etc). Each file is really a ZIP archive of a set of XML files that define the document.
It's clean, easy to parse, and easy to understand.
I'd like to actually set up my directory structure so that /database is really a ZIP archive that resides on the server - a single, portable file.
But as the data store grows in size, won't this begin to affect performance on the server? Will PHP need to read the entire archive in to memory to extract it and read its composite files?
What alternatives could I use to implement this kind of file structure but still make it as portable as possible?
Sqlite is enabled by default since PHP5 so most all PHP5 users should have it.
I think there will be tons of problems with the zip approach, for example adding a file to a relatively large zip archive is very time consuming. I think there will be horrible concurrency and locking issues.
Reading zip files requires a php extension anyway, unless you went with a pure PHP solution. The downside is most php solutions WILL want to read the whole zip into memory, and will also be way slower than something that is written in C and compiled like the zip extension in PHP.
I'd choose another approach, or make SQLite/MySQL a requirement. If you use PDO for PHP, then you can allow the user to choose SQLite or MySQL and your code is no different as far as issuing queries. I think 99%+ of webhosts out there support MySQL anyway.
Using a real database will also affect your performance. It's worth loading the extra modules (and most PHP installations have at least the mysql module and probably sqlite as well) for the fact that those modules are written in C and run much faster than PHP, and have been optimized for speed. Using sqlite will help keep your web app portable, if you're willing to deal with sqlite BS.
Zip archives are great for data exchange. They aren't great for fast access, though, and they're awful for rewriting content. Both of these are extremely important for a database used by a web application.
Your proposed solution also has some specific performance issues -- the list of files in a zip archive is internally stored as a "flat" list, so accessing a file by name takes O(n) time relative to the size of the archive.
I am currently building a caching component for my applications. It will have support for different adapters:
APC
Memcached
Files
For all of them, I need to generate a cache key for them. What's the best way to do this? I am considering contatenating the function name and arguements and then running md5() on it. Is this a good strategy?
Finally, when caching objects as files to disk, how should the cache files be organized? I have a feeling that having a cache folder and just throwing all the cache files in there would probably be pretty bad performance.
The application will be hosted on Linux and Windows servers.
Both md5() and sha1() fit your need to name cache files, since they both have a good performance.
When saving the cache files to the file system, you can refer to how git store its files.
Links useful:
Benchmark: http://www.cryptopp.com/benchmarks.html
How git stores objects: http://book.git-scm.com/7_how_git_stores_objects.html
How about http://php.net/manual/en/function.uniqid.php?
I was wondering if itwould be possible to somehow speed up symfony templates by loading the files in memcached, and then instead of doing include, grabbing them from memory? Has anyone tried this? WOuld it work?
Have you looked at the view cache already? This built-in system makes it possible to cache the output from actions, and has a lot of configuration options, and is overridable on a per-action (and per-component) level. It works by default on a file level, but I think it is possible to configure it in a way that the action output is cached to memcached. (Or you should write this part)
If you want really lightning fast pages, you should also look at the sfSuperCachePlugin, which stores the output as an HTML file in your public HTML folder. That way Apache can directly serve the pages, and doesn't need to start up PHP and symfony to generate the output.
Sorry for not having more time to give an explanation here but you can review the notes at:
http://www.symfony-project.org/book/1_2/12-Caching
under the heading:
Alternative Caching storage
Quote from the page:
"By default, the symfony cache system stores data in files on the web server hard disk. You may want to store cache in memory (for instance, via memcached) or in a database (notably if you want to share your cache among several servers or speed up cache removal). You can easily alter symfony's default cache storage system because the cache class used by the symfony view cache manager is defined in factories.yml."
good luck!
PHP 5.3 has a new feature called PHAR similar to JAR in JAVA. It's basically a archive of PHP files. What are its advantages? I can't understand how they can be helpful in the web scenario.
Any other use other than "ease of deployment" - deploy an entire application by just copying one file
There are tremendous benefits for open source projects (in no particular order).
Easier deployment means easier adoption. Imagine: You install a CMS, forum, or blog system on your website by dragging it into your FTP client. That's it.
Easier deployment means easier security. Updating to the latest version of a software package will be much less complicated if you have only one file to worry about.
Faster deployment. If your webhost doesn't give you shell access, you don't need to unzip before uploading, which cuts out per-file transfer overhead.
Innate compartmentalization. Files that are part of the package are clearly distinguished from additions or customizations. You know you can easily replace the archive but you need to backup your config and custom templates (and they aren't all mixed together).
Easier libraries. You don't need to figure out how to use the PEAR installer, or find out whether this or that library has a nested directory structure, or whether you have to include X, Y, or Z (in that order?). Just upload, include archive, start coding.
Easier to maintain. Not sure whether updating a library will break your application? Just replace it. Broken? Revert one file. You don't even need to touch your application.
What you see is what you get. Chances are, someone is not going to go to the trouble of fudging with an archive, so if you see one installed on a system you maintain, you can be fairly confident that it doesn't have someone's subtly buggy random hacks thrown in. And a hash can quickly tell you what version it is or whether it's been changed.
Don't poo-poo making it easier to deploy things. It won't make any difference for homegrown SaaS, but for anyone shipping or installing PHP software packages it's a game-changer.
In theory it should also improve loading speed. If you have alot of files which need to be included, replacing it with single include will save you time on file opening operations.
In my experience, loosely packaged PHP source files sitting in a production environment invite tinkering with live code when a fix is needed. Deploying in a .phar file discourages this behaviour and helps reinforce better practices, i.e. build and test in a local environment, then deploy to production.
The advantage is mainly ease of deployment. You deploy an entire application by just copying one file.
Libraries can also be used without being expanded.
Any tool that works on a single file "suddenly" works with all files of an application at once.
E.g. transport: You can upload the entire application through a single input/file element without additional steps.
E.g. signing an application: checksum/sign the file -> checksum/signature for the whole application.
...