I am creating a new PHP framework depending on Zend Framework.
It will be a general purpose MVC framework for web development.
I am worried about 2 aspects:
Logging:
Should I use logging? Is there any substantial performance problems when using logging?
Caching database queries:
I am caching some queries from database.
I am concerned about caching user related information. Suppose there are some information related to users. Like their personal info, etc.
If I cache such data, for every user a cache file will be generated in my data folder. Now suppose there are 10,000 - 20,000 users online in 2 hours span of time. These means that there will be 20000 files on my folder.
My question is that, will it affect the performance of my server. Is there any upper limit on how many files a folder can have on server.
Do not use a file based cache. File system operations are exceptionally slow: http://imgur.com/X1Hi1.gif . Use memcached, you don't need a lot of memory contrary to what the above post says, the amount of memory you need for it is totally proportional to how much stuff you want to store, plus memcached can cull data based on access frequency.
1) You definitely want logging, I'd recommend xdebug available at http://www.xdebug.org/. You can read further about the performance overheads at their site. (plus it integrates nicely with Eclipse's PHP version.)
2) I'm not really sure I'd want to cache much user information, but memcache is probably one of the better choices for caching in php (http://se2.php.net/memcache). And yeah, there's no limit on file number, and you'll probably not be going over the 32-bit filesize limit either =)
Caching is a real problem it's almost impossible to get it right from a user/programmer perspective. I wouldn't cache things as simple as user data. This is already cached in the database. Focus more on complex queries and complete webpages (or parts of it).
Unless you have a page like stackoverflow where i see really few ways to cache anything you have to search hard and check your logfiles about what users do on your site and you will see some hotspots soon.
Memcache is not recommended by me unless you have a lot of memory (> 8GB) on your machine. Memcache works best if you throw in Memcache servers with 16 GB doing nothing else them caching things.
For smaller sites, hardware and requirements you should consider APC as this is a very low overhead cache for data and it speeds up the execution of php at the same time (you don't want to run a production server without a bytecode cache).
Related
I've just started using YII and managed to finish my first app. unfortunately, launch day is close and I want this app to be super fast. So far, the only way of speeding it up I've come across, is standard caching. What other ways are there to speed up my app?
First of all, read Performance Tuning in the official guide. Additionally:
Check HTTP caching.
Update your PHP. Each major version gives you a good boost.
Use redis (or at least database) for sessions (default PHP sessions are using files and are blocking).
Consider using nginx instead (or with) apache. It serves content much better.
Consider using CDN.
Tweak your database.
These are all general things that are relatively easy to do. If it's not acceptable afterwards, do not assume. Profile.
1. Following best practices
In this recipe, we will see how to configure Yii for best performances and will see some additional principles of building responsive applications. These principles are both general and Yii-related. Therefore, we will be able to apply some of these even without using Yii.
Getting ready
Install APC (http://www.php.net/manual/en/apc.installation.php)
Generate a fresh Yii application using yiic webapp
2.Speeding up sessions handling
Native session handling in PHP is fine in most cases. There are at least two possible reasons why you will want to change the way sessions are handled:
When using multiple servers, you need to have a common session storage for both servers
Default PHP sessions use files, so the maximum performance possible is limited by disk I/O
3.Using cache dependencies and chains
Yii supports many cache backends, but what really makes Yii cache flexible is the dependency and dependency chaining support. There are situations when you cannot just simply cache data for an hour because the information cached can be changed at any time.
In this recipe, we will see how to cache a whole page and still always get fresh data when it is updated. The page will be dashboard-type and will show five latest articles added and a total calculated for an account. Note that an operation cannot be edited as it was added, but an article can.
4.Profiling an application with Yii
If all of the best practices for deploying a Yii application are applied and you still do not have the performance you want, then most probably, there are some bottlenecks with the application itself. The main principle while dealing with these bottlenecks is that you should never assume anything and always test and profile the code before trying to optimize it.
If most of your app is cacheable you should try a proxy like varnish.
Go for general PHP Mysql Performance turning.
1)Memcache
Memcahced open source distributed memory object caching system it helps you to speeding up the dynamic web applications by reducing database server load.
2)MySQL Performance Tuning
3)Webserver Performance turning for PHP
this is my first question here, which is regarding a specific website optimization.
A few moths ago, we launched [site] for one of our clients which is some kind of community website.
Everything works great, but now this website is getting bigger and it shows some slowness when the pages are loading.
The server specs:
PHP 5.2.1 (i think we need to upgrade on 5.3 to make use of the new garbage collector)
Apache 2.2
Quad Core Xeon Processor # 2,8 Ghz and 4 GB DDR 3 RAM.
XCACHE 1.3 (we added this a few months ago)
Mysql 5.1 (we are using innodb as engine)
Codeigniter framework
Here is what we did so far and what we intend to do further :
Beside xcache, we don't really use a caching mechanism because most of the content comes live and beside this, we didn't wanted to optimize prematurely because we didn't know what to expect as far as the traffic flow.
On the other hand, we have installed memcached and we want to implement a cache system based on memcached.
Regarding the database structure, we have reached 3NF with most of our tables, and yes we have some slow queries(which we plan to optimize) but i think because the tables that produce slow queries are the one for blog comments(~44,408 rows) / user logs tracking (~725,837 rows) / user comments (~698,964 rows) etc which are quite big tables. The entire database is 697.4 MB in size for now.
Also, here are some stats for January 2011:
Monthly unique visitors: - 127.124
Monthly unique views: 4.829.252
Monthly unique visits: 242.708
Daily average:
Unique new visitors: 7.533
Unique new views : 179.680
Just let me know if you need more details.
Any advice is highly appreciated.
Thank you.
When it come to performance issue, there is no golden rule or labelled sticky note that first tell that is related to database. Maybe what i could suggest is to do performance profiling and there are many free and paid tools over the Internet that allows you to do so.
First start of with web server layer, make sure everything is done correctly and optimized as what is be possible.
Then move on to next layer (which i assume is your database). Normally from layman perspective whenever someone mentioned InnoDB MySQL, we assume there are indexes being created to optimize and search operations. The usage of indexes also quite important because you don't want to indexing something wrong and make things worse. My advise to this is to get a DBA equivalent personnel to troubleshoot using a staging environment.
Another tricks you could possibility look at is the contents, from web page contents to database data, make sure you show/keep data where is needed only, do no store unnecessary information into database and using smart layout on the webpage. A cut down of a seconds or two might do a big difference in terms of usability and response time.
It is very hard to explain the detail here unless we have in-depth information about your application, its architecture and your environment, but above are some commonly used direction people use to troubleshoot such incident.
Good luck!
This site has excellent resources http://www.websiteoptimization.com/
The books that are mentioned are excellent. There are just too many techniques to list here and we do not know what you have tried so far.
Sorry for the delay guys, i have been very busy to find the issue and i did it.
Well, the problem was because of apache mostly, i had an access log of almost 300 GB which at midnight was parsed to generate webalizer stats. Mostly when this was happening the website was very very slow. I disabled webalizer for the domain, cleared the logs, and what to see, it is very fast again, doesn't matter the hour you access it.
I now only have just a few slow queries that i tend to fix today.
I also updated to CI 2.0 Reactor as suggested and started to use the memcached driver.
Who would knew that apache logs can be so problematic...
Based on the stats, I don't think you are hitting load problems... on a hunch, I would look to the database first. Database partitioning might be a good place to start.
But you should really do some profiling of your application first. How much time is spent in the application versus database. Are there application methods that are using lots of time and just need some tweaking? Are database queries not written efficiently? Do you need more or better database indices?
Everything looks pretty good-- if upgrading codeigniter is an option, the new codeigniter 2.0 (reactor) adds support for memcache (New Cache driver with file system, APC and memcache support). Granted you're already using xcache, these new additions may be worth looking at.
When cache objects weren't enough for our multi-domain platform that saw huge traffic, we went the route of throwing more hardware at it-- ram, servers/database. Then we moved to database clustering to handle single account forecasted heavy load. And now switching from apache to nginx... It's a never ending battle, but what worked for us was being smart about what we cached and increasing server memory then distributing this load across servers...
Cache as many database calls as you can. In my CI application I have a settings table that rarely changes, so I cache all calls made to it as I am constantly querying the settings table.
Cache your views and even your controllers as well. I tend to cache basically as much as I can in my CI applications and then refresh the cache when a file changes.
Only autoload important libraries, models and helpers. I've seen people autoload up to 10 libraries and on-top of that a few helpers and then a model. You only really need to autoload the database and session libraries if you are using them.
Regarding point number 3, are you autoloading many things in your config/autoload.php file by any chance? It might help speed things up only loading things you need in your controllers as you need them with exception of course the session and database libraries.
Trying to get to grips with the different types of cache engines File, APC, Xcache, Memcache. Anybody know of any good resources/links?
Note I am using Linux, PHP and mysql
There are 2 types of caching terminology thrown around in PHP.
First is an optcode cache:
http://en.wikipedia.org/wiki/PHP_accelerator
Second is a data cache:
http://simas.posterous.com/php-data-caching-techniques
A few of the technologies can cross boundaries into both realms, but the basics behind them are simple. The idea is: Keep as much data in ram and precompiled because compiling and HD seeks are very expensive processes. HD Seeks can be done to find a file to compile / query the DB to get data / looking for a temp file, and every time that happens it slows down the user experience.
Memcached is generally the way to go, but it has some "features" such as once you save some data to t cache, it doesn't necessarily guarantee that it will be available later as it dynamically removes old caches to make way for new ones. It's also fairly basic, you'll need to roll your own system for handling timeouts and preventing cascading but it's all fairly simple. There's tons of info in the Memcached FAQ, or feel free to ask and I'll post some code examples. Memcached can also act as a session handler which is great if you have lots of users or more than one server.
Otherwise disc caching is good if you only have one server or don't mind generating separate caches of each server. Generally faster than memcached as it doesn't have the network overhead (unless you have memcached on the same server). There are plenty of good disc caching frameworks but probably the best are Pear Cache_Lite and APC.
APC also has the added advantage that it can cache your compiled PHP code which may help on high-performance websites.
I'm running a php/mysql-driven website with a lot of visits and I'm considering the possibility of caching result-sets in shared memory in order to reduce database load.
However, right now MySQL's query cache is enabled and it seems to be doing a pretty good job since if I disable query caching, the use of CPU jumps to 100% immediately.
Given that situation, I dont know if caching result-sets (or even the generated HTML code) locally in shared memory with PHP will result in any noticeable performace improvement.
Does anyone out there have any experience on this matter?
PS: Please avoid suggesting heavy-artillery solutions like memcached. Right now I'm looking for simple solutions that dont require too much time to implement, deploy and maintain.
Edit:
I see my comment about memcached deviated answers from the actual point, which is whether caching DB queries in the application layer would result in a noticeable performace impact considering that the result of those queries are already being cached at the DB level.
I know you didn't want to hear about memcached, but it is one of the best solutions for what you're trying to do. Depending on your site usage, there can be massive improvements in performance. By simply using memcached's session handler over my database session handler, I was able to cut the load in half and cut back on request serving times by over 30%.
Realistically, memcached is a simple solution. It's already integrated with PHP (if you have the extension loaded), and it requires virtually no configuration (I simply had to add memcached as a service on my linux box, which is done in one or two shell commands).
I would suggest storing session data (and anything that lends itself to caching) in memcache. For dynamic pages (such as stack overflow homepage), I would recommend caching output for a couple of seconds to prevent flooding.
A decent single box solution is file-based caching, but you have to sweep them out manually. Other than that, you could use APC, which is very fast and in-memory (still have to expire them yourself though).
As soon as you scale past one web server, though, you're going to need a shared cache, which is memcached. Why are you so adamant about not deploying this? It's not hard, and it's just going to save you time down the road. You can either start using memcache now and be done with it, or you could use one of the above methods for now and then end up switching to memcache later anyways, resulting in even more work. Plus too, you don't have to deal with running a cronjob or some other ugly hack to get cache expiration features: it does that for you.
The mysql query cache is nice, but it's not without issues. One of the big ones is it expires automatically every time the source data is changed, which you probably don't want.
I'm involved in a project that will end up creating around 10 million new pages on an existing site. The site, and the new project, are built with CodeIgniter and connecting to MySQL.
I've never dealt with a site of this size before, and I'm concerned about how we should handle caching. Has anyone dealt with caching on a PHP site of this size that could give me some pointers? I'm used to the CodeIgniter caching system and similar, but the number of cache files that would create worries me.
Any suggestions would be appreciated.
I haven't done anything on that scale, but I don't see a problem with file-based caching as long as the caching mechanism isn't completely dumb, and you're using a modern filesystem. Distributing cache files throughout a directory tree is smart enough.
If you're worried, that's good. Of course, I would suggest writing a wrapper around CI's built-in mechanism, so that you can easily swap it out for something else (Like Zend_Cache, possibly with a beefy memcached server, or some smarter file-based system of your own design).
There are several layers of caching available to PHP and CodeIgniter, but you shouldn't have to worry about the number of cached files on a standard linux server (various file systems can handle hundreds of millions of files per mount point). But to pick your caching method, you need to measure carefully.
Options:
Opcode caching (Zend, eAccelerator, and more)
CodeIgniter view caching (configured per view)
CodeIgniter read query caching
General web caching (more info)
Optimize your database (more info)
(and so on)
Additionally, you can improve the file caches by using memory file systems and in-memory tables.
The real question is, how do you pick caching strategies? Capacity planning. You model your system (users, accounts, pages, files), simulate, measure, and add caches based on best theories. Measure again. Produce new theories and measurements until you have approaches that fit your desired scale.
In my experience, view caching and web caching are a big gain for widely read sites (WPSuperCache, for example). Opcode caching (and other forms of min-imisation) are useful for heavily dynamic sites, as is database performance tuning.
FYI: If the system runs on a Windows server: Windows can (could?) max. have approx. 65.000 files in a folder, including cache folders. Not sure if this upper limit has been fixed in newer versions.
All big guys use APC.
The number of webpages is not relevant.
The relevant number is the number of hits (pageviews ).
And if you design for speed ditch the Windows machines.