I've been maintaining a PHP/SQL e-commerce application and the customer has called about their ttfb jumping up to almost 3 seconds consistently.
What I've tried:
Creating a test.php page to just echo some text produces a 30ms ttfb.
Traveled back in the commit history and checked if any recent changes could have been the culprit.
The test page loading quickly leads me to believe that it's some sort of query or logic that runs on every other page on the site (auth?), but none of the recent commits since the jump in ttfb have had any effect. How could this just randomly happen?
This is actually a performance optimisation problem that can be solved with profiling.
For profiling you can use xdebug or maybe other tools that are out there, however I personally didn't find a useful solution when facing a similar situation, so I just made a simple profiling module adapted to the app.
What you want to do is try to mimic on local or on staging servers the exact settings as in production, server settings, db entries, etc. And then just measure the execution time starting from first line in index.php, to key part of the app, for example db read/write class, http request class. And then write the data to some db so you can generate a profiling report.
So for each route and/or operation you would want to see how many db requests have been made, and how long they took to execute, how many API calls have been made (if this is the case) and so on. At the end the goal is to have a good idea what part of the execution flow is taking how long.
Related
I developed a site using Zend Framework 2. It is basically a price comparison site that integrates with many of the top affiliate networks out there. I wrote a script that checks prices from each affiliate network, and then updates my local DB with that price. Depending on which affiliate network I am contacting, I may be making an API call (Amazon or CJ.com), or I may be looking at an XML product feed (Pepperjam or LinkShare). The XML product feed would be hosted locally.
At present, there are around 3,500 sku's that I am checking with this script. The vast majority of them (95%+) are targeting an XML product feed. I would estimate that this script should probably take in the neighborhood of 10 minutes to complete. Some of the XML files I am looking at are around 8 MB in size.
I have tested this script thoroughly in my local environment and taken great lengths to make sure that there is no memory leak or something of that nature which would cause performance issues. As an example, I made sure to use data streams where possible to avoid putting the XML file in memory over and over, etc. Suffice to say, the script runs locally without issue.
This script is intended to be run as a cron job, however I do have a way to trigger it via the secure admin interface ad-hoc. Locally, this is how I initiate the script to run, and everything goes rather smoothly.
When I deploy my code to the shared hosting account, I am having all sorts of problems. In order to troubleshoot, I attached logging to various stages of this script to track when it starts, how it progresses, and when each step completes, etc. All of this is being logged to a MySQL database.
Problem #1: If I run the script ad-hoc via an HTTP request, I find that it will run for a couple minutes, and then the script starts again (so there are now two instance apparently running). Wait another couple minutes, and a third one will start, etc..... Here is an example when I triggered the script to run at 10:09pm via an HTTP request.
Screenshot of process manager
Needless to say, I DO NOT run it via an HTTP request because it only serves to get me in trouble with my web hosting provider :)
Problem #2: When the script runs on the server, triggered via a cron job, it is failing to complete. I have taken the production copy of the database and taken it locally along with the XML files, it runs fine. So it should not be a problem with bad data exposing bad code. My observation is - the script nearly runs for the exact same amount of time - before aborts, or is terminated, or whatever. The last record updated is generally timestamped around 4 minutes and 30 seconds or so (if memory serves) after the script is triggered. The SKU list is constantly changing so the record that it ends on differs, but the the time of the last update is nearly the same each time. Nothing is being logged in the error logs. I monitored server resources via SSH top command and there is nothing out of the ordinary. CPU usage is in check and memory used does not go up.
I have a shared hosting account through Bluehost. My thoughts were that perhaps it was a script max execution time issue. I extended the max execution time in the script itself and via php.ini. Made no difference.
So I guess what I am looking for is some fresh ideas of where to go next. What questions should I be asking my hosting company so they can help me get to the bottom of this. They are only somewhat helpful to say the least. Could it be some limitation on my hosting account? Triggering some sort of automatic monitor that is killing the script? What types of Apache settings could be problematic for a script of this nature? PHP.ini settings? Absolutely any input you can provide would be helpful.
And why, when triggered via HTTP, would it keep spinning up new instances? I guess I could live w/o running it manually, and only run it via a cron job, but that isn't working either. So .... interested in hearing the communities thoughts on this. Thanks!
I haven't seen your script, neither did I work with your hoster, so everything below is just a guess - and a suggestion.
Given your description, I would say you're right that your script might have been killed by timeout when run from cron. I'm not sure why it keeps spawning new instances of your script when you execute it manually via an HTTP request, but it may also be related to a timeout (e.g. if they have a logic that restarts a script if it has not produced an output within a certain time, or something like that).
You can follow up with your hosting provider about running long-running (or memory-consuming) script in their environment, and they might have some FAQ or document already written that covers this topic.
Let me suggest an option for you in case if your provider is unable to help.
From what you said, I expect your script runs an SQL query to get a list of SKUs, and then slowly iterates over this list, performing some job on every item (and eventually dies for whatever reason, as we learned).
How about if you create a temporary table (or file - just any kind of persistent storage on the server) that would save the last processed record ID of the script, or NULL if the script successfully completed. That way you'll be able to make your script start with the last processed record (if the last processed record had id = 1000, add ... WHERE id > 1000 to the main query that fetches SKUs), and you won't really care if the script completed its first attempt or not (if not, it will keep processing from that very point when it was killed, on its second try).
Alternatively, to extend this approach, you can limit one invocation to the certain amount of records to process (e.g. 100 or 1000), again, saving the last processed record ID in the database or somewhere else.
The main idea is: if the script fails to process all SKUs at once, just make it restartable so that it does not lose its progress.
I run a php script that uses the wikipedia api to locate wikipedia pages about certain movies, based on a long list with titles and year of release. This takes 1-2 seconds per query on average, and I do about 5 queries per minute. This has been working well for years. But since february 11 it suddenly became very slow : 30 seconds per query seems the norm now.
This is a example from a random movie in my list, and the link my script loads with file_get_contents();
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=yaml&rvsection=0&titles=Deceiver_(film)
I can put this link in my browser directly and it takes no more than a few seconds to load and open it. So I don't think the wikipedia api servers have suddenly become slow. When I load the link to my php script from my webserver in my browser, it takes between 20 to 40 seconds before the page is loaded and the result from one query is shown. When I run the php script from the command line of my webserver, I have the same slow loading times. My script still manage to save some results to the database now and then, so I'm probably not blocked either.
Other parts of my php scripts have not slowed down. There is a whole bunch of calculations done with the results of the wikipedia api, and all that is still working at a regular speed. So my webserver is still healthy. The load is always pretty low, and I'm not even using this server for something else. I have restarted apache, but found no difference in loading times.
My questions :
has something changed in the wikipedia api system recently ? Perhaps my way of using it is outdated and I need to use something new ?
where could I look for the cause of this slow loading ? I've dug through error files and tested what I could test, but I don't even know where something goes wrong. If I knew what to look for, I might perhaps easily fix it.
git log --until=2013-02-11 --merges in operations/puppet.git shows no changes on that day. Sounds like a DNS issue or similar on your end.
I'm trying to make an app that goes and does actions for all the friends that the user of the app has. The problem is I didn't find yet a platform which I can develop such app as that on.
At first I tried using PHP, I used heroku and my code worked but because I had many friends the loop went more than 30 seconds and the request timed out and the operation stopped in the middle of the action.
I don't mind using any platform I just want it to work!
Python, C++, PHP. They all are fine for me.
Thanks in advance.
Let's start with that you can change the timeout settings, depending on where the restriction is set, can be on the php as explained set_time_limit function documentation:
Set the number of seconds a script is allowed to run. If this is
reached, the script returns a fatal error. The default limit is 30
seconds or, if it exists, the max_execution_time value defined in the
php.ini.
but it can also be set on the server itself.
Another issue is that routers on the route also have their own timeout limit, so from my experience ~60 seconds is the max.
As for what you want to do, the problem is not which language/technology you use, but the fact that you're making a lot of http requests to facebook which take a bit of time, and I believe that this is your bottleneck, and if that's the case then there's not much you can improve by choosing something other than php (though you can go with NIO which should improve the IO performance).
With that said, php is not always the best solution, depends on the task at hand.
Java or any other compiled language should perform better than a scripted language (php, python), and if you go with C++ you will top 'em all, but will you feel comfortable to program your app in C++?
Choose the language/technology you feel most "at home" with, if you have a selection to choose from then figure out what you need from your app and then research on which will perform better for what you need.
Edit
Last time I checked the maximum number of friends was limited to 5000.
If you need to to run a graph request per user friend then there's simply no way that you can do that without keeping the user waiting for way too long, regardless of timeouts.
You have two options as I see it:
Make the client asynchronous, you can use web sockets, comet, or even issue an ajax request every x seconds to get the computed data.
That way you don't need to worry about timeouts and the user can start getting content quickly.
Use the javascript api to make the graph requests, that way you completely avoid timing out, plus you reduce a huge amount of networking from your servers.
This option might not be available for you if you need your servers for the computation, if for example you depend on data from your db.
As for the "no facebook SDK for C++" issue, though I don't think it's even relevant, it's not a problem.
All facebook SDKs are simply wrappers for https request, so implementing your own SDK is not that hard, though I hate thinking about doing it with C++, but then again I hate thinking about doing anything with C++.
As all my requests goes through an index script, I tried to time the respond time of all my requests.
Its simply the difference between the start time (start of the script) and end time (end of the script).
As I cache my data on memcached and user are all served using memcached.
I mostly get less than a second respond time but at times there's wierd spike of more than a seconds. the worse case can go up to 200+ seconds.
I was wondering if mobile users had a slow connection, does that reflect on my respond time?
I am serving primary mobile users.
Thanks!
No, it's the runtime of your script. It does not count the latency to the user, that's something the underlying web server is worrying about. Something in your script just takes very long. I recommend you profile your script to find what that is. Xdebug is a good way to do so.
If you're measuring in PHP (which it sounds like you are), thats the time it takes for the page to be generated on the server side, not the time it takes to be downloaded.
Drop timers in throughout the page, and try and break it down to a section that is causing the huge delay of 200+ seconds.
You could even add a small script that will email you details of how long each section took to load if it doesn't happen often enough to see it yourself.
It could be that the script cannot finish because a client downloads the results very-very slowly. If you don't use a front-end server like nginx, the first thing to do is to try it.
Someone already mentioned xdebug, but normally you would not want to run xdebug in production. I would suggest using xhprof to profile pages on development/staging/production. You can turn on xhprof conditionally, which makes it really easy to run on production.
That question may appear strange.
But every time I made PHP projects in the past, I encountered this sort of bad experience:
Scripts cancel running after 10 seconds. This results in very bad database inconsistencies (bad example for an deleting loop: User is about to delete an photo album. Album object gets deleted from database, and then half way down of deleting the photos the script gets killed right where it is, and 10.000 photos are left with no reference).
It's not transaction-safe. I've never found a way to do something securely, to ensure it's done. If script gets killed, it gets killed. Right in the middle of a loop. It gets just killed. That never happened on tomcat with java. Java runs and runs and runs, if it takes long.
Lot's of newsletter-scripts try to come around that problem by splitting the job up into a lot of packages, i.e. sending 100 at a time, then relading the page (oh man, really stupid), doing the next one, and so on. Most often something hangs or script will take longer than 10 seconds, and your platform is crippled up.
But then, I hear that very big projects use PHP like studivz (the german facebook clone, actually the biggest german website). So there is a tiny light of hope that this bad behavior just comes from unprofessional hosting companies who just kill php scripts because their servers are so bad. What's the truth about this? Can it be configured in such a way, that scripts never get killed because they take a little longer?
Is PHP suitable for very large projects?
Whenever I see a question like that, I get a bit uneasy. What does very large mean? What may be large to you, may be small to me or vice versa. And that is even assuming that we use the same metric. Are you measuring time to build the project, complete life-cycle of the project, money that are involved, number of people using it, number of developers to build/maintain it, etc. etc.
That said, the problems you're describing sounds like you don't know your technology good enough. That would be a problem for you regardless of which technology you picked. For example, use database transactions to ensure atomicity. And use asynchronous offline jobs to process long running tasks (Such as dispatching a mailing list).
A lot if the bad behaviour is covered in good frameworks like the Zend Framework.
Anything that takes longer the 10 seconds is really messed up but you can always raise the execution time with http://de3.php.net/set_time_limit
A lot of big sites are writen in PHP: Facebook, Wikipedia, StudiVZ, Digg.com etc.. a lot of the things you are talking about are just configuration things maybe you should look into that?
Are you looking for set_time_limit() and ignore_user_abort()?
Performance is not a feature you can just throw in after most of the site is done.
You have to design the site for heavy load.
If a database task is normally involving 10K rows, you should be prepared not just the execution time issues, but other maintenance questions.
Worst case: make a consistency tool to check and fix those errors.
Better: instead of phisically delete the images, just flag them and let background services to take care of the expensive maneuvers.
Best: you can utilize a job queue service and add this job to the queue.
If you do need to do transactions in php, you can just do:
mysql_query("BEGIN");
/// do your queries here
mysql_query("COMMIT");
The commit command will just complete the transaction.
If any errors occur, you can just rollback with:
mysql_query("ROLLBACK");
Edit: Note this will only work if you are using a database that supports transactions, such as InnoDB
You can configure how much time is allowed for executing a script, either in the php.ini setting or via ini_set/set_time_limit
Instead of studivz (the German Facebook clone), you could look at the actual Facebook which is entirely PHP. Or Digg. Or many Yahoo sites. Or many, many others.
ignore_user_abort is probably what you're looking for, but you could also add another layer in terms of scheduled maintenance jobs. They basically run on a specified interval and do various things to make sure your data/filesystem are in a state that you want... deleting old/unlinked files is just one of many things you can do.
For these large loops like deleting photo albums or sending 1000's of emails your looking for ignore_user_abort and set_time_limit.
Something like this:
ignore_user_abort(true); //users leaves webpage will not kill script
set_time_limit(0); //script can take as long as it wants
for(i=0;i<10000;i++)
costly_very_important_operation();
Be carefull however that this could potentially run the script forever:
ignore_user_abort(true); //users leaves webpage will not kill script
set_time_limit(0); //script can take as long as it wants
while(true)
do_something();
That script will never die, unless you restart your server.
Therefore it is best to never set the time_limit the 0.
Technically no programming language is transaction safe, it's the database that needs to be transaction safe. So if the script/code running dies or disconnects, for whatever reason, the transaction will be rolled back.
Putting queries in a loop is a very bad idea unless it is specifically design to be running in batches and breaking a much larger set into smaller pieces. Adjusting PHP timers and limits is generally a stop gap solution, you are still dependent on the client browser if using the web to kick off a script.
If I have a long process that needs to be kicked off by a browser, I "disconnect" the process from the browser and web server so control is returned to the user while the script runs. PHP scripts run from the command line can run for hours if you want. You can then use AJAX, or reload the page, to check on the progress of the long running script.
There are security concern with this code, but to "disconnect" a process from PHP running under something like Apache:
exec("nohup /usr/bin/php -f /path/to/script.php > /dev/null 2>&1 &");
But that really has nothing to do with PHP being suitable for large projects or being transaction safe. PHP can be used for large projects, but since by default there is no code that remains "resident" between hits, it can get slow if not designed right. Also, since there is no namespace support, you want to plan ahead if you have a large development team.
It's fine for a Java based system to take a few minutes to startup, initialize and load all the default objects. But this is unacceptable with PHP. PHP will take more planning for larger systems. The question is, when does the time saved in using PHP get wasted by the additional planning time required for a large system?
The reason you most likely experienced bad database consistencies in the past is because you were using the MyISAM engine for mysql (which DOES NOT support transactions). Use InnoDB instead, it supports transactions and performs row level locking.
Or use postgreSQL.
Many, many software sites are made in PHP. However, you will not hear about millions of web pages made in PHP that do not exist anymore because they were abandoned. Those pages may have burned all company money for dealing with PHP mess, or maybe they bankrupted because their soft was so crappy that customer did not want it… PHP seems good at the startup, but it does not scale very well. Yes, there are many huge web sites made in PHP, but they are rather exceptions, than a norm.