I am running some set of CRON Jobs(every hour) to extract latest data from one DB and writing into CSVs using PHP.
Recently I have faced something unusual in my EC2 server. I could see CSV generated with header only, but then there was data. Also all my logger to track the process shown data extracted and count of extracted records as well. Only issue I found was CPU utilization was 100% during this scenario. Later everything went fine once CPU utilization become normal.
Then after 4 days, this time CSV generated with data twice. That means only one header but then same set of data repeated twice in the CSV. My logger to track the process shown correct count this time as well. Again only issue found was CPU utilization climbed up to 100% during this period of time.
Is there any connection between EC2 CPU utilization and this process, may be any memory related? Or anyone faced similar issues, even in different cloud?
Please advice.
Thanks
If the jobs takes more than one hour (because of high CPU utilization for example), then there will be another instance of the job and likely you will get the duplicated results in the CSV file. So, you should prevent the CRON jobs from being executed if there is already a running one. More information can be found here and here.
Related
The problem
I am using Laravel 5.3 to import a huge (about >1 million rows and >25 columns) tab separated file into mysql database using functions in controller code (I am restraining from posting all the code here). While processing the files I am encountered with the following error:
FatalErrorException in Connection.php line 720:
Maximum execution time of 30 seconds exceeded
Please note that the application is importing a different number of rows for different instances before failing.
Question
I know we can fix this using either of following:
changing php.ini suggested here
adding ini_set('max_execution_time', 300); at the beginning of public/index as suggested here
A varied number of reasons might be behind this and I am more interested in knowing where exactly is it running out of time. Laravel doesn't provide any more details than the above-mentioned. I would really appreciate if someone can provide ways to debug this. Things that would help:
Is the time aggregate of all requests by a method?
Does memory overload cause this?
Will it help by chunking the data and handling it through multiple request?
Environment
Laravel 5.3
Centos 7 on vagrant
MySQL
It's not a specific operation running out of time. It's... everything, combined, from start to finish.
max_execution_time integer
This sets the maximum time in seconds a script is allowed to run before it is terminated by the parser. This helps prevent poorly written scripts from tying up the server. The default setting is 30.
http://php.net/manual/en/info.configuration.php#ini.max-execution-time
The idea, here, is that for a web service, generally speaking, only a certain amount of time from request to response is reasonable. Obviously, if it takes 30 seconds (an arbitrary number for "reasonableness") to return a response to a web browser or from an API, something probably isn't working as intended. A lot of requests tying up server resources would result in a server becoming unresponsive to any subsequent requests, taking the entire site down.
The max_execution_time parameter is a protective control to mitigate the degradation of a site when a script -- for example -- gets stuck in an endless loop or otherwise runs for an unreasonable amount of time. The script execution is terminated, freeing resources that were being consumed, usually in an unproductive way.
Is the time aggregate of all requests by a method?
It's the total runtime time for everything in the script -- not one specific operation.
Does memory overload cause this?
Not typically, except perhaps when the system is constrained for memory and uses a swap file, since swap thrashing can consume a great deal of time.
Will it help by chunking the data and handling it through multiple request?
In this case, yes, it may make sense to work with smaller batches, which (generally speaking) should reduce the runtime. Everything is a tradeoff, as larger batches may or may not be more efficient, in terms of proccessing time per unit of work, which is workload-specific and rarely linear.
I have a PHP script that enables me to have a Social Network and such similiar.
Normally, there isn't any problem, my server is a VPS with:
2.4 GHz CPU
4 Cores
8 GB of RAM
150GB SSD
CentOS 7.1 with cPanel.
The problem is that normally server can mantain at a CPU load of 30-40% around 30 concurrent users. But sometimes, I don't know for what reason, the load goes really high, to 98-100% all the time. Even if users log out and there is even just 3-4 persons in the website, the server load remains to 98-100% all the time 'til I don't restart the server.
So, I noticed, using top command via SSH, that gets created a process in PHP with the user as the owner of the webspace (created via cPanel) and as command, PHP. The load for this process is from 20% to 27%.
The fact is that more of these PHP processes get created more time that pass.
For example, after 30 minutes, there is another PHP process with the same characteristics of the first process. And both, together, take 50-60% of the CPU load. More time pass, more process get created, to a max of 4 processes like this. (Is because my CPU has 4 cores?).
If I kill these processes via kill [pid] in 1-2 minutes, server goes back to 3% even with 10-15 concurrent users.
What is the problem? It is strictly php-file related or what? I even tried doing events on the website to check WHAT actions these PHP processes (even useless) that start. Because if I kill them, website continues to work very good!
What could be the problem?
There is a screen of CPU usage:
Thank you all.
If a process is making a lot of I/O operations like database calls etc, it can considerably increase the CPU load. In your case you are sure of the process which is the cause behind this high load. Noticing that load increases overt time,you should carefully look at the PHP script for memory leaks, lots of sessions, lots of nested loops with IO tugged in between and try to isolate the reason for it. good luck
So I have a XAMPP setup with Zurmo 2.6.5 running on it. Everything works like a charm. The speed at which it pulls up contacts, goes through pages, etc is considerably fast. I have 2 GB RAM and this is the only web app that runs on it. You can call it dedicated I guess. The problem arises when I attempt to export a fairly decent amount of data to Excel (CSV is the only option available). For e.g, I tried exporting 200-odd rows of data and it timed out due to the max_execution_limit parameter. I increased it first from around 300 to 600, and now finally to 1200. The script keeps running as though there were no end to it :-/.
Surprisingly, when I first apply the filter (not many, just one), it takes around 10-15 seconds to display the first 10 records. That indicates the query executes well within time limits. I have memcached installed, like they suggest to alleviate performance issues.
I checked Zurmo's forums and the net in general, but unfortunately I did not get even a single hit with reference to this issue. Can any fellow Zurmo developer / power user help me get this resolved?
Much appreciated. Thanks.
I have a server that has 2 quad core processors (2.4 GHz, 16GB RAM). I have a some PHP scripts that run under very heavy load. Most of these scripts do few things:
Fetch Data from database (just a single row, from a small table)
Fetch Data from other server (mainly Facebook)
Upload a small photo
Update Database table (this table is very heavily used, and number of rows grows very quickly, almost 2 rows per second)
The problem is that, the scripts are taking too much time to execute. I had a server previously which has lower configuration (one quad core processor, 6GB RAM), but scripts took 4-5 sec to complete. But now, execution time is 30-40sec, even more.
HOW I MEASURE EXECUTION TIME? I measure microtime() at start of script and end of script and subtract them. I just needed a rough estimate.
SERVER CONFIGURATION: Here are some parameters set in apache config:
server_limit = 350
max_chlid = 350
keep_alive = off
Other Characteristics:
1. When server is not under heavy load, execution time is very small
2. Previous server took very less time to execute, even under heavy load
I don't know what else details should I include. Please ask me, and I will post them here.
What should I do to improve this?
Update:
I have figured out the problem is with ImageMagick library. I googled and tried few soution like disabling OpenMP. But it hasn't helped much
I'm suggesting to do profiling with xdebug and then analyze it with software like kcachegrind. Then you will know what's taking time.
This could have many reasons:
Are your queries "slow"?
Is the server configuration right?
Has it a slow bandwidth?
Is MySql-Server configuration right?
What is the format of the table you insert?
Is something else (a cronjob e.g.) killing the database?
I would post this as a comment, but unfortunatly i can't please clear up those questions and tell what you find out ;)
I would start to decouple the problem. Test each action (fetch from db, fetch from fb, upload, etc.) separately.
At the same time check if all the components of your new server env are the same (packages, version, config, etc.) as before.
This question has been already posted on AWS forums, but yet remains unanswered https://forums.aws.amazon.com/thread.jspa?threadID=94589
I'm trying to to perform an initial upload of a long list of short items (about 120 millions of them), to retrieve them later by unique key, and it seems like a perfect case for DynamoDb.
However, my current write speed is very slow (roughly 8-9 seconds per 100 writes) which makes initial upload almost impossible (it'd take about 3 months with current pace).
I have read AWS forums looking for an answer and already tried the following things:
I switched from single "put_item" calls to batch writes of 25 items (recommended max batch write size), and each of my items is smaller than 1Kb (which is also recommended). It is very typical even for 25 of my items to be under 1Kb as well, but it is not guaranteed (and shouldn't matter anyway as I understand as only single item size is important for DynamoDB).
I use the recently introduced EU region (I'm in the UK) specifying its entry point directly by calling set_region('dynamodb.eu-west-1.amazonaws.com') as there is apparently no other way to do that in PHP API. AWS console shows that the table in a proper region, so that works.
I have disabled SSL by calling disable_ssl() (gaining 1 second per 100 records).
Still, a test set of 100 items (4 batch write calls for 25 items) never takes less than 8 seconds to index. Every batch write request takes about 2 seconds, so it's not like the first one is instant and consequent requests are then slow.
My table provisioned throughput is 100 write and 100 read units which should be enough so far (tried higher limits as well just in case, no effect).
I also know that there are some expenses on request serialisation so I can probably use the queue to "accumulate" my requests, but does that really matter that much for batch_writes? And I don't think that is the problem because even a single request takes too long.
I found that some people modify the cURL headers ("Expect:" particularly) in the API to speed the requests up, but I don't think that is a proper way, and also the API has been updated since that advice was posted.
The server my application is running on is fine as well - I've read that sometimes the CPU load goes through the roof, but in my case everything is fine, it's just the network request that takes too long.
I'm stuck now - is there anything else I can try? Please feel free to ask for more information if I haven't provided enough.
There are other recent threads, apparently on the same problem, here (no answer so far though).
This service is supposed to be ultra-fast, so I'm really puzzled by that problem in the very beginning.
If you're uploading from your local machine, the speed will be impacted by all sorts of traffic / firewall etc between you and the servers. If I call DynamoDB each request takes 0.3 of a second simply because of the time to travel to/from Australia.
My suggestion would be to create yourself an EC2 instance (server) with PHP, upload the script and all files to the EC2 server as a block and then do the dump from there. The EC2 server shuold have the blistering speed to the DynamoDB server.
If you're not confident about setting up EC2 with LAMP yourself, then they have a new service "Elastic Beanstalk" that can do it all for you. When you've completed the upload, simply burn the server - and hopefully you can do all that within their "free tier" pricing structure :)
Doesn't solve long term issues of connectivity, but will reduce the three month upload!
I would try a multithreaded upload to increase throughput. Maybe add threads one at a time and see if the throughput increases linearly. As a test you can just run two of your current loaders at the same time and see if they both go at the speed you are observing now.
I had good success using the php sdk by using the batch method on the AmazonDynamoDB class. I was able to run about 50 items per second from an EC2 instance. The method works by queuing up requests until you call the send method, at which point it executes multiple simultaneous requests using Curl. Here are some good references:
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/LoadData_PHP.html
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/LowLevelPHPItemOperationsExample.html
I think you can also use HIVE sql using Elastic Map Reduce to bulk load data from a CSV file. EMR can use multiple machines to spread the work load and achieve high concurrency.