I've been trying to debug an issue with postgres where there is far too much CPU usage on the server. I figured it might have to do with unoptimized queries and the like, however, I was unable to find any solution there. I tried using different settings for postgres, tweaking around the config. I have finally set my configurations to :
max_connections = 1000
shared_buffers = 4GB
effective_cache_size = 12GB
work_mem = 4194kB
maintenance_work_mem = 1GB
checkpoint_segments = 32
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
checkpoint_timeout - 15 min
random_page_cost = 0.5
seq_page_cost = 0.2
The server can easily provide these resources. I still couldn't get the net CPU usage to fall (it hits 40%+ on a single user, 80%+ on 2, and after 3 it begins to crawl.
Finally, I wrote the following function as a test:
public function testLoad(){
define("DBCOFIG", "host=hostname port=5432 dbname=db user=user password=pwd");
pg_connect(DBCOFIG)or die('Failed');
pg_query("select 1");
echo 'hi';die;
}
When I hit this function, it produces the exact same results, i.e. 40% CPU usage while the call is active. Clearly, the issue is not with the queries being fired, but rather the connection itself which php is making to the database. Every user will create a new http request, and every new request will create a new connection to the database, which will then create a problem.
I plan on having a userbase with around 100 parallel connections at any time, so obviously the current setup will not work for me. Any advice on where I'm going wrong? Some configuration I may be missing?
Related
So basically what I want to do is delete all keys from the database if its size is less than 1GB. I want to do it from a PHP script with follwoing code.
<?php
//Connecting to Redis server
$redis = new Redis();
$redis->connect('127.0.0.1', 6379);
echo "Connection to server sucessfully";
$redis->select(0);
$memstats = $redis->info("memory");
if ($memstats["used_memory"] < 8589934592){
$redis->flushAll();
}else{
echo "Mem is less than 1GB";
}
?>
I am using phpredis for this and just not sure about few things.
Does this used_memory parameters represents current database size?
Am I doing it right?
If not then how to get current database size?
Thank you. I do not have much idea of working with Redis.
Does this used_memory parameters represents current database size?
To my knowledge, there is no way to know the size of a single database in bytes. However, you can see the size of all databases in bytes using the info command as you did. However, if you only care about the databases' general size, then the required parameter is used_memory_dataset. The used_memory parameter is for the size of the whole Redis program/instance. Read the INFO page for more details.
Am I doing it right?
You are doing almost right. 8589934592 is 8 GB, not 1 GB. You should define a constant then use it like this:
define( 'ONE_GIGABYTE', 1073741824 );
if ($memstats["used_memory"] < ONE_GIGABYTE){
$redis->flushAll();
} else {
echo "Mem is NOT less than 1GB";
}
If not then how to get current database size?
Again, to my knowledge, there is no way to know the size of a single database in bytes. You can see the number of keys in the currently-selected database using the command DBSIZE like this: $count = $redis->dbSize();
Refer this link to understand the memory stats
The "total_system_memory_human" shows the total amount of memory that the Redis host has. And "used_memory_human" shows the total amount of used memory, which includes data and overhead.
Moreover, I would recommend to set expiry on the keys instead of deleting data based on such conditions.
Summary
This is a script (CakePHP 2.10.18 - LAMP dedicated server with PHP 5.3) that loads information from 2 MySQL tables, and then does some process of the data to output it to excel.
Table 1 has users, and Table 2 has info about those users (one record per user). The script has the goal of grabbing the record of a user from Table 1, grabbing its related info from Table 2, and put it in an excel row (using PHPExcel_IOFactory library for this).
The information extracted of those tables is of around 8000 records from each, the tables themselves have 100K and 300K total records respectively. All the fields in those tables are ints and small varchars with the exception of one field in the second table (datos_progreso seen in the code below), which is a text field and contains serialized data, but nothing big.
The issue is that if I run the script for the full 16000 records I get an Internal Server Error (without really any explanation in the logs), if I run the script for 1000 records it all works fine, so this seems to point out it's a resources issue.
I've tried (among other things that I will explain at the end) increasing the memory_limit from 128M to 8GB (yes you read that right), max_execution_time from 90 to 300 seconds, and max_input_vars from 1000 to 10000, and that isn't solving the problem.
My thoughts are that the amount of data isn't that huge to cause the resources to run out, but I've tried optimizing the script in several ways and can't get it to work. The only time I get it to work is by running it on a small portion of the records like I mention above.
I would like to know if there's something script-wise or php-configuration-wise I can do to fix this. I can't change the database tables with the information by the way.
Code
This is just the relevant bits of code that I think matter, the script is longer:
$this->Usuario->bindModel(
array('hasMany' => array(
'UsuarioProgreso' => array('className' => 'UsuarioProgreso', 'foreignKey' => 'id_usuario', 'conditions' => array('UsuarioProgreso.id_campania' => $id_campania)))
));
$usuarios = $this->Usuario->find('all', array(
'conditions'=>array('Usuario.id_campania'=>$id_campania, 'Usuario.fecha_registro >'=>'2020-05-28'),
'fields'=>array('Usuario.id_usuario', 'Usuario.login', 'Usuario.nombre', 'Usuario.apellido', 'Usuario.provincia', 'Usuario.telefono', 'Usuario.codigo_promocion'),
'order'=>array('Usuario.login ASC')
));
$usuario = null;
$progreso_usuario = null;
$datos_progreso = null;
$i = 2;
foreach ($usuarios as $usuario) {
if (isset($usuario['UsuarioProgreso']['datos_progreso'])) {
$datos_progreso = unserialize($progreso['UsuarioProgreso']['datos_progreso']);
$unit = 1;
$column = 'G';
while ($unit <= 60) {
if (isset($datos_progreso[$unit]['punt']))
$puntuacion = $datos_progreso[$unit]['punt'];
else
$puntuacion = ' ';
$objSheet->getCell($column.$i)->setValue($puntuacion);
$column++;
$unit++;
}
$nivel = 1;
$unidad_nivel = array(1 => 64, 2 => 68, 3 => 72, 4 => 76, 5 => 80, 6 => 84);
while ($nivel <= 6) {
$unidad = $unidad_nivel[$nivel];
if (isset($datos_progreso[$unidad]['punt']))
$puntuacion = $datos_progreso[$unidad]['punt'];
else
$puntuacion = ' ';
$objSheet->getCell($column.$i)->setValue($puntuacion);
$column++;
$nivel++;
}
}
//Free the variables
$usuario = null;
$progreso_usuario = null;
$datos_progreso = null;
$i++;
}
What I have tried
I have tried not using bindModel, and instead just load the information of both tables separately. So loading all the info of users first, looping through it, and on each loop grab the info for that specific user from Table 2.
I have tried also something similar to the above, but instead of loading all the info at once for the users from Table 1, just load first all their IDs, and then loop through those IDs to grab the info from Table 1 and Table 2. I figured this way I would use less memory.
I have also tried not using CakePHP's find(), and instead use fetchAll() with "manual" queries, since after some research it seemed like it would be more efficient memory-wise (didn't seem to make a difference)
If there's any other info I can provide that can help understand better what's going on please let me know :)
EDIT:
Following the suggestions in the comments I've implemented this in a shell script and it works fine (takes a while but it completes without issue).
With that said, I would still like to make this work from a web interface. In order to figure out what's going on, and since the error_logs aren't really showing anything relevant, I've decided to do some performance testing myself.
After that testing, these are my findings:
It's not a memory issue since the script is using at most around 300 MB and I've given it a memory_limit of 8GB
The memory usage is very similar whether it's via web call or shell script
It's not a timeout issue since I've given the script 20 minutes limit and it crashes way before that
What other setting could be limiting this/running out that doesn't fail when it's a shell script?
The way I solved this was using a shell script by following the advice from the comments. I've understood that my originally intended approach was not the correct one, and while I have not been able to figure out what exactly was causing the error, it's clear that using a web script was the root of the problem.
I developed a webservice (PHP/MySQL) that simply output a coupon code through a JSON string.
How it works: the application receives 1 parameter (email), it then makes a request to the database table to get a coupon code that has not yet been assigned. Then a request is made to update the row of this coupon code and put "1" in the assigned column. (SELECT / UPDATE routine)
After that, the JSON is outputed like this:
echo '{"couponCode": "'. $coupon_code . '"}';
That's all.
The problem is that the webservice receives 10000 requests in approx 1 minute. This occurs only one time a day. If I look in the raw logs of apache I can see that it has received exactly 10000 requests each time but in my table there's only 984 rows that has been updated (i.e.: 984 coupon codes given). I tested it multiple time and it varies between 980 and 986 each time. The log file created by the webservice doesn't show any errors and reflects exactly what has been updated in the database, between 980 to 986 new lines each time.
My question is: what happened with the missing requests? Is it the server that has not enough memory to handle such multiple requests in this short period of time? (When I test with 5000 requests it work OK)
If it can help, here's the function that get the a new coupon codes:
function getNewCouponCode($email){
$stmt = $this->connector->prepare("SELECT * FROM coupon_code WHERE email = '' ORDER BY id ASC LIMIT 1");
$stmt2 = $this->connector->prepare("UPDATE coupon_code SET email = :email WHERE id = :id");
try{
$this->connector->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$this->connector->beginTransaction();
$this->connector->exec("LOCK TABLES coupon_code WRITE");
/*TRANSACTION 1*/
$stmt->execute();
$result["select"] = $stmt->fetch(PDO::FETCH_ASSOC);
/*TRANSACTION 1*/
/*TRANSACTION 2*/
$stmt2->bindParam(":email", $email);
$stmt2->bindParam(":id", $result["select"]["id"]);
$result["update"] = $stmt2->execute();
/*TRANSACTION 2*/
$this->connector->commit();
$this->connector->exec('UNLOCK TABLES');
return $result;
}catch(Exception $e) {
$this->connector->rollBack();
$this->connector->exec('UNLOCK TABLES');
$result["error"] = $e->getMessage();
return $result;
}
}
Thanks in advance.
986 requests per minute is a pretty significant load for a PHP application the way you've designed it, and an Apache web server. It sounds like you're running this all on a single server.
First off, whatever is slamming you 10k times per minute should know to re-try later on if it gets a failure. Why isn't that happening? If that remote system is under your control, see if you can fix that.
Next, you'll find that the threading model of Nginx is much more efficient than Apache's for what you're doing.
Now, on to your application... it doesn't look like you actually need a SELECT and then UPDATE. Why not just an update, and check the result? Then it's atomic on its own and you don't have to do this table locking stuff (which is really going to slow you down).
I have a script that is running on a shared hosting environment where I can't change the available amount of PHP memory. The script is consuming a web service via soap. I can't get all my data at once or else it runs out of memory so I have had some success with caching the data locally in a mysql database so that subsequent queries are faster.
Basically instead of querying the web service for 5 months of data I am querying it 1 month at a time and storing that in the mysql table and retrieving the next month etc. This usually works but I sometimes still run out of memory.
my basic code logic is like this:
connect to web service using soap;
connect to mysql database
query web service and store result in variable $results;
dump $results into mysql table
repeat steps 3 and 4 for each month of data
the same variables are used in each iteration so I would assume that each batch of results from the web service would overwrite the previous in memory? I tried using unset($results) in between iterations but that didn't do anything. I am outputting the memory used with memory_get_usage(true) each time and with every iteration the memory used is increased.
Any ideas how I can fix this memory leak? If I wasn't clear enough leave a comment and I can provide more details. Thanks!
***EDIT
Here is some code (I am using nusoap not the php5 native soap client if that makes a difference):
$startingDate = strtotime("3/1/2011");
$endingDate = strtotime("7/31/2011");
// connect to database
mysql_connect("dbhost.com", "dbusername" "dbpassword");
mysql_select_db("dbname");
// configure nusoap
$serverpath ='http://path.to/wsdl';
$client = new nusoap_client($serverpath);
// cache soap results locally
while($startingDate<=$endingDate) {
$sql = "SELECT * FROM table WHERE date >= ".date('Y-m-d', $startingDate)." AND date <= ".date('Y-m-d', strtotime($startingDate.' +1 month'));
$soapResult = $client->call('SelectData', $sql);
foreach($soapResult['SelectDateResult']['Result']['Row'] as $row) {
foreach($row as &$data) {
$data = mysql_real_escape_string($data);
}
$sql = "INSERT INTO table VALUES('".$row['dataOne']."', '".$row['dataTwo']."', '".$row['dataThree'].")";
$mysqlResults = mysql_query($sql);
}
$startingDate = strtotime($startingDate." +1 month");
echo memory_get_usage(true); // MEMORY INCREASES EACH ITERATION
}
Solved it. At least partially. There is a memory leak using nusoap. Nusoap writes a debug log to a $GLOBALS variable. Altering this line in nusoap.php freed up a lot of memory.
change
$GLOBALS['_transient']['static']['nusoap_base']->globalDebugLevel = 9;
to
$GLOBALS['_transient']['static']['nusoap_base']->globalDebugLevel = 0;
I'd prefer to just use php5's native soap client but I'm getting strange results that I believe are specific to the webservice I am trying to consume. If anyone is familiar with using php5's soap client with www.mindbodyonline.com 's SOAP API let me know.
Have you tried unset() on $startingDate and mysql_free_result() for $mysqlResults?
Also SELECT * is frowned upon even if that's not the problem here.
EDIT: Also free the SOAP result too, perhaps. Some simple stuff to begin with to see if that helps.
Afternoon chaps,
Trying to index a 1.7million row table with the Zend port of Lucene. On small tests of a few thousand rows its worked perfectly, but as soon as I try and up the rows to a few tens of thousands, it times out. Obviously, I could increase the time php allows the script to run, but seeing as 360 seconds gets me ~10,000 rows, I'd hate to think how many seconds it'd take to do 1.7million.
I've also tried making the script run a few thousand, refresh, and then run the next few thousand, but doing this clears the index each time.
Any ideas guys?
Thanks :)
I'm sorry to say it, because the developer of Zend_Search_Lucene is a friend and he has worked really hard it, but unfortunately it's not suitable to create indexes on data sets of any nontrivial size.
Use Apache Solr to create indexes. I have tested that Solr runs more than 300x faster than Zend for creating indexes.
You could use Zend_Search_Lucene to issue queries against the index you created with Apache Solr.
Of course you could also use the PHP PECL Solr extension, which I would recommend.
Try speeding it up by selecting only the fields you require from that table.
If this is something to run as a cronjob, or a worker, then it must be running from the CLI and for that I don't see why changing the timeout would be a bad thing. You only have to build the index once. After that new records or updates to them are only small updates to your Lucene database.
Some info for you all - posting as an answer so I can use the code styles.
$sql = "SELECT id, company, psearch FROM businesses";
$result = $db->query($sql); // Run SQL
$feeds = array();
$x = 0;
while ( $record = $result->fetch_assoc() ) {
$feeds[$x]['id'] = $record['id'];
$feeds[$x]['company'] = $record['company'];
$feeds[$x]['psearch'] = $record['psearch'];
$x++;
}
//grab each feed
foreach($feeds as $feed) {
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::UnIndexed('id',
$feed["id"]));
$doc->addField(Zend_Search_Lucene_Field::Text('company',
$feed["company"]));
$doc->addField(Zend_Search_Lucene_Field::Text('psearch',
$feed["psearch"]));
$doc->addField(Zend_Search_Lucene_Field::UnIndexed('link',
'http://www.google.com'));
//echo "Adding: ". $feed["company"] ."-".$feed['pcode']."\n";
$index->addDocument($doc);
}
$index->commit();
(I've used google.com as a temp link)
The server its running on is a local install of Ubuntu 8.10, 3Gb RAM and a Dual Pentium 3.2GHz chip.