PHP Laravel get millions record using MYSQL - php

I have been using laravel + mysql for my project but and it was working perfectly fine until now. The records keep on increasing and now they have reached almost a million record. Problem is when i try to fetch sms from my database using this query
$smsHistory = SmsLog::where('created_at', '>=', $startDate)->where('created_at', '<=', $endDate)->whereNotNull('gateway_statuscode')->get();
It gives a 500 error without showing anything in error log. What i assume is that as i decreased the time period it gives me record so the problem is the bulk it can not handle. What could be the possible solution as i have to give it today.
I am not worried about the error log.. i want to get a count of those million record but i want to apply some algorithm on it before doing it
This is the check which i have to perform afterwards to see how many sms are in sms log
foreach ($smsHistory as $sms) {
$sms_content = SmsService::_getSmsContent($sms);
if ($sms_content->business_id && array_key_exists($sms_content->business_id, $smsCredits )) {
if (floor(strlen($sms_content->content) / 160) == 0) {
$smsCredits[$sms_content->business_id]['count'] += 1;
}
if (floor(strlen($sms_content->content) / 160) == 1 && floor(strlen($sms_content->content) / 306) == 0) {
$smsCredits[$sms_content->business_id]['count'] += 2;
}
if (floor(strlen($sms_content->content) / 306) == 1 && floor(strlen($sms_content->content) / 459) == 0) {
$smsCredits[$sms_content->business_id]['count'] += 3;
}
if (floor(strlen($sms_content->content) / 459) == 1 && floor(strlen($sms_content->content) / 621) == 0) {
$smsCredits[$sms_content->business_id]['count'] += 4;
}
if (floor(strlen($sms_content->content) / 621) == 1 && floor(strlen($sms_content->content) / 774) == 0) {
$smsCredits[$sms_content->business_id]['count'] += 5;
}
if (floor(strlen($sms_content->content) / 774) == 1 && floor(strlen($sms_content->content) / 927) == 0) {
$smsCredits[$sms_content->business_id]['count'] += 6;
}
}
this is the database field
content is the sms that i have to get and count

Regarding the 500 error:
If you are getting a 500 error, there should hopefully be some clue in the actual server error log (your laravel application error log may not have caught it depending on the error handlers, etc and what the cause was). php_info() should show you the location of the physical error log with the error_log setting:
<?php phpinfo(); ?>
If I had to guess, possibly something memory related that is causing it to choke. But that is just a guess.
As a side question, why are you trying to retrieve so many at once? Can you split them up somehow?
Edit based on updated question:
You may need to use some raw-expressions here to get what you really want: https://www.laravel.com/docs/4.2/queries#raw-expressions
MySQL for example provides the ability to get a column length:
FLOOR(CHAR_LENGTH(content) / 160) as len1, FLOOR(CHAR_LENGTH(content) / 306) as len2, FLOOR(CHAR_LENGTH(content) / 459) as len3
So probably with some magic we could take your query and let the database system do it all for you. And there are probably more efficient ways to do it, if I knew more about the significance of those numbers, but I am trying to help you at least get on one possible path.

You should setup index for mysql, or implement a search engine like elastic search

Related

Google Cloud Vision - PHP Error occurred during parsing

Im using Vision API Client Library for PHP.
This is my code:
use Google\Cloud\Vision\V1\ImageAnnotatorClient;
putenv("GOOGLE_APPLICATION_CREDENTIALS=/json.json");
$imageAnnotator = new ImageAnnotatorClient();
$fileName = 'textinjpeg.jpg';
$image = file_get_contents($fileName);
$response = $imageAnnotator->labelDetection($image);
$labels = $response->getLabelAnnotations();
if ($labels) {
echo("Labels:" . PHP_EOL);
foreach ($labels as $label) {
echo($label->getDescription() . PHP_EOL);
}
} else {
echo('No label found' . PHP_EOL);
}
And I receive this error:
Error occurred during parsing: Fail to push limit. (0)
/srv/www/site.ru/htdocs/vendor/google/protobuf/src/Google/Protobuf/Internal/CodedInputStream.php:345
#0: Google\Protobuf\Internal\CodedInputStream->pushLimit(integer)
/srv/www/site.ru/htdocs/vendor/google/protobuf/src/Google/Protobuf/Internal/CodedInputStream.php:368
#1: Google\Protobuf\Internal\CodedInputStream->incrementRecursionDepthAndPushLimit(integer, integer, integer)
....
....
....
#15: Google\Cloud\Vision\V1\ImageAnnotatorClient->labelDetection(string)
/srv/www/site.ru/htdocs/local/php_interface/GoogleCloud.php:41
This is the place, where Exception goes from:
public function pushLimit($byte_limit)
{
// Current position relative to the beginning of the stream.
$current_position = $this->current();
$old_limit = $this->current_limit;
// security: byte_limit is possibly evil, so check for negative values
// and overflow.
if ($byte_limit >= 0 &&
$byte_limit <= PHP_INT_MAX - $current_position &&
$byte_limit <= $this->current_limit - $current_position) {
$this->current_limit = $current_position + $byte_limit;
$this->recomputeBufferLimits();
} else {
throw new GPBDecodeException("Fail to push limit.");
}
return $old_limit;
}
$byte_limit <= $this->current_limit - $current_position is true
Should I increase current_position? And if I should, how can i do it? Change something on server or in PHP config?
It's a crazy thing to do!
The error "Fail to push limit" appears from time to time in forums. Various ideas are given there as to where the problem could lie. One cause could be when the source code is composed on the local PC via Composer and then transferred to the server via (S)FTP. The FTP programme decides on the basis of the file extension whether it saves the data on the server in ASCII or binary format.
In vendor/google/protobuf/src/Google/Protobuf/ there are various generated files that have a .php extension but are actually BINARY! (if you open the file, you can see it immediately, e.g. : vendor/google/protobuf/src/GPBMetadata/Google/Protobuf/Any.php)
The solution to transfer these files explicitly via binary to the server worked in my case! If in doubt, transfer the complete module from Google/protobuf as a binary...
You mentioned that $byte_limit <= $this->current_limit - $current_position is true, so either $byte_limit >= 0 or $byte_limit <= PHP_INT_MAX - $current_position are false.
If $byte_limit <= PHP_INT_MAX - $current_position is false, then increasing $current_position, won't turn it true. If you want to tweak the values, so the expression get evaluated as true, you would need to increase the value of PHP_INT_MAX instead.
If $byte_limit >= 0 is false, then modifying $current_limit won't avoid the exception.
Either way, it seems that the error is an issue with the protobuf php library, so I'd recommend you to report the issue there, rather than try to modify the values directly.
mbstring.func_overload was 2
This was the reason of error
Changed to 0 and it worked

Solution for calling a function doing lots of stuff in it by Cron?

function cronProcess() {
# > 100,000 users
$users = $this->UserModel->getUsers();
foreach ($users as $user) {
# Do lots of database Insert/Update/Delete, HTTP request stuff
}
}
The problem happens when the number of users reaches ~ 100,000.
I called the function by CURL via CronTab.
So what is the best solution for this?
I do a lot of bulk tasks in CakePHP, some processing millions of records. It's certainly possible to do, the key as others suggested is small batches in a loop.
If this is something you're calling from Cron, it's probably easier to use a Shell (< v3.5) or the newer Command class (v3.6+) than cURL.
Here's generally how I paginate large batches, including some helpful optional things like a progress bar, turning off hydration to speed things up slightly, and showing how many users/second the script was able to process:
<?php
namespace App\Command;
use Cake\Console\Arguments;
use Cake\Console\Command;
use Cake\Console\ConsoleIo;
class UsersCommand extends Command
{
public function execute(Arguments $args, ConsoleIo $io)
{
// I'd guess a Finder would be a more Cake-y way of getting users than a custom "getUsers" function:
// See https://book.cakephp.org/3.0/en/orm/retrieving-data-and-resultsets.html#custom-finder-methods
$usersQuery = $this->UserModel->find('users');
// Get a total so we know how many we're gonna have to process (optional)
$total = $usersQuery->count();
if ($total === 0) {
$this->abort("No users found, stopping..");
}
// Hydration takes extra processing time & memory, which can add up in bulk. Optionally if able, skip it & work with $user as an array not an object:
$usersQuery->enableHydration(false);
$this->info("Grabbing $total users for processing");
// Optionally show the progress so we can visually see how far we are in the process
$progress = $io->helper('Progress')->init([
'total' => 10
]);
// Tune this page value to a size that solves your problem:
$limit = 1000;
$offset = 0;
// Simply drawing the progress bar every loop can slow things down, optionally draw it only every n-loops,
// this sets it to 1/5th the page size:
$progressInterval = $limit / 5;
// Optionally track the rate so we can evaluate the speed of the process, helpful tuning limit and evaluating enableHydration effects
$startTime = microtime(true);
do {
$users = $usersQuery->offset($offset)->toArray();
$count = count($users);
$index = 0;
foreach ($users as $user) {
$progress->increment(1);
// Only draw occasionally, for speed
if ($index % $progressInterval === 0) {
$progress->draw();
}
### WORK TIME
# Do your lots of database Insert/Update/Delete, HTTP request stuff etc. here
###
}
$progress->draw();
$offset += $limit; // Increment your offset to the next page
} while ($count > 0);
$totalTime = microtime(true) - $startTime;
$this->out("\nProcessed an average " . ($total / $totalTime) . " Users/sec\n");
}
}
Checkout these sections in the CakePHP Docs:
Console Commands
Command Helpers
Using Finders & Disabling Hydration
Hope this helps!

Any performance benefits of using apc_store vs apc_add (or vice versa)?

While I understand the differences between apc_store and apc_add, I was wondering if using one or the other has any performance benefits?
One would think apc_store COULD be a bit quicker since it does not need to do a check-if-exists before doing the insert.
Am I correct in my thinking?
Or would using apc_add in situations where we know FOR SURE that the entry does not exist prove to be a bit faster?
Short: apc_store() should be slightly slower than apc_add().
Longer: the only difference between the two is the exclusive flag passed to apc_store_helper() that in turns leads to the behavior difference in apc_cache_insert().
Here is what happens there:
if (((*slot)->key.h == key.h) && (!memcmp((*slot)->key.str, key.str, key.len))) {
if(exclusive) {
if (!(*slot)->value->ttl || (time_t) ((*slot)->ctime + (*slot)->value->ttl) >= t) {
goto nothing;
}
}
// THIS IS THE MAIN DIFFERENCE
apc_cache_remove_slot(cache, slot TSRMLS_CC);**
break;
} else
if((cache->ttl && (time_t)(*slot)->atime < (t - (time_t)cache->ttl)) ||
((*slot)->value->ttl && (time_t) ((*slot)->ctime + (*slot)->value->ttl) < t)) {
apc_cache_remove_slot(cache, slot TSRMLS_CC);
continue;
}
slot = &(*slot)->next;
}
if ((*slot = make_slot(cache, &key, value, *slot, t TSRMLS_CC)) != NULL) {
value->mem_size = ctxt->pool->size;
cache->header->mem_size += ctxt->pool->size;
cache->header->nentries++;
cache->header->ninserts++;
} else {
goto nothing;
}
The main difference is that apc_add() saves one slot removal if the value is already present. Real world benchmarks would obviously make a lot of sense to confirm that analysis.

Implement atomic counter in Memcached without cas

We have a web page want to limit uo to 100 people can access concurrently, so we use a memcached to implement a global counter, e.g.
We are using http://www.php.net/manual/en/class.memcache.php so there is not cas, current code is something like
$count = $memcache_obj->get('count');
if ($count < 100) {
$memcache_obj->set('count', $count+1);
echo "Welcome";
} else {
echo "No luck";
}
As you can see there is race condition in the above code and but if we are not going to replace memcached extension which support cas, it is able to support it using PHP code only?
As answer to "emcconville". This is non-blocking even without CAS.
If your concerned about race conditions, and the count value is completely arbitrary, you can use Memcache::increment directly before any business logic.
The increment method will return the current value after the incrementation takes place; of which, you can compare results. Increment will also return false if the key has yet to be set; allowing for your application to deal with it as needed.
$current = $memcache_obj->increment('count');
if($current === false) {
// NOT_FOUND, so let's create it
// Will return false if has just been created by someone else.
$memcache_obj->add('count',0); // <-- no risk of race-condition
// At this point 'count' key is created by us or someone else (other server/process).
// "increment" will update 0 or whatever it is at the moment by 1.
$current = $memcache_obj->increment('count')
echo "You are the $current!";
}
if ($current < 100) {
echo "Hazah! You are under the limit. Congrats!";
} else {
echo "Ah Snap! No Luck - you reached the limit.";
// If your worried about the value growing _too_ big, just drop the value down.
// $memcache_obj->decrement('count');
}
If your concerned about race conditions, and the count value is completely arbitrary, you can use Memcache::increment directly before any business logic.
The increment method will return the current value after the incrementation takes place; of which, you can compare results. Increment will also return false if the key has yet to be set; allowing for your application to deal with it as needed.
$current = $memcache_obj->increment('count');
if($current === false) {
// NOT_FOUND, so let's create it
$memcache_obj->set('count',1); // <-- still risk of race-condition
echo "Your the first!";
} else if ($current < 100) {
echo "Hazah! Your under the limit.";
} else {
echo "Ah Snap! No Luck";
// If your worried about the value growing _too_ big, just drop the value down.
// $memcache_obj->decrement('count');
}
function memcache_atomic_increment($counter_name, $delta = 1) {
$mc = new Memcache;
$mc->connect('localhost', 11211) or die("Could not connect");
while (($mc->increment($counter_name, $delta) === false) &&
($mc->add($counter_name, ($delta<0)?0:$delta, 0, 0) === false)) {
// loop until one of them succeeds
}
return intval($mc->get($counter_name));
}
The comments in Memcache::add include an example locking function, have you tried it out?

How to deal with invalid page number in Codeigniter?

I'm doing QA on this new app I've created using CI. With pagination implemented, what's the typical way of dealing with invalid page numbers? Nothing stops me from manually changing the offset in the URL manually. So if the max is 20, if I modify to 100 what should happen? I'm brainstorming on ways to check if the offset is valid and if it isn't redirect somewhere or display an error message (not sure if I care to do so).
You could do a check on the numbers of results returned and return a 404.
if ($this->page->get_pages(20, 100) {
$this->load->view('our_view');
} else {
show_404();
}
If anyone comes across this later, here's how I tackled the issue.
$max_offset = (ceil($config['total_rows'] / $config['per_page']) - 1) * $config['per_page'];
if (($this->start > $max_offset) && ($max_offset >= 0)) {
redirect("/report/filter");
} else {
$data['records'] = $this->report_model->filterItem($records, false, $config['per_page'], $this->start);
$this->load->view('report_view', $data);
}

Categories