Advice To Improve Efficiency Of API Call And Cache - php

So i have the following code:
private function getArtistInfo($artist){
$artisan = json_decode($artist, true);
$artistObj = array();
//fb($artist);
$artistObj['id'] = $artisan['name']['ids']['nameId'];
$memcache = new Memcached($artistObj['id']);
$artistCache = $memcache->getMemcache();
if($artistCache === false){
$artistObj['name'] = $artisan['name']['name'];
$artistObj['image'] = $artisan['name']['images'][0]['url'];
$initArtist = array('id' => $artistObj['id'], 'name' => $artistObj['name'], 'image' => $artistObj['image']);
$artistObj = $this->buildArtist($artisan, $artistObj);
$memcache->setMemcache($artistObj);
}
else{
$initArtist = array('id' => $artistCache['id'], 'name' => $artistCache['name'], 'image' => $artistCache['image']);
}
return $initArtist;
}
Now the code works but it takes getArtistInfo() too long to finish when i just want the $initArtist value; I would like my client to get right away the $initArtist once its constructed, and somehow let the caching of $artistObj runs in the background.
So far i have read up on several different topic i thought might be useful: event delegation, callback function, call_user_func, observer pattern, threading, gearman etc. However, I have no idea which one of them would actually do what i want. Please point me to the right direction.
EDIT:
My Memcached class:
class Memcached {
private static $MEMCACHED_HOST = "localhost";
private static $MEMCACHED_PORT = "11211";
private $id, $key, $memcache, $cacheOK;
function __construct ($id){
$this->id = $id;
$this->key = 'artistID_'. $this->id;
$this->memcache = new Memcache;
$this->cacheOK = $this->memcache->connect(Memcached::$MEMCACHED_HOST, Memcached::$MEMCACHED_PORT);
}
protected function getMemcache(){
$artistInfo = null;
if($this->cacheOK === true){
$artistInfo = $this->memcache->get($this->key);
}
if($artistInfo === false){
return false;
}
return $artistInfo;
}
public function setMemcache($artistInfo){
$this->memcache->set($this->key, $artistInfo, 0, 60);
}
}
My buildArtist() code:
private function buildArtist($artisan, $artistObj){
$artistObj['amgID'] = $artisan['name']['ids']['amgPopId'];
$discography = $artisan['name']['discography'];
foreach($discography as $album){
$albumID = $album['ids']['amgPopId'];
preg_match('/(\d+)/', $albumID, $matches);
$albumObj['amgAlbumID'] = $matches[1];
$albumObj['title'] = $album['title'];
$albumObj['releaseDate'] = $album['year'];
$albumObj['more'] = $this->getMoreMusic($albumObj['title'], $artistObj['name']);
$artistObj['discography'][] = $albumObj;
}
return $artistObj;
}

Well, it's not entirely clearly how long too long is, or which part of this code is what's slowing you down. For all we know, the slow part isn't the part that stores the data in Memcached.
In any case, once you do identify that this is your bottleneck, one thing you can do to accomplish this type of out of order execution is use a brokerless messaging queue like ZeroMQ to accept JSON object that need cached. A separate PHP script can then take on the job of processing and caching these requests asynchronously outside of any web-request. This separate script could be run through a cron-job or some other job manager that handles the caching part in parallel.

You want to use set and get rather than using the memcache persistence ID, i'm not even sure what setMemcache and getMemcache are but they aren't in the extension documentation.
Here's an example from the documentation:
<?php
$m = new Memcached();
$m->addServer('localhost', 11211);
if (!($ip = $m->get('ip_block'))) {
if ($m->getResultCode() == Memcached::RES_NOTFOUND) {
$ip = array();
$m->set('ip_block', $ip);
} else {
/* log error */
/* ... */
}
}
Please show the code of buildArtist for help on optimizing it.

Related

PHP static caching doesn't work

I am new to PHP generally and caching and am trying to develop a Facebook share counter for Wordpress. As advised by a member here, since it's not optimum to make calls to the API and slow down the website each time, I decided to cache the results and went with the static method. Here's the code I am using.
function fb_cache($atts) {
$url = $atts['url'];
static $fb_cache = array();
if (isset($fb_cache[$url])) {
$fb_count = $fb_cache[$url];
return $fb_count;
} else {
$fb = json_decode( file_get_contents('http://graph.facebook.com/' . $url) );
$fb_count = $fb->share->share_count;
$fb_cache[$url] = $fb_count;
return $fb_count;
}
}
This doesn't seem to work as the number keeps changing every few seconds and the API calls are thus being made each time. To use it as a plugin, I also have an instantiating code in the end.
static function get_instance() {
static $instance = false;
if ( ! $instance ) {
$instance = new self;
}
}
Can someone tell me what I am doing wrong. I apologize if it's a noob question and if I am using the wrong method altogether.

Update delay on Google Drive api

I'm building a synchronisation service that keeps track of all the changes in a specific Google Drive folder.
Following this link I built this function:
public function whatChangesAreThere()
{
$response = [];
$pageToken = $this->lastPageToken;
while ($pageToken != null) {
$listChanges = $this->googleDriveService->changes->listChanges($pageToken, [
'spaces' => 'drive'
]);
collect($listChanges->changes)->each(function ($change) use (&$response) {
$fileResponse = [];
$fileResponse["type"] = $change->type;
if (array_has($change, "file.name")) {
$fileResponse["fileName"] = $change->file->name;
}
$fileResponse["kind"] = $change->kind;
$fileResponse["removed"] = $change->removed;
$fileResponse["teamDriveId"] = $change->teamDriveId;
$fileResponse["time"] = $change->time;
$response[$change->fileId] = $fileResponse;
});
$pageToken = $listChanges->nextPageToken;
}
$this->storeNewPageToken($listChanges->newStartPageToken);
return $response;
}
The function seems to work: when I change something in the file I get the name of the file that changed in my application. The problem is that there can be up to 1-2 minutes delay between changing the file and getting the file listed in this function.
I'm wondering if I'm doing something wrong or if this dealy is due to something else.
What do you think?

php zookeeper watcher doesn't work

I'm trying to use zookeeper in a php app, and I've done most of the get($path)/set($path, $value)/getChildren($path) functions following https://github.com/andreiz/php-zookeeper, except the watch_callback function that just doesn't work.
My php version is 5.6.14 and thread safety disabled, and I'm using apache2.4.
Here is some code snippet
class Zookeeper_Module {
private $zookeeper;
public function __construct(){
$this->ci = & get_instance();
$zookeeper_server = $this->ci->config->item('zookeeper_server');
$this->zookeeper = new Zookeeper($zookeeper_server);
}
public function set($path, $value){
$this->zookeeper->set($path, $value);
}
public function get($path, $watch_cb = null){
return $this->zookeeper->get($path, $watch_cb);
}
public function get_watch_cb($event_type = '', $stat = '', $path = ''){
error_log('hello from get_watcher_cb');
$value = $this->get($path, array($this, 'get_watch_cb'));
// update redis cache
$this->ci->cache->redis->save('some cache key', $value);
}
}
class MyTest{
public function get(){
$zookeeper = new Zookeeper_Module ();
$value = $zookeeper->get( '/foo/bar', array (
$zookeeper,
'get_watch_cb'
) );
}
public function set(){
$zookeeper = new Zookeeper_Module ();
$zookeeper->set( '/foo/bar', 'some value');
}
}
I can successfully get or set a node value, but I can neither catch watch callback log nor have redis cache updated.
I write a more simple demo, very similar with this https://github.com/andreiz/php-zookeeper/wiki, and the watcher works fine in this demo.
The most significant difference is
while( true ) {
echo '.';
sleep(2);
}
While java has a jvm container hosting watchers, php doesn't has a container to do it, so we have to use a while(true) to keep watchers alive.
So I add a while(true) in my code and now the watcher works fine.
But I don't want to add a terrible while(true) in a web app, so the final solution is adding a java app to communicate with zookeeper and save results in redis, and php app just reads info from redis.

php memory limit garbage collector

3 days crashing my head towards a wall.
I developed a php script for import big text files and populate mysql database. Until i get 2 million records it works perfectly but i need to import like 10 million rows divided in different files.
My application scans files in a folder, get file extension (i have 4 kind of procedures import for 4 different extensions) and call the relative import function.
I have a structure made of theese classes:
CLASS SUBJECT1{ public function import_data_1(){
__DESTRUCT(){$this->childObject = null;}
IMPORT SUBJECT1(){
//fopen($file);
//ob_start();
//PDO::BeginTransaction();
//WHILE (FILE) {
//PREPARED STATEMENT
//FILE READING
//GET FILE LINE
//EXECUTE INSERT
//} END WHILE
//PDO::Commit();
//ob_clean(); or ob_flush();
//fclose($file);
//clearstatcache();
}
};}
CLASS SUBJECT2{ same as SUBJECT1;}
CLASS SUBJECT3{ same as SUBJECT1;}
CLASS SUBJECT4{ same as SUBJECT1;}
and the main class that launches the procedure:
CLASS MAIN{
switch($ext)
case "ext1":
$SUBJECT1 = new SUBJECT1();
IMPORT_SUBJECT1();
unset $SUBJECT1;
$SUBJECT1 = null;
break;
case "ext2": //SAME AS CASE ext1 WITH IMPORT_SUBJECT2();
case "ext3": //SAME AS CASE ext1 WITH IMPORT_SUBJECT3();
case "ext4": //SAME AS CASE ext1 WITH IMPORT_SUBJECT4();
}
It works perfectly with some adjustement of mysql file buffers (ib_logfile0 and ib_logfile1 are set as 512Mb).
The problem is that everytime a procedure is terminated php does not free memory. I'm sure that destructor is called (i put an echo inside __destruct method) and the object is not accesible (var_dump say is NULL). I tried so many ways to free memory but now i'm at a dead point.
I also verified
gc_collect_cycles()
in many different point of code and it always says 0 cycles so all abject are not referenced each other.
I tried even to delete class structure and call all the code sequential but i always get this error:
Fatal error: Out of memory (allocated 511180800) (tried to allocate 576 bytes) in C:\php\index.php on line 219 (line 219 is execute of a PS on the 13th file).
The memory is used in this way:
php script: 52MB
end first file import :110MB
destructors and unset calling: 110MB
new procedure calling: 110MB
end second file import 250MB
destructors and unset calling: 250MB
new procedure calling: 250MB
So as you can see even unsetting objects they don't free memory.
I tried setting php ini memory size to 1024M but it grows up really fast and crashes after 20 files.
Any advice?
Many thanks!
EDIT 1:
posting code:
class SUBJECT1{
public function __destruct()
{
echo 'destroying subject1 <br/>';
}
public function import_subject1($file,$par1,$par2){
global $pdo;
$aux = new AUX();
$log = new LOG();
// ---------------- FILES ----------------
$input_file = fopen($file, "r");
// ---------------- PREPARED STATEMENT ----------------
$PS_insert_data1= $pdo->prepare("INSERT INTO table (ID,PAR1,PAR2,PARN) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE ID = VALUES(ID), PAR1 = VALUES(PAR1), PAR2 = VALUES(PAR2), PAR3 = VALUES(PAR3), PARN = VALUES(PARN)");
$PS_insert_data2= $pdo->prepare("INSERT INTO table (ID,PAR1,PAR2,PARN) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE ID = VALUES(ID), PAR1 = VALUES(PAR1), PAR2 = VALUES(PAR2), PAR3 = VALUES(PAR3), PARN = VALUES(PARN)");
//IMPORT
if ($input_file) {
ob_start();
$pdo->beginTransaction();
while (($line = fgets($input_file)) !== false) {
$line = utf8_encode($line);
$array_line = explode("|", $line);
//set null values where i neeed
$array_line = $aux->null_value($array_line);
if(sizeof($array_line)>32){
if(!empty($array_line[25])){
$PS_insert_data1->execute($array_line[0],$array_line[1],$array_line[2],$array_line[5]);
}
$PS_insert_data2->execute($array_line[10],$array_line[11],$array_line[12],$array_line[15]);
}
$pdo->commit();
flush();
ob_clean();
fclose($f_titolarita);
clearstatcache();
}
I do this iterative for all files of my folder, the other procedures are the same concept.
I still have increase of memory and now it crashes with a white page response :-\
Personally, I would go slightly different about it. These are the steps I would do:
Open a PDO connection, set PDO in Exception mode
Get a list of files that I want to read
Create a class that can utilize PDO and the list of files and perform insertions
Prepare the statement ONCE, utilize it many times
Chunk PDO transaction commits to 50 (configurable) inserts - this means that every 50th time I call $stmt->execute(), I issue a commit - which utilizes the HDD better thus making it faster
Read each file line by line
Parse the line and check if it's valid
If yes, add to MySQL, if not - report an error
Now, I've created 2 classes and example on how I'd go about it. I tested only up to the reading part since I don't know your DB structure nor what AUX() does.
class ImportFiles
{
protected $pdo;
protected $statements;
protected $transaction = false;
protected $trx_flush_count = 50; // Commit the transaction at every 50 iterations
public function __construct(PDO $pdo = null)
{
$this->pdo = $pdo;
$this->stmt = $this->pdo->prepare("INSERT INTO table
(ID,PAR1,PAR2,PARN)
VALUES
(?,?,?,?)
ON DUPLICATE KEY UPDATE ID = VALUES(ID), PAR1 = VALUES(PAR1), PAR2 = VALUES(PAR2), PAR3 = VALUES(PAR3), PARN = VALUES(PARN)");
}
public function import($file)
{
if($this->isReadable($file))
{
$file = new FileParser($file);
$this->insert($file);
}
else
{
printf("\nSpecified file is not readable: %s", $file);
}
}
protected function isReadable($file)
{
return (is_file($file) && is_readable($file));
}
protected function insert(FileParser $file)
{
while($file->read())
{
//printf("\nLine %d, value: %s", $file->getLineCount(), $file->getLine());
$this->insertRecord($file);
$this->flush($file);
}
$this->flush(null);
}
// Untested method, no idea whether it does its job or not - might fail
protected function flush(FileParser $file = null)
{
if(!($file->getLineCount() % 50) && !is_null($file))
{
if($this->pdo->inTransaction())
{
$this->pdo->commit();
$this->pdo->beginTransaction();
}
}
else
{
if($this->pdo->inTransaction())
{
$this->pdo->commit();
}
}
}
protected function insertRecord(FileParser $file)
{
$check_value = $file->getParsedLine(25);
if(!empty($check_value))
{
$values = [
$file->getParsedLine[0],
$file->getParsedLine[1],
$file->getParsedLine[2],
$file->getParsedLine[5]
];
}
else
{
$values = [
$file->getParsedLine[10],
$file->getParsedLine[11],
$file->getParsedLine[12],
$file->getParsedLine[15]
];
}
$this->stmt->execute($values);
}
}
class FileParser
{
protected $fh;
protected $lineCount = 0;
protected $line = null;
protected $aux;
public function __construct($file)
{
$this->fh = fopen($file, 'r');
}
public function read()
{
$this->line = fgets($this->fh);
if($this->line !== false) $this->lineCount++;
return $this->line;
}
public function getLineCount()
{
return $this->lineCount;
}
public function getLine()
{
return $this->line;
}
public function getParsedLine($index = null)
{
$line = $this->line;
if(!is_null($line))
{
$line = utf8_encode($line);
$array_line = explode("|", $line);
//set null values where i neeed
$aux = $this->getAUX();
$array_line = $aux->null_value($array_line);
if(sizeof($array_line) > 32)
{
return is_null($index) ? $array_line : isset($array_line[$index]) ? $array_line[$index] : null;
}
else
{
throw new \Exception(sprintf("Invalid array size, expected > 32 got: %s", sizeof($array_line)));
}
}
else
{
return [];
}
}
protected function getAUX()
{
if(is_null($this->aux))
{
$this->aux = new AUX();
}
return $this->aux;
}
}
Usage:
$dsn = 'mysql:dbname=testdb;host=127.0.0.1';
$user = 'dbuser';
$password = 'dbpass';
try
{
$pdo = new PDO($dsn, $user, $password);
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$import = new ImportFiles($pdo);
$files = ['/usr/local/file1.txt', '/usr/local/file2.txt'];
foreach($files as $file)
{
$import->import($file);
}
} catch (Exception $e)
{
printf("\nError: %s", $e->getMessage());
printf("\nFile: %s", $e->getFile());
printf("\nLine: %s", $e->getLine());
}
SOLVED:
i did this approach, maybe is useful for someone who has similar problem:
I opened task manager and looked at memory usage for apache and mysql processes with these cases:
Tried to read and elaborate files without calling MySql procedures (memory usage was ok)
Tried to read, elaborate and inserting in db just files with extension one by one (all .ext1, all .ext2, ....)
Debugged the procedure with big memory encreasing isolating functions one by one finding the problematic one.
Found the problem and solved
The problem was that i called a function passing as parameter the Prepared Statement. I thought that, once prepared, it was just a "static" object to call. What happens is that if you pass the same PS in a function the memory grows up exponentially.
Hope this helps to someone.
Bye!

Laravel REST API and high CPU load

I am developing a simple RESTful API using Laravel 4.
I have set a Route that calls a function of my Controller that basically does this:
If information is in the database, pack it in a JSON object and return a response
Else try to download it (html/xml parsing), store it and finally pack the JSON response and send it.
I have noticed that the CPU load while doing a total of 1700 requests, only 2 at a time together, raises to 70-90%.
I am a complete php and laravel beginner and I've made the API following this tutorial, maybe I'm probably doing something wrong or it's just a proof of concept lacking of optimzations. How can I improve this code? (starting function is getGames)
Do you think the root of all problems is Laravel or I should obtain the same result even changing framework/using raw PHP?
UPDATE1 I also set a file Cache, but the CPU load is still ~50%.
UPDATE2 I set the query rate at two each 500ms and the CPU load lowered at 12%, so I guess this code is missing queue handling or something like this.
class GameController extends BaseController{
private static $platforms=array(
"Atari 2600",
"Commodore 64",
"Sega Dreamcast",
"Sega Game Gear",
"Nintendo Game Boy",
"Nintendo Game Boy Color",
"Nintendo Game Boy Advance",
"Atari Lynx",
"M.A.M.E.",
"Sega Mega Drive",
"Colecovision",
"Nintendo 64",
"Nintendo DS",
"Nintendo Entertainment System (NES)",
"Neo Geo Pocket",
"Turbografx 16",
"Sony PSP",
"Sony PlayStation",
"Sega Master System",
"Super Nintendo (SNES)",
"Nintendo Virtualboy",
"Wonderswan");
private function getDataTGDB($name,$platform){
$url = 'http://thegamesdb.net/api/GetGame.php?';
if(null==$name || null==$platform) return NULL;
$url.='name='.urlencode($name);
$xml = simplexml_load_file($url);
$data=new Data;
$data->query=$name;
$resultPlatform = (string)$xml->Game->Platform;
$data->platform=$platform;
$data->save();
foreach($xml->Game as $entry){
$games = Game::where('gameid',(string)$entry->id)->get();
if($games->count()==0){
if(strcasecmp($platform , $entry->Platform)==0 ||
(strcasecmp($platform ,"Sega Mega Drive")==0 &&
($entry->Platform=="Sega Genesis" ||
$entry->Platform=="Sega 32X" ||
$entry->Platform=="Sega CD"))){
$game = new Game;
$game->gameid = (string)$entry->id;
$game->title = (string)$entry->GameTitle;
$game->releasedate = (string)$entry->ReleaseDate;
$genres='';
if(NULL!=$entry->Genres->genre)
foreach($entry->Genres->genre as $genre){
$genres.=$genre.',';
}
$game->genres=$genres;
unset($genres);
$game->description = (string)$entry->Overview;
foreach($entry->Images->boxart as $boxart){
if($boxart["side"]=="front"){
$game->bigcoverurl = (string)$boxart;
$game->coverurl = (string) $boxart["thumb"];
} continue;
}
$game->save();
$data->games()->attach($game->id);
}
}
else foreach($games as $game){
$data->games()->attach($game->id);
}
}
unset($xml);
unset($url);
return $this->printJsonArray($data);
}
private function getArcadeHits($name){
$url = "http://www.arcadehits.net/index.php?p=roms&jeu=";
$url .=urlencode($name);
$html = file_get_html($url);
$data = new Data;
$data->query=$name;
$data->platform='M.A.M.E.';
$data->save();
$games = Game::where('title',$name)->get();
if($games->count()==0){
$game=new Game;
$game->gameid = -1;
$title = $html->find('h4',0)->plaintext;
if("Derniers jeux commentés"==$title)
{
unset($game);
return Response::json(array('status'=>'404'),200);
}
else{
$game->title=$title;
$game->description="(No description.)";
$game->releasedate=$html->find('a[href*=yearz]',0)->plaintext;
$game->genres = $html->find('a[href*=genre]',0)->plaintext;
$minithumb = $html->find('img.minithumb',0);
$game->coverurl = $minithumb->src;
$game->bigcoverurl = str_replace("/thumb/","/jpeg/",$minithumb->src);
$game->save();
$data->games()->attach($game->id);
}
}
unset($html);
unset($url);
return $this->printJsonArray($data);
}
private function printJsonArray($data){
$games = $data->games()->get();
$array_games = array();
foreach($games as $game){
$array_games[]=array(
'GameTitle'=>$game->title,
'ReleaseDate'=>$game->releasedate,
'Genres'=>$game->genres,
'Overview'=>$game->description,
'CoverURL'=>$game->coverurl,
'BigCoverURL'=>$game->bigcoverurl
);
}
$result = Response::json(array(
'status'=>'200',
'Game'=>$array_games
),200);
$key = $data->query.$data->platform;
if(!Cache::has($key))
Cache::put($key,$result,1440);
return $result;
}
private static $baseImgUrl = "";
public function getGames($apikey,$title,$platform){
$key = $title.$platform;
if(Cache::has($key)) return Cache::get($key);
if(!in_array($platform,GameController::$platforms)) return Response::json(array("status"=>"403","message"=>"non valid platform"));
$datas = Data::where('query',$title)
->where('platform',$platform)
->get();
//If this query has already been done we return data,otherwise according to $platform
//we call the proper parser.
if($datas->count()==0){
if("M.A.M.E."==$platform){
return $this->getArcadeHits($title);
}
else{
return $this->getDataTGDB($title,$platform);
}
} else{
else return $this->printJsonArray($datas->first());
}
}
}
?>
You're trying to retrieve data from others' servers. That is putting your CPU "on hold" until the data is fully retrieved. That's what is making your code be so "CPU expensive" (couldn't find other stuff that fits here =/ ), cause your script is waiting until the data is received and then release the script (CPU) work.
I strongly suggest that you make asynchronous calls. That would release your CPU to work on the code, while other part of your system is getting the information you need.
I hope that'll be some help! =D
UPDATE
To make examples, I'd have to re-factor your code (and I'm lazy as anything!). But, I can tell you for sure: If you put your request code, those who make calls to others site's XML, onto a queue you would gain a lot of free CPU time. Every request are redirected for a queue. Once they're ready, you treat them as you wish. Laravel has a beautiful way for dealing with queues.
what I would do first is to use a profiler to find out which parts would need an optimization. You can use for example this:
http://xdebug.org/docs/profiler
As well you didn't specify what kind of cpu is it, how many cores are you using? Is this a problem that your cpu is getting used that high?
you should use Laravel's Queue system along with beanstalkd for example and then monitor the queue (worker) with artisan queue:listen

Categories