could you please help me to find what cause this process to reach 500MB of memory usage.
It is basically an html page downloader.
Despite the fact that the process is stable (and do not exceed that limit), it' meant to use on low performing machine and I'm not satisfied.
The size of the mysql table 'Sites' is 170MB.
following the script code.
Thanks in advance.
function start() {
try {
global $log;
$db = getConnection();
Zend_Db_Table::setDefaultAdapter($db);
$log->logInfo("logger start");
while (1) {
$sitesTable = new Zend_Db_Table('Sites');
$rowset = $sitesTable->fetchAll();
foreach ($rowset as $row) {
if (time() >= (strtotime($row->lastUpdate) + $row->pollingHours * 60 * 60)) {
db_updateHtml($row);
}
}
}
} catch (Exception $e) {
global $log;
$log->logError($e->getMessage());
}
}
function db_updateHtml($siteRecord) {
try {
if ($siteRecord instanceof Zend_Db_Table_Row) {
$rowwithConnection = $siteRecord;
$url = $siteRecord->url;
$idSite = $siteRecord->idSite;
$crawler = new Crawler();
$sitesTable = new Zend_Db_Table('Sites');
//$rowwithConnection = $sitesTable->fetchRow(
// $sitesTable->select()->where('idSite = ?', $idSite));
$newHtml = HtmlDbEncode($crawler->get_web_page($url));
if (strlen($newHtml) < 10) {
global $log;
$log->logError("Download failed for: url: $url \t idsite: $idSite ");
}
if ($rowwithConnection->isChecked != 0) {
$rowwithConnection->oldHtml = $rowwithConnection->newHtml;
$rowwithConnection->isChecked = 0;
}
$rowwithConnection->newHtml = $crawler->get_web_page($url);
$rowwithConnection->lastUpdate = date("Y-m-d H:i:s");
//$rowwithConnection->diffHtml = getDiff($rowwithConnection->oldHtml, $rowwithConnection->newHtml, false, $rowwithConnection->minLengthChange);
$rowwithConnection->diffHtml = getDiffFromRecord($rowwithConnection, false, $rowwithConnection->minLengthChange);
/* if (strlen($rowwithConnection->diffHtml) > 30) {
$rowwithConnection->lastChanged = $rowwithConnection->lastUpdate;
} */
$rowwithConnection->save();
} else {
$log->logCrit("siteRecord is uninitialized");
}
} catch (Exception $e) {
global $log;
$log->logError($e->getMessage());
}
}
function getDiffFromRecord($row, $force = false, $minLengthChange = 100) {
if ($row instanceof Zend_Db_Table_Row) {
require_once '/var/www/diff/library/finediff.php';
include_once '/var/www/diff/library/Text/Diff.php';
$diff = new AndreaDiff();
$differences = $diff->getDiff($row->oldHtml, $row->newHtml);
if ($diff->isChanged($minLengthChange) || $force) {
$row->lastChanged = $row->lastUpdate;
$row->isChecked = false;
return ($differences);
}
}
return null;
}
function getConnection() {
try {
$pdoParams = array(
PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true
);
$db = new Zend_Db_Adapter_Pdo_Mysql(array(
'host' => '127.0.0.1',
'username' => 'root',
'password' => 'administrator',
'dbname' => 'diff',
'driver_options' => $pdoParams
));
return $db;
} catch (Exception $e) {
global $log;
$log->logError($e->getMessage());
}
}
1) Try use fetch method, not fetchAll:
foreach($sitesTable->fetch() as $row){
//...
}
2) try to unset all variables which store html code (if you save it in memory), at last iteration i suppose variable $rowwithConnection will have html code inside.
When i want profile php application i use xhprof it will save you a LOT of time. Good Luck!
Related
There are 3 queues in redis.
If I insert a data at a same time with different browsers, both requests are accepted.
I get 2 duplicated records. Any helps?
For example, the first request is on the 1st queue.
At the second request, 2nd queue is searched for finding the data.
But the data doesn't exist in the queue.
The second request is inserted on the queue.
Is it understandable?
public function save($evt_no, $type = "redis") {
$name = '';
$res = true;
try{
$headers = getallheaders();
$bodys = $_REQUEST;
$session = $_SESSION;
foreach ($_FILES as $fileKey=>$fileVal) {
if ($fileVal['error'] == 0) {
try {
$uploaded_row = $this->fileL->upload('queueEventParticipant', $fileKey, array() ,'queue');
} catch (\Exception $ex) {
throw $ex;
}
$bodys['file'][_return_numeric($fileKey)] = $uploaded_row;
}
}
$data = array(
'header' => $headers,
'body' => $bodys,
'session' => $session
);
$data['body']['evt_no'] = $evt_no;
$data = json_encode($data);
if($type == "redis"){
$name = $this->attendQueueRedisService->setNewRsvp($data);
} else {
throw new \Exception("No exist");
}
} catch (\Exception $ex){
log_message("error", $ex->getMessage());
throw $ex;
}
if($res){
return $name;
} else {
return "Error";
}
}
class AttendQueueRedisService extends CI_Model {
private $redisClient;
private $redisConfig;
const PREFIX_DONE = 'done_';
const PREFIX_ERROR = 'error_';
public function __construct() {
parent::__construct();
if ($this->load->config('redis', true, true)) {
$this->redisConfig = $this->config->config['redis'];
}
$this->redisClient = new Redis();
$this->redisClient->connect($this->redisConfig['hostname'], $this->redisConfig['port']);
$this->redisClient->auth($this->redisConfig['auth']);
$this->redisClient->select($this->redisConfig['queue']);
}
public function setNewRsvp($data) {
$key = uniqid('rsvp_'.gethostname().'_',true).':redis';
$this->redisClient->set($key, $data, 60 * 30);
return $key;
}
}
In my project, I have defined two classes Datasource and RedisHelper in Datasource.php file.
Datasource is for getting data from redis.
RedisHelper is for connecting redis.
Here is code of Datasource.php:
<?php
namespace common;
class Datasource {
public static $redises = array();
public function __construct() {}
public static function getRedis($config_name = NULL, $server_region = 'default') {
global $config;
$redis_config = $config['redis'][$config_name];
try {
self::$redises[$config_name] = RedisHelper::instance($config_name, $redis_config, $server_region);
} catch (Exception $e) {
self::$redises[$config_name] = null;
}
return self::$redises[$config_name];
}
}
class RedisHelper {
private $_config_name = "";
private $_redis_config = null;
private $_server_region = null;
public $timeout = 1;
private $_redis = null;
private static $instances = array();
private static $connect_error = 0;
private $call_error = 0;
private function __construct($config_name, $redis_config, $server_region) {
if ($config_name && $redis_config && $server_region) {
$this->_config_name = $config_name;
$this->_redis_config = $redis_config;
$this->_server_region = $server_region;
$this->timeout = isset($this->_redis_config[$server_region]['timeout']) ? $this->_redis_config[$server_region]['timeout'] : $this->timeout;
try {
$this->_redis = new \redis();
$this->_redis->connect($this->_redis_config[$server_region]['host'], $this->_redis_config[$server_region]['port'], $this->timeout);
} catch (Exception $e) {
$this->_redis = null;
}
} else {
$this->_redis = null;
}
}
public static function instance($config_name, $redis_config, $server_region) {
if (!$config_name || !$redis_config) {
return false;
}
$only_key = $config_name . ':' . $server_region;
if (!isset(self::$instances[$only_key])) {
try {
self::$instances[$only_key] = new RedisHelper($config_name, $redis_config, $server_region);
self::$connect_error = 0;
} catch (Exception $e) {
if (self::$connect_error < 2) {
self::$connect_error += 1;
return RedisHelper::instance($config_name, $redis_config, $server_region);
} else {
self::$connect_error = 0;
self::$instances[$only_key] = new RedisHelper(false, false, false);
}
}
}
$redis_config_info = array();
if ($redis_config && isset($redis_config[$server_region]) && isset($redis_config[$server_region]['password'])) {
$redis_config_info = $redis_config[$server_region];
unset($redis_config_info['password']);
}
self::$connect_error = 0;
return self::$instances[$only_key];
}
}
Here is redis.ini.php file:
<?php
$config['redis']['instance1'] = array(
'default' => array(
'host' => '127.0.0.1',
'port' => '6379',
'timeout' => 5,
'pconnect' => 1,
'password' => '',
)
);
$config['redis']['instance2'] = array(
'default' => array(
'host' => '127.0.0.1',
'port' => '6379',
'timeout' => 5,
'pconnect' => 1,
'password' => '',
)
);
Now I want to connect redis with classes, Here is my html code:
<body style="height:100%" >
<div><h3>haha</h3></div>
<?php
include "o1ws1v/class/common/Datasource.php";
include 'o1ws1v/conf/redis.ini.php';
$redis_obj = common\Datasource::getRedis('instance1');
$value = $redis_obj->get("xie");
echo "get key xie is:".$value."\n";
?>
</body>
But It works fail. $redis_obj is nothing in my console. Who can help me?
I am use mgp25/Instagram-API
How can l get my instagram posts with the likes of a particular user?
My code:
set_time_limit(0);
date_default_timezone_set('UTC');
require __DIR__.'/vendor/autoload.php';
$username = 'myInstagramUsername';
$password = 'myInstagramPassword';
$debug = false;
$truncatedDebug = false;
$ig = new \InstagramAPI\Instagram($debug, $truncatedDebug);
try {
$ig->login($username, $password);
} catch (\Exception $e) {
echo 'Something went wrong: '.$e->getMessage()."\n";
exit(0);
}
try {
$userId = $ig->people->getUserIdForName($username);
$act = json_encode($ig->people->getRecentActivityInbox(), true);
???????
} catch (\Exception $e) {
echo 'Something went wrong: '.$e->getMessage()."\n";
}
Worked
set_time_limit(0);
date_default_timezone_set('UTC');
require __DIR__.'/vendor/autoload.php';
$username = 'username';
$password = 'password';
$debug = false;
$truncatedDebug = false;
$ig = new \InstagramAPI\Instagram($debug, $truncatedDebug);
try {
$ig->login($username, $password);
} catch (\Exception $e) {
echo 'Something went wrong: '.$e->getMessage()."\n";
exit(0);
}
try {
$posts = [];
$comments = [];
$userId = $ig->people->getUserIdForName($username);
$maxId = null;
$response = $ig->timeline->getUserFeed($userId, $maxId);
foreach ($response->getItems() as $item) {
foreach($item->getLikers($item->getId()) as $h){
$posts[] = ['id' => $item->getId(), 'username' => $h->username];
}
foreach($ig->media->getComments($item->getId()) as $v){
if(count($v->comments) > 0){
foreach($v->comments as $c){
$comments[] = ['id' => $item->getId(), 'username' => $c->user->username, 'text' => $c->text];
}
}
}
}
print_r($posts);
print_r($comments);
} catch (\Exception $e) {
echo 'Something went wrong: '.$e->getMessage()."\n";
}
Try looping through each item of your profile then get the likes and find the username. Then if the item has a like by that user put it in an item array like so:
// Get the UserPK ID for "natgeo" (National Geographic).
$userId = $ig->people->getUserIdForName('natgeo');
// Starting at "null" means starting at the first page.
$maxId = null;
do {
$response = $ig->timeline->getUserFeed($userId, $maxId);
// In this example we're simply printing the IDs of this page's items.
foreach ($response->getItems() as $item) {
//loop through likes as u can see in [source 1][1] there is some method called 'getLikers()' which u can call on a media object.
foreach($item->getMedia()->getLikers() as $h){
// here do some if with if response user == username
}
}
source 1:https://github.com/mgp25/Instagram-API/blob/master/src/Request/Media.php
source 2:https://github.com/mgp25/Instagram-API/tree/master/examples
source 3:https://github.com/mgp25/Instagram-API/blob/e66186f14b9124cc82fe309c98f5acf2eba6104d/src/Response/MediaLikersResponse.php
By reading the source files this could work i havent tested it yet.
for new version of mgp25 this code work fine
POST UPDATED
$likes = [];
$comments = [];
$userId = $ig->people->getUserIdForName($username);
$maxId = null;
$response = $ig->timeline->getUserFeed($userId, $maxId);
$posts = $response->jsonSerialize();
foreach ($response->getItems() as $item) {
$likers = $ig->media->getLikers($item->getId());
if ($likers != null) {
foreach ($likers->getUsers() as $h) {
$likes[] = ['id' => $item->getId(), 'username' => $h->getUsername()];
}
}
$commentsList = $ig->media->getComments($item->getId());
if ($commentsList != null) {
foreach ($commentsList->getComments() as $c) {
$comments[] = ['id' => $item->getId(), 'username' => $c->getUser()->getUsername(), 'text' => $c->getText()];
}
}
}
updated reference link
Updating my collection record field with 'MongoBinData', exception is triggered:
"document fragment is too large: 21216456, max: 16777216"
I find some web discussion about 'allowDiskUse:true' for aggregate, but nothing ubout 'update'.
Here a part of code in PHP:
try {
$criteria = array( '_id' => $intReleaseId);
$fileData = file_get_contents( $_FILES[ $fileKey]["tmp_name"]);
$mongoBinData = new MongoBinData( $fileData, MongoBinData::GENERIC)
$docItem['data'] = $mongoBinData;
$docItem['fileType'] = $strFileType;
$docItem['fileSize'] = $intFileSize;
$docItem['fileExtension'] = $strFileExtension;
$docItem['fileName'] = $strFileName;
$options = array( "upsert" => true,
'safe' => true, 'fsync' => true,
'allowDiskUse' => true ); // this option doesn't change anything
$reportJson = self::GetCollection('releases')->update( $criteria, $docItem, $options);
...
MongoDb release is db version v3.0.6
Some idea ?
Self resolved.
Use gridFS is immediate and simple.
$_mongo = new MongoClient();
$_db = $_mongo->selectDB($_mDbName);
$_gridFS = $_db->getGridFS();
/* */
function saveFileData($intReleaseId, $binData) {
$criteria = array( '_id' => $intReleaseId);
// if exist or not, remove previous value
try {
$_gridFS->remove( $criteria);
}
catch(Exception $e) {}
// store new file content
$storeByteCompleted = false;
try {
$reportId = $_gridFS->storeBytes(
$binData,
array("_id" => $intReleaseId));
if ($reportId == $intReleaseId) {
$storeByteCompleted = true;
}
catch(Exception $e) {}
return $storeByteCompleted;
}
function loadFileData($intReleaseId) {
$gridfsFile = null;
$binData = null;
try {
$gridfsFile = $_gridFS->get($intReleaseId);
}
catch(Exception $e) {}
if ($gridfsFile != null) {
$binData = $gridfsFile->getBytes()
}
return $binData;
}
That's all.
Im trying to make Cassandra run with PHP on Windows 7 at the moment.
I installed cassandra and thrift...
When I call the cassandra-test.php, I get the following error:
( ! ) Fatal error: Call to undefined method
CassandraClient::batch_insert() in
C:\xampp\htdocs\YiiPlayground\cassandra-test.php on line 75
Call Stack
# Time Memory Function Location
1 0.0014 337552 {main}( ) ..\cassandra-test.php:0
2 0.0138 776232 CassandraDB->InsertRecord(
) ..\cassandra-test.php:304
The cassandra-test.php looks as follows:
<?php
// CassandraDB version 0.1
// Software Projects Inc
// http://www.softwareprojects.com
//
// Includes
$GLOBALS['THRIFT_ROOT'] = 'C:/xampp/htdocs/Yii/kallaspriit-Cassandra-PHP-Client-Library/thrift';
//$GLOBALS['THRIFT_ROOT'] = realpath('E:/00-REGIESTART/Programme/Cassandra/thrift');
require_once $GLOBALS['THRIFT_ROOT'].'/packages/cassandra/Cassandra.php';
require_once $GLOBALS['THRIFT_ROOT'].'/packages/cassandra/cassandra_types.php';
require_once $GLOBALS['THRIFT_ROOT'].'/transport/TSocket.php';
require_once $GLOBALS['THRIFT_ROOT'].'/protocol/TBinaryProtocol.php';
require_once $GLOBALS['THRIFT_ROOT'].'/transport/TFramedTransport.php';
require_once $GLOBALS['THRIFT_ROOT'].'/transport/TBufferedTransport.php';
class CassandraDB
{
// Internal variables
protected $socket;
protected $client;
protected $keyspace;
protected $transport;
protected $protocol;
protected $err_str = "";
protected $display_errors = 0;
protected $consistency = 1;
protected $parse_columns = 1;
// Functions
// Constructor - Connect to Cassandra via Thrift
function CassandraDB ($keyspace, $host = "127.0.0.1", $port = 9160)
{
// Initialize
$this->err_str = '';
try
{
// Store passed 'keyspace' in object
$this->keyspace = $keyspace;
// Make a connection to the Thrift interface to Cassandra
$this->socket = new TSocket($host, $port);
$this->transport = new TFramedTransport($this->socket, 1024, 1024);
$this->protocol = new TBinaryProtocolAccelerated($this->transport);
$this->client = new CassandraClient($this->protocol);
$this->transport->open();
}
catch (TException $tx)
{
// Error occured
$this->err_str = $tx->why;
$this->Debug($tx->why." ".$tx->getMessage());
}
}
// Insert Column into ColumnFamily
// (Equivalent to RDBMS Insert record to a table)
function InsertRecord ($table /* ColumnFamily */, $key /* ColumnFamily Key */, $record /* Columns */)
{
// Initialize
$this->err_str = '';
try
{
// Timestamp for update
$timestamp = time();
// Build batch mutation
$cfmap = array();
$cfmap[$table] = $this->array_to_supercolumns_or_columns($record, $timestamp);
// Insert
$this->client->batch_insert($this->keyspace, $key, $cfmap, $this->consistency);
// If we're up to here, all is well
$result = 1;
}
catch (TException $tx)
{
// Error occured
$result = 0;
$this->err_str = $tx->why;
$this->Debug($tx->why." ".$tx->getMessage());
}
// Return result
return $result;
}
// Insert SuperColumn into SuperColumnFamily
// (Equivalent to RDMBS Insert record to a "nested table")
function InsertRecordArray ($table /* SuperColumnFamily */, $key_parent /* Super CF */,
$record /* Columns */)
{
// Initialize
$err_str = '';
try
{
// Timestamp for update
$timestamp = time();
// Build batch mutation
$cfmap = array();
$cfmap[$table] = $this->array_to_supercolumns_or_columns($record, $timestamp);
// Insert
$this->client->batch_insert($this->keyspace, $key_parent, $cfmap, $this->consistency);
// If we're up to here, all is well
$result = 1;
}
catch (TException $tx)
{
// Error occured
$result = 0;
$this->err_str = $tx->why;
$this->Debug($tx->why." ".$tx->getMessage());
}
// Return result
return $result;
}
// Get record by key
function GetRecordByKey ($table /* ColumnFamily or SuperColumnFamily */, $key, $start_from="", $end_at="")
{
// Initialize
$err_str = '';
try
{
return $this->get($table, $key, NULL, $start_from, $end_at);
}
catch (TException $tx)
{
// Error occured
$this->err_str = $tx->why;
$this->Debug($tx->why." ".$tx->getMessage());
return array();
}
}
// Print debug message
function Debug ($str)
{
// If verbose is off, we're done
if (!$this->display_errors) return;
// Print
echo date("Y-m-d h:i:s")." CassandraDB ERROR: $str\r\n";
}
// Turn verbose debug on/off (Default is off)
function SetDisplayErrors($flag)
{
$this->display_errors = $flag;
}
// Set Consistency level (Default is 1)
function SetConsistency ($consistency)
{
$this->consistency = $consistency;
}
// Build cf array
function array_to_supercolumns_or_columns($array, $timestamp=null)
{
if(empty($timestamp)) $timestamp = time();
$ret = null;
foreach($array as $name => $value) {
$c_or_sc = new cassandra_ColumnOrSuperColumn();
if(is_array($value)) {
$c_or_sc->super_column = new cassandra_SuperColumn();
$c_or_sc->super_column->name = $this->unparse_column_name($name, true);
$c_or_sc->super_column->columns = $this->array_to_columns($value, $timestamp);
$c_or_sc->super_column->timestamp = $timestamp;
}
else
{
$c_or_sc = new cassandra_ColumnOrSuperColumn();
$c_or_sc->column = new cassandra_Column();
$c_or_sc->column->name = $this->unparse_column_name($name, true);
$c_or_sc->column->value = $value;
$c_or_sc->column->timestamp = $timestamp;
}
$ret[] = $c_or_sc;
}
return $ret;
}
// Parse column names for Cassandra
function parse_column_name($column_name, $is_column=true)
{
if(!$column_name) return NULL;
return $column_name;
}
// Unparse column names for Cassandra
function unparse_column_name($column_name, $is_column=true)
{
if(!$column_name) return NULL;
return $column_name;
}
// Convert supercolumns or columns into an array
function supercolumns_or_columns_to_array($array)
{
$ret = null;
for ($i=0; $i<count($array); $i++)
foreach ($array[$i] as $object)
{
if ($object)
{
// If supercolumn
if (isset($object->columns))
{
$record = array();
for ($j=0; $j<count($object->columns); $j++)
{
$column = $object->columns[$j];
$record[$column->name] = $column->value;
}
$ret[$object->name] = $record;
}
// (Otherwise - not supercolumn)
else
{
$ret[$object->name] = $object->value;
}
}
}
return $ret;
}
// Get record from Cassandra
function get($table, $key, $super_column=NULL, $slice_start="", $slice_finish="")
{
try
{
$column_parent = new cassandra_ColumnParent();
$column_parent->column_family = $table;
$column_parent->super_column = $this->unparse_column_name($super_column, false);
$slice_range = new cassandra_SliceRange();
$slice_range->start = $slice_start;
$slice_range->finish = $slice_finish;
$predicate = new cassandra_SlicePredicate();
$predicate->slice_range = $slice_range;
$resp = $this->client->get_slice($this->keyspace, $key, $column_parent, $predicate, $this->consistency);
return $this->supercolumns_or_columns_to_array($resp);
}
catch (TException $tx)
{
$this->Debug($tx->why." ".$tx->getMessage());
return array();
}
}
// Convert array to columns
function array_to_columns($array, $timestamp=null) {
if(empty($timestamp)) $timestamp = time();
$ret = null;
foreach($array as $name => $value) {
$column = new cassandra_Column();
$column->name = $this->unparse_column_name($name, false);
$column->value = $value;
$column->timestamp = $timestamp;
$ret[] = $column;
}
return $ret;
}
// Get error string
function ErrorStr()
{
return $this->err_str;
}
}
// Initialize Cassandra
$cassandra = new CassandraDB("SPI");
// Debug on
$cassandra->SetDisplayErrors(true);
// Insert record ("Columns" in Cassandra)
$record = array();
$record["name"] = "Mike Peters";
$record["email"] = "mike at softwareprojects.com";
if ($cassandra->InsertRecord('mytable', "Mike Peters", $record)) {
echo "Record (Columns) inserted successfully.\r\n";
}
// Print record
$record = $cassandra->GetRecordByKey('mytable', "Mike Peters");
print_r($record);
?>
Any ideas on this, how to fix this?
Thanks a lot!
You really don't want to do Thrift by hand if you can avoid it. Take a look at phpcassa library:
https://github.com/thobbs/phpcassa
Oh, and in the above, looks like you want 'batch_mutate' not 'batch_insert' on ln. 75. That method changed names in versions of cassandra > 0.6.x