I have finished writing my php scripts for a project I am doing. My next step is I would like to see if I can improve my code from a memory stand point as some of my scripts eat a lot of memory. I have been doing research on this and one suggestion is to NULL and unset variables, but I never see an example of doing this. So I wanted to give an example of a common action done in my scripts and wanted to know if this is the proper way of doing this:
$query = $dbconn->get_results("SELECT id,name FROM account WHERE active = 1");
if(isset($query))
{
foreach($query AS $currq)
{
$account_id = intval($currq->id);
$account_name = trim($currq->name);
//Code to stuff with this data
//NULL the variables before looping again
$account_id = NULL;
$account_name = NULL;
//Unset the variables before looping again
unset($account_id);
unset($account_name);
}
$query = NULL;
unset($query);
$currq = NULL;
unset($currq);
Would that be the correct way to free up memory? I read the garbage collection in PHP can be lazy, so that is why they recommend to NULL the value as it will shrink it right away.
I know this might be too vague for this site, but if anyone can just let me know if this is the proper way of freeing up memory? Or if there is a different way, can you please provide an example just so I can visually see how it work. Thanks in advance!
Please read up on PHP generators, that is what they are exactly for.
You don't want to fetch all records at once, this would blow holes into your memory like a shotgun.
Instead you want to fetch your records one at the time, process it then fetch the next one.
Here is an example:
function getAccountData(\PDO $pdo)
{
$stmt = $pdo->prepare("SELECT id,name FROM account WHERE active = 1");
$stmt->execute();
while ($row = $stmt->fetch()) {
yield $row;
}
}
foreach (getAccountData($pdo) as $account){
//process the record for each iteration
//no need to unset anything
}
Well, if the function $dbconn->get_results returns an array with all the data, then there is no point in using generators since the memory has already being allocated for the data.
You can also use the mysqli_fetch_assoc function to get one row at a time. It should be more memory efficient then fetching all rows at once. http://php.net/manual/en/mysqli-result.fetch-assoc.php
Related
I have a file that has the function of importing data into a sql database from an api. A problem I encountered was that the api can only retrieve a max dataset size of 1000, even though sometimes I need to retrieve large amounts of data, ranging from 10-200,000. My first thought was to create a while loop in which inside I make calls to the api until all of the data is properly retrieved, and afterwards, can I enter it into the database.
$moreDataToImport = true;
$lastId = null;
$query = '';
while ($moreDataToImport) {
$result = json_decode(callToApi($lastId));
$query .= formatResult($result);
$moreDataToImport = !empty($result['dataNotExported']);
$lastId = getLastId($result['customers']);
}
mysqli_multi_query($con, $query);
The issue I encountered with this is that I was quickly reaching memory limits. The easy solution to this is to simply increase the memory limit until it was suffice. How much memory I needed, however, was undeclared, because there is always a possibility that I need to import very large datasets, and can theoretically always run out of memory. I don't want to set an infinite memory limit, as the problems with that are unimaginable.
My second solution to this was instead of looping through the imported data, I could instead send it to my database, and then do a page refresh, with a get request specifying the last Id I left off on.
if (isset($_GET['lastId'])
$lastId = $_GET['lastId'];
else
$lastId = null;
$result = json_decode(callToApi($lastId));
$query .= formatResult($result);
mysqli_multi_query($con, $query);
if (!empty($result['dataNotExported'])) {
header('Location: ./page.php?lastId='.getLastId($result['customers']));
}
This solution solves my memory limit issue, however now I have another issue, being that browsers, after 20 redirects (depends on the browser), will automatically kill the program to stop a potential redirect loop, then shortly refresh the page. The solution to this would be to kill the program yourself at the 20th redirect and allow it to do a page refresh, continuing the process.
if (isset($_GET['redirects'])) {
$redirects = $_GET['redirects'];
if ($redirects == '20') {
if ($lastId == null) {
header("Location: ./page.php?redirects=2");
}
else {
header("Location: ./page.php?lastId=$lastId&redirects=2");
}
exit;
}
}
else
$redirects = '1';
Though this solves my issues, I am afraid this is more impractical than other solutions, as there must be a better way to do this. Is this, or the issue of possibly running out of memory my only two choices? And if so, is one more efficient/orthodox than the other?
Do the insert query inside the loop that fetches each page from the API, rather than concatenating all the queries.
$moreDataToImport = true;
$lastId = null;
$query = '';
while ($moreDataToImport) {
$result = json_decode(callToApi($lastId));
$query = formatResult($result);
mysqli_query($con, $query);
$moreDataToImport = !empty($result['dataNotExported']);
$lastId = getLastId($result['customers']);
}
Page your work. Break it up into smaller chunks that will be below your memory limit.
If the API only returns 1000 at a time, then only process 1000 at a time in a loop. In each iteration of the loop you'll query the API, process the data, and store it. Then, on the next iteration, you'll be using the same variables so your memory won't skyrocket.
A couple things to consider:
If this becomes a long running script, you may hit the default script running time limit - so you'll have to extend that with set_time_limit().
Some browsers will consider scripts that run too long to be timed out and will show the appropriate error message.
For processing upwards of 200,000 pieces of data from an API, I think the best solution is to not make this work dependant on a page load. If possible, I'd put this in a cron job to be run by the server on a regular schedule.
If the dataset is dependant on the request (for example, if you're processing temperatures from one of 1000s of weather stations - the specific station ID to be set by the user), then consider creating a secondary script that does the work. Calling and forking the secondary script from your primary script will enable your primary script to finish execution while your secondary script executes in the background on your server. Something like:
exec('php path/to/secondary-script.php > /dev/null &');
I've been searching for a suitable PHP caching method for MSSQL results.
Most of the examples I can find suggest storing the results in an array, which would then get included to page. This seems great unless a request for the content was made at the same time as it being updated/rebuilt.
I was hoping to find something similar to ASP's application level variables, but far as I'm aware, PHP doesn't offer this functionality?
The problem I'm facing is I need to perform 6 queries on page to populate dropdown boxes. This happens on the vast majority of pages. It's also not an option to combine the queries. The cached data will also need to be rebuilt sporadically, when the system changes. This could be once a day, once a week or a month. Any advice will be greatly received, thanks!
You can use Redis server and phpredis PHP extension to cache results fetched from database:
$redis = new Redis();
$redis->connect('/tmp/redis.sock');
$sql = "SELECT something FROM sometable WHERE condition";
$sql_hash = md5($sql);
$redis_key = "dbcache:${sql_hash}";
$ttl = 3600; // values expire in 1 hour
if ($result = $redis->get($redis_key)) {
$result = json_decode($result, true);
} else {
$result = Db::fetchArray($sql);
$redis->setex($redis_key, $ttl, json_encode($result));
}
(Error checks skipped for clarity)
because a provider I use, has a quite unreliable MySQL servers, which are down at leas 1 time pr week :-/ impacting one of the sites I made, I want to prevent its outeges in the following way:
dump the MySQL table to a file In case the connection with the SQL
server is failed,
then read the file instead of the Server, till the Server is back.
This will avoid outages from the user experience point of view.
In fact things are not so easy like it seems and I ask for your help please.
What I did is to save the data to a JSON file format.
But this got issues because many data on the DB are "in clear" included escaped complex URLs, with long argument's line, that give some issue during the decode process from JSON.
On CSV and TSV is also not workign correctly.
CSV is delimited by Commas or Semilcolon , and those are present in the original content taken from the DB.
TSV format leave double quotes that are not deletable, without avoid to go to eliminate them into the record's fields
Then I tried to serialize each record read from the DB, store it and retrive it serializing it.
But the result is a bit catastrophic, becase all the records are stored in the file.
When I retrieve them, only one is returned. then there is something that blocks the functioning of the program (here below the code please)
require_once('variables.php');
require_once("database.php");
$file = "database.dmp";
$myfile = fopen($file, "w") or die("Unable to open file!");
$sql = mysql_query("SELECT * FROM song ORDER BY ID ASC");
// output data of each row
while ($row = mysql_fetch_assoc($sql)) {
// store the record into the file
fwrite($myfile, serialize($row));
}
fclose($myfile);
mysql_close();
// Retrieving section
$myfile = fopen($file, "r") or die("Unable to open file!");
// Till the file is not ended, continue to check it
while ( !feof($myfile) ) {
$record = fgets($myfile); // get the record
$row = unserialize($record); // unserialize it
print_r($row); // show if the variable has something on it
}
fclose($myfile);
I tried also to uuencode and also with base64_encode but they were worse choices.
Is there any way to achieve my goal?
Thank you very much in advance for your help
If you have your data layer well decoupled you can consider using SQLite as a fallback storage.
It's just a matter of adding one abstraction more, with the same code accessing the storage and changing the storage target in case of unavailability of the primary one.
-----EDIT-----
You could also try to think about some caching (json/html file?!) strategy returning stale data in case of mysql outage.
-----EDIT 2-----
If it's not too much effort, please consider playing with PDO, I'm quite sure you'll never look back and believe me this will help you structuring your db calls with little pain when switching between storages.
Please take the following only as an example, there are much better
way to design this architectural part of code.
Just a small and basic code to demonstrate you what I mean:
class StoragePersister
{
private $driver = 'mysql';
public function setDriver($driver)
{
$this->driver = $driver;
}
public function persist($data)
{
switch ($this->driver)
{
case 'mysql':
$this->persistToMysql($data);
case 'sqlite':
$this->persistToSqlite($data);
}
}
public function persistToMysql($data)
{
//query to mysql
}
public function persistSqlite($data)
{
//query to Sqlite
}
}
$storage = new StoragePersister;
$storage->setDriver('sqlite'); //eventually to switch to sqlite
$storage->persist($somedata); // this will use the strategy to call the function based on the storage driver you've selected.
-----EDIT 3-----
please give a look at the "strategy" design pattern section, I guess it can help to better understand what I mean.
After SELECT... you need to create a correct structure for inserting data, then you can serialize or what you want.
For example:
You have a row, you could do that - $sqls[] = "INSERT INTOsong(field1,field2,.. fieldN) VALUES(field1_value, field2_value, ... fieldN_value);";
Than you could serialize this $sqls, write into file, and when you need it, you could read, unserialize and make query.
Have you thought about caching your queries into a cache like APC ? Also, you may want to use mysqli or pdo instead of mysql (Mysql is deprecated in the latest versions of PHP).
To answer your question, this is one way of doing it.
var_export will export the variable as valid PHP code
require will put the content of the array into the $data variable (because of the return statement)
Here is the code :
$sql = mysql_query("SELECT * FROM song ORDER BY ID ASC");
$content = array();
// output data of each row
while ($row = mysql_fetch_assoc($sql)) {
// store the record into the file
$content[$row['ID']] = $row;
}
mysql_close();
$data = '<?php return ' . var_export($content, true) . ';';
file_put_contents($file, $data);
// Retrieving section
$rows = require $file;
I'm experiencing a strange problem. I'm caching the output of a query using memcache functions in a file named count.php. This file is called by an ajax every second when a user is viewing a particular page. The output is cached for 5 seconds, so within this time if there will be 5 hits to this file i expect the cached result to be returned 3-4 times atleast. However this is not happening, instead everytime a query is going to db as evidenced from a echo statement, but if the file is called from the browser directly by typing the url (like http://example.com/help/count.php) repeatedly many times within 5 seconds data is returned from cache (again evidenced from the echo statement). Below is the relevant code of count.php
mysql_connect(c_dbhost, c_dbuname, c_dbpsw) or die(mysql_error());
mysql_select_db(c_dbname) or die("Coud Not Find Database");
$product_id=$_POST['product_id'];
echo func_total_bids_count($product_id);
function func_total_bids_count($product_id)
{
$qry="select count(*) as bid_count from tbl_userbid where userbid_auction_id=".$product_id;
$row_count=func_row_count_only($qry);
return $row_count["bid_count"];
}
function func_row_count_only($qry)
{
if($_SERVER["HTTP_HOST"]!="localhost")
{
$o_cache = new Memcache;
$o_cache->connect('localhost', 11211) or die ("Could not connect to memcache");
//$key="total_bids" . md5($product_id);
$key = "KEY" . md5($qry);
$result = $o_cache->get($key);
if (!$result)
{
$qry_result = mysql_query($qry);
while($row=mysql_fetch_array($qry_result))
{
$row_count = $row;
$result = $row;
$o_cache->set($key, $result, 0, 5);
}
echo "From DB <br/>";
}
else
{
echo "From Cache <br/>";
}
$o_cache->close();
return $row_count;
}
}
I'm confused as to why when an ajax calls this file, DB is hit every second, but when the URL is typed in the browser cached data is returned. To try the URL method i just replaced $product_id with a valid number (Eg: $product_id=426 in my case). I'm not understanding whats wrong here as i expect data to be returned from cache within 5 seconds after the 1st hit. I want the data to be returned from cache. Can some one please help me understand whats happening ?
If you're using the address bar, you're doing a GET, but your code is looking for $_POST['...'], so you will end up with an invalid query. So for a start, the results using the address bar won't be what you're expecting. Is your Ajax call actually doing a POST?
Please also note that you've got a SQL injection vulnerability there. Make sure $product_id is an integer.
There are many problems with your code, first of all you always connect to the database and select a table, even if you don't need it. Second, you should check $result with !empty($result) which is more reliable as just !$result, because it's also covers empty objects.
As above noted, if the 'product_id' is not in the $_POST array, you could use $_REQUEST to also cover $_GET (but you shouldn't, if you are certain it's coming via $_POST).
Is it possible to ask for all data in my database and make objects from it and save it into an array or something, so I just need to call the database once and afterwards I just use my local array? If yes, how is it done?
public function getAllProjects(){
$query="SELECT * FROM projects";
$result=mysql_query($query);
$num=mysql_numrows($result);
while ($row = mysql_fetch_object($result)) {
// save object into array
}
}
public function fetchRow($row){
include("Project.php");
$project = new Project();
$id=$row->id;
$project->setId($id);
$title=$row->title;
$project->setTitle($title);
$infos=$row->info;
$project->setInfo($infos);
$text=$row->text;
$project->setText($text);
$cate=$row->category;
$project->setCategory($cate);
return $project;
}
If I have for example this code. How do i store the objects correctly into an array, where I grab the data from? And why can't I make more than one object of type "Project"?
Let's ignore the fact that you will run out of memory.
If you have everything in an array you will no longer have the functionalities of a relational database.
Try a search over a multi megabytes, multi dimensional array in php and be prepared for a extended coffee break.
If you are thinking in doing something like that is because you feel that the database is slow... You should learn about data normalization and correct use of indexes then.
And no NoSQL is not the answer.
Sorry to pop your balloon.
Edited to add: What you CAN to is use memcache to store the final product of some expensive processes. Don't bother storing the result of trivial queries, the internal cache of mysql is very optimized for those.
You should use the $_SESSION vars in php, To use them, add a session_start() at the beginning of your code. Then you can set vars with $_SESSION['selectaname'] = yourvar
Nothing prevent you to make a sql query like "SELECT username FROM users WHERE id = 1" and then set a $_SESSION['user'] = $queryresult
Then you'll have this :
echo $_SESSION['user'];
"mylogin"