Flat file databases [closed] - php

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
What are the best practices around creating flat file database structures in PHP?
A lot of more matured PHP flat file frameworks out there which I attempt to implement SQL-like query syntax which is over the top for my purposes in most cases. (I would just use a database at that point).
Are there any elegant tricks out there to get good performance and features with a small code overhead?

Well, what is the nature of the flat databases. Are they large or small. Is it simple arrays with arrays in them? if its something simple say userprofiles built as such:
$user = array("name" => "bob",
"age" => 20,
"websites" => array("example.com","bob.example.com","bob2.example.com"),
"and_one" => "more");
and to save or update the db record for that user.
$dir = "../userdata/"; //make sure to put it bellow what the server can reach.
file_put_contents($dir.$user['name'],serialize($user));
and to load the record for the user
function &get_user($name){
return unserialize(file_get_contents("../userdata/".$name));
}
but again this implementation will vary on the application and nature of the database you need.

You might consider SQLite. It's almost as simple as flat files, but you do get a SQL engine for querying. It works well with PHP too.

In my opinion, using a "Flat File Database" in the sense you're meaning (and the answer you've accepted) isn't necessarily the best way to go about things. First of all, using serialize() and unserialize() can cause MAJOR headaches if someone gets in and edits the file (they can, in fact, put arbitrary code in your "database" to be run each time.)
Personally, I'd say - why not look to the future? There have been so many times that I've had issues because I've been creating my own "proprietary" files, and the project has exploded to a point where it needs a database, and I'm thinking "you know, I wish I'd written this for a database to start with" - because the refactoring of the code takes way too much time and effort.
From this I've learnt that future proofing my application so that when it gets bigger I don't have to go and spend days refactoring is the way to go forward. How do I do this?
SQLite. It works as a database, uses SQL, and is pretty easy to change over to MySQL (especially if you're using abstracted classes for database manipulation like I do!)
In fact, especially with the "accepted answer"'s method, it can drastically cut the memory usage of your app (you don't have to load all the "RECORDS" into PHP)

One framework I'm considering would be for a blogging platform. Since just about any possible view of data you would want would be sorted by date, I was thinking about this structure:
One directory per content node:
./content/YYYYMMDDHHMMSS/
Subdirectories of each node including
/tags
/authors
/comments
As well as simple text files in the node directory for pre- and post-rendered content and the like.
This would allow a simple PHP glob() call (and probably a reversal of the result array) to query on just about anything within the content structure:
glob("content/*/tags/funny");
Would return paths including all articles tagged "funny".

Here's the code we use for Lilina:
<?php
/**
* Handler for persistent data files
*
* #author Ryan McCue <cubegames#gmail.com>
* #package Lilina
* #version 1.0
* #license http://opensource.org/licenses/gpl-license.php GNU Public License
*/
/**
* Handler for persistent data files
*
* #package Lilina
*/
class DataHandler {
/**
* Directory to store data.
*
* #since 1.0
*
* #var string
*/
protected $directory;
/**
* Constructor, duh.
*
* #since 1.0
* #uses $directory Holds the data directory, which the constructor sets.
*
* #param string $directory
*/
public function __construct($directory = null) {
if ($directory === null)
$directory = get_data_dir();
if (substr($directory, -1) != '/')
$directory .= '/';
$this->directory = (string) $directory;
}
/**
* Prepares filename and content for saving
*
* #since 1.0
* #uses $directory
* #uses put()
*
* #param string $filename Filename to save to
* #param string $content Content to save to cache
*/
public function save($filename, $content) {
$file = $this->directory . $filename;
if(!$this->put($file, $content)) {
trigger_error(get_class($this) . " error: Couldn't write to $file", E_USER_WARNING);
return false;
}
return true;
}
/**
* Saves data to file
*
* #since 1.0
* #uses $directory
*
* #param string $file Filename to save to
* #param string $data Data to save into $file
*/
protected function put($file, $data, $mode = false) {
if(file_exists($file) && file_get_contents($file) === $data) {
touch($file);
return true;
}
if(!$fp = #fopen($file, 'wb')) {
return false;
}
fwrite($fp, $data);
fclose($fp);
$this->chmod($file, $mode);
return true;
}
/**
* Change the file permissions
*
* #since 1.0
*
* #param string $file Absolute path to file
* #param integer $mode Octal mode
*/
protected function chmod($file, $mode = false){
if(!$mode)
$mode = 0644;
return #chmod($file, $mode);
}
/**
* Returns the content of the cached file if it is still valid
*
* #since 1.0
* #uses $directory
* #uses check() Check if cache file is still valid
*
* #param string $id Unique ID for content type, used to distinguish between different caches
* #return null|string Content of the cached file if valid, otherwise null
*/
public function load($filename) {
return $this->get($this->directory . $filename);
}
/**
* Returns the content of the file
*
* #since 1.0
* #uses $directory
* #uses check() Check if file is valid
*
* #param string $id Filename to load data from
* #return bool|string Content of the file if valid, otherwise null
*/
protected function get($filename) {
if(!$this->check($filename))
return null;
return file_get_contents($filename);
}
/**
* Check a file for validity
*
* Basically just a fancy alias for file_exists(), made primarily to be
* overriden.
*
* #since 1.0
* #uses $directory
*
* #param string $id Unique ID for content type, used to distinguish between different caches
* #return bool False if the cache doesn't exist or is invalid, otherwise true
*/
protected function check($filename){
return file_exists($filename);
}
/**
* Delete a file
*
* #param string $filename Unique ID
*/
public function delete($filename) {
return unlink($this->directory . $filename);
}
}
?>
It stores each entry as a separate file, which we found is efficient enough for use (no unneeded data is loaded and it's faster to save).

IMHO, you have two... er, three options if you want to avoid homebrewing something:
SQLite
If you're familiar with PDO, you can install a PDO driver that supports SQLite. Never used it, but I have used PDO a ton with MySQL. I'm going to give this a shot on a current project.
XML
Done this many times for relatively small amounts of data. XMLReader is a lightweight, read-forward, cursor-style class. SimpleXML makes it simple to read an XML document into an object that you can access just like any other class instance.
JSON (update)
Good option for smallish amounts of data, just read/write file and json_decode/json_encode. Not sure if PHP offers a structure to navigate a JSON tree without loading it all in memory though.

If you're going to use a flat file to persist data, use XML to structure the data. PHP has a built-in XML parser.

If you want a human-readable result, you can also use this type of file :
ofaurax|27|male|something|
another|24|unknown||
...
This way, you have only one file, you can debug it (and manually fix) easily, you can add fields later (at the end of each line) and the PHP code is simple (for each line, split according to |).
However, the drawbacks is that you should parse the entire file to search something (if you have millions of entry, it's not fine) and you should handle the separator in data (for example if the nick is WaR|ordz).

I have written two simple functions designed to store data in a file. You can judge for yourself if it's useful in this case.
The point is to save a php variable (if it's either an array a string or an object) to a file.
<?php
function varname(&$var) {
$oldvalue=$var;
$var='AAAAB3NzaC1yc2EAAAABIwAAAQEAqytmUAQKMOj24lAjqKJC2Gyqhbhb+DmB9eDDb8+QcFI+QOySUpYDn884rgKB6EAtoFyOZVMA6HlNj0VxMKAGE+sLTJ40rLTcieGRCeHJ/TI37e66OrjxgB+7tngKdvoG5EF9hnoGc4eTMpVUDdpAK3ykqR1FIclgk0whV7cEn/6K4697zgwwb5R2yva/zuTX+xKRqcZvyaF3Ur0Q8T+gvrAX8ktmpE18MjnA5JuGuZFZGFzQbvzCVdN52nu8i003GEFmzp0Ny57pWClKkAy3Q5P5AR2BCUwk8V0iEX3iu7J+b9pv4LRZBQkDujaAtSiAaeG2cjfzL9xIgWPf+J05IQ==';
foreach($GLOBALS as $var_name => $value) {
if ($value === 'AAAAB3NzaC1yc2EAAAABIwAAAQEAqytmUAQKMOj24lAjqKJC2Gyqhbhb+DmB9eDDb8+QcFI+QOySUpYDn884rgKB6EAtoFyOZVMA6HlNj0VxMKAGE+sLTJ40rLTcieGRCeHJ/TI37e66OrjxgB+7tngKdvoG5EF9hnoGc4eTMpVUDdpAK3ykqR1FIclgk0whV7cEn/6K4697zgwwb5R2yva/zuTX+xKRqcZvyaF3Ur0Q8T+gvrAX8ktmpE18MjnA5JuGuZFZGFzQbvzCVdN52nu8i003GEFmzp0Ny57pWClKkAy3Q5P5AR2BCUwk8V0iEX3iu7J+b9pv4LRZBQkDujaAtSiAaeG2cjfzL9xIgWPf+J05IQ==')
{
$var=$oldvalue;
return $var_name;
}
}
$var=$oldvalue;
return false;
}
function putphp(&$var, $file=false)
{
$varname=varname($var);
if(!$file)
{
$file=$varname.'.php';
}
$pathinfo=pathinfo($file);
if(file_exists($file))
{
if(is_dir($file))
{
$file=$pathinfo['dirname'].'/'.$pathinfo['basename'].'/'.$varname.'.php';
}
}
file_put_contents($file,'<?php'."\n\$".$varname.'='.var_export($var, true).";\n");
return true;
}

This one is inspiring as a practical solution:
https://github.com/mhgolkar/FlatFire
It uses multiple strategies to handling data...
[Copied from Readme File]
Free or Structured or Mixed
- STRUCTURED
Regular (table, row, column) format.
[DATABASE]
/ \
TX TableY
\_____________________________
|ROW_0 Colum_0 Colum_1 Colum_2|
|ROW_1 Colum_0 Colum_1 Colum_2|
|_____________________________|
- FREE
More creative data storing. You can store data in any structure you want for each (free) element, its similar to storing an array with a unique "Id".
[DATABASE]
/ \
EX ElementY (ID)
\________________
|Field_0 Value_0 |
|Field_1 Value_1 |
|Field_2 Value_2 |
|________________|
recall [ID]: get_free("ElementY") --> array([Field_0]=>Value_0,[Field_1]=>Value_1...
- MIXD (Mixed)
Mixed databases can store both free elements and tables.If you add a table to a free db or a free element to a structured db, flat fire will automatically convert FREE or SRCT to MIXD database.
[DATABASE]
/ \
EX TY

Just pointing out a potential problem with a flat file database with this type of system:
data|some text|more data
row 2 data|bla hbalh|more data
...etc
The problem is that the cell data contains a "|" or a "\n" then the data will be lost. Sometimes it would be easier to split by combinations of letters that most people wouldn't use.
For example:
Column splitter: #$% (Shift+345)
Row splitter: ^&* (Shift+678)
Text file: test data#$%blah blah#$%^&*new row#$%new row data 2
Then use: explode("#$%", $data); use foreach, the explode again to separate columns
Or anything along these lines. Also, I might add that flat file databases are good for systems with small amounts of data (ie. less than 20 rows), but become huge memory hogs for larger databases.

Related

Losing the ".json" in the API Explorer documentation

First let me say that the new API Explorer in Restler is great. Very happy about its addition. Now, in typical fashion, let me complain about something that isn't working for me ...
The fact that Restler can return results in multiple formats is a very nice feature but I'm currently not using it (choosing to only use JSON as my return format). In the API Explorer I'd like all references to .json to not show up as this just complicates the look of the service architecture.
Here's a quick example:
class Users {
/**
* Preferences
*
* Preferences returns a dictionary of name-value pairs that provide input to applications that want to make user-specific decisions
*
* #url GET /{user_id}/preferences
**/
function preferences ($user_id , $which = 'all') {
return "$which preferences for {$user_id}";
}
/**
* GET Sensors
*
* Get a list of all sensors associated with a user.
*
* #url GET /{user_id}/sensor
**/
function sensor ($user_id) {
return "sensor";
}
/**
* GET Sensors by Type
*
* #param $user_id The user who's sensors you are interested in
* #param $type The type of sensor you want listed.
*
* #url GET /{user_id}/sensor/{type}
**/
function sensor_by_type ($user_id, $type) {
return "specific sensor";
}
/**
* ADD a Sensor
*
* #param $user_id The user who you'll be adding the sensor to
*
* #url POST /sensor
**/
function postSensor() {
return "post sensor";
}
}
In this example the API Explorer looks like this:
The basic problem I'd like to remove is remove all ".json" references as the calling structure without the optional .json works perfectly fine.
Also, for those that DO want the .json showing up there's a secondary problem of WHERE does this post-item modifier show up? In the example above you have .json attaching to the "users" element in the GET's and to the "sensor" element in the PUT. This has nothing to do with the HTTP operation but rather it seems to choose the element which immediately precedes the first variable which may not be intuitive to the user and actually isn't a requirement in Restler (at least its my impression that you can attache .json anywhere in the chain and get the desired effect).
We are using safer defaults that will work for everyone, but made it completely configurable.
if you prefer .json to be added at the end, add the following to index.php (gateway)
use Luracast\Restler\Resources;
Resources::$placeFormatExtensionBeforeDynamicParts = false;
If you prefer not to add .json extension, add the following to index.php
use Luracast\Restler\Resources;
Resources::$useFormatAsExtension = false;

Is it possible for Zend_Translate to return multiple 'pieces' of content from a language file?

Started Googling today to research implementing Zend_Translate in a Zend 1.6.x project i have recently been assigned to. But i am finding it difficult to get to usable/appropriate sources of information.
Implemented simple Array adapter, which works nicely.
Basic overlay of the implementation as follows:
in the Language file:
return array(
'testKey' => 'Hello World!');
in SomeController.php: (added translate to the registry)
public function init()
{
...
$this->_translate = Zend_Registry::get('translate');
...
}
in the view:
echo $translate->_('testKey');
I would like to know if it is possible to retrieve more than just one element from the language array? Something like:
$phraseList= $translate->_('lanKey1','lanKey1'..'n');
//or
$phraseList= $translate->_( array('lanKey1','lanKey1'..'n') );
Or at the least does anyone have resources to point out, or a direction to research in?
Many thanks,
David
No, you can pass one item at a time.
You can refer the source code. Its a better resource than a documentation.
/**
* Translates the given string
* returns the translation
*
* #param string $messageId Translation string
* #param string|Zend_Locale $locale (optional) Locale/Language to use, identical with locale
* identifier, #see Zend_Locale for more information
* #return string
*/
public function _($messageId, $locale = null)
{
return $this->translate($messageId, $locale);
}
FYI: Zend_Translate_Adapter

Generating large Excel files from MySQL data with PHP from corporate applications

We're developing and maintaining a couple of systems, which need to export reports in Excel format to the end user. The reports are gathered from a MySQL database with some trivial processing and usually result in ~40000 rows of data with 10-15 columns, we're expecting the amount of data to grow steadily.
At the moment we're using PHPExcel for the Excel generation, but it's not working for us anymore. After we go above 5000 rows, the memory consumption and loading times become untolerable, and can't be solved by indefinitely increasing PHP's maximum limits for memory usage and script execution times. Processing of the data is as lean as possible, and the entire problem is with PHPExcel being a memory hog. CSV generation would be lighter, but unfortunately we're required to export Excel (and Excel alone) from our services due to user demands. This is due to formatting requirements etc., so CSV isn't an option.
Any ideas/recommendations for a third party application/module/service/what ever for generating large excels? Doesn't matter if it's a commercial licence, as long as it fits our needs, can be integrated to existing PHP applications and does its job. Our services are generally running on linux/php/mysql and we can do just about whatever we need to do with the servers.
Thanks!
For such a large amount of data I would not recommend tools like PHPExcel or ApachePOI (for Java) because of their memory requirements. I have struggled with similar task recently and I have found convenient (but maybe little bit fiddly) way to inject data into spreadsheets. Serverside generation or updating of Excel spreadsheets can be achieved thus simple XML editing. I have XLSX spreadsheet sitting on the server and every time data is gathered from dB, I unzip it using php. Then I access specific XML files that are holding contents of worksheets that need to be injected and insert data manually. Afterwards, I compress spreadsheet folder in order to distribute it as an regular XLSX file. Whole process is quite fast and reliable. Obviously, there are few issues and glitches related to inner organisation of XLSX/Open XML file (e. g. Excel tend to store all strings in separate table and use references to this table in worksheets). But when injecting only data like numbers and strings, it is not that hard. If anyone is interested, I can provide some code.
Okay, here goes sample code for this. I have tried to comment what it does, but feel free to ask for further explanation.
<?php
/**
* Class for serverside spreadsheet data injecting
* Reqs: unzip.php, zip.php (containing any utility functions able to unzip files & zip folders)
*
* Author: Poborak
*/
class DataInjector
{
//spreadsheet file, we inject data into this one
const SPREADSHEET_FILE="datafile.xlsx";
// specific worksheet into which data are being injected
const SPREADSHEET_WORKSHEET_FILE="/xl/worksheets/sheet7.xml";
//working directory, spreadsheet is extracted here
const WSPACE_DIR="Wspace";
// query for obtaining data from DB
const STORE_QUERY = "SELECT * FROM stores ORDER BY store_number ASC";
private $dbConn;
private $storesData;
/**
* #param mysqli $dbConn
*/
function __construct(mysqli $dbConn) {
$this->dbConn = $dbConn;
}
/**
* Main method for whole injection process
* First data are gathered from DB and spreadsheet is decompressed to workspace.
* Then injection takes place and spreadsheet is ready to be rebuilt again by zipping.
*
* #return boolean Informace o úspěchu
*/
public function injectData() {
if (!$this->getStoresInfoFromDB()) return false;
if (!$this->explodeSpreadsheet(self::SPREADSHEET_FILE,self::WSPACE_DIR)) return false;
if (!$this->injectDataToSpreadsheet(self::WSPACE_SUBDIR.self::SPREADSHEET_WORKSHEET_FILE)) return false;
if (!$this->implodeSpreadsheet(self::SPREADSHEET_FILE,self::WSPACE_DIR)) return false;
return true;
}
/**
* Decompress spreadsheet file to folder
*
* #param string $spreadsheet
* #param string $targetFolder
*
* #return boolean success/fail
*/
private function explodeSpreadsheet($spreadsheet, $targetFolder) {
return unzip($spreadsheet,$targetFolder);
}
/**
* Compress source folder to spreadsheet file
*
* #param string $spreadsheet
* #param string $sourceFolder
*
* #return boolean success/fail
*/
private function implodeSpreadsheet($spreadsheet, $sourceFolder) {
return zip($sourceFolder,$spreadsheet);
}
/**
* Loads data from DB to member variable $storesDetails (as array)
*
* #return boolean success/fail
*/
private function getStoresInfoFromDb() {
unset($this->storesData);
if ($stmt = $this->dbConn->prepare(self::STORE_QUERY)) {
$stmt->execute();
$stmt->bind_result($store_number, $store_regional_manager, $store_manager, $store_city, $store_address);
while ($stmt->fetch()) {
$this->storesData[trim($store_number)] = array(trim($store_regional_manager),trim($store_manager),trim($store_address),trim($store_city));
}
$stmt->close();
}
return true;
}
/**
* Injects data from member variable $storesDetails to spreadsheet $ws
*
* #param string $ws target worksheet
*
* #return boolean success/fail
*/
private function injectDataToSpreadsheet($ws) {
$worksheet = file_get_contents($ws);
if ($worksheet === false or empty($this->storesData) return false;
$xml = simplexml_load_string($worksheet);
if (!$xml) return false;
// Loop through $storesDetails array containing rows of data
foreach ($this->storesData as $std){
// For each row of data create new row in excel worksheet
$newRow = $xml->sheetData->addChild('row');
// Loop through columns values in rowdata
foreach ($std as $cbd){
// Save each column value into next column in worksheets row
foreach ($this->storesData as $cbd){
$newCell = $newRow->addChild('c');
$newCell->addAttribute('t', "inlineStr");
$newIs = $newCell->addChild('is');
// text has to be saved as utf-8 (otherwise the spreadsheet file become corrupted)
if (!mb_check_encoding($cbd, 'utf-8')) $cbd = iconv("cp1250","utf-8",$cbd);
$newT = $newIs->addChild('t',$cbd);
}
}
}
// Save xml data back to worksheet file
if (file_put_contents($ws, $xml->asXML()) !== false) return true;
}
}
?>
The list of alternatives for PHPExcel that I try to keep up to date is here
If you're after raw speed/memory performance above and beyond anything that PHPExcel can offer, then the only one I'd actually recommend is Ilia's wrapper extension for libXL, because the library is still actively supported.
You can export in CSV format, Excel can handle that. If you have problems writing the file, you can always loop the results (pagination) and append them to the CSV file
Try to convert afterwards using PHPExcel to .xsl or .odf format, otherwise leave it at CSV.
Did you try out the old Pear Excel (aka Spreadsheet_Excel_Writer: http://pear.php.net/package/Spreadsheet_Excel_Writer/redirected)?
Checkuout discussion regarding Pear Vs PHPExcel:
http://phpexcel.codeplex.com/discussions/240688
Check out OfficeWriter. We recently specifically improved performance for massive datasets for a Fortune 500 financial company. It does way more with the file format than you specifically need (charts and what have you), but the API is pretty easy to use and with the evaluation you could get a POC up quickly. Disclaimer - I'm on the engineers who built the latest version.
One other downside for you guys is that it's .NET.
What about just printing table?
<?php
header("Content-Type: application/vnd.ms-excel; charset=utf-8");
header("Content-Disposition: attachment; filename=abc.xls"); //File name extension was wrong
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);
echo "<table><tr><td>Test</td><td>Test2</td></table>";

Differentiate between a parameter that can take either file contents or file path in PHP

I'm implementing a function called attach. I'm looking for a way to take a param that is either a file path or a file's contents. Here's what I have:
/**
* #param name - name of the file
* #param file - either a file or a file path
*/
function attach($name, $file) {
$attachment = array();
$attachment['name'] = $name;
if(map($file)) {
$attachment['filepath'] = $file;
$attachment['file'] = file_get_contents($file);
} else {
$attachment['file'] = $file;
$attachment['filepath'] = getcwd();
}
}
/**
* #param filepath - can take multiple forms
* ie. ui:form:text.css => ui/form/text.css
* text.css => getcwd().'text.css'
* /ui/form/text.css => /ui/form/text.css
*
* #return if file exists - return file path
* if not found - return false
*/
function map($filepath) {
// ... too long to post
}
The map function allows you to turn namespaces (using ":") into filepaths.
The issue I'm worried about is if an error is made in the filepath (ie. someone types in the file path wrong) I don't want it to think that since the file doesn't exist, it must be file contents
Also: if possible, i'd rather not edit map() as it would require me to change a bunch of code - consider map as a black box.
Finally: I put this example together quickly - so please do not discuss the shortcomings of getcwd(), and other syntactical issues. I have a more elaborate system in place in map()
Thanks!
Matt Mueller
Even if PHP had better support for method overloading, you'd have a hard time here since both vars (filepath and file contents) would probably be a string. What is preventing you from just creating a couple of wrapper methods, like attach_filepath(..), attach_filecontents(..)? Or, if you are set on having one method, you could add a third param, like:
function attach($name, $file, $filecontents=false) {
I agree that it would probably be a bad idea to try to guess the users' intentions based on the contents of a var that would have the same type in both cases.
Before adding the attachment your code could use a function like file_exists to determine if what seems like a path actually is a path and the file does in fact exist. Additionally you may want to check if the path refers to a folder or a file.

Is this the best way to use memcache?

I just started playing with memcache(d) last night so I have a LOT to learn about it
I am wanting to know if this code is a good way of doing what it is doing or if I should be using other memcache functions
I want to show a cache version of something, if the cache does not exist then I generate the content from mysql and set it into cache then show the mysql result on the page, then next page load it will check cache and see that it is there, so it will show it.
This code seems to do the trick but there are several different memcache functions should I be using other ones to accomplish this?
<?PHP
$memcache= new Memcache();
$memcache->connect('127.0.0.1', 11211);
$rows2= $memcache->get('therows1');
if($rows2 == ''){
$myfriends = findfriend2(); // this function gets our array from mysql
$memcache->set('therows1', $myfriends, 0, 30);
echo '<pre>';
print_r($myfriends); // print the mysql version
echo '</pre>';
}else{
echo '<pre>';
print_r($rows2); //print the cached version
echo '</pre>';
}
?>
Here is the locking function provided in the link posted by #crescentfresh
<?PHP
// {{{ locked_mecache_update($memcache,$key,$updateFunction,$expiryTime,$waitUTime,$maxTries)
/**
* A function to do ensure only one thing can update a memcache at a time.
*
* Note that there are issues with the $expiryTime on memcache not being
* fine enough, but this is the best I can do. The idea behind this form
* of locking is that it takes advantage of the fact that
* {#link memcache_add()}'s are atomic in nature.
*
* It would be possible to be a more interesting limiter (say that limits
* updates to no more than 1/second) simply by storing a timestamp or
* something of that nature with the lock key (currently stores "1") and
* not deleitng the memcache entry.
*
* #package TGIFramework
* #subpackage functions
* #copyright 2009 terry chay
* #author terry chay <tychay#php.net>
* #param $memcache memcache the memcache object
* #param $key string the key to do the update on
* #param $updateFunction mixed the function to call that accepts the data
* from memcache and modifies it (use pass by reference).
* #param $expiryTime integer time in seconds to allow the key to last before
* it will expire. This should only happen if the process dies during update.
* Choose a number big enough so that $updateFunction will take much less
* time to execute.
* #param $waitUTime integer the amount of time in microseconds to wait before
* checking for the lock to release
* #param $maxTries integer maximum number of attempts before it gives up
* on the locks. Note that if $maxTries is 0, then it will RickRoll forever
* (never give up). The default number ensures that it will wait for three
* full lock cycles to crash before it gives up also.
* #return boolean success or failure
*/
function locked_memcache_update($memcache, $key, $updateFunction, $expiryTime=3, $waitUtime=101, $maxTries=100000)
{
$lock = 'lock:'.$key;
// get the lock {{{
if ($maxTries>0) {
for ($tries=0; $tries< $maxTries; ++$tries) {
if ($memcache->add($lock,1,0,$expiryTime)) { break; }
usleep($waitUtime);
}
if ($tries == $maxTries) {
// handle failure case (use exceptions and try-catch if you need to be nice)
trigger_error(sprintf('Lock failed for key: %s',$key), E_USER_NOTICE);
return false;
}
} else {
while (!$memcache->add($lock,1,0,$expiryTime)) {
usleep($waitUtime);
}
}
// }}}
// modify data in cache {{{
$data = $memcache->get($key, $flag);
call_user_func($updateFunction, $data); // update data
$memcache->set($key, $data, $flag);
// }}}
// clear the lock
$memcache->delete($lock,0);
return true;
}
// }}}
?>
Couple things.
you should be checking for false, not '' using === in the return value from get(). php's type conversions save you from doing that here, but IMHO it's better to be explicit about the value you are looking for from a cache lookup
You've got a race condition there between the empty check and where you set() the db results. From http://code.google.com/p/memcached/wiki/FAQ#Race_conditions_and_stale_data:
Remember that the process of checking
memcached, fetching SQL, and storing
into memcached, is not atomic at all!
The symptoms of this are a spike in the DB CPU when the key expires and (on a high volume site) a bunch of requests simultaneously trying to hit the db and cache the value.
You can solve it by using add() instead of get. See a more concrete example here.

Categories