PHP Caching with APC - php

Let's say I cache data in a PHP file in PHP array like this:
/cache.php
<?php return (object) array(
'key' => 'value',
);
And I include the cache file like this:
<?php
$cache = include 'cache.php';
Now, the question is will the cache file be automatically cached by APC in the memory? I mean as a typical opcode cache, as all .php files.
If I store the data differently for example in JSON format (cache.json), the data will not be automatically cached by APC?
Would apc_store be faster/preferable?

Don't mix APC's caching abilities with its ability to optimize intermediate code and cache compiled code. APC provides 2 different things:
It gives a handy method of caching data structures (objects,
arrays etc), so that you can store/get them with apc_store and
apc_fetch
It keeps a compiled version of your scripts so that the
next time they run, they run faster
Let's see an example for (1): Suppose you have a data structure which takes 1 second to calculate:
function calculate_array() {
sleep(1);
return array('foo' => 'bar');
}
$data = calculate_array();
You can store its output so that you don't have to call the slow calculate_array() again:
function calculate_array() {
sleep(1);
return array('foo' => 'bar');
}
if (!apc_exists('key1')) {
$data = calculate_array();
apc_store('key1', $data);
} else {
$data = apc_fetch('key1');
}
which will be considerably faster, much less than the original 1 second.
Now, for (2) above: having APC will not make your program run faster than 1 second, which is the time that calculate_array() needs. However, if your file additionally needed (say) 100 milliseconds to initialize and execute, simply having enabled APC will make it need (approx) 20 millisecond. So you have an 80% increase in initialization/preparation time. This can make quite a difference in production systems, so simply installing APC can have a noticeable positive impact on your script's performance, even if you never explicitly call any of its functions

If you are just storing static data (as in your example), it would be preferable to use apc_store.
The reasoning behind this is not so much whether the opcode cache is faster or slower, but the fact you are using include to fetch static data into scope.
Even with an opcode cache, the file will still be checked for consistency on each execution. PHP will not have to parse the contents, but it will have to check whether the file exists, and that it hasn't changed since the opcode cache was created. Filesystem checks are resource expensive, even if it is only to stat a file.
Therefore, of the two approaches I would use apc_store to remove the filesystem checks completely.

Unlike the other answer I would use the array-file-solution (the first one)
<?php return (object) array(
'key' => 'value',
);
The reason is, that with both solutions you are on the right side, but when you let the caching up to APC itself you don't have to juggle around with the apc_*()-functions. You simply include and use it. When you set
apc.stat = 0
you avoid the stat-calls on every include too. This is useful for production, but remember to clear the system-cache on every deployment.
http://php.net/apc.configuration.php#ini.apc.stat
Oh, not to forget: With the file-approach it works even without APC. Useful for the development setup, where you usually shouldn't use any caching.

Related

PHP Cache Mechanism

I was working on a program which needs a cache system.
so the description is I got a mysql db which has 4 columns, 'mac','src','username','main' Which mac,src,username are key/values and foreign key in main table. it will insert to those 3 first and put their ID in main.
The data I got is about 18m for main table, and for those 3 about 2m each.
I dont want to use a select everytime it needs to insert in main, so I used an array to cache them.
$hash= ['mac'=>[],'src'=>[],'username'=>[]];
and store 'n fetch data like this : $hash['mac']['54:52:00:27:e4:91'];
This approach got bad performance when the hash data's goo beyond 500k ;
So is there any better way to do this ?
PS: I got same thing with nodeJS which i used a npm module named hashtable And Performance was about 10k inserts each 4m. I've read about php arrays and found out they are Hashtables , but now it do the same job with far wayslower, for only 1k it takes atleast 5minutes;
Assuming you're on a Linux server. See: Creating a RAM disk. Once you have a RAM disk, cache each ID as a file, using a sha1() hash of the mac address. The RAM disk file is, well, RAM; i.e., a persistent cache in memory.
<?php
$mac = '54:52:00:27:e4:91';
$cache = '/path/to/ramdisk/'.sha1($mac);
if (is_file($cache)) { // Cached already?
$ID = file_get_contents($cache); // From the cache.
} else {
// Run SQL query here and get the $ID.
// Now cache the $ID.
file_put_contents($cache, $ID); // Cache it.
}
// Now do your insert here.
To clarify: A RAM disk allows you to use filesystem wrappers in PHP, such as file_get_contents() and file_put_contents() to read/write to RAM.
Other more robust alternatives to consider:
Redis: https://www.tutorialspoint.com/redis/redis_php.htm
Memcached: http://php.net/manual/en/book.memcached.php
You can use PHP Super Cache which is very simple which is faster than Reddis, Memcache etc
require __DIR__.'/vendor/autoload.php';
use SuperCache\SuperCache as sCache;
//Saving cache value with a key
// sCache::cache('<key>')->set('<value>');
sCache::cache('myKey')->set('Key_value');
//Retrieving cache value with a key
echo sCache::cache('myKey')->get();
https://packagist.org/packages/smart-php/super-cache

Does file_get_contents use a cache?

I have a function that generates a table with contents from the DB. Some cells have custom HTML which I'm reading in with file_get_contents through a templating system.
The small content is the same but this action is performed maybe 15 times (I have a limit of 15 table rows per page). So does file_get_contents cache if it sees that the content is the same?
file_get_contents() does not have caching mechanism. However, you can use write your own caching mechanism.
Here is a draft :
$cache_file = 'content.cache';
if(file_exists($cache_file)) {
if(time() - filemtime($cache_file) > 86400) {
// too old , re-fetch
$cache = file_get_contents('YOUR FILE SOURCE');
file_put_contents($cache_file, $cache);
} else {
// cache is still fresh
}
} else {
// no cache, create one
$cache = file_get_contents('YOUR FILE SOURCE');
file_put_contents($cache_file, $cache);
}
UPDATE the previous if case is incorrect, now rectified by comparing to current time. Thanks #Arrakeen.
Like #deceze says, generally the answer is no. However operating system level caches may cache recently used files to make for quicker access, but I wouldn't count on those being available. If you'd like to cache a file that is being read multiple times per request, consider using a static variable to act as a cache inside a wrapper function.
function my_file_read($filename) {
static $file_contents = array();
if (!isset($file_contents[$filename])) {
$file_contents[$filename] = file_get_contents($filename);
}
return $file_contents[$filename];
}
Calling my_file_read($filename) multiple times will only read the file from disk a single time, subsequent calls will read the value from the static variable within the function. Note that you shouldn't count on this approach for large files or ones used only once per page, since the memory used by the static variable will persist until the end of the request. Keeping the contents of files unnecessarily in static variables is a good way to make your script a memory hog.
The correct answer is yes. All the PHP file system functions do their own caching, and you can use the "realpath_cache_size = 0" directive in PHP.ini to disable the caching if you like. The default caching timeout is 120 seconds. This is separate from the caching typically done by browsers for all GET requests (the majority of Web accesses) unless the HTTP headers override it. Caching is not a good idea during development work, since your code may read in old data from a file whose contents you have changed.

PHP Memory management and arrays

Do I need to worry about memory leaks with PHP? In particular, I have the following code that is being called from a browser. When the call finishes, is everything cleaned up properly, or, do I need to clear the memory created by the first array that was created?
class SomeClass
{
var $someArray = array();
function someMethod()
{
$this->someArray[1] = "Some Value 1";
$this->someArray[2] = "Some Value 2";
$this->someArray[3] = "Some Value 3";
$this->someArray = array();
$this->someArray[1] = "Some other Value";
$this->someArray[2] = "Some other Value";
$this->someArray[3] = "Some other Value";
}
}
someMethod();
Thanks,
Scott
Do I need to worry about memory leaks with PHP?
It's possible to have a cyclic reference in PHP where the refcount of the zval never drops to 0. This will cause a memory leak (GC won't clean up objects that have a reference to them). This has been fixed in >= PHP 5.3.
In particular, I have the following code that is being called from a browser. When the call finishes, is everything cleaned up properly, or, do I need to clear the memory created by the first array that was created?
PHP scripts have a request lifecycle (run application, return response, close application), so it shouldn't be a worry. All memory used by your application should be marked as free'd when your application finishes, ready to be overwritten on the next request.
If you're super paranoid, you can always unset things, however, PHP is a garbage collected language meaning that unless there is a bug in the core or in an extension, there is never going to be a memory leak.
More information
On a side note, you should use the newer PHP 5 OOP syntax. And, someMethod would be an error. It would need to be $obj->someMethod() where $obj is an instance of the class.
There actually do exist memory problems if you run mod_php through Apache with the mpm_prefork behavior. The problem is that memory consumed by PHP is not released back to the operating system. The same Apache process can reuse the memory for subsequent requests, but it can't be used by other programs (not even other Apache processes).
One solution is to restart the processes from time to time, for example by setting the MaxRequestsPerChild setting to something rather low (100 or so, maybe lower for lightly loaded servers). The best solution is to not use mod_php at all but instead run PHP through FastCGI.
This is a sysadmin issue though, not a programmer issue.

How to save Php variable every 5 minutes without database

On my website there is a php function func1(), which gets some info from other resources. It is very costly to run this function.
I want that when Visitor1 comes to my website then this func1() is executed and the value is stored in $variable1=func1(); in a text file (or something, but not a database).
Then a time interval of 5 min starts and when during this interval Visitor2 visits my website then he gets the value from the text file without calling the function func1().
When Visitor3 comes in 20 min, the function should be used again and store the new value for 5 minutes.
How to make it? A small working example would be nice.
Store it in a file, and check the file's timestamp with filemtime(). If it's too old, refresh it.
$maxage = 1200; // 20 minutes...
// If the file already exists and is older than the max age
// or doesn't exist yet...
if (!file_exists("file.txt") || (file_exists("file.txt") && filemtime("file.txt") < (time() - $maxage))) {
// Write a new value with file_put_contents()
$value = func1();
file_put_contents("file.txt", $value);
}
else {
// Otherwise read the value from the file...
$value = file_get_contents("file.txt");
}
Note: There are dedicated caching systems out there already, but if you only have this one value to worry about, this is a simple caching method.
What you are trying to accomplish is called caching. Some of the other answers you see here describe caching at it's simplest: to a file. There are many other options for caching depending on the size of the data, needs of the application, etc.
Here are some caching storage options:
File
Database/SQLite (yes, you can cache to a database)
MemCached
APC
XCache
There are also many things you can cache. Here are a few:
Plain Text/HTML
Serialized data such as PHP objects
Function Call output
Complete Pages
For a simple, yet very configurable way to cache, you can use the Zend_Cache component from the Zend Framework. This can be used on it's own without using the whole framework as described in this tutorial.
I saw somebody say use Sessions. This is not what you want as sessions are only available to the current user.
Here is an example using Zend_Cache:
include ‘library/Zend/Cache.php’;
// Unique cache tag
$cache_tag = "myFunction_Output";
// Lifetime set to 300 seconds = 5 minutes
$frontendOptions = array(
‘lifetime’ => 300,
‘automatic_serialization’ => true
);
$backendOptions = array(
‘cache_dir’ => ‘tmp/’
);
// Create cache object
$cache = Zend_Cache::factory(‘Core’, ‘File’, $frontendOptions, $backendOptions);
// Try to get data from cache
if(!($data = $cache->load($cache_tag)))
{
// Not found in cache, call function and save it
$data = myExpensiveFunction();
$cache->save($data, $cache_tag);
}
else
{
// Found data in cache, check it out
var_dump($data);
}
In a text file. Oldest way of saving stuff (almost). Or do a cronjob to run the script with the function each 5 minutes independently on the visits.
Use caching, such as APC!
If the resource is really big, this may not be the best option and a file may then indeed be better.
Look at:
apc_store
apc_fetch
Good luck!

Can I compile my PHP script to a faster executing format?

I have a PHP script that acts as a JSON API to my backend database.
Meaning, you send it an HTTP request like: http://example.com/json/?a=1&b=2&c=3... it will return a json object with the result set from my database.
PHP works great for this because it's literally about 10 lines of code.
But I also know that PHP is slow and this is an API that's being called about 40x per second at times and PHP is struggling to keep up.
Is there a way that I can compile my PHP script to a faster executing format? I'm already using PHP-APC which is a bytecode optimization for PHP as well as FastCGI.
Or, does anyone recommend a language I rewrite the script in so that Apache can still process the example.com/json/ requests?
Thanks
UPDATE: I just ran some benchmarks:
PHP script takes 0.6 second to
complete
If I use the generated SQL from the PHP script above and run the query from the same web server but directly from within the MySQL command, meaning, network latency is still in play - the fetched result set takes only 0.09 seconds to complete.
As you notice, PHP is literally 1 order of magnitude slower in generating the results. Network does not appear to be the major bottleneck in this case, though I agree it typically is the root cause.
Before you go optimizing something, first figure out if it's a problem. Considering it's only 10 lines of code (according to you) I very much suspect you don't have a problem. Time how long the script takes to execute. Bear in mind that network latency will typically dwarf trivial script execution times.
In other words: don't solve a problem until you have a problem.
You're already using an opcode cache (APC). It doesn't get much faster than that. More to the point, it rarely needs to get any faster than that.
If anything you'll have problems with your database. Too many connections (unlikely at 20x per second), too slow to connect or the big one: query is too slow. If you find yourself in this situation 9 times out of 10 effective indexing and database tuning is sufficient.
In the cases where it isn't is where you go for some kind of caching: memcached, beanstalkd and the like.
But honestly 20x per second means that these solutions are almost certainly overengineering for something that isn't a problem.
I've had a lot of luck with using PHP, memcached and nginx's memcache module together for very fast results. The easiest way is to just use the full URL as the cache key
I'll assume this URL:
/widgets.json?a=1&b=2&c=3
Example PHP code:
<?
$widgets_cache_key = $_SERVER['REQUEST_URI'];
// connect to memcache (requires memcache pecl module)
$m = new Memcache;
$m->connect('127.0.0.1', 11211);
// try to get data from cache
$data = $m->get($widgets_cache_key);
if(empty($data)){
// data is not in cache. grab it.
$r = mysql_query("SELECT * FROM widgets WHERE ...;");
while($row = mysql_fetch_assoc($r)){
$data[] = $row;
}
// now store data for next time.
$m->set($widgets_cache_key, $data);
}
var_dump(json_encode($data));
?>
That in itself provides a huge performance boost. If you were to then use nginx as a front-end for Apache (put Apache on 8080 and nginx on 80), you could do this in your nginx config:
worker_processes 2;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
access_log off;
sendfile on;
keepalive_timeout 5;
tcp_nodelay on;
gzip on;
upstream apache {
server 127.0.0.1:8080;
}
server {
listen 80;
server_name _;
location / {
if ($request_method = POST) {
proxy_pass http://apache;
break;
}
set $memcached_key $uri;
memcached_pass 127.0.0.1:11211;
default_type text/html;
proxy_intercept_errors on;
error_page 404 502 = /fallback;
}
location /fallback {
internal;
proxy_pass http://apache;
break;
}
}
}
Notice the set $memcached_key $uri; line. This sets the memcached cache key to use REQUEST_URI just like the PHP script. So if nginx discovers a cache entry with that key it will serve it directly from memory, and you never have to touch PHP or Apache. Very fast.
There is an unofficial Apache memcache module as well. Haven't tried it but if you don't want to mess with nginx this may help you as well.
The first rule of optimization is to make sure you actually have a performance problem. The second rule is to figure out where the performance problem is by measuring your code. Don't guess. Get hard measurements.
PHP is not going to be your bottleneck. I can pretty much guarantee that. Network bandwidth and latency will dwarf the small overhead of using PHP vs. a compiled C program. And if not network speed, then it will be disk I/O, or database access, or a really bad algorithm, or a host of other more likely culprits than the language itself.
If your database is very read-heavy (I'm guessing it is) then a basic caching implementation would help, and memcached would make it very fast.
Let me change your URL structure for this example:
/widgets.json?a=1&b=2&c=3
For each call to your web service, you'd be able to parse the GET arguments and use those to create a key to use in your cache. Let's assume you're querying for widgets. Example code:
<?
// a function to provide a consistent cache key for your resource
function cache_key($type, $params = array()){
if(empty($type)){
return false;
}
// order your parameters alphabetically by key.
ksort($params);
return sha1($type . serialize($params));
}
// you get the same cache key no matter the order of parameters
var_dump(cache_key('widgets', array('a' => 3, 'b' => 7, 'c' => 5)));
var_dump(cache_key('widgets', array('b' => 7, 'a' => 3, 'c' => 5)));
// now let's use some GET parameters.
// you'd probably want to sanitize your $_GET array, however you want.
$_GET = sanitize($_GET);
// assuming URL of /widgets.json?a=1&b=2&c=3 results in the following func call:
$widgets_cache_key = cache_key('widgets', $_GET);
// connect to memcache (requires memcache pecl module)
$m = new Memcache;
$m->connect('127.0.0.1', 11211);
// try to get data from cache
$data = $m->get($widgets_cache_key);
if(empty($data)){
// data is not in cache. grab it.
$r = mysql_query("SELECT * FROM widgets WHERE ...;");
while($row = mysql_fetch_assoc($r)){
$data[] = $row;
}
// now store data for next time.
$m->set($widgets_cache_key, $data);
}
var_dump(json_encode($data));
?>
You're already using APC opcode caching which is good. If you find you're still not getting the performance you need, here are some other things you could try:
1) Put a Squid caching proxy in front of your web server. If your requests are highly cacheable, this might make good sense.
2) Use memcached to cache expensive database lookups.
Consider that if you're handling database updates, your MySQL performance is what, IMO, needs attention. I would expand the test harness like so:
run mytop on the dbserver
run ab (apache bench) from a client, like your desktop
run top or vmstat on the webserver
And watch for these things:
updates to the table forcing reads to wait (MyISAM engine)
high load on the webserver (could indicate low memory conditions on webserver)
high disk activity on webserver, possibly from logging or other web requests causing random seeking of uncached files
memory growth of your apache processes. If your result sets are getting transformed into large associative arrays, or getting serialized/deserialized, these can become expensive memory allocation operations. Your code might need to avoid calls like mysql_fetch_assoc() and start fetching one row at a time.
I often wrap my db queries with a little profiler adapter that I can toggle to log unusually query times, like so:
function query( $sql, $dbcon, $thresh ) {
$delta['begin'] = microtime( true );
$result = $dbcon->query( $sql );
$delta['finish'] = microtime( true );
$delta['t'] = $delta['finish'] - $delta['begin'];
if( $delta['t'] > $thresh )
error_log( "query took {$delta['t']} seconds; query: $sql" );
return $result;
}
Personally, I prefer using xcache to APC, because I like the diagnostics page it comes with.
Chart your performance over time. Track the number of concurrent connections and see if that correlates to performance issues. You can grep the number of http connections from netstat from a cronjob and log that for analysis later.
Consider enabling your mysql query cache, too.
Please see this question. You have several options. Yes, PHP can be compiled to native ELF (and possibly even FatELF) format. The problem is all of the Zend creature comforts.
Since you already have APC installed, it can be used (similar to the memcached recommendations) to store objects. If you can cache your database results, do it!
http://us2.php.net/manual/en/function.apc-store.php
http://us2.php.net/manual/en/function.apc-fetch.php
From your benchmark it looks like the php code is indeed the problem. Can you post the code?
What happens when you remove the MySQL code and just put in a hard-coded string representing what you'll get back from the db?
Since it takes .60 seconds from php and only .09 seconds from a MySQL CLI I will guess that the connection creation is taking too much time. PHP creates a new connection per request by default and that can be slow sometimes.
Think about it, depending on your env and your code you will:
Resolve the hostname of the MySQL server to an IP
Open a connection to the server
Authenticate to the server
Finally run your query
Have you considered using persistent MySQL connections or connection pooling?
It effectively allows you to jump right to query step from above.
Caching is great for performance as well. I think others have covered this pretty well already.

Categories