Each time someone lands in my page list.php?id=xxxxx it requeries some MySQL queries to return this:
$ids = array(..,..,..); // not big array - not longer then 50 number records
$thumbs = array(..,..,..); // not big array - not longer then 50 text records
$artdesc = "some text not very long"; // text field
Because the database from which I make the queries is quite big I would like to cache this results for 24h in maybe a file like: xxxxx.php in a /cache/ directory so i can use it in include("xxxxx.php") if it is present. ( or txt files !? , or any other way )
Because there is very simple data I believe it can be done using a few of PHP lines and no need to use memcached or other professional objects.
Becasuse my PHP is very limited can someone just place the PHP main lines ( or code ) for this task ?
I really would be very thankfull !
Caching a PHP array is pretty easy:
file_put_contents($path, '<?php return '.var_export($my_array,true).';?>');
Then you can read it back out:
if (file_exists($path)) $my_array = include($path);
You might also want to look into ADOdb, which provides caching internally.
Try using serialize;
Suppose you get your data in two arrays $array1 and $array2. Now what you have to do is store these arrays in file. Storing string (the third variable in your question) to file is easy, but to store an array you have to first convert it to string.
$string_of_array1 = serialize( $array1 );
$string_of_array2 = serialize( $array2 );
The next problem is the naming of cache files so that you can easily check if the relevant array is already available in cache. The best way to do this is to create an MD5 hash of your mysql query and use it as cache file name.
$cache_dir = '/path/cache/';
$query1 = 'SELECT many , fields FROM first_table INNER JOIN another_table ...';
$cache1_filename = md5( $query1 );
if( file_exists( $cache_dir . $cache1_filename ) )
{
if( filemtime( $cache_dir . $cache1_filename ) > ( time( ) - 60 * 60 * 24 ) )
{
$array1 = unserialize( file_get_contents( $cache_dir . $cache1_filename ) );
}
}
if( !isset( $array1 ) )
{
$array1 = run_mysql_query( $query1 );
file_put_contents( serialize( $array1 ) );
}
Repeat the above with the other array that should be stored in a separate file with MD5 of the second query used as the name of second cache file.
In the end, you have to decide how long your cache should remain valid. For the same query, there may change records in your mysql table that may make your file system cache outdated. So, you cannot just rely on unique file names for unique queries.
Important:
Old cache files have to be deleted. You may have to write a routine that checks all files of a directory and deletes the files older than n seconds.
Keep the cache dir outside the webroot.
Just write a new file with the name of the $_GET['id'] and contents of the stuff you want cached, and each time check to see if that file exists, else create one. Something like this:
$id = $_GET['id']
if (file_exists('/a/dir/' . $id)) {
$data = file_get_contents('/a/dir/' . $id);
} else {
//do mysql query, set data to result
$handle = fopen('/a/dir/' . $id, 'w+');
fwrite($handle, $data);
fclose($handle);
}
Based on #hamid-sarfraz answer, here is a solution used in PDO extended class, using json_encode/decode instead of serialize :
function get_assoc_c($query, $lifetime = 60*60*24) {
$c_dir = '/path/to/.cache/';
$c_filename = md5($query);
if(file_exists($c_dir . $c_filename)) {
if(filemtime($c_dir . $c_filename) > (time() - $lifetime)) {
return json_decode(file_get_contents($c_dir . $c_filename), true);
}
}
if(!isset($content)) {
if(!file_exists($c_dir))
mkdir($c_dir);
$stmt = $this->query($query);
$content = $stmt->fetchAll(PDO::FETCH_ASSOC);
file_put_contents($c_dir . $c_filename, json_encode($content));
return $content;
}
return false;
}
! Be aware to not use queries with arguments passed as variables (SQL injection).
Related
I am making a Covid-19 statistics website - https://e-server24.eu/ . Every time somebody is entering the website, the PHP script is decoding JSON from 3 urls and storing data into some variables.
I want to make my website more optimized so my question is: Is there any script that can update the variables data one time per day, not every time someone accesses the website?
Thanks,
I suggest looking into memory object caching.
Many high-performance PHP web apps use caching extensions (e.g. Memcached, APCu, WinCache), accelerators (e.g. APC, varnish) and caching DBs like Redis. The setup can be a bit involved but you can get started with a simple role-your-own solution (inspired by this):
<?php
function cache_set($key, $val) {
$val = var_export($val, true);
// HHVM fails at __set_state, so just use object cast for now
$val = str_replace('stdClass::__set_state', '(object)', $val);
// Write to temp file first to ensure atomicity
$tmp = sys_get_temp_dir()."/$key." . uniqid('', true) . '.tmp';
file_put_contents($tmp, '<?php $val = ' . $val . ';', LOCK_EX);
rename($tmp, sys_get_temp_dir()."/$key");
}
function cache_get($key) {
//echo sys_get_temp_dir()."/$key";
#include sys_get_temp_dir()."/$key";
return isset($val) ? $val : false;
}
$ttl_hours = 24;
$now = new DateTime();
// Get results from cache if possible. Otherwise, retrieve it.
$data = cache_get('my_key');
$last_change = cache_get('my_key_last_mod');
if ($data === false || $last_change === false || $now->diff($last_change)->h >= $ttl_hours ) { // cached? h: Number of hours.
// expensive call to get the actual data; we simple create an object to demonstrate the concept
$myObj = new stdClass();
$myObj->name = "John";
$myObj->age = 30;
$myObj->city = "New York";
$data = json_encode($myObj);
// Add to user cache
cache_set('my_key', $data);
$last_change = new DateTime(); //now
// Add timestamp to user cache
cache_set('my_key_last_mod', $last_change);
}
echo $data;
Voila.
Furthermore; you could look into client-side caching and many other things. But this should give you an idea.
PS: Most memory cache systems allow to define a time-to-live (TTL) which makes this more concise. But I wanted to keep this example dependency-free. Cache cleaning was omitted here. Simply delete the temp file.
Simple way to do that
Create a script which will fetch , decode JSON data and store it to your database.
Then set a Cron jobs with time laps of 24 hours .
And when user visit your site fetch the data from your database instead of your api provider.
I have 1000 json files in my server and users request their file with something like this mysite.com/request.php?file=id and i should show them id.json .But before i show them i should check if id.json need to be updated or not .
My json files are something like this:
{
"response": {
......,
"lastupdate":14323342
}
}
In last update i store the last update time in seconds and if the current time is bigger than 1 hour i should update my file from somewhere else .
Now my question : Is it good to save lastupdate in Mysql or in each json file ?
If i use mysql i need 1000 rows and each row should have 2 columns ,first columns are id and second columns are lastupdate.And my users are more than 10,000 and each day and server hardwares is important for me.
If you're grabbing a specific file (not opening a "flatfile database" and search something), it should be faster than querying a mySQL db.
A thought form a similar case; if the same ID could be requested a number of times in less than an hour, AND if you actually don't need to update the last update time in the "json file" until a user is asking, then when there's a request to request.php?file=id I'll:
"try" to get and decode the json file, put the decoded to (eg) $data
if the file doesn't exist OR lastupdate is more than an hour
{
update the lastupdate, and update $data
update the json file
}
show $data
... it's basically a simple caching to try to get data from file instead of db, and to prevent updating (db query) when no one is asking.
You can safely rm -rf everything in the "cache folder" anytime.
if ( $data = file_get_contents( 'path_to_cache/' . $id . '.json' ) )
$data = json_decode( $data );
if ( empty( $data ) || $data->lastupdate < time() - 3600 )
{
// do query, put to $data
// encode $data, put to 'path_to_cache/' . $id . '.json'
}
echo $data;
... and actually, in my case I don't need to put lastupdate to DB. So I can simply use the json file's filemtime().
I have two csv files, and both have same data structure.
ID - Join_date - Last_Login
I want to compare and get the exactly matching records numbers based on this example:
the first files has 100 records, of which 20 are not included in the 2nd file.
the 2nd file has 120 records.
I want a script in PHP to compare these two files and build two separate CSV files.
And I want to remove all extra records from the 2nd file which are not included in the first file.
And remove all records from the first file which are not included in the 2nd file.
Thanks
There is a GNU utility comm that will do this really easily. You could exec that through php or just do it directly. If you don't have access to comm, the easiest thing to do would be to store both files in an array (probably via file()) and use array_intersect().
You an try this for limited number of CSV file .. if you have a very large CSV i would advice you import it directly into MySQL
function csvToArray($csvFile, $full = false) {
$handle = fopen ( $csvFile, "r" );
$array = array ();
while ( ($data = fgetcsv ( $handle )) !== FALSE ) {
$array [] = ($full === true) ? $data : $data[0]; // Full array or only ID
}
return $array;
}
$file1 = "file1.csv" ;
$file2 = "file2.csv" ;
$fileData1 = csvToArray($file1);
$fileData2 = csvToArray($file2);
var_dump(array_diff($fileData1,$fileData2));
var_dump(array_intersect($fileData1,$fileData2));
I have a bunch of files I need to crunch and I'm worrying about scalability and speed.
The filename and filedata(only the first line) is stored into an array in RAM to create some statical files later in the script.
The files must remain files and can't be put into a databases.
The filename are formatted in the following fashion :
Y-M-D-title.ext (where Y is Year, M for Month and D for Day)
I'm actually using glob to list all the files and create my array :
Here is a sample of the code creating the array "for year" or "month" (It's used in a function with only one parameter -> $period)
[...]
function create_data_info($period=NULL){
$data = array();
$files = glob(ROOT_DIR.'/'.'*.ext');
$size = sizeOf($files);
$existing_title = array(); //Used so we can handle having the same titles two times at different date.
if (isSet($period)){
if ( "year" === $period ){
for ($i = 0; $i < $size; $i++) {
$info = extract_info($files[$i], $existing_file);
//Create the data array with all the data ordered by year/month/day
$data[(int)$info[5]][] = $info;
unset($info);
}
}elseif ( "month" === $period ){
for ($i = 0; $i < $size; $i++) {
$info = extract_info($files[$i], $existing_file);
$key = $info[5].$info[6];
//Create the data array with all the data ordered by year/month/day
$data[(int)$key][] = $info;
unset($info);
}
}
}
[...]
}
function extract_info($file, &$existing){
$full_path_file = $file;
$file = basename($file);
$info_file = explode("-", $file, 4);
$filetitle = explode(".", $info_file[3]);
$info[0] = $filetitle[0];
if (!isSet($existing[$info[0]]))
$existing[$info[0]] = -1;
$existing[$info[0]] += 1;
if ($existing[$info[0]] > 0)
//We have already found a post with this title
//the creation of the cache is based on info[4] data for the filename
//so we need to tune it
$info[0] = $info[0]."-".$existing[$info[0]];
$info[1] = $info_file[3];
$info[2] = $full_path_file;
$post_content = file(ROOT_DIR.'/'.$file, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$info[3] = $post_content[0]; //first line of the files
unset($post_content);
$info[4] = filemtime(ROOT_DIR.'/'.$file);
$info[5] = $info_file[0]; //year
$info[6] = $info_file[1]; //month
$info[7] = $info_file[2]; //day
return $info;
}
So in my script I only call create_data_info(PERIOD) (PERIOD being "year", "month", etc..)
It returns an array filled with the info I need, and then I can loop throught it to create my statistics files.
This process is done everytime the PHP script is launched.
My question is : is this code optimal (certainly not) and what can I do to squeeze some juice from my code ?
I don't know how I can cache this (even if it's possible), as there is a lot of I/O involved.
I can change the tree structure if it could change things compared to a flat structure, but from what I found out with my tests it seems flat is the best.
I already thought about creating a little "booster" in C doing only the crunching, but I since it's I/O bound, I don't think it would make a huge difference and the application would be a lot less compatible for shared hosting users.
Thank you very much for your input, I hope I was clear enough here. Let me know if you need clarification (and forget my english mistakes).
To begin with you should use DirectoryIterator instead of glob function. When it comes to scandir vs opendir vs glob, glob is as slow as it gets.
Also, when you are dealing with a large amount of files you should try to do all your processing inside one loop, php function calls are rather slow.
I see you are using unset($info); yet in every loop you make, $info gets new value. Php does its own garbage collection, if thats your concern. Unset is a language construct not a function and should be pretty fast, but when using not needed, it still makes whole thing a bit slower.
You are passing $existing as a reference. Is there practical outcome for this? In my experience references make things slower.
And at last your script seems to deal with a lot of string processing. You might want to consider somekind of "serialize data and base64 encode/decode" solution, but you should benchmark that specifically, might be faster, might be slower depenging on your whole code. (My idea is that, serialize/unserialize MIGHT run faster as these are native php functions and custom functions with string processing are slower).
My answer was not very I/O related but I hope it was helpful.
I have a project that needs to create files using the fwrite in php. What I want to do is to make it generic, I want to make each file unique and dont overwrite on the others.
I am creating a project that will record the text from a php form and save it as html, so I want to output to have generated-file1.html and generated-file2.html, etc.. Thank you.
This will give you a count of the number of html files in a given directory
$filecount = count(glob("/Path/to/your/files/*.html"));
and then your new filename will be something like:
$generated_file_name = "generated-file".($filecount+1).".html";
and then fwrite using $generated_file_name
Although I've had to do a similar thing recently and used uniq instead. Like this:
$generated_file_name = md5(uniqid(mt_rand(), true)).".html";
I would suggest using the time as the first part of the filename (as that should then result in files being listed in chronological/alphabetic order, and then borrow from #TomcatExodus to improve the chances of the filename being unique (incase of two submissions being simultaneous).
<?php
$data = $_POST;
$md5 = md5( $data );
$time = time();
$filename_prefix = 'generated_file';
$filename_extn = 'htm';
$filename = $filename_prefix.'-'.$time.'-'.$md5.'.'.$filename_extn;
if( file_exists( $filename ) ){
# EXTREMELY UNLIKELY, unless two forms with the same content and at the same time are submitted
$filename = $filename_prefix.'-'.$time.'-'.$md5.'-'.uniqid().'.'.$filename_extn;
# IMPROBABLE that this will clash now...
}
if( file_exists( $filename ) ){
# Handle the Error Condition
}else{
file_put_contents( $filename , 'Whatever the File Content Should Be...' );
}
This would produce filenames like:
generated_file-1300080525-46ea0d5b246d2841744c26f72a86fc29.htm
generated_file-1300092315-5d350416626ab6bd2868aa84fe10f70c.htm
generated_file-1300109456-77eae508ae79df1ba5e2b2ada645e2ee.htm
If you want to make absolutely sure that you will not overwrite an existing file you could append a uniqid() to the filename. If you want it to be sequential you'll have to read existing files from your filesystem and calculate the next increment which can result in an IO overhead.
I'd go with the uniqid() method :)
If your implementation should result in unique form results every time (therefore unique files) you could hash form data into a filename, giving you unique paths, as well as the opportunity to quickly sort out duplicates;
// capture all posted form data into an array
// validate and sanitize as necessary
$data = $_POST;
// hash data for filename
$fname = md5(serialize($data));
$fpath = 'path/to/dir/' . $fname . '.html';
if(!file_exists($fpath)){
//write data to $fpath
}
Do something like this:
$i = 0;
while (file_exists("file-".$i.".html")) {
$i++;
}
$file = fopen("file-".$i.".html");