How to efficiently cache mysql query in php for this table

How to efficiently cache mysql query in php for this table - php

I have a Posts table containing the fields title, contents, rating, no_of_comments, author_id. When a user downvotes the post, the rating field is decremented and vice-versa. Also i am caching the display query, which shows the recent posts, and it's related to the table Authors. The problem is that the ratings field need to be updated often i.e. there are a lot of upvotes and downvotes. So i need to rebuild the cache every time a user up/down the post. I believe this is a waste, because only one field in the entire cached data is updated. So i want to know is there any workaround this issue. btw i am using file based caching.

If you normalised your data structure, you would be able to cache the static part - i.e. The "Post", have the "Post Attributes" or whatever you decide to use, being dynamic.
Number of comments should be derived by counting the number of comments in a "Comments" table.
A solid data structure will help you out alot here.

Alright. So your solution is to open an ajax request to a PHP that does the following:
UNTESTED
while (array_pop(mysql_fetch_array(mysql_query("SELECT count(*) FROM `posts` WHERE `ID` = '$latest_id_prediction'"))) == 0) {
sleep(10);
}
echo array_pop(mysql_fetch_array(mysql_query("SELECT * FROM `posts` WHERE `ID` = '$latest_id_prediction'");
This will check every 10 seconds if there is a new post, if there is, it will echo out the post information. what you do is make an ajax request to this page without a timeout, and update the page when you get a reply. When you get the reply, open a new connection.

Related

Count the number of times post has been viewed

I am working on a project where only title of posts are shown on main page and on clicking the title, full post is loaded on another page posts.php code for this is:
<?php echo $row['title']; ?>
Now to count post views I have a column hits in my posts table ,initially value of hits is set to 0 and whenever a post is opened value of hits is increased by 1, for this my code is in posts.php
$id = $_GET['postId'];
$sql = "UPDATE posts SET hits = hits + 1 WHERE post_id = $id";
But this is not a good practice for tracking post views as views increase whenever page is refreshed. I want a clean system to track post views where, for every distinct user or visitor views increase by one irrespective of fact how many times the same user/visitor views same post (as in stackoverflow). Like tracking them by their IP address or something else, just an idea (how these guys do it) or how the stuff works would be enough to let me start my work.

You cannot solve your problem so simply. Your problem is counting unique users who view a page (with, perhaps a time component). You have several problems, as described in the comments. The first is determining what you are counting. For a prototype, IP address is good as anything else for getting started. However, it has many short-comings, and you will need to think hard about identifying a "visitor".
There is a way to solve your problem in SQL, sort of efficiently. But, it requires an additional table at the level of post/visitor. It will have one row per combination, and then you will need to count from there. To quickly get the unique counts, you then need an insert trigger on that table.
Here is a sketch of the code:
create unique index unq_postvisitors_post_visitor on postvisitors(postid, visitorid);
insert into PostVisitors (postid, visitorid)
select $postid, $visitorid
on duplicate key update set counter = counter + 1;
delimiter $$
create trigger trig_PostVisitors
after insert on PostVisitors
begin
update posts
set numvisitors = numvisitors + 1
where posts.post_id = new.post_id;
end;$$
delimiter ;

Simplest way I use to solve this problem is through cookies.
Whenever your page is opened, you check if there's set cookie_name cookie through isset($_COOKIE[$cookie_name]).
If isset returns false, you set a cookie through setcookie(cookie_name, value, expire);, maybe setting expire time to 24h (you have to set it in seconds, so 24h is 84600). Also, you trigger your counting systems with a +1 to your visitor counter.
If isset returns true, do nothing.
PHP Cookies Refs

Try this It'll Work
$refreshed = $_SERVER['HTTP_CACHE_CONTROL'];
if ($refreshed == 'max-age=0'){
$sql = "UPDATE posts SET hits = hits + 1 WHERE post_id = $id";
}
Try this script on the page $_SERVER['HTTP_CACHE_CONTROL'] get place when page is refreshed

checking number of records in mySQL table

I am looking for a way to check if there are certain number of records within mysql table. For example: After POST request been before putting data to dabase, it checks first how many records there are. If lets say there are 24 records, then it will delete record with latest date based on timestamp and then inster new value from POST request. Has anyone got idea on how to do it? Looking forward fpr your answers. Below I attached simple code i wrote to insert data from post request into table.
<?php
include("connect.php");
$link=Connection();
$temp1=$_POST["temp1"];
$hum1=$_POST["hum1"];
$query = "INSERT INTO `tempLog` (`temperature`, `humidity`)
VALUES ('".$temp1."','".$hum1."')";
mysql_query($query,$link);
mysql_close($link);
header("Location: index.php");
?>

When you say delete with the latest date I have to assume you mean the oldest record? Your description doesnt tell me the name of you date field so Lets assume its onDate. You also didnt mention what your primary key is so lets assume that is just id. if you run the below query before inserting it will purge all the oldest records leaving only the newest 23 in the database.
delete from templog where id in (
select id from (
select #rownum:=#rownum+1 'rowid', t.id from templog t, (select #rownum:=0)r order by t.onDate
)v where v.rowid > 23
);
Of course you should test on data you don't mind losing.
It is best to do a cleanup purge each time instead of removing a single row before adding a new one because in the event of exceptions it will never clean itself down to the 24 rows you wish to truly have.
I also want to note that you may want to reconsider this method all together. Instead leave the data there, and only query the most recent 24 when displaying the log. Since you are going through the trouble of collecting the data you might as well keep it for future reporting. Then later down the road if your table gets to large run a simple daily purge query to delete anything older than a certain threshold.
Hope this helps.

Simple concurrency in PHP?

I have a small PHP function on my website which basically does 3 things:
check if user is logged in
if yes, check if he has the right to do this action (DB Select)
if yes, do the related action (DB Insert/Update)
If I have several users connected at the same time on my website that try to access this specific function, is there any possibility of concurrency problem, like we can have in Java for example? I've seen some examples about semaphore or native PHP synchronization, but is it relevant for this case?
My PHP code is below:
if ( user is logged ) {
sql execution : "SELECT....."
if(sql select give no results){
sql execution : "INSERT....."
}else if(sql select give 1 result){
if(selected column from result is >= 1){
sql execution : "UPDATE....."
}
}else{
nothing here....
}
}else{
nothing important here...
}

Each user who accesses your website is running a dedicated PHP process. So, you do not need semaphores or anything like that. Taking care of the simultaneous access issues is your database's problem.

Not in PHP. But you might have users inserting or updating the same content.
You have to make shure this does not happen.
So if you have them update their user profile only the user can access. No collision will occur.
BUT if they are editing content like in a Content-Management System... they can overwrite each others edits. Then you have to implement some locking mechanism.
For example(there are a lot of ways...) if you write an update on the content keeping the current time and user.
Then the user has a lock on the content for maybe 10 min. You should show the (in this case) 10 min countdown in the frontend to the user. And a cancel button to unlock the content and ... you probably get the idea
If another person tries to load the content in those 10 min .. it gets an error. "user xy is already... lock expires at xx:xx"
Hope this helps.

In general, it is not safe to decide whether to INSERT or UPDATE based on a SELECT result, because a concurrent PHP process can INSERT the row after you executed your SELECT and saw no row in the table.
There are two solutions. Solution number one is to use REPLACE or INSERT ... ON DUPLICATE KEY UPDATE. These two query types are "atomic" from perspective of your script, and solve most cases. REPLACE tries to insert the row, but if it hits a duplicate key it replaces the conflicting existing row with the values you provide, INSERT ... ON DUPLICATE KEY UPDATE is a little bit more sophisticated, but is used in a similar situations. See the documentation here:
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
http://dev.mysql.com/doc/refman/5.0/en/replace.html
For example, if you have a table product_descriptions, and want to insert a product with ID = 5 and a certain description, but if a product with ID 5 already exists, you want to update the description, then you can just execute the following query (assuming there's a UNIQUE or PRIMARY key on ID):
REPLACE INTO product_description (ID, description) VALUES(5, 'some description')
It will insert a new row with ID 5 if it does not exist yet, or will update the existing row with ID 5 if it already exists, which is probably exactly what you want.
If it is not, then approach number two is to use locking, like so:
query('LOCK TABLE users WRITE')
if (num_rows('SELECT * FROM users WHERE ...')) {
query('UPDATE users ...');
}
else {
query('INSERT INTO users ...');
}
query('UNLOCK TABLES')

Long polling with PHP and jQuery - issue with update and delete

I wrote a small script which uses the concept of long polling.
It works as follows:
jQuery sends the request with some parameters (say lastId) to php
PHP gets the latest id from database and compares with the lastId.
If the lastId is smaller than the newly fetched Id, then it kills the
script and echoes the new records.
From jQuery, i display this output.
I have taken care of all security checks. The problem is when a record is deleted or updated, there is no way to know this.
The nearest solution i can get is to count the number of rows and match it with some saved row count variable. But then, if i have 1000 records, i have to echo out all the 1000 records which can be a big performance issue.
The CRUD functionality of this application is completely separated and runs in a different server. So i dont get to know which record was deleted.
I don't need any help coding wise, but i am looking for some suggestion to make this work while updating and deleting.
Please note, websockets(my fav) and node.js is not an option for me.

Instead of using a certain ID from your table, you could also check when the table itself was modified the last time.
SQL:
SELECT UPDATE_TIME
FROM information_schema.tables
WHERE TABLE_SCHEMA = 'yourdb'
AND TABLE_NAME = 'yourtable';
If successful, the statement should return something like
UPDATE_TIME
2014-04-02 11:12:15
Then use the resulting timestamp instead of the lastid. I am using a very similar technique to display and auto-refresh logs, works like a charm.
You have to adjust the statement to your needs, and replace yourdb and yourtable with the values needed for your application. It also requires you to have access to information_schema.tables, so check if this is available, too.
Two alternative solutions:
If the solution described above is too imprecise for your purpose (it might lead to issues when the table is changed multiple times per second), you might combine that timestamp with your current mechanism with lastid to cover new inserts.
Another way would be to implement a table, in which the current state is logged. This is where your ajax requests check the current state. Then generade triggers in your data tables, which update this table.

You can get the highest ID by
SELECT id FROM table ORDER BY id DESC LIMIT 1
but this is not reliable in my opinion, because you can have ID's of 1, 2, 3, 7 and you insert a new row having the ID 5.
Keep in mind: the highest ID, is not necessarily the most recent row.
The current auto increment value can be obtained by
SELECT AUTO_INCREMENT FROM information_schema.tables
WHERE TABLE_SCHEMA = 'yourdb'
AND TABLE_NAME = 'yourtable';
Maybe a timestamp + microtime is an option for you?

I am just learning php as I go along, and I'm completely lost here. I've never really used join before, and I think I need to here, but I don't know. I'm not expecting anyone to do it for me but if you could just point me in the right direction it would be amazing, I've tried reading up on joins but there are like 20 different methods and I'm just lost.
Basically, I hand coded a forum, and it works fine but is not efficient.
I have board_posts (for posts) and board_forums (for forums, the categories as well as the sections).
The part I'm redoing is how I get the information for the last post for the index page. The way I set it up is that to avoid using joins, I have it store the info for latest post in the table for board_forums, so say there is a section called "Off Topic" there I would have a field for "forum_lastpost_username/userid/posttitle/posttime" which I woudl update when a user posts etc. But this is bad, I'm trying to grab it all dynamically and get rid of those fields.
Right now my query is just like:
`SELECT * FROM board_forums WHERE forum_parent='$forum_id''
And then I have the stuff where I grab the info for that forum (name, description, etc) and all the data for the last post is there:
$last_thread_title = $forumrow["forum_lastpost_title"];
$last_thread_time = $forumrow["forum_lastpost_time"];
$lastpost_username = $forumrow["forum_lastpost_username"];
$lastpost_threadid = $forumrow["forum_lastpost_threadid"];
But I need to get rid of that, and get it from board_posts. The way it's set up in board_posts is that if it's a thread, post_parentpost is NULL, if it's a reply, then that field has the id of the thread (first post of the topic). So, I need to grab the latest post_date, see which user posted that, THEN see if parentpost is NULL (if it's null then the last post is a new thread, so I can get all the info of the title and user there, but if it's not, then I need to get the info (title, id) of the first post in that thread (which can be found by seeing what post_parentpost is, looking up that ID and getting the title from it.
Does that make any sense? If so please help me out :(
Any help is greatly appreciated!!!!

Updating board___forums whenever a post or a reply is inserted is - regarding performance - not the worst idea. For displaying the index page you only have to select data from one table board_forums - this is definitely much faster than selecting a second table to get the "last posts' information", even when using a clever join.

You are better off just updating the stats on each action, New Post, Delete Post etc.
The other instances would not likely require any stats update (deletion of a thread would trigger a forum update, to show one less topic in the topic count).
Think about all the actions the user would do, in most cases, you dont need to update any stats, therefore, getting the counts on the fly is very inefficient and you are right to think so.

It looks like you've already done the right thing.
If you were to join, you'd do it like this:
SELECT * FROM board_forums
JOIN board_posts ON board_posts.forum_id = board_forums.id
WHERE forum_parent = '$forum_id'
The problem with that, is that it gets you every post, which is not useful (and very slow). What you would want to do is something like this
SELECT * FROM board_forums
JOIN board_posts ON board_posts.forum_id = board_forums.id ORDER BY board_posts.id desc LIMIT 1
WHERE forum_parent = '$forum_id'
except SQL doesn't work like that. You can't order or limit on a join (or do many other useful things like that), so you have to fetch every row and then scan them in code (which sucks).
In short, don't worry. Use joins for the actual case where you do want to load all forums and all posts in one hit.

The simple solution will result in numerous queries, some optional, as you're already discovered.
The classic approach to this is to cache the results, and only retrieve it once in a while. The cache doesn't have to live long; even two or three seconds on a busy site will make a significant difference.
De-normalizing the data into a table you're already reading anyway will help. This approach saves you figuring out optional queries and can be a bit of a cheap win because it's just one more update when an insert is already happening. But it shifts some data integrity to the application.
As an aside, you might be running into the recursive-query problem with your threads. Relational databases do not store heirarchical data all that well if you use a "simple" algorithim. A better way is something sometimes called 'set trees'. It's a bit hard to Google, unfortunately, so here are some links.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.