Handling unread posts in PHP / MySQL - php

For a personal project, I need to build a forum using PHP and MySQL. It is not possible for me to use an already-built forum package (such as phpBB).
I'm currently working through the logic needed to build such an application, but it's been a long day and I'm struggling with the concept of handling unread posts for users. One solution I had was to have a separate table which essentially holds all post IDs and user IDs, to determine if they've been read:
tbl_userReadPosts: user_id, post_id, read_timestamp
Obviously, if a user's ID appears in this table, we know they've read the post. This is great, except if we have thousdands of posts per day (which is more than possible in the system which is being proposed), and thousdands of users. This table would become huge within a matter of days, if not hours.
Another option would be to track the user's last activity as a timestamp, and then retrieve all posts made after their last activity was updated. This works in theory, but let's say a user is writing an extremely long post, and in the meantime several members also start new threads or reply to posts in other threads. When the user submits his new post, his last activity would be updated, and thus not match those made in the meantime.
Does anyone have experience with this, and how did you tackle it?
I've checked in phpBB and it seems that the system assigns a custom session to each user, and works on that basis, but the documentation is pretty sparse as to how this deals with unread posts.
Thoughts and opinions gratefully received, as always.

Sorry for the quick answer but I only have a second. You definitely do not want to store the read information in the database, as you've already deduced, this table would become gigantic.
Something in between what you've already suggested: Store the users last activity, and in conjunction with storing information of what they've seen in the cookie, to determine which threads/posts they've read already.
This offloads the storage to the client side cookie, which is far more efficient.

A table holding all user_ids and post_ids is a bad idea, as it grows exponentially. Imagine if your forum solution grew to a million posts and 50,000 users. Now you have 50 billion records. That'll be a problem.
The trick is to use a table as you said, but it only holds posts which have been read since the this login, of posts which were posted between the last login and this login.
All posts made prior to the last login are considered read.
IE, I last logged in on 4/3/2011, and then I log in today. All posts made before 4/3/2011 are considered read (they're not new to me). All posts between 4/3/2011 and now, are unread unless they are seen in the read table. The read table is flushed each time I log in.
This way your read posts table should never have more than a couple hundred records for each member.

Instead of having a new row for every post*user, you can have a field in the user-table that holds a comma-separated string with post-IDs that the user has read.
Obviously the user doesn't need to know that there are unread posts from 2 years ago, so you only display "New post" for posts made in the last 24 hours and is not in the comma-separated string.
You could also solve this with a session variable or a cookie.

This method stores the most recently-accessed postID separately for each forumID.
It's not as fine-grained as a solution that keeps track of each post individually, but it shrinks the amount of data that you need to store per user and still provides a decent way to keep track of a user's view history.
<?php
session_start();
//error_reporting(E_ALL);
// debug: clear session
if (isset($_GET['reset'])) { unset($_SESSION['activity']); }
// sample data: db table with your forum ids
$forums = array(
// forumID forumTitle
'1' => 'Public Chat',
'2' => 'Member Area',
'3' => 'Moderator Mayhem'
);
// sample data: db table with your forum posts
$posts = array(
// postID forumID postTitle
'12345' => array( 'fID'=>'1', 'title'=>'Hello World'),
'12346' => array( 'fID'=>'3', 'title'=>'I hate you all'),
'12347' => array( 'fID'=>'1', 'title'=>'Greetings!'),
'12348' => array( 'fID'=>'2', 'title'=>'Car thread'),
'12349' => array( 'fID'=>'1', 'title'=>'I like turtles!'),
'12350' => array( 'fID'=>'2', 'title'=>'Food thread'),
'12351' => array( 'fID'=>'3', 'title'=>'FR33 V1AGR4'),
'12352' => array( 'fID'=>'3', 'title'=>'CAPSLOCK IS AWESOME!!!!!!!!'),
'12353' => array( 'fID'=>'2', 'title'=>'Funny pictures thread'),
);
// sample data: db table with the last read post from each forum
$userhist = array(
// forumID postID
'1' => '12344',
'2' => '12350',
'3' => '12346'
);
// reference for shorter code
$s = &$_SESSION['activity'];
// store user's history into session
if (!isset($s)) { $s = $userhist; }
// mark forum as read
if (isset($_GET['mark'])) {
$mid = (int)$_GET['mark'];
if (array_key_exists($mid, $forums)) {
// sets the last read post to the last entry in $posts
$s[$mid] = array_search(end($posts), $posts);
}
// mark all forums as read
elseif ($mid == 0) {
foreach ($forums as $fid=>$finfo) {
// sets the last read post to the last entry in $posts
$s[$fid] = array_search(end($posts), $posts);
}
}
}
// mark post as read
if (isset($_GET['post'])) {
$pid = (int)$_GET['post'];
if (array_key_exists($pid, $posts)) {
// update activity if $pid is newer
$hist = &$s[$posts[$pid]['fID']];
if ($pid > $hist) {
$hist = $pid;
}
}
}
// link to mark all as read
echo '<p>[Read All]</p>' . PHP_EOL;
// display forum/post info
foreach ($forums as $fid=>$finfo) {
echo '<p>Forum: ' . $finfo;
echo ' [Mark as Read]<br>' . PHP_EOL;
foreach ($posts as $pid=>$pinfo) {
if ($pinfo['fID'] == $fid) {
echo '- Post: ' . $pid . '';
echo ' - ' . ($s[$fid] < $pid ? 'NEW' : 'old');
echo ' - "' . $pinfo['title'] . '"<br>' . PHP_EOL;
}
}
echo '</p>' . PHP_EOL;
}
// debug: display session value and reset link
echo '<hr><pre>$_SESSION = '; print_r($_SESSION); echo '</pre>' . PHP_EOL;
echo '<hr>[Reset Session]' . PHP_EOL;
?>
Note: Obviously this example is for demonstration purposes only. Some of the structure and logic may need to be changed when dealing with an actual database.

Phpbb2 has implemented this fairly simple. It just shows you all post since your last login. This way you don’t need to store any information about what the user actually has seen or read.

Related

How can we make multiple php http request at the same time asynchronously?

I'm currently working on a project with my friends,
so let me explain:
We have a mySql database filled with english postcode from London, one table with universities, and one with hosts, what we want is to actually calculate the public transport travel time between all the host and the universities and save it into another table of the database that will have the host postcode, the university post code and the travel time between the both on one line, and etc...
For that we are doing http request to the tfl API that return to us a JSON with all the travel details (and of course the travel time), that we then decode and keep only what we want (travel time).
The problem is that we have a quite big database with almost 250 host and 800 universities that give us around 200 000 request and a too long process time to be used (with the api response time and the php treatment, around 19h)
We tried to see if we could use the cURL method to split the process between multiple loop so that we can divide the process time by the number of cURL we made but we can't manage to figure how we can do that...
The final goal is to make a small local app that when we select one university it give us the nearests 10 hosts in public transport.
Does anyone have any experience with that kind of things and can help us ?
Here is what we have right now :
//postCodeUni list contains all the universites objects
foreach ($postCodeUni as $uniPostCode) {
//here we take the postcode from the university object
$uni = $uniPostCode['Postcode'];
//postCodeHost list contains all the host objects
foreach ($postCodeHost as $hostPostCode) {
//here we take the postcode from the host object
$host = $hostPostCode['Postcode'];
//here we make an http request to the tfl api that return us a journey between the two post codes (a json with all the journey details)
$data = json_decode(file_get_contents('https://api.tfl.gov.uk/journey/journeyresults/' . $uni . '/to/' . $host . '?app_key=a59c7dbb0d51419d8d3f9dfbf09bd5cc'), true);
//here we save the multiple duration times (because there is different ways to travel between two point with public transport)
$duration = $data['journeys'];
$tableTemp = [];
foreach ($duration as $durations) {
$durationns = $durations['duration'];
array_push($tableTemp, $durationns);
}
//We then take the shorter one
$min = min($tableTemp);
echo "Shorter travel time : " . $min . " of travel between " . $uni . " and ". $host . " . <br>";
echo "<br>";
//We then save this time in a table that will contain the travel time of all the journeys to do comparaison
array_push($tableAllRequest, array($uni . " and ". $host => $min));
}
}
There are many ways to achieve this however the easiest imo would be to use Guzzle Async (cURL multi interface under the hood). Take a look at this answer - Guzzle async requests not really async? example below,
<?php
use GuzzleHttp\Promise;
use GuzzleHttp\Client;
$client = new Client(['base_uri' => 'http://httpbin.org/']);
// Initiate each request but do not block
$promises = [
'image' => $client->getAsync('/image'),
'png' => $client->getAsync('/image/png'),
'jpeg' => $client->getAsync('/image/jpeg'),
'webp' => $client->getAsync('/image/webp')
];
// Wait on all of the requests to complete. Throws a ConnectException
// if any of the requests fail
$results = Promise\unwrap($promises);
// Wait for the requests to complete, even if some of them fail
$results = Promise\settle($promises)->wait();
// Loop through each response in the results and fetch data etc
foreach($results as $promiseKey => $result) {
// Data response
$dataOfResponse = ($result['value']->getBody()->getContents());
// Status
echo $promiseKey . ':' . $result['value']->getStatusCode() . "\r\n";
}

How do I validate a PHP integer within a variable?

I have integrated Yelp reviews into my directory site with each venue that has a Yelp ID returning the number of reviews and overall score.
Following a successful MySQL query for all venue details, I output the results of the database formatted for the user. The Yelp element is:
while ($searchresults = mysql_fetch_array($sql_result)) {
if ($yelpID = $searchresults['yelpID']) {
require('yelp.php');
if ( $numreviews > 0 ) {
$yelp = '<img src="'.$ratingimg.'" border="0" /> Read '.$numreviews.' reviews on <img src="graphics/yelp_logo_50x25.png" border="0" /><br />';
} else {
$yelp = '';
}
} //END if ($yelpID = $searchresults['yelpID']) {
} //END while ($searchresults = mysql_fetch_array($sql_result)) {
The yelp.php file returns:
$yrating = $result->rating;
$numreviews = $result->review_count;
$ratingimg = $result->rating_img_url;
$url = $result->url;
If a venue has a Yelp ID and one or more reviews then the output displays correctly, but if the venue has no Yelp ID or zero reviews then it displays the Yelp review number of the previous venue.
I've checked the $numreviews variable type and it's an integer.
So far I've tried multiple variations of the "if ( $numreviews > 0 )" statement such as testing it against >=1, !$numreviews etc., also converting the integer to a string and comparing it against other strings.
There are no errors and printing all of the variables returned gives the correct number of reviews for each property with venues having no ID or no reviews returning nothing (as opposed to zero). I've also compared it directly against $result->review_count with the same problem.
Is there a better way to make the comparison or better format of variable to work with to get the correct result?
EDIT:
The statement if ($yelpID = $searchresults['yelpID']) { is not operating as it should. It is identical to other statements in the file, validating row contents which work correctly for their given variable, e.g. $fbID = $searchresults['fbID'] etc.
When I changed require('yelp.php'); to require_once('yelp.php'); all of the venue outputs changed to showing only the first iterated result. Looking through the venues outputted, the error occurs on the first venue after a successful result which makes me think there is a pervasive piece of code in the yelp.php file, causing if ($yelpID = $searchresults['yelpID']) { to be ignored until a positive result is found (a yelpID in the db), i.e. each venue is correctly displayed with a yelp number of reviews until a blank venue is encountered. The preceding venues' number of reviews is then displayed and this continues for each blank venue until a venue is found with a yelpID when it shows the correct number again. The error reoccurs on the next venue output with no yelpID and so on.
Sample erroneous output: (line 1 is var_dump)
string(23) "bayview-hotel-bushmills"
Bayview Hotel
Read 3 reviews on yelp
Benedicts
Read 3 reviews on yelp (note no var_dump output, this link contains the url for the Bayview Hotel entry above)
string(31) "bushmills-inn-hotel-bushmills-2"
Bushmills Inn Hotel
Read 7 reviews on yelp
I suspect this would be a new question rather than clutter/confuse this one further?
END OF EDIT
Note: I'm aware of the need to upgrade to mysqli but I have thousands of lines of legacy code to update. For now I'm working on functionality before reviewing the code for best practice.
Since the yelp.php is sort of a blackbox; the best explanation for this behavior would be that it only set's those variables if it finds a match. Updating your code to this should fix that:
unset($yrating, $numreviews, $ratingimg, $url);
require('yelp.php');
I also noticed this peculiar if-statement, do you realize that's an assignment or is this a copy/paste error? If you want to test (that's what if is for)
if ($yelpID == $searchresults['yelpID']) {

Random Content Array seems stuck

I have a random content script that has worked perfectly but now seems to have a glitch.
It's the "Spotlight On:" story on the upper lefthand corner at http://fiction.deslea.com/index2.php and the code is as follows:
$storyspotlights = array("bluevial", "biophilia", "real", "edgeofreality",
"limitsofperception", "markofcain", "spokenfor", "closer",
"feildelm", "purgatory", "elemental");
$randomstoryID = array_rand($storyspotlights);
$randomstory = $storyspotlights[$randomstoryID];
switch ($randomstory) {
case ($randomstory == 'closer'):
$storyspotlightheader = "<div class='storyspotlightheader'>Closer</div>";
$storyspotlighttext = "snip";
//some stories snipped
case ($randomstory == 'bluevial'):
$storyspotlightheader = "<div class='storyspotlightheader'>The Blue
Vial</div>";
$storyspotlighttext = "snip";
break;
//more stories snipped
}
print($storyspotlightheader);
print($storyspotlighttext);
My problem is - all the stories from Blue Vial to Spoken For appear when you refresh the page, in random order (although Blue Vial seems to stick a fair bit). These were the stories in the script originally.
Since then I have added the last four to the array and the content generation switch case fragment, but these last four stories never, ever appear in the randomiser. I've literally sat and refreshed for hours. I've confirmed over and over that the updated script is on the server, and even deleted and re-uploaded it.
I did try unset and also $storyspotlights = array() at the beginning of the script at various stages of troubleshooting, but to no avail. I also tried moving the new stories to the start of the array - no change there either.
What am I missing?
It's surprising this works at all. That's not how you use switch..case.
switch (<value to compare>) {
case <value to compare against>:
...
}
That means you write this:
switch ($randomstory) {
case 'closer':
...
}
With what you've written it's actually executing like:
if ($randomstory == ($randomstory == 'closer')) ...
Also make sure you have not actually forgotten some break statements, which would make the code fall through to the next case and indeed make certain cases "more sticky" than others.
Also, I'd simplify the whole thing to this:
$stories = array(
array('header' => '...', 'text' => '...'),
array('header' => '...', 'text' => '...'),
...
);
$story = $stories[array_rand($stories)];
echo $story['header'];
echo $story['text'];

How to make big Facebook API requests faster?

I am working on an Facebook application in PHP that fetches a large amount of location information of the user's friends. The application gets increasingly slow as the number of friends of the users increases. But the more friends' information I retrieve, the more accurate is the result.
I have tried to use the following ways to speed up the query:
$facebook->api('/locations?ids=uid1,uid2,uid3,...')
And I used this together with batched requests:
$batched_request = array(
array('method' => 'GET', 'relative_url' => '/locations?ids=uid1,uid2,uid3,...'),
array('method' => 'GET', 'relative_url' => '/locations?ids=uid11,uid12,uid13,...'),
array('method' => 'GET', 'relative_url' => '/locations?ids=uid21,uid22,uid23,...'),
...
);
$batch = $facebook->api('/?batch='.json_encode($batched_request), 'POST');
But still it takes at least 20 seconds to get the location information from a random set of 100 friends of the user.
Actual Code Used
This part is fine. It gets done in just a few seconds.
$number_of_friends = "100"; // Set the maximum number of friends from which their location information is retrieved
$number_of_friends_per_request = 10; // Set the number of friends per request in the batch
$access_token = $facebook->getAccessToken();
// This is the excerpt of another batched request to get the friend ids
$request = '[{"method":"POST","relative_url":"method/fql.query?query=SELECT+uid,+name+FROM+user+WHERE+uid+IN(SELECT+uid2+FROM+friend+WHERE+uid1+=+me()+order+by+rand()+limit+'.$number_of_friends.')"}]';
$post_url = "https://graph.facebook.com/" . "?batch=" . urlencode($request) . "&access_token=" . $access_token . "&method=post";
$post = file_get_contents($post_url);
$decoded_response = json_decode($post, true);
$friends_json = $decoded_response[0]['body'];
$friends_data = json_decode($friends_json, true);
if (is_array($friends_data)) {
foreach ($friends_data as $friend) {
$selected_friend_ids[] = number_format($friend["uid"], 0, '.', ''); // Since there are exceptionally large id numbers
}
}
But this is problematic. It takes too long to receive a response from Facebook.
// Retrieve the locations of the user's friends using batched request
$i = 0;
$batched_request = array();
while ($i < ($number_of_friends/$number_of_friends_per_request)) {
$i++;
$friend_ids_variable_name = 'friend_ids_part_'.$i;
$$friend_ids_variable_name = array_slice($selected_friend_ids, ($i-1)*$number_of_friends_per_request, $number_of_friends_per_request);
if (!empty($$friend_ids_variable_name)) {
$api_string_ids_variable_name = 'api_string_ids_'.$i;
$$api_string_ids_variable_name = implode(',', $$friend_ids_variable_name);
$batched_request[] = array('method' => 'GET', 'relative_url' => '/locations?ids='.$$api_string_ids_variable_name);
}
}
$batch = $facebook->api('/?batch='.json_encode($batched_request), 'POST');
foreach ($batch as $batch_item) {
$body = $batch_item["body"];
$partial_friends_locations = json_decode($body, true);
foreach ($partial_friends_locations as $friend_id => $friend_locations_data) {
$friend_locations = $friend_locations_data["data"];
foreach ($friend_locations as $friend_location) {
// Process location information...
}
}
}
}
Is there a way to make the above request faster? I placed some codes to check the response time of the request and it is pretty slow.
For 100 friends, it takes > 20 seconds on average.
For 200 friends, it takes > 40 seconds on average.
For 400 friends, it takes > 80 seconds on average, and I sometimes receive an error message: "Error Code: 1 Message: An unknown error occurred"
To make things faster, it means:
Getting the same amount of information in less time, or
Getting more information for the same amount of time.
Why bother with batched requests? You can achieve everything with a single FQL multiquery:
{
"my_friends":
"SELECT uid, name FROM user WHERE uid IN
(SELECT uid2 FROM friend WHERE uid1 = me() ORDER BY rand() LIMIT 100)",
"their_locations":
"SELECT page_id, tagged_uids FROM location_post WHERE tagged_uids IN
(SELECT uid FROM #my_friends)",
"those_places":
"SELECT page_id, name, location FROM page WHERE page_id IN
(SELECT page_id FROM #their_locations"
}
In the API explorer, this runs in the 800-1200 ms range for me.
Another question: Why do you have the PHP SDK installed, but aren't using it to make these queries?
I too have noticed that facebook has a slow API response time. I have not actually developed a facebook application, but perhaps some of my learnings in dealing with the facebook like buttons and comment/share buttons will help you:
What I would suggest, is to lace your page with asynchronous api calls via javascript to get data. This way the page loads fast for your user, then it loads the facebook data in the background. You can accomplish this relatively easily with a library like jquery. Essentially you will break off the chunk of code that runs to process the data into another file and then run that with the jquery call.
Now that is the first part. The second part could be a little trick again to allow the user to perceive a faster application load time. Always load the first 100 friends first and display resulting data, and then query a second time (again using asynchronous calls), to finish querying for the rest of the users friends.
Again, this is speaking from more of a general perspective, but hopefully it will help!

Retrieve top 10 friends of user on Facebook (get their UIDs) PHP, Facebook API

How do I get a user's top 10 friends' uids, and wrap each uid with specified string?
From: Get list of top friends for facebook app, I see this:
$statuses = $facebook->api('/me/statuses');
foreach($statuses['data'] as $status){
// processing likes array for calculating fanbase.
foreach($status['likes']['data'] as $likesData){
$frid = $likesData['id'];
$frname = $likesData['name'];
$friendArray[$frid] = $frname;
}
foreach($status['comments']['data'] as $comArray){
// processing comments array for calculating fanbase
$frid = $comArray['from']['id'];
$frname = $comArray['from']['name'];
}
But what does that return? Does it return the user IDs of friends in an array? I would like to get it in an array, the result of the search, so I can wrap each ID using foreach and do what I please with it.
If the above code is enough, should I be calling $frid for the array of top friends? I just need comprehension. :o)
Thank you for your time.
Assume that permissions are granted.
(This happens only after the user allows permission, so assume we already have that.)
Ensure you have the latest SDK by going to https://github.com/facebook/facebook-php-sdk/zipball/master
Unzip and you have a layout as shown below
/facebook-php-sdk
index.php
Where index.php with the file being used to display in the browser the number of friends
Include the SDK at the start of the PHP file
require('sdk/src/facebook.php');
Go to https://developers.facebook.com/apps, select your app and get your App ID and App Secret, create an instance within the PHP file
$facebook = new Facebook(array(
'appId' => 'YOUR_APP_ID_HERE',
'secret' => 'YOUR_SECRET_HERE',
));
Then retrieve the $user data so we know that the current user is authenticated
$user = $facebook->getUser();
Before sending any calls check whether the authentication was right
if ($user) {
try {
$user_profile = $facebook->api('/me');
} catch (FacebookApiException $e) {
error_log($e);
$user = null;
}
}
Now make a call to /me/statuses the documentation is available at https://developers.facebook.com/docs/reference/api/user/#statuses
$statuses = $facebook->api('/me/statuses');
This should return an array of the structure Status message defined at http://developers.facebook.com/docs/reference/api/status/ of all the current user status messages.
Now you need to decide what determines top friends
number of likes + comments
number of comments
number of likes
Let's choose option 1, and give each weight of 1. That is a like and a comment are equivalent in value for determining the amount of friends
So create a variable to hold this, for example $friendArray
Then make an iteration over all the status messages but the entire JSON reponse starts with a wrapped data
{
"data": [
So access $statuses['data'], the foreach will give all status messages as a status item
foreach($statuses['data'] as $status){
Within this loop iterate all the likes and increment the value of each id that appears
foreach($status['likes']['data'] as $likesData){
$frid = $likesData['id'];
$friendArray[$frid] = $friendArray[$frid] + 1;
}
Within this loop iterate all the comments and increment the value of each id that appears
foreach($status['comments']['data'] as $comArray){
$frid = $comArray['from']['id'];
$friendArray[$frid] = $friendArray[$frid] + 1;
}
At the end of the outer loop foreach($statuses['data'] as $status){ you should have the an array $friendArray which has the scores.
Call asort http://www.php.net/manual/en/function.asort.php to sort the array and you can then loop for the top x scores.
The code you showed isn't a function and is actually missing a closing brace, it does not actually return anything as it is not a function.
Caveats: The /me/statuses only returns a limited set of status messages per call you need to get previous page calls to iterate all your messages. The top friends returned are only based on the restriction I made up above.

Categories