I am working on a social network type site in PHP, I have done this once before and the site outgrew my coding ability to keep up, this was a couple years back and now I am wanting to tackle this project again.
Basicly on my network there is a friend_friend mysql table that keeps track of who is who's friend, for every confirmed friend, there are 2 entries into the DB
here is that table:
CREATE TABLE IF NOT EXISTS `friend_friend` (
`autoid` int(11) NOT NULL AUTO_INCREMENT,
`userid` int(10) DEFAULT NULL,
`friendid` int(10) DEFAULT NULL,
`status` enum('1','0','3') NOT NULL DEFAULT '0',
`submit_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`alert_message` enum('yes','no') NOT NULL DEFAULT 'yes',
PRIMARY KEY (`autoid`),
KEY `userid` (`userid`),
KEY `friendid` (`friendid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1657259 ;
I then have a user table with all users info called friend_reg_user
Then a table for bulletins that users post, the object is to only show bulletins from users who you are friends with.
Here is bulletins table
CREATE TABLE IF NOT EXISTS `friend_bulletin` (
`auto_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(10) NOT NULL DEFAULT '0',
`bulletin` text NOT NULL,
`subject` varchar(255) NOT NULL DEFAULT '',
`color` varchar(6) NOT NULL DEFAULT '000000',
`submit_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`status` enum('Active','In Active') NOT NULL DEFAULT 'Active',
`spam` enum('0','1') NOT NULL DEFAULT '1',
PRIMARY KEY (`auto_id`),
KEY `user_id` (`user_id`),
KEY `submit_date` (`submit_date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=455144 ;
Ok so to do this I would either run a query on the friend_friend table to get all friends of a user and add them to a string like this 1,2,3,4,5,6 those would be friend ID numbers and then select from bulletin table where bulletin author ID is in my friend ID list
The second method is to use JOINS to get all this data at once.
My quest now finally, once the site gets very large, when there are millions of friends records and bulletins in the DB this all slows down, what are my options to speed things up? Is there a better way to do this? Also I am planning on changing bulletins to include more then just bulletins but do more of user actions like the big sites do now so it will show status updates and blogs and bulletins and all
What you are looking to do can likely be done in a number of ways. You can have a summary rollup table that combines all of the associated data (friends in this instance) for a given member.
That is a pretty basic approach but it can become much more sophisticated.
Summary rollups act as a persistent caching mechanism. You'll have to keep this up to date by some method - a cron job, MapReduce, etc. You dont want to compute all that data every time you need it - instead, compute it at regular intervals so that it is ready quickly.
Memcache is a great tool for caching but that caches data that has to be computed at some point anyway. Unfortunately, Memcache is not persistent. That means that if the memcached servier or service dies, so does your data.
You can explore some advanced cutting edge technologies such as MongoDB, CouchDB, Project Voldemort and neo4j for some even more efficient tools.
Id also recommend looking at the source code for the open source PHP based social network Elgg at http://www.elgg.org/
Facebook uses memcached to store SQL databases as distributed hash tables. That's probably your best bet.
Related
I cleaned the question a little bit because it was getting very big and unreadable.
Running on my localhost.
As you can see in the image below, the query takes 755.15 ms when selecting from the table Job that contains 15000 rows (with the where conditions returning 6650)
The table Company contains 1000 rows.
The table geo__name contains 84300 rows approx and is not giving me any problem, so I believe the problem is the database structure or something.
The structure of these 2 tables is the following:
Table Job is:
CREATE TABLE `job` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`company_id` int(11) NOT NULL,
`activity_sector_id` int(11) DEFAULT NULL,
`status` int(11) NOT NULL,
`active` datetime NOT NULL,
`contract_type_id` int(11) NOT NULL,
`salary_type_id` int(11) NOT NULL,
`workday_id` int(11) NOT NULL,
`geoname_id` int(11) NOT NULL,
`title` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`minimum_experience` int(11) DEFAULT NULL,
`min_salary` decimal(7,2) DEFAULT NULL,
`max_salary` decimal(7,2) DEFAULT NULL,
`zip_code` int(11) DEFAULT NULL,
`vacancies` int(11) DEFAULT NULL,
`show_salary` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `created_at` (`created_at`,`active`,`status`) USING BTREE,
CONSTRAINT `FK_FBD8E0F823F5422B` FOREIGN KEY (`geoname_id`) REFERENCES `geo__name` (`id`),
CONSTRAINT `FK_FBD8E0F8398DEFD0` FOREIGN KEY (`activity_sector_id`) REFERENCES `activity_sector` (`id`),
CONSTRAINT `FK_FBD8E0F85248165F` FOREIGN KEY (`salary_type_id`) REFERENCES `job_salary_type` (`id`),
CONSTRAINT `FK_FBD8E0F8979B1AD6` FOREIGN KEY (`company_id`) REFERENCES `company` (`id`),
CONSTRAINT `FK_FBD8E0F8AB01D695` FOREIGN KEY (`workday_id`) REFERENCES `workday` (`id`),
CONSTRAINT `FK_FBD8E0F8CD1DF15B` FOREIGN KEY (`contract_type_id`) REFERENCES `job_contract_type` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=15001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
The table company is:
CREATE TABLE `company` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`logo` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`website` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`user_id` int(11) NOT NULL,
`phone` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`cifnif` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`type` int(11) NOT NULL,
`subscription_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_4FBF094FA76ED395` (`user_id`),
KEY `IDX_4FBF094F9A1887DC` (`subscription_id`),
KEY `name` (`name`(191)),
CONSTRAINT `FK_4FBF094F9A1887DC` FOREIGN KEY (`subscription_id`) REFERENCES `subscription` (`id`),
CONSTRAINT `FK_4FBF094FA76ED395` FOREIGN KEY (`user_id`) REFERENCES `user` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
The query is the following:
SELECT
j0_.id AS id_0,
j0_.status AS status_1,
j0_.title AS title_2,
j0_.min_salary AS min_salary_3,
j0_.max_salary AS max_salary_4,
c1_.id AS id_5,
c1_.name AS name_6,
c1_.logo AS logo_7,
a2_.id AS id_8,
a2_.name AS name_9,
g3_.id AS id_10,
g3_.name AS name_11,
j4_.id AS id_12,
j4_.name AS name_13,
j5_.id AS id_14,
j5_.name AS name_15,
w6_.id AS id_16,
w6_.name AS name_17
FROM
job j0_
INNER JOIN company c1_ ON j0_.company_id = c1_.id
INNER JOIN activity_sector a2_ ON j0_.activity_sector_id = a2_.id
INNER JOIN geo__name g3_ ON j0_.geoname_id = g3_.id
INNER JOIN job_salary_type j4_ ON j0_.salary_type_id = j4_.id
INNER JOIN job_contract_type j5_ ON j0_.contract_type_id = j5_.id
INNER JOIN workday w6_ ON j0_.workday_id = w6_.id
WHERE
j0_.active >= CURRENT_TIMESTAMP
AND j0_.status = 1
ORDER BY
j0_.created_at DESC
When executing the above query I have these results:
In MYSQL Workbench: 0.578 sec / 0.016 sec
In Symfony profiler: 755.15 ms
The question is: Is the duration of this query correct? if not, how can I improve the speed of the query? it seems too much.
The Symfony debug toolbar if it helps:
As you can see in the below image, I'm only getting the data I really need:
The explain query:
The timeline:
The MySQL server can't handle the load being placed on it. This could be due to resource contention, or because it has not been appropriately tuned and it could also be a problem with your hard drive.
First, I would start your performance by adding MySQL keyword "STRAIGHT_JOIN" which tells MySQL to query the data in the order I have provided, dont try to think the relationships for me. However, on your dataset being so small, and already 1/2 second, don't know if that will help as much, but on larger datasets I have known it to SIGNIFICANTLY improve performance.
Next, you appear to be getting lookup descriptions based on the PK/FK relationship results. Not seeing the indexes on those tables, I would suggest doing covering indexes which contain both the key and description so the join can get the data from the index pages it uses for the JOIN instead of use index page, find the actual data pages to get the description and continue.
Last, your job table with the index on (created_at,active,status), might perform better if the index had the index as ( status, active, created_at ).
With your existing index, think of it this way, each day of data is put into a single box. Within each day box that is sorted by an active timestamp (even if simplified by active date), THEN the status.
So, for each day CREATED, you open a box. Look at secondary boxes, one for each "Active" timestamp (ex: by day). Within each Active timestamp (day), only now can you see if the "Status = 1" records. So open each active timestamp day, assess Status = 1, then close each created day box and go to the next created day box and repeat. So look at the labor intensive of open each box per day, each active box within that day.
Now, under the suggested index starting with status. You now have a very finite number of boxes, one for each status. Open only the 1 box for status = 1 These are the only ones you want to consider... All the others you don't care. Inside that, you have the actual records based on ACTIVE Timestamp and that is sub-sorted. From that, you can jump directly to those at the current timestamp. From the first record and the rest within the box, you now have all the records that qualify. Done. Since these records (index) ALSO has the Created_at as part of the index, it can optimize that with the descending sort order.
For ensuring "covering indexes" for the other lookup tables if they do not yet exist, I suggest the following.
table index
company ( id, name, logo )
activity_sector (id, name )
geo__name ( id, name )
job_salary_type ( id, name )
job_contract_type ( id, name )
workday ( id, name )
And the MySQL Keyword...
SELECT STRAIGHT_JOIN (rest of query...)
There are several reasons as to why Symfony is slow.
1. Server fault
First, it could be the server fault. Server performances may hinder your query time.
2. Data size and defered rendering
Then comes the data size. As you can see on the image below, the query on one of my project have a 50Mb data size (currently about 20k rows).
Parsing 50Mb in HTML can take some time, mostly because of loops.
Still, there are solutions about this, like defered rendering.
Defered rendering is quite simple, instead of parsing data in your twig you,
send all data to a javascript varaible, and use javascript to parse/render data once the DOM is loaded.
3. Query optimisation
As I wrote in comment, you can check the following question, on which I explained why custom queries are important.
Are Doctrine relations affecting application performance?
In this question, you will read that order matter... It's in fact the most important thing.
While static data in your databases are often inserted in the right order,
it's rarely the case for dynamic data (data provided by user during the website life)
Which is why, using ORDER BY in your query will often speed up the page rendering,
as doctrine won't be doing extra queries on it's own.
As exemple, One of my site have about 700 entries diplayed on the index.
First, here is the query count while using findAll() :
It show 254 query (253 duplicates) in 144ms, plus 39 render time.
Next, using the second parameter of findBy(), ORDER BY, I get this result :
You can see the full query here (sreenshot is big)
Much better, 1 query only in 8ms, and about the same render time.
But, here, I don't use any fields from associations.
From the moment I will do it, doctrine qui do some extra query, and query count and time will skyrocket.
In the end, it will turn back to something like findAll()
And last, this is the custom query :
In this custom query, the query time went from 8ms to 38ms.
But, unlike the previous query, I got way more data in my result,
which will prevent doctrine from doing extra query.
Again, ORDER BY() matter in this query. Without it, I skyrocket back to 84 queries.
4. Partials
When you do custom query, you can load partials objects instead of full data.
As you said in your question, description field seems to slow down your loading speed,
with partials, you can avoid to load some fields from the table, which will speed up query speed.
First, instead of your regular syntax, this is how you will create the query builder :
$em=$this->getEntityManager();
$qb=$em->createQueryBuilder();
Just in case, I prefer to keep $em as a separate variable (if I want to fetch some class repository for example).
Then you can start your partial select. Careful, first select can't include any association fields :
$qb->select("partial job.{id, status, title, minimum_experience, min_salary, max_salary, zip_code, vacancies")
->from(Job::class, "job");
Then you can add your associations :
$qb->addSelect("company")
->join("job.company", "company");
Or even add partial association in case you don't need all the data of the association :
$qb->addSelect("partial activitySector.{id}")
->join("job.activitySector", "activitySector");
$qb->addSelect("partial job.{id, company_id, activity_sector_id, status, active, contract_type_id, salary_type_id, workday_id, geoname_id, title, minimum_experience, min_salary, max_salary, zip_code, vacancies, show_salary");
5. Caches
You could also use various caches, like Zend OPCache for PHP, which you will find some advices in this question: Why Symfony3 so slow?
There is also the SQL cache Varnish.
This round up about everything I can share to lower your loading time.
Hope it will prove useful and you will be able to solve your problem.
So many keys , try to minimize the number of keys.
I'm working on a project and right now I'm implementing a leaderboard. Before I start working on it, I need some advices for better practice of my leaderboard's structure.
First of all the leaderboard will be displayed on two pages, the one is on the home page of each player's which will contain the first 10 teams (same 10 teams for all players) and the other leaderboard will be in the leaderboard's page, which there, will have all the teams with sorting functionalities.
The structure of the leaderboard of each row is the following:
• ranking position
• team name
• team value
• total of the games the team won
• total of the games the team defeated
• total of the games the team had draw
• sum of the goals the team has made
• sum of the goals the team has conceded
• the last 4 game results of the team
Below is my database's tables
challenges table
CREATE TABLE `challenges` (
id` int(10) unsigned NOT NULL AUTO_INCREMENT,
'challenge_date` datetime NOT NULL,
`status` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `challanges_id_index` (`id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
challenges results
CREATE TABLE `challenges_results` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`challenge_id` int(11) NOT NULL,
`team_id` int(11) NOT NULL,
`goals` int(11) NOT NULL,
`result` char(1) DEFAULT NULL,
`challenge_date` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
On challenges results result column can be W for wins, D for draws and L for defeats
team values
CREATE TABLE `team_values` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`team_id` int(11) DEFAULT NULL,
`value` double(15,8) DEFAULT '1500.00000000',
`created_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
team
CREATE TABLE `teams` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`avatar` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`founded` date NOT NULL,
`residense_city_id` int(10) unsigned NOT NULL,
`slug` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`primary_color` char(10) COLLATE utf8_unicode_ci NOT NULL,
`secondary_color` char(10) COLLATE utf8_unicode_ci NOT NULL,
`status` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `teams_slug_unique` (`slug`),
KEY `teams_id_index` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
One team can have many values (teams_values) but only the recent will be displayed.
One team can be in many challenges.
One team can have many results from different challenges.
The leaderboard will work as follow. The teams will be sorted with the highest values from teams_values table. That value is calculated and stored every time the team is having a challenge.
In case where two or more teams have the same value we need to apply the following three rules. The rules also needs to be executed one by one, for example if I run the first rule and still there are teams which are equal also on value and goals scored then I will apply the second rule and so on.
• Best offense (higher Number of goals scored)
• Best Defense (Less Number of goals conceded)
• The team with the most wins in the games between them
So I came with three solutions which still I don't know which one is the better and if there is a better from the three.
The first option that I though is to use options like inner join, union etc to collect the information from the tables and apply the rules on the same SQL query. So every time that I want to view the leaderboard, I will execute this SQL. The problem with this solution is that I don't know how effective will be in case that we want the leaderboard to be always up to date with the latest results. Because imaging having 10k visitors per day and everyone executing this query.
Second option is to collect the information and in case of duplicate values, I will use PHP to get the duplicate teams, apply the rules and then based on the results of the rules swipe the teams in the array. From performance site I don't know how effective is this option.
Third solution is to create another table called leaderboard which I will store all this information in case the team doesn't exist or I will update the record if exist based on the results of the latest challenge e.g increasing the goals if the team scored. Then I will use only the leaderboard table for filtering the data and printing the ranking of the teams. I believe this option is better because I need to deal only with one table and I will update the record only when a team finished a challenge.
We will use cache, but for now we are thinking that the leaderboard should be always up to date and not updating it once a day.
Which one is better solution and why and in case of a better solution I'm open for suggestions. Thanks
Since you're running on a shared account on a virtual private server, the chances are very going you're going to theoretically run into cases where you contend for the use of server resources, disk usage, memory usage, cpu processing power.
First and foremost, try do all your database calculations in MySQL, and only return the data to PHP once you've completed all operations on them. MySQL is optimised for the job, whereas PHP is better at general computing problems.
Your one option to take some load off the server would be to use PHP to create a webpage that is viewable in the browser, every single time the leaderboard is updated. That way, you run through calculations only once every time the leaderboard needs to be updated.
If I was building the system and the system was never going to reach enterprise-grade level, but instead remain small and functional, I would write a PHP script early on, because you can save a lot of processing power that way alone.
For what it's worth, if the server is well configured, you shouldn't be worried about getting 10k user requests a day, unless your code is really terribly written.
EDIT: As an afterthought, you can install a program like https://memcached.org/, which caches your SQL data in RAM. Sites like LiveJournal and Wordpress use it, but you'd need to configure it in a way that works for the rest of the vps users unless the box is really high spec.
I have a small(100-ish rows, 5 columns) table which is displayed in full for a control panel feature. When using IntelliJ to test development, it responds to the initial request, but never completes executing, and thus never serves any content. If I deploy the PHP files to my local web server, it serves the same content with no hesitation at all. Sometimes, when I load parts of the control panel that use no database access, it loads it just fine(albeit slow). I've upped the max memory allowed for requests in my cli/php.ini, and also increased the memory available to IntelliJ. My idea64.vmoptions is as follows:
-Xms128m
-Xmx3G
-XX:MaxPermSize=750m
-XX:ReservedCodeCacheSize=200m
-ea
-Dsun.io.useCanonCaches=false
-Djava.net.preferIPv4Stack=true
-Djsse.enableSNIExtension=false
-XX:+UseCodeCacheFlushing
-XX:+UseConcMarkSweepGC
-XX:SoftRefLRUPolicyMSPerMB=50
-Dawt.useSystemAAFontSettings=lcd
If I dump the table, it loads the page again, so I assume the problem is related to how much memory IntelliJ allows php to use, but I'm quite stumped as to what to look for. The only special thing about the table, as far as I know, is that it uses a very large primary key column. Table structure is as follows:
CREATE TABLE IF NOT EXISTS `links` (
`url` VARCHAR(767) NOT NULL,
`link_group` INT(10) UNSIGNED NOT NULL,
`isActive` TINYINT(1) NOT NULL DEFAULT '1',
`hammer` TINYINT(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`url`),
KEY `group` (`link_group`)
)
ENGINE =InnoDB
DEFAULT CHARSET =utf8mb4,
ROW_FORMAT = COMPRESSED;
The row format is compressed to allow for said large primary keys. How should I proceed to if not solve it, find the cause?
I tried following Peter's suggestions, to no avail. I'm beginning to think this may just be IntelliJ not properly being able to serve PHP in my case. New table structure is as follows:
CREATE TABLE IF NOT EXISTS `links` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`url` varchar(767) NOT NULL,
`link_group` int(10) unsigned NOT NULL,
`isActive` tinyint(1) NOT NULL DEFAULT '1',
`hammer` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `url` (`url`),
KEY `group` (`link_group`),
FULLTEXT KEY `url_2` (`url`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPRESSED AUTO_INCREMENT=1 ;
Just to be clear, the MySQL performance doesn't seem bad. SELECT * FROM links executes in 0.0005 seconds.
You might want to recreate that table. Your table definition might be causing the unpredicatable behaviour.
Try using the TEXT data type for the url field. Also, using that as a PRIMARY key is not funny. Use an id field for the primary key and then, add a unique index to the url field (if so desired).
I'm adding "activity log" to a busy website, which should show user the last N actions relevant to him and allow going to a dedicated page to view all the actions, search them etc.
The DB used is MySQL and I'm wondering how the log should be stored - I've started with a single Myisam table used for FULLTEXT searches, and to avoid extra select queries on every action: 1) an insert to that table happens 2) the APC cache for each is updated, so on the next page request mysql is not used. Cache has a log lifetime and if it's missing, the first AJAX request from user creates it.
I'm caching 3 last events for each user, so when a new event happens, I grab the current cache, add the new event to the beginning and remove the oldest event, so there's always 3 of those in the cache. Every page of the site has a small box displaying those.
Is this a proper setup? How would you recommend implementing this sort of feature?
The schema I have is:
CREATE DATABASE `audit`;
CREATE TABLE `event` (
`eventid` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`userid` INT UNSIGNED NOT NULL ,
`createdat` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
`message` VARCHAR( 255 ) NOT NULL ,
`comment` TEXT NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER DATABASE `audit` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE `audit`.`event` ADD FULLTEXT `search` (
`message` ( 255 ) ,
`comment` ( 255 )
);
Based on your schema, I'm guessing that (caching aside), you'll be inserting many records per second, and running fairly infrequent queries along the lines of select * from event where user_id = ? order by created_date desc, probably with a paging strategy (thus requiring "limit x" at the end of the query to show the user their history.
You probably also want to find all users affected by a particular type of event - though more likely in an off-line process (e.g. a nightly mail to all users who have updated their password"; that might require a query along the lines of select user_id from event where message like 'password_updated'.
Are there likely to be many cases where you want to search the body text of the comment?
You should definitely read the MySQL Manual on tuning for inserts; if you don't need to search on freetext "comment", I'd leave the index off; I'd also consider a regular index on the "message" table.
It might also make sense to introduce the concept of "message_type" so you can introduce relational consistency (rather than relying on your code to correctly spell "password_updat3"). For instance, you might have an "event_type" table, with a foreign key relationship to your event table.
As for caching - I'm guessing users would only visit their history page infrequently. Populating the cache when they visit the site, on the off-chance they might visit their history (if I've understood your design) immediately limits the scalability of your solution to how many history records you can fit into your cachce; as the history table will grow very quickly for your users, this could quickly become a significant factor.
For data like this, which moves quickly and is rarely visited, caching may not be the right solution.
This is how Prestashop does it:
CREATE TABLE IF NOT EXISTS `ps_log` (
`id_log` int(10) unsigned NOT NULL AUTO_INCREMENT,
`severity` tinyint(1) NOT NULL,
`error_code` int(11) DEFAULT NULL,
`message` text NOT NULL,
`object_type` varchar(32) DEFAULT NULL,
`object_id` int(10) unsigned DEFAULT NULL,
`id_employee` int(10) unsigned DEFAULT NULL,
`date_add` datetime NOT NULL,
`date_upd` datetime NOT NULL,
PRIMARY KEY (`id_log`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=6 ;
My advice would be use a schema less storage system .. they perform better in high volume logging data
Try to consider
Redis
MongoDB
Riak
Or any other No SQL System
I have a social network similar to myspace but I use PHP and mysql, I have been looking for the best way to show users bulletins posted only
fronm themself and from users they are confirmed friends with.
This involves 3 tables
friend_friend = this table stores records for who is who's friend
friend_bulletins = this stores the bulletins
friend_reg_user = this is the main user table with all user data like name and photo url
I will post bulletin and friend table scheme below, I will only post the fields important for the user table.
-- Table structure for table friend_bulletin
CREATE TABLE IF NOT EXISTS `friend_bulletin` (
`auto_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(10) NOT NULL DEFAULT '0',
`bulletin` text NOT NULL,
`subject` varchar(255) NOT NULL DEFAULT '',
`color` varchar(6) NOT NULL DEFAULT '000000',
`submit_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`status` enum('Active','In Active') NOT NULL DEFAULT 'Active',
`spam` enum('0','1') NOT NULL DEFAULT '1',
PRIMARY KEY (`auto_id`),
KEY `user_id` (`user_id`),
KEY `submit_date` (`submit_date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=245144 ;
-- Table structure for table friend_friend
CREATE TABLE IF NOT EXISTS `friend_friend` (
`autoid` int(11) NOT NULL AUTO_INCREMENT,
`userid` int(10) DEFAULT NULL,
`friendid` int(10) DEFAULT NULL,
`status` enum('1','0','3') NOT NULL DEFAULT '0',
`submit_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`alert_message` enum('yes','no') NOT NULL DEFAULT 'yes',
PRIMARY KEY (`autoid`),
KEY `userid` (`userid`),
KEY `friendid` (`friendid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=2657259 ;
friend_reg_user table fields that will be used
auto_id = this is the users ID number
disp_name = this is the users name
pic_url = this is a thumbnail image path
Bulletins should show all bulletins posted by a user ID that is in our friend list
should also show all bulletins that we posted ourself
needs to scale well, friends table is several million rows
// 1 Old method uses a subselect
SELECT auto_id, user_id, bulletin, subject, color, fb.submit_date, spam
FROM friend_bulletin AS fb
WHERE (user_id IN (SELECT userid FROM friend_friend WHERE friendid = $MY_ID AND status =1) OR user_id = $MY_ID)
ORDER BY auto_id
// Another old method that I used on accounts that had a small amount of friends because this one uses another query
that would return a string of all there friends in this format $str_friend_ids = "1,2,3,4,5,6,7,8"
select auto_id,subject,submit_date,user_id,color,spam
from friend_bulletin
where user_id=$MY_ID or user_id in ($str_friend_ids)
order by auto_id DESC
I know these are not good for performance as my site is getting really large so I have been experimenting with JOINS
I beleive this gets everything I need except it needs to be modified to also get bulletins posted by myself, when I add that into the WHERE part it seems to break it and return multiple results for each bulletin posted, I think because it is trying to
return results that I am a friedn of and then I try to consider myself a friend and that doesnt work well.
My main point of this whole post though is I am open to opinions on the best performance way to do this task, many big social networks have similar function that return a list of items posted only by your friends. There has to be other faster ways???? I keep reading that JOINS are not great for performance but how else can I do this? Keep in mind I do use indexes and have a dedicated database server but my userbase is large there is no way around that
SELECT fb.auto_id, fb.user_id, fb.bulletin, fb.subject, fb.color, fb.submit_date, fru.disp_name, fru.pic_url
FROM friend_bulletin AS fb
LEFT JOIN friend_friend AS ff ON fb.user_id = ff.userid
LEFT JOIN friend_reg_user AS fru ON fb.user_id = fru.auto_id
WHERE (
ff.friendid =1
AND ff.status =1
)
LIMIT 0 , 30
First of all, you can try to partition out the database so that you're only accessing a table with the primary rows you need. Move rows that are less often used to another table.
JOINs can impact performance but from what I've seen, subqueries are not any better. Try refactoring your query so that you're not pulling all that data at once. It also seems like some of that query can be run once elsewhere in your app, and those results either stored in variables or cached.
For example, you can cache an array of friends who are connected for each user and just reference that when running the query, and only update the cache when a new friend is added/removed.
It also depends on the structure of your systems and your code architecture - your bottle nneck may not entirely be in the db.