I have a page on my site that if the column 'active' in my table is not null then show my user, if not dont bother.
Before I go further though, say I have 100 users, having this query on my page will that slow speed down a lot? Basically I have my query on my index page, so when users visit I want it to run, and pull all of my data from my table...
while($r = $q->fetch(PDO::FETCH_LAZY)){
echo '<div class="user">'.$r["user"].'</div>';
}
The short answer is no. You should not see a significant delay for a reasonable amount of users. However, if you are anticipating thousands of users, you might be better off using any number of practices to optimize performance, like:
Paging: Using a LIMIT clause with your queries and an assortment of links to provide access to all user records. This way, no matter how much your table may grow, performance will stay constantly high.
A simple text search form for the user name. Consider having to browse through a couple hundred users just to reach a particular one, versus having the server do that for you. If you go that way, don't forget to index the fields the search would apply to to ensure optimal performance.
Index the fields you 're using as criteria in your query, which is covered by ndefontenay and I 'm just mentioning it for completeness.
EDIT: Regarding your question, if what you want is to get all users where the field active is not null, you can do it like this:
SELECT * FROM User WHERE active IS NOT NULL
A small note: In terms of performance, it is best to specify the fields you want to retrieve instead of collecting all fields. If, for example, you wanted to retrieve the user name and email for each user, you could do it like:
SELECT username, email FROM User WHERE active IS NOT NULL
It looks more like a SQL question rather than a PHP question.
Your query looks like this:
select username, fullname, whatever_else from usertable where active = 1;
This should be a really fast query if you create an index on your active column:
create index idx_active
on usertable
(active desc);
I made the index descending so that the value 1 is lined up first (in my example 1 is active 0 inactive and the active column is an integer).
I assume that on the long run, you will have a lot less active people than inactive people as well. You will always query a small substract of your table usertable.
Hope this helps.
Related
I have a table article with many articles. I want to track the number of viewers for each article. Here's my idea, of how I plan to do it.
I've a table viewers with rows: id, ip, date, article_id.
article_id is a FOREIGN FIELD referring to the id of the article. So, when a user open up an article, his/her IP address is stored in the table.
Is this a decent approach? It is for a personal site and not a very big website.
EDIT: I want to print the number of view on each article page.
It depends on how frequently you need to display number of viewer. Your general query will be:
select count(*) from viewers
where article_id='10'
With time, your viewers table will grow. Say it have million records after 1 year or two. Now if you are showing number of viewers on each article page or displaying articles with most viewers, it will start impacting on performance even though foreign key is indexed. However that will happen after you added hundreds of articles with each having thousands of viewers.
A better optimized solution may be to keep number of viewers in article table and use that to display results. Existing Viewers table is also necessary to ensure there is no duplicate entry (Same user reading an article ten times must be marked as single entry not ten).
Use a Tool like Google Analytics. This will do the job much more elaborated and you're up and running in minutes, there's more about unique visitors than IP addresses!
If you want to have an on premise solution, look at PIWIK, which is PHP framework for exactly this puprose.
In this design,There is a one problem if the same user open it again and again then either you have to put check before insert the entry or you insert the same ip address multiple time but different time stamp.
Most of the popular sites consider one ip address as one view even if that client or user open that article several times.
I can think of solution.
your approach with single check. if the same client has opened it again don't insert it.
or group by Id when you retrieve the counter.
It depends on what you want to store in your database. If you want to know exactly how many unique users visited this particular article (including date and ip) this is reasonable way do to this. But if you need only a number to show you can alter article table and include new column with visit_counter and store cookie to prevent incrementing counter on same user.
try something like this
// insert
$query = mysqli_query("REPLACE INTO viewers (ip) VALUES ('" . ip2long($_SERVER['REMOTE_ADDR']) . "')");
// retrieve
list($pageviews) = mysqli_fetch_row(mysqli_query("SELECT COUNT(ip) FROM viewers"));
echo $pageviews;
Read : REPLACE INTO
Yes, this is good aproach if you create some kind of cache for displaying how many views each article had. It's not optimal to count views each time user open website.
You can do it in SQL Server. There is something like materialized view. ( https://code.google.com/p/flexviews/ )
select article_id, count(*) as views from viewers group by article_id
Or you can cache it in files and refresh every X time.
To store users who viewed article I suggest using AJAX. When user open website, another 'thread' will call website to add his as viewer. Even if your db is slow, it will not slow down website loading, also web spiders will not be counted.
I have to create a like system (the name won't be "like", Facebook owns it).
So I imagined two ways to store these likes in my database and I want to know, which way is the better for a very high-traffic site.
Create table comment_likes with "id", "comment_id", "user_id" cells. In comments table store the "like_count", so I don't need to count it when I need to write it out. But likes are easy to do thing, so people will create a lots of them and if I need to list a specified comment's likes, I need to read the whole comment_likes table and found all the user_ids. This could be millions of rows in the future. If 1000 user will do it in the same time, my system will die.
My second thought was, to store likes in comments table. create a cell named "likes" with a list of user_ids like this: 1#34#21#56#....
So when somebody like/unlike a comment just CONCAT or REPLACE his/her id in this cell with a #. When I need to list specified comment just explode this list at #-s.
I think 2nd could be faster and smarter, but what do you think about this?
The first option is much better, because you have the benefits of a relational setup. For example: What if you want to get the comments from the database userId x has liked? With the first setup this is a fast and simple query. In the second case you would have to use a LIKE, which is much slower and inaccurate. (Imagine the userId is 1, and the likes field in the comments table contains #10 - it would return the comment if you would use LIKE '%1%').
And even for a high traffic site; just using an index on commentId would make this a fast operation.
So go for the first option.
If you really doubt the speed of the first option, you could create a "cache" field in the comments table in which you count the amount of likes, so you don't have to perform a subquery to select the like count.
I have recently written a survey application that has done it's job and all the data is gathered. Now i have to analyze the data and i'm having some time issues.
I have to find out how many people selected what option and display it all.
I'm using this query, which does do it's job:
SELECT COUNT(*)
FROM survey
WHERE users = ? AND table = ? AND col = ? AND row = ? AND selected = ?
GROUP BY users,table,col,row,selected
As evident by the "?" i'm using MySQLi (in php) to fetch the data when needed, but i fear this is causing it to be so slow.
The table consists of all the elements above (+ an unique ID) and all of them are integers.
To explain some of the fields:
Each survey was divided into 3 or 4 tables (sized from 2x3 to 5x5) with a 1 to 10 happiness grade to select form. (questions are on the right and top of the table, then you answer where the questions intersect)
users - age groups
table, row, col - explained above
selected - dooooh explained above
Now with the surveys complete and around 1 million entries in the table the query is getting very slow. Sometimes it takes like 3 minutes, sometimes (i guess) the time limit expires and you get no data at all. I also don't have access to the full database, just my empty "testing" one since the costumer is kinda paranoid :S (and his server seems to be a bit slow)
Now (after the initial essay) my questions are: I left indexing out intentionally because with a lot of data being written during the survey, it would be a bad idea. But since no new data is coming in at this point, would it make sense to index all the fields of a table? How much sense does it make to index integers that never go above 10? (as you can guess i haven't got a clue about indexes). Do i need the primary unique ID in this table? I
I read somewhere that indexing may help groups but only if you group by the first columns in a table (and since my ID is first and from my point of view useless can i remove it and gain anything by it?)
Is there another way to write my query that would basically do the same thing but in a shorter period of time?
Thanks for all your suggestions in advance!
Add an index on entries that you "GROUP BY" or do "WHERE". So that's ONE index incorporating users,table,col,row and selected in your case.
Some quick rules:
combine fields to have the WHERE first, and the GROUP BY elements last.
If you have other queries that only use part of it (e.g. users,table,col and selected) then leave the missing value (row, in this example) last.
Don't use too many indexes/indeces, as each will slow the table to updates marginally - so on really large system you need to balance queries with indexes.
Edit: do you need the GROUP BY user,col,row as these are used in the WHERE. If the WHERE has already filtered them out, you only need group by "selected".
I am currently using MySQL and MyISAM.
I have a function of which returns an array of user IDs of either friends or users in general in my application, and when displaying them a foreach seemed best.
Now my issue is that I only have the IDs, so I would need to nest a database call to get each user's other info (i.e. name, avatar, other fields) based on the user ID in the loop.
I do not expect hundreds of thousands of users (mainly for hobby learning), although how should I do this one, such as the flexibility of placing code in a foreach for display, but not relying on ID arrays so I am out of luck to using a single query?
Any general structures or tips on what I can display the list appropriately with?
Is my amount of queries (1:1 per users in list) inappropriate? (although pages 0..n of users, 10 at a time make it seem not as bad I just realize.)
You could use the IN() MySQL method, i.e.
SELECT username,email,etc FROM user_table WHERE userid IN (1,15,36,105)
That will return all rows where the userid matches those ID's. It gets less efficient the more ID's you add but the 10 or so you mention should be just fine.
Why couldn't you just use a left join to get all the data in 1 shot? It sounds like you are getting a list, but then you only need to get all of a single user's info. Is that right?
Remember databases are about result SETS and while generally you can return just a single row if you need it, you almost never have to get a single row then go back for more info.
For instance a list of friends might be held in a text column on a user's entry.
Whether you expect to have a small database or large database, I would consider using the InnoDB engine rather than MyISAM. It does have a little higher overhead for processing than MyISAM, however you get all the added benefits (as your hobby grows) including JOIN, which will allow you to pull in specific data from multiple tables:
SELECT u.`id`, p.`name`, p.`avatar`
FROM `Users` AS u
LEFT JOIN `Profiles` AS p USING `id`
Would return id from Users and name and avatar from Profiles (where id of both tables match)
There are numerous resources online talking about database normalization, you might enjoy: http://www.devshed.com/c/a/MySQL/An-Introduction-to-Database-Normalization/
Im sketching out a database layout for a website that has the potential to become huge with 100's of queries a minute.
I was thinking about doing the following:
user table
id
name
(few more fields)
Pages (this one will become the biggest table)
id
titel
img
text
restaurant (this will be the row that connects the pages to the user table, i was planning on creating an index on this one to increase speed)
So im wondering if creating an index for the 'restaurant' row will increase the speed of my queries or if there is any other way to speed up things?
Thanks in advance!
If you need to do some query like :
select *
from pages
where restaurant = ...
Or like :
select *
from user
inner join pages on pages.restaurant = user.id
where user.name = '...'
Or any other condition on the restaurant column, then, you'll probably want to add an index on that column, to avoid scanning all lines on the pages table.
But note that useful/necessary indexes will almost always depend on the kind of queries you'll be doing.
Which means that it's not quite possible to accurately guess which indexes you'll need -- first, you need to know how you will access you data.
Note : you should read the How MySQL Uses Indexes section of MySQL's manual : it contains stuff that's interesting to know ;-)
As a test, you can always run your query in your preferred tool and add EXPLAIN in front. This will show you what indices are being used and/or which temporary tables had to be created etc.
EXPLAIN select *
from pages
where restaurant = ...
If you're using the InnoDB storage, you should not just use 'an index' but make use of FOREIGN KEY. Thus, you will also decrease potential integrity problems.
Suggestion: do not use restaurant as a name. Add some more tables and it will be difficult to keep track what references what. Why not call it user_id? (This is a matter of personal preference, though.)