I am creating a MCQ quiz based on php and mysql. Here are the structures of my main tables:
quiz table: quiz id, quiz_category
category table: id, title...
questions table: id, quiz id, categoryid, title...
answers table: id, question id...
To start things, I have the tables populated with 150+ quizzes, 4 categories, 14000+ questions and rightanswers for each.
To save time, for each question, the right answer is pulled from the answers table https://stackoverflow.com/editing-helpalongwith 3 other random answers .
Now when I was testing it with just two quizzes, it worked fine. But with 150 quizzes, several problems have cropped up:
the database is slow and for later quizzes takes forever to load questions
the randomization of answers is not working anymore - along with the right answer, the other options show the same entry, making it easy for the user to guess the right answer.
You can see the code I am working with in my previous Stackoverflow query. https://stackoverflow.com/questions/14826573/randomising-questions-and-answers-php-quiz-not-working
Any idea about what the ideal queries should be for the quiz program to work?
I will provide some tips on how to improve performance, however these will be generic and may not be complete.
From briefly looking at your PHP and SQL statements from your previous question, there are a few logical places for an index. To add an index please reefer to the MySQL manual for more information
$sql4="select * from answers where question_id=".$row2['id'];
question_id should have an index
$sql2="select * from questions where quiz_id=".$_SESSION['quizid'];
quiz_id should have an index
Adding these two indexes will also improve selectivity on this
$sql3="select * from answers where question_id in (select id from
questions where quiz_id =$row2[quiz_id]) order by rand()";
This will help as previously you would have been performing a full table scan for each query.
Your other issue is that you have a loop and on each iteration you are sending commands to query the database, you should collect all the information at once before the loop and then iterate using that rather than sending individual queries each iteration.
Related
i have a tables named questions and users in my mysql server
questions table columns :
id
text
difficulty
category
users table columns
id
username
Lets say i have 10k users and 20k questions.
Based on difficulty and category i am picking a question and asking to user , but i dont wanna show the questions user answered correct before.
easiest way is creating an user column named solvedquestions and seperating with comma 1,2,3,4,5.. etc and creating a query for that column while asking another question like this..
select * from questions where difficulty=[difficulty] and id not in([user.solvedquestions])
but this query can take so long time and ugly , thats why i want to find an alternative way using elastic search
i will keep the users solved question history in elastic search
and
find the questions that is not solved without using not in query
the options come to me mind :
save all question id records with difficulties and categories in elastic search and query ids there . and after find the ids in mysql in query
keep all the question data in elastic search and search texts too
note : questions difficulty and answers may update and sync per day
I have a simple form that asks "how are you doing right now at this moment?" and they select #1-10 from a dropdown.
The challenge: the user will answer this question endlessly over time, and I'd like to, if possible, store their ongoing answers in 1 column of a record with their unique user_id. Since they can potentially have hundreds of submissions to the question, what would the best way to store and retrieve their stored answer? There will be an option for them to view their past 5, 10, or even 100 answers so they can see a pattern over time how they're doing. Their info would be displayed probably in a table going across the screen like:
Here's how you've been doing:
2 4 8 9 4 9 4 etc etc
Is there a way, and is it in this case recommended, to save all their submitted answers to the question in 1 single table row column? If so, can you give me an idea of the mysql code to save ... and code to retrieve it? I would create x # of columns to save each answer if there was a known total, but in this case, we don't know how many there will be.
I wasn't able to find a solution to online.
Yes , according to the Jeff, If I were you, I will create some table that we call it temporary_answer with the field,
user_id, question_id, answer_id, created_datetime
And you will able to fetch this temporary answer anytime, anywhere by filtering the user_id and created_datetime. I have done with this when I was developing e-learning sites. I hope this answer can help.
CMIIW.
I have an application (More likely a quiz app) where i have saved all my 1000 quizzes in MySQL database, I want to retrieve a random question from this table when a user request one, I can easily do it using the RAND() function in MySQL.. my problem is , I don't want to give the same question two or more times to a user, how can i keep a record of retrieved questions? Do I have to create tables for each and every users? won't that increase the load time?? please help me, any help would be a big favor ..
-regards
If you want it for a short time, use the user's $_SESSION for that.
If you need the long term ( say tomorrow, not to ask the same questions) - you'll have to create additional table for usersToQuestions, where you'll store the user id and the questions the user had been already asked.
Retrieving a question in both cases would require a simple IN condition:
SELECT * FROM questions
WHERE id not IN ('implode(",", $_SESSION["asked"])')
SELECT * FROM questions
WHERE id not IN (
SELECT question_id FROM questions2users WHERE userid = 123
)
my problem is , I don't want to give the same question two or more times to a user,
how can i keep a record of retrieved questions? Do I have to create tables for each
and every users? won't that increase the load time?
Yes, but possibly not so much.
You keep a single extra table with userId, questionId and insert there the questions already asked to the various users.
When you ask question 123 to user 456, you run a single INSERT
INSERT INTO askedQuestions (userId, questionId) VALUES (456, 123);
Then you extract questions from questions with a LEFT JOIN
SELECT questions.* FROM questions
LEFT JOIN askedQuestions ON (questions.id = askedQuestions.questionId AND askedQuestions.userId = {$_SESSION['userId']} )
WHERE askedQuestions.userId IS NULL
ORDER BY RAND() LIMIT 1;
if you keep askedQuestions indexed on (userId, questionId), joining will be very efficient.
Notes on RAND()
Selecting on a table like this should not done with ORDER BY RAND(), which will retrieve all the rows in the table before outputting one of them. Normally you would choose a questionId at random, and select the question with that questionId, and that would be waaaay faster. But here, you have no guarantee that the question has not been already asked to that user, and the faster query might fail.
When most questions are still free to ask, you can use
WHERE questions.questionId IN ( RAND(N), RAND(N), RAND(N), ... )
AND askedQuestions.userId IS NULL LIMIT 1
where N is the number of questions. Chances are that at least one of the random numbers you extract will still be free. The IN will decrease performances, and you will have to strike a balance with the number of RANDs. When questions are almost all asked, chances of a match decrease, and your query might return nothing even with many RANDs (also because RANDs will start yielding duplicate IDs, in what is known as the Birthday Paradox).
One way to achieve the best of both worlds could be to fix a maximum number of attempts, say, three (or better still, based on the number of questions left over).
For X times you generate (in PHP) a set of Y random ids betweeen 1 and 1000, and try to retrieve (userId, questionId) from askedQuestions. The table is thin and indexed, so this is really fast. If you fail, then the extracted questionId is random and free, and you can run
SELECT * FROM questions WHERE id = {$tuple['questionId']};
which is also very fast. If you succeed X times, i.e., for X times, all Y random questionIds are registered as being already asked, then you run the full query. Most users will be served almost instantly (two very quick queries), and only a few really dedicated users will require more processing. You might want to set some kind of alerting to warn you of users running out of questions.
One solution is to add an ID column in the question table and when you serve it to a user you check that ID with the list of questions that you served the user.
You can use in memory data structure like List to keep track of the questions that are served to a particular user. This way, you only need array of Lists instead of tables to get the job done.
The following description is a simple example with questions and answers. But the logic of my site is similar.
Lets say tables are:
USERS table: USER_ID, etc
QUESTIONS table: QUESTION_ID, TEXT, CATEGORY, CORRECT_RESPONSE, AVAILABLE
RESPONSES table: QUESTION_ID, USER_ID, RESPONSE_VALUE
PROFILE table: USER_ID, CATEGORY_Questions, YEAR, NUMBER_OF_ANSWERED, Number_OF_CORRECT, POINTS
The questions will be available to be answered by users for few hours. Every question has the same 3 choices for answers YES/NO/DEPENDS.
So I want users to go click on one of them for example and store an entry on RESPONSES table (ok this query is easy) and then not be able to answer the same question again.
Users will be able to edit the question for some time and after this period I want the question to be shown as answered, until the end of the day that I will mark the question as AVAILABLE=NO and it will removed from the unanswered questions... What is the most efficient way to do this?
There are alot of ways to achieve this depending on the context one of these is create a boolean bit column called answered and another column AnswerDate datetime or timestamp then when the user answer a question add the answer time then using php or javascript in handling the update of the flag answered in the table after a period of time that you want has elapsed.
I'm currently working on a medium-sized web project, and I've ran into a problem.
What I want to do is display a question, together with an image. I have a (global) list of questions, and a (global) list of images, all questions should be asked for all images.
As far as the user can see the question and image should be chosen at random. However the statistics from the answers (question/image-pair) will be used for research purposes. This means that all the question/image-pair must be chosen such that the answers will be distributed evenly across all question, and across all images.
A user should only be able to answer a specific question/image-pair one time.
I am using a mysql database and php. Currently, i have three database tables:
tbl_images (image_id)
tbl_questions (question_id)
tbl_answers (answer_id, image_id, question_id, user_id)
The other columns are not related to this specific problem.
Solution 1:
Track how many times each image/question has been used (add a column in each table). Always choose the image and question that has been asked the least.
Problem:
What I'm actually interested in is distribution among questions for an image and vice versa, not that each question is even globally.
Solution 2:
Add another table, containing all question/image-pairs along with how many times it has been asked. Choose the lowest combination (first row if count column is sorted by ascending order).
Problem:
Does not enforce that the user can only answer a question once. Also does not give the appearance that the choice is random to the user.
Solution 3:
Same as #2, but store question/image/user_id in table.
Problem:
Performance issues (?), a lot of space wasted for each user. There will probably be semi-large amounts of data (thousands of questions/images and atleast hundreds of users).
Solution 4:
Choose a question and image at true random from all available. With a large enough amount of answers they will be distributed evenly.
Problem:
If i add a new question or image they will not get more answers than the others and therefore never catch up. I want an even amount of statistics for all question/image-pairs.
Solution 5:
Weighted random. Choose a number of question/image pairs (say about 10-100) at true random and pick the best (as in, lowest global count) of these that the user has not answered.
Problem:
Does not guarantee that a recently added question or image gets a lot of answers quickly.
Solution #5 is probably the best once I've come up with so far.
Your input is very much appreciated, thank you for your time.
From what I understand of your problem, I would go with #1. However, you do not need a new column. I would create an SQL View instead becuase it sounds like you'll need to report on things like that anyway. A view is basically a cached select, but acts similar to a table. Thus you would create a view for keeping the total of each question answered for each image:
DROP VIEW IF EXISTS "main"."view_image_question_count";
CREATE VIEW "view_image_question_count" AS
SELECT a.image_id, a.question_id, SUM(b.question_id) as "total"
FROM answer AS a
INNER JOIN answer AS b ON a.question_id = b.question_id
GROUP BY a.image_id, a.question_id;
Then, you need a quick and easy way to get the next best image/question combo to ask:
DROP VIEW IF EXISTS "main"."view_next_best_question";
CREATE VIEW "view_next_best_question" AS
SELECT a.*, user_id
FROM view_image_question_count a
JOIN answer USING( image_id, question_id )
JOIN question USING(question_id)
JOIN image USING(image_id)
ORDER BY total ASC;
Now, if you need to report on your image to question performace, you can do so by:
SELECT * FROM view_image_question_count
If you need the next best image+question to ask for a user, you would call:
SELECT * FROM view_next_best_question WHERE user_id != {USERID} LIMIT 1
The != {USERID} part is to prevent getting a question the user has already answered. The LIMIT optimizes to only get one.
Disclaimer: There is probably a lot that could be done to optimize this. I just wanted to post something for thought.
Also, here is the database dump I used for testing. http://pastebin.com/yutyV2GU