I have 2 tables,
first table stores URLs
|link_id | link_url | <== schema url_table ::: Contains 2 million+ rows
and second table stores user_bookmarks
|user_id| link_id | is_bookmarked | <== schema for user_table ::: over 3.5 million+ rows
is_bookmarked stores 1 or 0, according to the link being bookmarked by the user or not.
Here is the problem,
When a new link is added, these are the steps followed
1) Check if url already exists in url_table , which means going through millions of rows
2)if does not exist add a new row in url_table and user_table
The Database(Mysql) is simply taking too much time, due to the enormous row-set,
Also, its a very simple php+Mysql app, with no search-assisted indexing programs whatsoever.
any suggestions to speed this up?
Why not remove the column user_bookmarks.is_bookmarked and use the sole existence of an entry with user_id and link_id as indicator that the link was bookmarked?
A new link has no entries in the user_bookmarks table, because nobody bookmarked it yet. When a user bookmarks a link, you add an entry. When the user removes the bookmark, you remove the row.
To check if a user bookmarked a link or not, simply SELECT count() FROM user_bookmarks WHERE user_id=? AND link_id=?. When you receive 1 row, it is bookmarked. When you receive 0 rows, it isn't.
Speeding up the insert-query when adding a new entry in the URL table could be accelerated with an appropriate index.
If you told us what your curent schema was (i.e. the create table statements including indexes) rather than just what your column names were then we might be able to make practical suggestions as to how to improve that.
There's certainly scope for improving the method of adding rows:
Assuming that the link_url can be larger than the 767 byte limit for an Innodb table (you didn't say what engine you are using), then change the id column to contain a md5 hash of the link_url with a unique index. Then when you want to add a record, go ahead and try to insert it using INSERT IGNORE ....
Related
I have a table, where I store user uploaded files. There can be 5 different file types: profile picture, cpr file, degree file, video file, background check file.
Table structure is this:
file_id, user_id, file_type, file_size, file_name, file_new_name, file_path, file_cat, date_created
My questions:
Is this structure efficient or should I create 5 different tables?
If I would like to update, lets say user profile picture row, then what would be the best way to do it? --- I came up with a solution that probably is not be the best one- I update the row where file_cat = "profile_picture" and user_id=:user_id. Would that put a lot of load in the system?
First when user signs up, he doesn't have any files. Should I user insert into ... VALUES ... on duplicate key update with a hidden value in a form?
Thank you in advance.
This is three questions not one.
Is this structure efficient or should I create 5 different tables?
One table is good enough
If I would like to update, lets say user profile picture row, then
what would be the best way to do it? --- I came up with a solution
that probably is not be the best one- I update the row where file_cat
= "profile_picture" and user_id=:user_id. Would that put a lot of load in the system?
Not if you have an index on file_cat, user_id (composite index on both fields). If you want to make things a bit leaner you can store constants instead of 'profile_picture' etc. eg
profile_picture = 1
cpr = 2
....
background = 6
This would make the tables and indexes a bit smaller. It might make the queries slightly faster.
First when user signs up, he doesn't have any files. Should I user
insert into ... VALUES ... on duplicate key update with a hidden value
in a form?
No need for that. not having a record for new users actually makes things easier. You can do an COUNT(*) = 0 query or better still an EXISTS query without having to fetch rows and examine them.
Update:
These EXISTS queries are really usefull when you are dealing with JOINs or Sub Queries for example to quickly find if a user has uploaded a profile picc
SELECT * from users WHERE exists (SELECT * from pictures where pictures.user_id = users.id)
If you use the primary key properly then your insert ... on duplicate key update ... query will do everything for you.
For your table you need to define a primary key column. In this case I would say it is your file_id column. So if you do your insert, the MySQL server will check to see if your file_id column is defined already for that value, if so it will update with the new values, other wise it will add a new row of data with the new file_id.
I should be easy enough to separate it though, make 1 script for creating new rows and another for updating. Usually you will know when you are creating as opposed to updating in an application. Again using a primary key correctly will help you out a lot. Using a primary key in your where clause I am pretty sure is one of the most efficient ways to update.
https://dev.mysql.com/doc/refman/5.5/en/optimizing-primary-keys.html
I'm trying to build a very simple login system for my site (just for practice for a project i'm working on). The way I've decided to implement it is use a table with fields for ID, Name, Password, and username and search for the entered information in the existing table.
For registration, it simply injects the information supplied into the table, and I would like to assign a customer ID number. My idea for assigning an ID number is to simply find the size of the ID column (which will contain the ID's 1,2,3..etc up to the end) and assign the new registration to the length +1. For this purpose i'll need a way to get the size of the column, but I'm just learning php and sql so i'm not sure what the syntax would be.
TLDR; is there a funtion in sql that I can use in php to get the length of a particular column? (i.e the number of entries stored in that column?)
Set the ID column to Primary and Auto increment.
you don't include that in your query it is created on its own.
You'd probably be better off just using an IDENTITY or AUTO_INCREMENT column. The problem with checking for the "size of the column" (by which I assume you mean the count of rows in that column) is that you could end up inserting duplicate IDs, for example:
ID | ...
---------
1
2
4
So if you did a SELECT COUNT(ID)+1 FROM MyTable, it would return 4, and you have an ID collision.
You could do something like SELECT MAX(ID)+1 FROM MyTable, but even then there could be concurrency problems (process A and process B both try to run that query at the same time, before either has a chance to insert the new ID of 5). You're really best off just letting your RDBMS take care of it..
I am storing user ID values in a table field separated by a | (user_id1|user_id2|user_id3|user_id17).
A user ID will be added and removed from this field at certain points.
How can I check if the current users ID exists in the field or not using a query?
And it of course needs to be an exact match. Can't look for user_id1 and find user_id17.
I know I could use a SELECT query, explode the field, then use in_array but if there's a way to do it using a query it'd be better.
I guess I'll explain what I am doing: I made a forum for a small private website (7 users), but coding it for larger scale.
My table structure is pretty good: forum_categories, forum_topics, forum_posts. Using foreign keys between the tables for delete and update queries.
What I am seeking help on is to mark Topics as unread for each user. I could create a new table with topic_id & user_id, each one being a new row but that wouldn't be good with alot of users & topics.
If somebody has a better solution I am all for it. Or can prove to me that 1 row per user_id is the best way then I'll be more than willing to do that.
I think you want to track read messages, not the other way around. If you tracked unread messages, every time you add a user you'll have to add that user to every topics "unread list".
I looked into SMF like my comment suggested. They are using a separate table to track read messages.
A simple table that holds user_id and topic_id are you are need. When a user reads a topic, make sure there is a row in the table for that user.
Another reason to use a separate table. It's going to be faster to query against 2 int values in the database than to use LIKE % statements.
Hi my qouestion is how to get the first number that is not used in specific database row. The number must be betwen 1 and 9999 and must be compared with all numbers in that specific database row, so if data in my database row starts with 5, i wont to be able to get the first number that is not used ...in this case the number 1. then when I create data with number 1.. the next number I need to get is 2 and...I'm using that to create profiles, and that number is the profile number, and ewery new profile must have the first unused number in data base. How to do that. I don't know where to start. So if someone can put me on the right path for solution of this problem? Thanks.
the edit
But, I dont need the auto increment i need to user to be able choosing this number on his own, first, this first number must bee suggested to the user by placeing it in the text form. And if the user select the number that is alredy in the database my program whil let the user know that he is trying to select the number that is allredy exist. So if you understand me ...I know the basics of mysql. The problem comes when the user deletes one profil then the deleted number can't be used eny more. For that i need the functio first free unused number.
New edit
I'l try to clear up some details...Frst this is the program for human resources and the user creates the dosies of workers... when user is creating the new dosie hee needs to select the dosie number for this worker, now I need to sugest to user the first unused number for the new dosie... the dosie number is not the dosie 'id'. Dosie number must be selected manualy by user or he can let the first free number to given to the new dosie... I think this whill clear some things
You are probably talking about auto-Increment primary key of table rows. Just insert the data, without specifying this "number" and the database will automatically set it to the proper (next free) value.
Do not reuse primary keys (eg you have 1,2,3,4,5 but then delete 3 - if you reuse 3 you will not know at any future point that 3 was some other record that was actually deleted).
This, btw, is very basic database knowledge. Read some introduction tutorials on MySQL or any other SQL relational database.
You are trying to use bad the database.
May be you can look this: Finding the next available id in MySQL
First create a table with values 1 to 9999. Then, run this query once:
delete from table where id IN (select id from profiles)
This way, you get IDs that are not in the profiles table. The first one can be shown to the user. On saving the record, make sure to delete that ID from this table.
If I understood you correctly, this is what you are looking for.
If you are limited to using values 1 through 9999 I would probably setup the process as follows:
Add another table with two columns (id_tracker).
Populate id_tracker with id's 1 through 9999 defaulting is_used to 0.
Update id_tracker.is_used to 1 based on the contents of your table.
Add a delete, insert triggers to your table to update the id_tracker as necesssary.
And select empty ID's as follows SELECT id FROM id_tracker WHERE is_used = 0 ORDER BY id LIMIT 1
Here's some SQL to get you started:
create table id_tracker
(id int not null, is_used tinyint default 0, primary key (id));
delimiter |
CREATE TRIGGER your_table_delete_trigger BEFORE DELETE ON your_table
FOR EACH ROW
BEGIN
UPDATE id_tracker SET is_used = 0 WHERE id = OLD.your_table_id;
END;
|
CREATE TRIGGER your_table_insert_trigger AFTER INSERT ON your_table
FOR EACH ROW
BEGIN
UPDATE id_tracker SET is_used = 1 WHERE id = NEW.your_table_id;
END;
|
delimiter ;
** NOTE: the above is for MySQL
Okay, so let's say I have a mysql database table with two columns, one is for id and the other is for password. If I have three rows of data and the id values go from 1 to 3 and I delete row 3 and then create another row of data, I will see id=4 instead of id=3 on the newly created row. I know this has to do with the auto increment value but I was wondering if I can add some code in a php file that will automatically reset all the id numbers such that you start at id=1 and go up to the last id number in increments of 1 after a row has been deleted?
My goal is to create a form where the user enters a password and the system will match the password with a password value in the database. If there is a match, the row with the matched password will be deleted and the column of id numbers will be reordered such that no id numbers are skipped.
Update: I'm making a rotating banner ad system by setting a random number from 1 to 4 to a variable so that the php file will retrieve a random ad from id=1 to id=4 by using the random number variable. If the random number happens to be 3 and id=3 does not exist, there will be a gap in the row of banner ads. If there is a way to work around big gaps in this situation, please tell me. thanks in advance
Just execute the following SQL query:
ALTER TABLE `tbl_name` AUTO_INCREMENT = 1;
…but it sounds like a terrible idea, so don't do it. Why is the value of your primary key so important? Uniqueness is far more important, and reseting it undermines that.
You can only use
ALTER TABLE 'tbl' AUTO_INCREMENT=#
to reset to a number above the highest value number. If you have 1, 2, 3, and you delete 2, you cannot use this to fill 2. If you delete 3, you could use this to re-use 3 (assuming you haven't put anything higher). That is the best you can do.
ALTER TABLE 'table' AUTO_INCREMENT = 1;
However running this code is not the best idea. There is something wrong with your application if you depend on the column having no gaps. Are you trying to count the number of users? if so use COUNT(id)? Are you trying to deal with other tables? If so use a foreign key.
If you are dead set on doing this the Wrong Way you could try to look for the lowest free number and do the incrementing on your own. Keep in mind the race conditions involves however.
Also, keep in mind that if you change the actual numbers in the database you will need to change all references to it in other tables and in your code.
Well, you can actually just specify the id number you'd like a record to have as part of your insert statement, for example:
INSERT INTO person VALUES(1,'John','Smith','jsmith#devnull.fake','+19995559999');
And if there's not a primary key collision (no record in the database with id=1), then MySQL will happily execute it.
The ALTER TABLE 'tbl' AUTO_INCREMENT=# thing also works, and means you don't have to keep track of the counter.
While you're thinking about this, though, you might want to read some of the discussion on natural vs surrogate keys. The idea of having your id # be specifically important is a bit unusual and might be a sign of a troubled design.
You could do that by:
Inventing a mechanism that provides the next available id when you want to insert (e.g. a transaction involving reading and incrementing an integer column somewhere -- pay special attention to the transaction isolation level!)
Using UPDATE to decrement all ids greater than the one you just deleted (again, with a transaction -- don't forget that foreign keys must be ON UPDATE CASCADE!)
But it begs the question: why do you want this? is it going to be worth the trouble?
It's almost certain that you can achieve whatever your goal is without such witchery.
Update (to address comment):
To select a random number of rows, you can do e.g. in MySQL
SELECT id FROM banners ORDER BY RAND() LIMIT 5
to select 5 random, guaranteed existing banner ids.
A word of caution: there are quite a few people who view ORDER BY RAND() as a bad performance hog. However, it is IMHO not quite right to put every case in the same basket. If the number of rows in the table is manageable (I would consider anything below 10K to be not that many) then ORDER BY RAND() provides a very nice and succint solution. Also, the documentation itself suggests this approach:
However, you can retrieve rows in
random order like this:
mysql> SELECT * FROM tbl_name ORDER BY RAND();
ORDER BY RAND() combined with
LIMIT is useful for selecting a random
sample from a set of rows:
mysql> SELECT * FROM table1, table2 WHERE a=b AND c ORDER BY RAND() LIMIT 1000;
RAND() is not meant to be
a perfect random generator. It is a
fast way to generate random numbers on
demand that is portable between
platforms for the same MySQL version.