Restructured database, using SQL in phpmyadmin to move data around - php

I've recently been working on normalizing and restructuring my database to make it more effective in the long run. Currently I have around 500 records, and obviously I don't want to lose the users data.
I assume SQL through phpmyadmin is the easiest way to do this?
So let me give you guys an example
In my old table I would have something like this
records //this table has misc fields, but they are unimportant right now
id | unit |
1 | g |
With my new one, I have it split apart 3 different tables.
records
id
1
units
id | unit
1 | g
record_units
id | record_id | unit_id
1 | 1 | 1
Just to be clear, I am not adding anything into the units table. The table is there as a reference for which id to store in the record_units table
As you can see it's pretty simple. What moved in the second table is that I started using an index table to hold my units, since they would be repeated quite often. I then store that unit id, and the pairing record id in the record_units table, so I can later retrieve the fields.
I am not incredibly experienced with SQL, though i'd say I know average. I know this operation would be quite simple to do with my cakephp setup, because all my associations are already setup, but I can't do that.

If I understand correctly you want to copy related records from your old table to the new tables, in which case you can use something like this
UPDATE units u
INNER JOIN records r
ON u.id=r.id
SET u.unit = r.unit
This will copy the unit type from your old table to the matching id in the new units table and then you can do something similare on your 3rd table

Related

PHP & MySQL performance - One big query vs. multiple small

For an MySQL table I am using the InnoDB engine and the structure of my tables looks like this:
Table user
id | username | etc...
----|------------|--------
1 | bruce | ...
2 | clark | ...
3 | tony | ...
Table user-emails
id | person_id | email
----|-------------|---------
1 | 1 | bruce#wayne-ent.com
2 | 1 | ceo#wayne-ent.com
3 | 2 | clark.k#daily-planet.com
To fetch data from the database I've written a tiny framework. E.g. on __construct($id) it checks if there is a person with the given id, if yes it creates the corresponding model and saves only the field id to an array. During runtime, if I need another field from the model it fetches only the value from the database, saves it to the array and returns it. E.g. same with the field emails for that my code accesses the table user-emails and get all the emails for the corresponding user.
For small models this works alright, but now I am working on another project where I have to fetch a lot of data at once for a list and that takes some time. Also I know that many connections to MySQL and many queries are quite stressful for the server, so..
My question now is: Should I fetch all data at once (with left joins etc.) while constructing the model and save the fields as an array or should I use some other method?
Why do people insist on referring to the entities and domain objects as "models".
Unless your entities are extremely large, I would populate the entire entity, when you need it. And, if "email list" is part of that entity, I would populate that too.
As I see it, the question is more related to "what to do with tables, that are related by foreign keys".
Lets say you have Users and Articles tables, where each article has a specific owner associate by user_id foreign key. In this case, when populating the Article entity, I would only retrieve the user_id value instead of pulling in all the information about the user.
But in your example with Users and UserEmails, the emails seem to be a part of the User entity, and something that you would often call via $user->getEmailList().
TL;DR
I would do this in two queries, when populating User entity:
select all you need from Users table and apply to User entity
select all user's emails from the UserEmails table and apply it to User entity.
P.S
You might want to look at data mapper pattern for "how" part.
In my opinion you should fetch all your fields at once, and divide queries in a way that makes your code easier to read/manage.
When we're talking about one query or two, the difference is usually negligible unless the combined query (with JOINs or whatever) is overly complex. Usually an index or two is the solution to a very slow query.
If we're talking about one vs hundreds or thousands of queries, that's when the connection/transmission overhead becomes more significant, and reducing the number of queries can make an impact.
It seems that your framework suffers from premature optimization. You are hyper-concerned about fetching too many fields from a row, but why? Do you have thousands of columns or something?
The time consuming part of your query is almost always the lookup, not the transmission of data. You are causing the database to do the "hard" part over and over again as you pull one field at a time.

Php and mysql. Data management for classifieds ads website

Visitor opens url, for example
/transport/cars/audi/a6
or
/real-estate/flats/some-city/city-district
I plan separate table for cars and real-estate (separate table for each top level category).
Based on url (php explode create array)
$array[0] - based on the value know which table to SELECT
And so on $array[1], $array[2] ...
For example, RealEstate table may look like:
IdOfAd | RealEstateType | Location1 | Location2 | TextOfAd | and so on
----------------------------------------------------------------------
1 | flat | City1 | CityDistric1 | text.. |
2 | land | City2 | CityDistric2 | text.. |
And mysql query to display ads would be like:
SELECT `TextOfAd`, `and so on...`
WHERE RealEstateType = ? AND Location1 =? AND Location2 = ?
// and possibly additional AND .. AND
LIMIT $start, $limit
Thinking about performance. Hopefully after some long time number of active ads would be high (also i plan not to delete expired ads, just change column value to 0 not to display for SELECT; but display if directly visit from search engine).
What i need to do (change database design or SELECT in some another way), if for example number of rows in table would be 100 000 or millions?
Thinking about moving expired ads to another table (for which performance is not important). For example, from search engine user goes to some url with expired ad. At first select main table, if do not find, then select in table for expired ads. Is this some kind of solution?
Two hints:
Use ENGINE=InnoDB when creating your table. InnoDB uses row-level locking, which is MUCH better for bigger tables, as this allows rows to be read much faster, even when you're updating some of them.
ADD INDEX on suitable columns. Indexing big tables can reduce search times by several orders of magnitude. They're easy to forget and a pain to debug! More than once I've been investigating a slow query, realised I forgot a suitable INDEX, added it, and had immediate results on a query that used to take 15 seconds to run.

Multiply column with same content, make them count as "one"

About
I have this table in my database holding some information saved with a user id and time.
| CONTENT | USER ID | TIME |
| text | 1 | 1405085592 |
| hello | 2 | 1405085683 |
| hey | 1 | 1405086953 |
This example could be a data dump from my database, now as you can count there is "three" rows. However I only need to know how many users there have some information in my database. Therefor the result I'm really looking for is "two", because only two users have information in the database. User ID 1 is owning both "text"(1) & "hey"(3) where user ID 2 haves "hello"(2).
In short
I want to count how many users (regardless how many rows of information they have) there are inside my database.
** What I tried **
I tried to fetch every single row into an array and then using array_unique to count them together, works fine but I do not see this as a clean and best way to do this.
Then what?
I could use the array_unique and just use count to see how many rows there are, but I'm looking for something more clean. I tried to search for this, but I'm not actually sure what I should search for in term to hit something I'm looking for. After being stuck and though I wanted to learn something new, I wanted to post this problem here.
Note
I hope you guys can help me, I have tried to make it clear what I'm looking for and what I tried. If not please let me know. Sorry if some of the above contains misspelled words, incorrect grammar or is badly explained. I do not speak English daily, but I try my best.
You are looking for the DISTINCT keyword. It returns the count of unique values of a column:
SELECT COUNT(DISTINCT user_id)
FROM your_table
See example on SQL Fiddle.
This query:
SELECT DISTINCT user_id FROM table
will return just one row for every user in the table.

MySQL and PHP - multiple rows or concatenated into 1 row - which is quicker?

When storing relationship data for a user (potentially a thousand friends per user), would it be faster to create a new row for each relationship, or to concatenate all of of their friends into a string and then parse that later?
I.e.
Primary id | Friend1ID | Friend2ID|
1| 234| 5789|
2| 5789| 234|
Where the IDs are references to primary IDs in a 'Users' table.
Or for the 'Users' table to just have a column called friends which may look like this:
Primary id | Friend1ID |
234| 5789.123.8474|
5789| 234|
I'm of the understanding that string concatenation and parsing is generally quite slow, so I'd be tempted to lean towards the first method. However as the number of users grows, this then becomes a case of selecting one row and parsing it V searching millions of rows for rows which match the WHERE criteria.
Is one method distinctly faster than the other? Particularly as the number of users grows.
You should use a second table to store the friends.
Users Table
----------
userid | username
1 | Bob
2 | Mike
3 | John
Users Friends Table
--------------------
userid | friend_id
1 | 2
3 | 2
Here you can see that Mike is friends with both Bob and John.... This is of course a very simply demonstration.
Your second option will not scale, some people may have hundreds of thousands of friends, storing each Id in a single field is going to cause a headache further down the line. adding friends, removing friends. working out complex relationships between people. Lots of over head.
Querying millions of records with a WHERE clause on a properly indexed table should take no more than a second, the first option is the better one.
The "correct" way would probably be keeping multiple rows. This allows for much easier statistical analysis and more complex queries (like friends of friends) without any hacky stuff. Integer storage size is also often smaller than string storage, even though you're repeating one ID - especially if you use an appropriately sized integer store (like mediumint).
It's also more maintainable, scalable (if they start getting a damn lot of friends) export and importable. The speed gain from concatenation, if any, wouldn't be worth the rest of the benefits.
If you wanted for instance to search if Bob was a friend of Jane, this would be a single row lookup in the multiple row implementation, or in the single row implementation: get Bob's row, decode field, loop through field looking for Jane - found Jane. DBMS optimisation and indexing would make the multiple row implementation much faster in this case - if you had the primary key as (id, friendid) then it'd be pretty much instantaneous as the table would probably be hashed on that key.
I believe the proper way to do it which might be more faster is two do a two columns table
user | friend
1 | 2
1 | 3
It will simple and will make queering and updating much easier and you can have as many relationship as you want.
Don't over complicate the problem...
... Asking for the more "correct" way is wrong itself.
It depends based on case.
If you have low access rate to your web application having more rows won't change anything on the other side of the coins (i'm not English), on large and medium application access it's maybe better to have the minimal access to the db possible.
To obtain this as you've already thinked you can concatenate the values and then split them on login of the user and then put everything into the $_SESSION supervar.
At least this is what i think.

How to create a query that retrieve data from several tables, and there is no relationship between the tables

As described in the title.
How to create a query that retrieve data from several tables, and there is no relationship between the tables.
Example:
I have two table like the following:
Table (Categories) Table (Pic_files)
------------------- | ----------------------
Cat_id | cat_name | pic_id | pic_title
------------------- | ----------------------
1 | Animals | 1 | Dog_Walks
------------------- | ----------------------
2 | Nature | 2 | red_flower
------------------ | ----------------------
| 3 | blue_flower
| ----------------------
so on | so on | so on | so on
------------------ | ----------------------
What I want is like the next two queries combined in one query
Query 1:
("SELECT cat_name FROM Categories WHERE Cat_id='2'")
Query 2:
("SELECT pic_title FROM Pic_files WHERE Cat_id='2' LIMIT 5 ")
And if I wanted print out data of specific table I doing the next
foreach($data as $pic){
echo $pic['pic_title']; // to get the data from Categories table
}
also if that
foreach($data as $cat){
echo $cat['cat_name']; // to get the data from Pic_files table
}
In short, I want to combine 2 or more queries in one query, and treating with them by column name, and don't forget that all tables doesn't have any relationship to some
You can do this with a cross join:
select *
from (SELECT cat_name FROM Categories WHERE Cat_id='2') cat cross join
(SELECT pic_title FROM Pic_files WHERE Cat_id='2' LIMIT 5) t
I think what folks are struggling with here in giving you an answer is understanding what you are looking for.
There are certainly PHP libraries which will allow you to execute multiple SQL queries with a single call to the database. See mysqli_multi_query as an example. But, it doesn't sound like this is what you are looking for. It seems you want to find a magical way to execute multiple queries on unrelated tables as a single query and then somehow magically be able to get at the sets of data independently. If you are going to do things like Cartesian joins on unrelated tables, you are going to end up with orders of magnitude greater results in your result set than what you are looking for, introducing performance problems on both the database and the application.
It sounds like you want to reduce the number of queries for the sake of reducing the number of queries, which sounds like a bad design approach IMO. Certainly you can minimize queries on RELATED tables using joins. But to try to somehow JOIN unrelated tables will probably introduce more complexity and performance problems into your system than you are bargaining for. Reducing queries on the database is not the end all be all of database performance optimization. You need to make sure you have tables properly indexed for your queries. You need to make sure you have appropriate data types and sizes for the data you are storing. You need to make sure you have a properly allocated memory to keep your indexes in memory. You need to make sure you have a properly configured server to meet your memory, disk capacity, and disk I/O requirements. Once you have all that maximally optimized, and your system is maxed out, and you have determined that the problem on your system is having two different queries to unrelated tables instead of one combined query, then you should worry about what you are worrying about now.

Categories