Combine Multiple Rows in MySQL into JSON or Serialize - php

I currently have a database structure for dynamic forms as such:
grants_app_id user_id field_name field_value
5--------------42434----full_name---John Doe
5--------------42434----title-------Programmer
5--------------42434----email-------example#example.com
I found this to be very difficult to manage, and it filled up the number rows in the database very quickly. I have different field_names that can vary up to 78 rows, so it proved to be very costly when making updates to the field_values or simply searching them. I would like to combine the rows and use either json or php serialize to greatly reduce the impact on the database. Does anyone have any advice on how I should approach this? Thank you!
This would be the expected output:
grants_app_id user_id data
5--------------42434----{"full_name":"John Doe", "title":"Programmer", "email":"example#example.com"}

It seems you don't have a simple primary key in those rows.
Speeding up the current solution:
create an index for (grants_app_id, user_id)
add an auto-incrementing primary key
switch from field_name to field_id
The index will make retrieving full-forms a lot more fun (while taking a bit extra time on insert).
The primary key allow you to update a row by specifying a single value backed by a unique index, which should generally be really fast.
You probably already have some definition of fields. Add integer-IDs and use them to speed up the process as less data is stored, compared, indexed, ...
Switching to a JSON-Encoded variant
Converting arrays to JSON and back can be done by using json_encode and json_decode since PHP 5.2.
How can you switch to JSON?
Possibly the current best way would be to use a PHP-Script (or similar) to retrieve all data from the old table, group it correctly and insert it into a fresh table. Afterwards you may switch names, ... This is an offline approach.
An alternative would be to add a new column and indicate by field_name=NULL that the new column contains the data. Afterwards you are free to convert data at any time or store only new data as JSON.
Use JSON?
While certainly it is tempting to have all data in one row there are somethings to remember:
with all fields preserved in a single text-field searching for a value inside a field may become a two-phase approach, as a % inside any LIKE can skip into other field's values. Also LIKE '%field:value%' is not easily optimized by indexing the column.
changing a single field means updating all stored fields. As long as you are sure only one process changes the data at any given time this is ok, otherwise there tend to be more problems.
JSON-column needs to be big enough to hold field-names + values + separators. This can be a lot. Also if you miss-calculate a long value in any field means a truncation with the risk of loosing all information on all fields after the long value
So in your case even with 78 different fields it may still be better two have a row per formular user and field. (It may even turn out that JSON is more practicable for formulars with few fields).
As explained in this question you have to remember that JSON is only some other text to MySQL.

Related

Column varchar and issue about index or fulltext?

Well I have a column varchar for password on my table and at some scripts i make queries like:
length(column_varchar) < 10
My question is if i put a index on this column, it will help? or in this case should use fulltext? or don't need a index?
Another question i need to use index in all columns that will be used in 'where'?
Thanks in advanced.
Indexes are used to index content (field value), not the length of the field, therefore no index can help in the above query. (N. B. you could have a sparate field that has the content length and index that separate field.) Also, the password should be stored in a hashed format, so all password lengths should be the same, or at least should not be a criteria for selection.
No, you should not index all columns that will be used in a where criteria. Selecting the optimal index structure is a complicated and very broad topic. Always consider the following points when trying to determine what fields (or combination of fields) to index:
Indexes speed up selects, but slow down data modification, since you have to update the index as well, not just the column's value.
MySQL can use only 1 index per table in a query.
MySQL uses the selectivity of the indexes to determine which one to use. A field that can have 2 values only (yes / no, true / false) is not selective enough, so do not trouble yourself with indexing it.
Always use the explain command to check which indexes your queries use.
You've got two questions here, in general you should split questions up.
Anyway, the first "Will it help indexing a column where you doing a test for length."
No, it won't. The only way you could improve the performance here would be to have an additional column that holds the length of the value in column_varchar and index that.
You wrote in comments that you are holding hashes, so the lengths will all be the same, so I have to guess that some passwords are null and so you don't hash them, or that you are migrating from not hashed to hashed.
The second question: should you index all fields in a where clause. This is not an automatic yes, which is why there are books written about query optimisation.
It depends on how much benefit you will get from the index, and that depends on the nature of the data.
The main trade off is between insert speed and query speed. Indexes slow inserts and speed up queries.
The next thing to consider is selectivity. If the value you are indexing has only three potential values, for example, the database will need frequent updating of the index to get real value from it.
In this specific case, you have evenly distributed data ( because it is hashed), you have great selectivity ( MD5 has few collisions) and you are expecting to query more often with a single term, so you should definitely be indexing this column.

Is there a way to compress a MySQL column where values repeat very often?

I have a InnoDB table with a VARCHAR column, with tens of thousands instances of the same text under it. Is there a way to compact it on-the-fly in order to save space? Is some kind of INDEX enough?
Can't InnoDB see that the values are the same, and use less space by internally assigning them some ID or whatever?
If the task is as simple as it seems, then what you are looking for is normalisation.
In simple terms, what you have to do is make this column contain Foreign Keys to another table, which has the values for this table. Now, store newer values in the other table, and when a value previously exists you do not need to make another entry for that in the table. Form this relation between the tables and in your original table a huge amount of space will be saved.
I suggest you to read up about redundancies and normalisation.
Hope it solves your problem.
You can use MySQL ENUM data type. It stores the values as indexes, but upon select you see the text value.
Here is the documentation:
http://dev.mysql.com/doc/refman/5.7/en/enum.html
Cons are that not all databases support ENUM type so you may find that as a problem if some day you decide to switch databases.
There also some other limitations pointed here:
http://dev.mysql.com/doc/refman/5.7/en/enum.html#enum-limits

Storing an index list with MYSQL?

I have a MySQL/PHP performance related question.
I need to store an index list associated with each record in a table. Each list contains 1000 indices. I need to be able to quickly access any index value in the list associated to a given record. I am not sure about the best way to go. I've thought of the following ways and would like your input on them:
Store the list in a string as a comma separated value list or using JSON. Probably terrible performance since I need to extract the whole list out of the DB to PHP only to retrieve a single value. Parsing the string won't exactly be fast either... I can store a number of expanded lists in a Least Rencently Used cache on the PHP side to reduce load.
Make a list table with 1001 columns that will store the list and its primary key. I'm not sure how costly this is regarding storage? This also feels like abusing the system. And then, what if I need to store 100000 indices?
Only store with SQL the name of the binary file containing my indices and perform a fopen(); fseek(); fread(); fclose() cycle for each access? Not sure how the system filesystem cache will react to that. If it goes badly then there are many solutions available to adress the issues... but that's sounds a bit overkill no?
What do you think of that?
What about a good old one-to-many relationship?
records
-------
id int
record ...
indices
-------
record_id int
index varchar
Then:
SELECT *
FROM records
LEFT JOIN indices
ON records.id = indices.record_id
WHERE indices.index = 'foo'
The standard solution is to create another table, with one row per (record, index), and add a MySQL Index to allow fast search
CREATE TABLE IF NOT EXISTS `table_list` (
`IDrecord` int(11) NOT NULL,
`item` int(11) NOT NULL,
KEY `IDrecord` (`IDrecord`)
)
Change the item's type according to your needs - I used int in my example.
The most logical solution would be to put each value in it's own tuple. Adding a MYSQL index to each tuple will enable the DBMS to quickly ascertain the value, and should improve performance.
The reasons we're not going with your other answers are as follows:
Option 1
Storing multiple values in one MYSQL cell is a violation of the first stage of database normalisation. You can read up on it here.
Option 3
This has heavy reliance on other files. You want to localize your data storage as much as possible, to make it easier to maintain in the future.

Mysql phpMyAdmin few questions:

I am quite new to the mysql phpMyadmin environment, and I would like to have some area
1. I need a field of text that should be up to around 500 characters.
Does that have to be "TEXT" field? does it take the application to be responsible for the length ?
indexes. I understand that when I signify a field as "indexed", that means that field would have a pointer table and upon each a WHERE inclusive command, the search would be optimized by that field (log n complexity). But what happens if I signify a field as indexed after the fact ? say after it has some rows in it ? can I issue a command like "walk through all that table and index that field" ?
When I mark fields as indexed, I sometimes get them in phpMyAdmin as having the keyname
for accessing the table by the indexed field when I write php, does it take an extra effort on my side to use that keyname that is written down there at the "structure" view to use the table as indexed, or does that keyname is being used behind the scenes and I should not care about it whatsoever ?
I sometimes get the keynames referencing two or more fields altogether. The fields show one on top of the other. I don't know how it happened, but I need them to index only one field. What is going on ?
I use UTF-8 values in my db. When I created it, I think I marked it as utf8_unicode_ci, and some fields are marked as utf8_general_ci, does it matter ? Can I go back and change the whole DB definition to be utf8_general_ci ?
I think that was quite a bit,
I thank you in advance!
Ted
First, be aware that this not per se something about phpmyadmin, but more about mysql / databases.
1)
An index means that you make a list (most of the time a tree) of the values that are present. This way you can easily find the row with that/those values. This tree can be just as easily made after you insert values then before. Mind you, this means that all the "add to index" commands are put together, so not something you want to do on a "live" table with loads of entries. But you can add an index whenever you want it. Just add the index and the index will be made, either for an empty table or for a 'used' one.
2)
I don't know what you mean by this. Indexes have a name, it doesn't really matter what it is. A (primary) key is an index, but not all indexes are keys.
3)
You don't need to 'force' mysql to use a key, the optimizer knows best how and when to use keys. If your keys are correct they are used, if they are not correct they can't be used so you can't force it: in other words: don't think about it :)
4)
PHPMYADMIN makes a composite keys if you mark 2 fields as key at the same time. THis is annoying and can be wrong. If you search for 2 things at once, you can use the composite key, but if you search for the one thing, you can't. Just mark them as a key one at a time, or use the correct SQL command manually.
5)
you can change whatever you like, but I don't know what will happen with your values. Better check manually :)
If you need a field to contain 500 characters, you can do that with VARCHAR. Just set its length to 500.
You don't index field by field, you index a whole column. So it doesn't matter if the table has data in it. All the rows will be indexed.
Not a question
The indexes will be used whenever they can. You only need to worry about using the same columns that you have indexed in the WHERE section of your query. Read about it here
You can add as many columns as you wish in an index. For example, if you add columns "foo", "bar" and "ming" to an index, your database will be speed optimized for searches using those columns in the WHERE clause, in that order. Again, the link above explains it all.
I don't know. I'm 100% sure that if you use only UTF-8 values in the database, it won't matter. You can change this later though, as explained in this Stackoverflow question: How to convert an entire MySQL database characterset and collation to UTF-8?
I would recommend you scrap PHPMyAdmin for HeidiSQL though. HeidiSQL is a windows client that manages all your MySQL servers. It has lots of cool functions, like copying a table or database directly from one MySQL server to another. Try it out (it's free)

Website: What is the best way to store a large number of user variables?

I'm designing a website using PHP and MySQL currently and as the site proceeds I find myself adding more and more columns to the users table to store various variables.
Which got me thinking, is there a better way to store this information? Just to clarify, the information is global, can be affected by other users so cookies won't work, also I'd lose the information if they clear their cookies.
The second part of my question is, if it does turn out that storing it in a database is the best way, would it be less expensive to have a large number of columns or rather to combine related columns into delimited varchar columns and then explode them in PHP?
Thanks!
In my experience, I'd rather get the database right than start adding comma separated fields holding multiple items. Having to sift through multiple comma separated fields is only going to hurt your program's efficiency and the readability of your code.
Also, if your table is growing to much, then perhaps you need to look into splitting it into multiple tables joined by foreign dependencies?
I'd create a user_meta table, with three columns: user_id, key, value.
I wouldn't go for the option of grouping columns together and exploding them. It's untidy work and very unmanageable. Instead maybe try spreading those columns over a few tables and using InnoDb's transaction feature.
If you still dislike the idea of frequently updating the database, and if this method complies with what you're trying to achieve, you can use APC's caching function to store (cache) information "globally" on the server.
MongoDB (and its NoSQL cousins) are great for stuff like this.
The database a perfectly fine place to store such data, as long as they're variables and not, say, huge image files. The database has all the optimizations and specifications for storing and retrieving large amounts of data. Anything you set up on file system level will always be beaten by what the database already has in terms of speed and functionality.
would it be less expensive to have a large number of columns or rather to combine related columns into delimited varchar columns and then explode them in PHP?
It's not really that much of a performance than a maintenance question IMO - it's not fun to manage hundreds of columns. Storing such data - perhaps as serialized objects - in a TEXT field is a viable option - as long as it's 100% sure you will never have to make any queries on that data.
But why not use a normalized user_variables table like so:
id | user_id | variable_name | variable_value
?
It is a bit more complex to query, but provides for a very clean table structure all round. You can easily add arbitrary user variables that way.
If you are doing a lot of queries like SELECT FROM USERS WHERE variable257 = 'green' you may have to stick to have specific columns.
The database is definitely the best place to store the data. (I'm assuming you were thinking of storing it in flat files otherwise) You'd definitely get better performance and security from using a DB over storing in files.
With regards to the storing your data in multiple columns or delimiting them... It's a personal choice but you should consider a few things
If you're going to delimit the items, you need to think of what you're going to delimit them with (something that's not likely to crop up within the text your delimiting)
I often find that it helps to try and visualise whether another programmer of your level would be able to understand what you've done with little help.
Yes, as Pekka said, if you want to perform queries on the data stored you should stick with the seperate columns
You may also get a slight performance boost from not retrieving and parsing ALL your data every time if you just want a couple of fields of information
I'd suggest going with the seperate columns as it offers you the option of much greater flexibility in the future. And there's nothing worse than having to drastically change your data structure and migrate information down the track!
I would recommend setting up a memcached server (see http://memcached.org/). It has proven to be viable with lots of the big sites. PHP has two extensions that integrate a client into your runtime (see http://php.net/manual/en/book.memcached.php).
Give it a try, you won't regret it.
EDIT
Sure, this will only be an option for data that's frequently used and would otherwise have to be loaded from your database again and again. Keep in mind though that you will still have to save your data to some kind of persistent storage.
A document-oriented database might be what you need.
If you want to stick to a relational database, don't take the naïve approach of just creating a table with oh so many fields:
CREATE TABLE SomeEntity (
ENTITY_ID CHAR(10) NOT NULL,
PROPERTY_1 VARCHAR(50),
PROPERTY_2 VARCHAR(50),
PROPERTY_3 VARCHAR(50),
...
PROPERTY_915 VARCHAR(50),
PRIMARY KEY (ENTITY_ID)
);
Instead define a Attribute table:
CREATE TABLE Attribute (
ATTRIBUTE_ID CHAR(10) NOT NULL,
DESCRIPTION VARCHAR(30),
/* optionally */
DEFAULT_VALUE /* whatever type you want */,
/* end_optionally */
PRIMARY KEY (ATTRIBUTE_ID)
);
Then define your SomeEntity table, which only includes the essential attributes (for example, required fields in a registration form):
CREATE TABLE SomeEntity (
ENTITY_ID CHAR(10) NOT NULL
ESSENTIAL_1 VARCHAR(30),
ESSENTIAL_2 VARCHAR(30),
ESSENTIAL_3 VARCHAR(30),
PRIMARY KEY (ENTITY_ID)
);
And then define a table for those attributes that you might or might not want to store.
CREATE TABLE EntityAttribute (
ATTRIBUTE_ID CHAR(10) NOT NULL,
ENTITY_ID CHAR(10) NOT NULL,
ATTRIBUTE_VALUE /* the same type as SomeEntity.DEFAULT_VALUE;
if you didn't create that field, then any type */,
PRIMARY KEY (ATTRIBUTE_ID, ENTITY_ID)
);
Evidently, in your case, that SomeEntity is the user.
Instead of MySQL you might consider using a triplestore, or a key-value store
that way you get the benifits of having all the multithreading multiuser, performance and caching voodoo, figured out, without all the trouble of trying to figure out ahead of time what kind of values you really want to store.
Downsides: it's a bit more costly to figure out the average salary of all the people in idaho who also own hats.
depends on what kind of user info you are storing. if its session pertinent data, use php sessions in coordination with session event handlers to store your session data in a single data field in the db.

Categories