Query Serialized Data

Query Serialized Data - php

I 1000+ have rows of SQL data that look like the followinng
umeta_id | user_id | meta_key | meta_value
433369 | 44 | all_contests | a:1:{s:12:"all_contests";a:2:{i:5011;a:1:{s:7:"entries";i:3;}i:8722;a:1:{s:7:"entries";i:3;}}}
433368 | 63 | all_contests | a:1:{s:12:"all_contests";a:2:{i:5032;a:1:{s:7:"entries";i:3;}i:8724;a:1:{s:7:"entries";i:3;}}}
When you unserialize one of those values you get an array like the following:
Array
(
[all_contests] => Array
(
[5011] => Array
(
[entries] => 3
)
[8722] => Array
(
[entries] => 3
)
)
)
I am trying to create a leaderboard out of all the users for a given contest id. The "all_contests" key holds an array keyed by the ids of all the contests the user signs up for. Inside that is the entries for the given contest.
The query needs to look at all the rows containing 'all_contests' keys and find the 10 highest entries values for a given contest id.
I'm not even sure that it is possible to reliably search inside of a piece of serialized data the way that I'm looking to.

No, it's impractical to search inside a piece of serialized data.
You can do it with enough meticulous usage of MySQL String Functions like INSTR(), SUBSTR(), FIELD(), and so on. But writing queries like that is time-consuming to develop the query, and it won't have good performance.
What you're doing is a variation on a common mistake: storing a comma-separated list in a column of one table.
This is avoiding creating the intersection table to represent a many-to-many relationship. In other words, you have two tables for users and contests, and you need a third table, in which each row represent's one user's participation in one contest.
See my answer to Is storing a delimited list in a database column really that bad?
The answer is about comma-separated lists, but it applies equally to serialized arrays like you're doing.

Related

Mysql: get all distinct values from field serialized by PHP

I have a movies table (has data from a legacy project) with the field genre which contains values serialized by PHP like:
a:3:{i:0;s:9:"Animation";i:1;s:9:"Adventure";i:2;s:5:"Drama";}
I'm working in a search page, & I need to find all unique genres of the current search result to be used as a filter in the page,
as an example, if the search result was these 2 movies:
The Dark Knight (action, crime, drama)
Black Knight (fantasy, adventure, comedy)
I want to know the combination of there genres, which will be:
['action', 'crime', 'drama', 'fantasy', 'adventure', 'comedy']
how to get the genres array? (I'm using Yii2).

You should unserialize your data
$data = 'a:3:{i:0;s:9:"Animation";i:1;s:9:"Adventure";i:2;s:5:"Drama";}';
$data = unserialize($data);
print_r($data);
and you will get
Array
(
[0] => Animation
[1] => Adventure
[2] => Drama
)
If you need to search the entire table for "Drama" to decide which shows/movies to display, you could always use wildcards in your search
select * from table where column like '%Drama%'
but of course make sure to take appropriate database precautions.

Suppose you have a serialize string that have various values in different position and you need to search from serialized field. You want particular key’s value using MySQL query. It is very easy with MySQL “%like%” statement but “%like%” fetches more matches which you do not require.
Search from serialized field: Suppose you have a following serialize string in database:
a:9:{s:2:"m1";s:4:"1217";s:2:"m2";s:8:"9986-961";s:2:"m3";s:19:"1988-03-07 00:00:00";s:2:"m4";s:0:"";s:2:"m5";s:0:"";s:2:"m6";s:0:"";s:2:"m7";s:3:"104";s:2:"m8";s:6:"150000";s :2:"m9";s:18:"Ok Then, Yes It Is";}
And you need the row in which the m9 value is ‘Yes It Is’.
So the sql will be like:
SELECT * FROM table WHERE field REGEXP '.*"array_key";s:[0-9]+:".array_value.".*'
test code
SELECT * FROM table WHERE field REGEXP '.*"m9";s:[0-9]+:".Ok Then, Yes It Is.".*'

Storing multiple values in one MYSQL field

I am currently planning out how a table will look in MYSQL database. I want to do something like the below, where plant1 and plant2 would be the columns, and then each of those plants would have characteristics assigned to them as you see below.
Is it possible to display info this way in MYSQL?
Array (
[plant1] => Array (
[image_url] => http://www.example.com/image1.png
[botanical_name] => Foo
[common_name] => Bar
),
[plant2] => Array (
[image_url] => http://www.example.com/image2.png
[botanical_name] => Foo
[common_name] => Bar
)

Well, i don't suggest that, but you can insert php array to that field and then parse it in php when you are getting data from database. So the content of that field would be like that:
$plant1 = array('plant1' => array('image_url' => 'http://www.example.com/image1.png', 'botanical_name' => 'Foo', 'common_name' => 'Bar'));
$plant2 = array('plant2' => array('image_url' => 'http://www.example.com/image2.png', 'botanical_name' => 'Foo', 'common_name' => 'Bar'));
You can insert $plant1 $plant2 to your mysql fields and then print it like this to get wanted result:
print_r(array_merge($mysql_plant1, $mysql_plant2));
Function array_merge connects arrays together.

You can do it storing as a serialized value (http://www.php.net/manual/en/function.serialize.php) but it's not a good idea.
For that you will have to create another table to store the caracteristics of each plant like:
**plants**
id_plant
name
**plants_ccharacteristics**
id_plant_characteristic
fk_plant
name
value
In that way you can store it properly.
As an extra point, that structure you want to store looks quite good for use NoSQL databases like MongoDB.

Your table design is flawed. What you should have is Plant being a row, with each item in the array being a column in the plants table.
If you need Plants to be a column in another table, then assign the first column of the plants table to be a unique, auto-incrementing key and then put that key in the column of the other table. Eg:
**Garden_Plants**
gp_key
gp_garden_id
gp_plant_id <-- p_id from Plants_Table
**Plants_Table**
p_id
p_botanical_name
p_common_name
p_image_url

you would have a table called plants, this would have a related table called plantDetails that has a referencing ID from the plant in question.
this would give 1 plant, with multiple references

This doesn't seem optimised for querying the database, however if you have a specific reason for doing it this way please let us know..
A more optimised solution:
Have a table called Plants, with fields ID, image_url, botanical_name, common_name.
Store all plants in that table. ID is a unique Identifier for that plant, probably set with autonumber so cannot be duplicated
If you need to store some kind of relationship between 2 plants, have a table called plant_relationships with fields for something like plant_ID_1 and plant_ID_2
You can then query your full list of plants directly, or query the relationships table to find plants related to other plants

If you want multiple values in one mysql field, i would suggest to put your information in one string before storing it in the database. Each substring need to be separated by a character that will not be used in the substrings.
$value1 = aaa
$value2 = bbb
$value3 = ccc
$valueAll = aaa:bbb:ccc
When you get the data out of the DB, you will need to split the string with the : or the other character selected.

Get details from another mysql table

I have a table which would contain information about a certain month, and one column in that row would have mysql row id's for another table in it to grab multiple information from
is there a more efficent way to get the information than exploding the ids and doing seperate sql queryies on each... here is an example:
Row ID | Name | Other Sources
1 Test 1,2,7
the Other Sources has the id's of the rows from the other table which are like so
Row ID | Name | Information | Link
1 John | No info yet? | http://blah.com
2 Liam | No info yet? | http://blah.com
7 Steve| No info yet? | http://blah.com
and overall the information returned wold be like the below
Hi this page is called test... here is a list of our sources
- John (No info yet?) find it here at http://blah.com
- Liam (No info yet?) find it here at http://blah.com
- Steve (No info yet?) find it here at http://blah.com
i would do this... i would explode the other sources by , and then do a seperate SQL query for each, i am sure there could be a better way?

Looks like a classic many-to-many relationship. You have pages and sources - each page can have many sources and each source could be the source for many pages?
Fortunately this is very much a solved problem in relational database design. You would use a 3rd table to relate the two together:
Pages (PageID, Name)
Sources (SourceID, Name, Information, Link)
PageSources (PageID, SourceID)
The key for the "PageSources" table would be both PageID and SourceID.
Then, To get all the sources for a page for example, you would use this SQL:
SELECT s.*
FROM Sources s INNER JOIN PageSources ps ON s.SourceID = ps.SourceID
AND ps.PageID = 1;

Not easily with your table structure. If you had another table like:
ID Source
1 1
1 2
1 7
Then join is your friend. With things the way they are, you'll have to do some nasty splitting on comma-separated values in the "Other Sources" field.

Maybe I'm missing something obvious (been known to), but why are you using a single field in your first table with a comma-delimited set of values rather than a simple join table. The solution if do that is trivial.

The problem with these tables is that having a multi-valued column doesn't work well with SQL. Tables in this format are considered to be normalized, as multi-valued columns are forbidden in First Normal Form and above.
First Normal Form means...
There's no top-to-bottom ordering to the rows.
There's no left-to-right ordering to the columns.
There are no duplicate rows.
Every row-and-column intersection contains exactly one
value from the applicable domain (and
nothing else).
All columns are regular [i.e. rows have no hidden components such as
row IDs, object IDs, or hidden timestamps].
—Chris Date, "What First Normal Form Really Means", pp. 127-8[4]
Anyway, the best way to do it is to have a many to many relationship. This is done by putting a third table in the middle, like Dominic Rodger does in his answer.

Optimize MySQL search process

Here is the scenario 1.
I have a table called "items", inside the table has 2 columns, e. g. item_id and item_name.
I store my data in this way:
item_id | item_name
Ss001 | Shirt1
Sb002 | Shirt2
Tb001 | TShirt1
Tm002 | TShirt2
... etc, i store in this way:
first letter is the code for clothes, i.e S for shirt, T for tshirt
second letter is size, i.e s for small, m for medium and b for big
Lets say in my items table i got 10,000 items. I want to do fast retrieve, lets say I want to find a particular shirt, can I use:
Method1:
SELECT * from items WHERE item_id LIKE Sb99;
or should I do it like:
Method2:
SELECT * from items WHERE item_id LIKE S*;
*Store the result, then execute second search for the size, then third search for the id. Like the hash table concept.
What I want to achieve is, instead of search all the data, I want to minimize the search by search the clothes code first, follow by size code and then id code. Which one is better in term of speed in mysql. And which one is better in long run. I want to reduce the traffic and not to disturb the database so often.
Thanks guys for solving my first scenario. But another scenario comes in:
Scenario 2:
I am using PHP and MySQL. Continue from the preivous story. If my users table structure is like this:
user_id | username | items_collected
U0001 | Alex | Ss001;Tm002
U0002 | Daniel | Tb001;Sb002
U0003 | Michael | ...
U0004 | Thomas | ...
I store the items_collected in id form because one day each user can collect up to hundreds items, if I store as string, i.e. Shirt1, pants2, ..., it would required a very large amount of database spaces (imagine if we have 1000 users and some items name are very long).
Would it be easier to maintain if I store in id form?
And if lets say, I want to display the image, and the image's name is the item's name + jpg. How to do that? Is it something like this:
$result = Select items_collected from users where userid= $userid
Using php explode:
$itemsCollected = explode($result, ";");
After that, matching each item in the items table, so it would like:
shirt1, pants2 etc
Den using loop function, loop each value and add ".jpg" to display the image?

The first method will be faster - but IMO it's not the right way of doing it. I'm in agreement with tehvan about that.
I'd recommend keeping the item_id as is, but add two extra fields one for the code and one for the size, then you can do:
select * from items where item_code = 'S' and item_size = 'm'
With indexes the performance will be greatly increased, and you'll be able to easily match a range of sizes, or codes.
select * from items where item_code = 'S' and item_size IN ('m','s')
Migrate the db as follows:
alter table items add column item_code varchar(1) default '';
alter table items add column item_size varchar(1) default '';
update items set item_code = SUBSTRING(item_id, 1, 1);
update items set item_size = SUBSTRING(item_id, 2, 1);
The changes to the code should be equally simple to add. The long term benefit will be worth the effort.
For scenario 2 - that is not an efficient way of storing and retrieving data from a database. When used in this way the database is only acting as a storage engine, by encoding multiple data into fields you are precluding the relational part of the database from being useful.
What you should do in that circumstance is to have another table, call it 'items_collected'. The schema would be along the lines of
CREATE TABLE items_collected (
id int(11) NOT NULL auto_increment KEY,
userid int(11) NOT NULL,
item_code varchar(10) NOT NULL,
FOREIGN KEY (`userid`) REFERENCES `user`(`id`),
FOREIGN KEY (`itemcode`) REFERENCES `items`(`item_code`)
);
The foreign keys ensure that there is Referential integrity, it's essential to have referential integrity.
Then for the example you give you would have multiple records.
user_id | username | items_collected
U0001 | Alex | Ss001
U0001 | Alex | Tm002
U0002 | Daniel | Sb002
U0002 | Daniel | Tb001
U0003 | Michael | ...
U0004 | Thomas | ...

The first optimization would be splitting the id into three different fields:
one for type, one for size, one for the current id ending (whatever the ending means)
If you really want to keep the current structure, go for the result straight away (option 1).

If you want to speed up for results you should split up the column into multiple columns, one for each property.
Step 2 is to create an index for each column. Remember that mysql only uses one index per table per query. So if you really want speedy queries and your queries vary a lot with these properties, then you might want to create an index on (type,size,ending), (type,ending,size) etc.
For example a query with
select * from items where type = s and size = s and ending = 001
Can benefit from the index (type,size,ending) but:
select * from items where size = s and ending = 001
Can not, because the index will only be used in order, so it needs type, then size, then ending. This is why you might want multiple indexes if you really want fast searches.
One other note, generally it is not a good idea to use * in queries, but to select only the columns you need.

You need to have three columns for the model, size and id, and index them this way:
CREATE INDEX ix_1 ON (model, size, id)
CREATE INDEX ix_2 ON (size, id)
CREATE INDEX ix_3 ON (id, model)
Then you'll be able to search efficiently on any subset of the parameters:
model-size-id, model-size and model queries will use ix_1;
size-id and size queries will use ix_2;
model-id and id queries will use ix_3
Index on your column as it is now is equivalent to ix_1, and you can use this index to efficiently search on the appropriate conditions (model-size-id, model-size and model).
Actually, there is a certain access path called INDEX SKIN SCAN that may be used to search on non-first columns of a composite index, but MySQL does not support it AFAIK.
If you need to stick to your current design, you need to index the field and use queries like:
WHERE item_id LIKE #model || '%'
WHERE item_id LIKE #model || #size || '%'
WHERE item_id = #model || #size || #id
All these queries will use the index if any.
There is not need to put in into multiple queries.

I'm comfortable that you've designed your item_id to be searchable with a "Starts with" test. Indexes will solve that quickly for you.
I don't know MySQL, but in MSSQL having an index on a "Size" column that only has choices of S, M, L most probably won't achieve anything, the index won't be used because the values it contains are not sufficiently selective - i.e. its quicker to just go through all the data rather than "Find the first S entry in the index, now retrieve the data page for that row ..."
The exception is where the query is covered by the index - i.e. several parts of the WHERE clause (and indeed, all of them and also the SELECT columns) are included in the index. In this instance, however, the first field in the index (in MSSQL) needs to be selective. So put the column with the most distinct values first in the index.
Having said that if your application has a picklist for Size, Colour, etc. you should have those data attributes in separate columns in the record - and separate tables with lists of all the available Colours and Sizes, and then you can validate that the Colour / Size given to a Product is actually defined in the Colour / Size tables. Cuts down the Garbage-in / Garbage-out problem!
Your item_selected needs to be in a separate table so that it is "normalised". Don't store a delimited list in a single column, store it using individual rows in a separate table
Thus your USERS table will contain user_id & username
Your, new, items_collected table will contains user_id & item_id (and possibly also Date Purchased or Invoice Number)
You can then say "What did Alex buy" (your design has that) and also "Who bought Ss001" (which, in your design, would require ploughing through all the rows in your USERS table and splitting out the items_collected to find which ones contained Ss001 [1])
[1] Note that using LIKE wouldn't really be safe for that because you might have an item_id of "Ss001XXX" which would match WHERE items_collected LIKE '%Ss001%'

Questions about Php and Mysql Hash Table

I am a new php and mysql programmer. I am handling quite large amount of data, and in future it will grow slowly, thus I am using hash table. I have couple of questions:
Does mysql have hash table built in function? If yes, how to use that?
After couple of days doing research about hash table. I briefly know what hash table is but I just could not understand how to start creating one. I saw a lot of hash table codes over the internet. Most of them, in the first step in to create a hashtable class. Does it mean, they store the hash table value in the temporary table instead of insert into mysql database?
For questions 3,4 & 5, example scenario:
User can collect items in the website. I would like to use hash table to insert and retrieve the items that the user collected.
[Important] What are the possible mysql database structure looks like?
e.g, create items and users table
in items table have: item_id, item_name, and item_hash_value
in users table have: user_id, username, item_name, item_hash_value
I am not sure if the users table is correct?
[Important] What are the steps of creating hash table in php and mysql?
(If there is any sample code would be great :))
[Important] How to insert and retrieve data from hash table? I am talking about php and mysql, so I hope the answers can be like: "you can use mysql query i.e SELECT * from blabla..."

(sorry about the italics, underscores can trigger them but I can't find a good way to disable that in the middle of a paragraph. Ignore the italics, I didn't mean to put them there)
You don't need to worry about using a hashtable with MySQL. If you intend to have a large number of items in memory while you operate on them a hashtable is a good data structure to use since it can find things much faster than a simple list.
But at the database level, you don't need to worry about the hashtable. Figuring out how to best hold and access records is MySQL's job, so as long as you give it the correct information it will be happy.
Database Structure
items table would be: item_id, item_name
Primary key is item_id
users table would be: user_id, username
Primary key is user_id
user_items table would be: user_id, item_id
Primary key is the combination of user_id and item_id
Index on item_id
Each item gets one (and only one) entry in the items table. Each user gets one (and only one) entry in the users table. When a user selects an item, it goes in the user items table. Example:
Users:
1 | Bob
2 | Alice
3 | Robert
Items
1 | Headphones
2 | Computer
3 | Beanie Baby
So if Bob has selected the headphones and Robert has selected the computer and beanie baby, the user_items table would look like this:
User_items (user_id, item_id)
1 | 1 (This shows Bob (user 1) selected headphones (item 1))
3 | 2 (This shows Robert (user 3) selected a computer (item 2))
3 | 3 (This shows Robert (user 3) selected a beanie baby (item 3))
Since the user_id and item_id on the users and items tables are primary keys, MySQL will let you access them very fast, just like a hashmap. On the user_items table having both the user_id and item_id in the primary key means you won't have duplicates and you should be able to get fast access (an index on item_id wouldn't hurt).
Example Queries
With this setup, it's really easy to find out what you want to know. Here are some examples:
Who has selected item 2?
SELECT users.user_id, users.user_name FROM users, user_items
WHERE users.user_id = user_items.user_id AND user_items.item_id = 2
How many things has Robert selected?
SELECT COUNT(user_items.item_id) FROM user_items, users
WHERE users.user_id = user_items.user_id AND users.user_name = 'Robert'
I want a list of each user and what they've selected, ordered by the user name
SELECT user.user_name, item.item_name FROM users, items, user_items
WHERE users.user_id = user_items.user_id AND items.item_id = user_items.item_id
ORDER BY user_name, item_name
There are many guides to SQL on the internet, such as the W3C's tutorial.

1) Hashtables do exist in MySQL but are used to keep internal track of keys on tables.
2) Hashtables work by hashing a data cell to create a number of different keys that separate the data by these keys making it easier to search through. The hashtable is used to find what the key is that should be used to bring up the correct list to search through.
Example, you have 100 items, searching 100 items in a row takes 10 seconds. If you know that they can be separated by type of item and break it up into 25 items of t-shirts, 25 items of clocks, items rows of watches, and items rows of shoes. Then when you need to find a t-shirt, you can only have to search through the 25 items of t-shirts which then takes 2.5 seconds.
3) Not sure what your question means, a MySQL database is a binary file that contains all the rows in the database.
4) As in #2 you would need to decide what you want your key to be.
5) #2 you need to know what your key is.

If you think a hash table is the right way to store your data, you may want to use a key-value database like CouchDB instead of MySQL. They show you how to get started with PHP.

I am a new php and mysql programmer. I am handling quite large amount of data, and in future it will grow slowly, thus I am using hash table.
lookin at your original purpose, use "memcache" instead, it is the most scalable solution while offers the minimal changes in your code, you can scale up the memcache servers as your data go larger and larger.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.