So, there's a field in the db in which I store serialized arrays.
$array = array('count1' => 10, 'count2' => 20, 'count3' => 4);
serialized:
a:3:{s:6:"count1";i:10;s:6:"count2";i:20;s:6:"count3";i:4;}
Would it be possible to pull count1+count2+count3 using a mysql query? I guess I'm looking for something like php's explode. Pretty sure this can't be done, but I thought I'd ask.
I need to pull the highest count1+count2+count3 rows and return the total count. Looping through each row and unserializing wouldn't work since there are TONS of rows.
If you need to access parts of your serialized data via SQL, you need to store them in separate columns.
While it might be possible to use techniques such as regular expressions to access those three values in this string, it would be extremely slow when used in a WHERE criterion as indexes would be useless - not to mention that it would be a huge mess, way worse than using goto in a programming language.
So the solution is to create a new columns and then iterate over all rows, unserialize them, and store the sum into the new column. That might take a while but you'll only need to it once.
Depending on your application it might be better to create three columns and store each value separately.
Related
Let's say that I have array like the one I posted below and that I need to store it in my MySQL database:
Array(
"Weight" => "10",
"Height" => "17",
"Usage" => "35"
);
Preamble:
I will never update these values
I will never perform a query based on these values
Long story short I only need to store and display this array as it is. Actually I need to use these values to generate graphs. Now I see 2 possible options.
Option 1: even if I will never use a WHERE, ORDER BY, HAVING (...) condition on these values, I store each value separately in a dedicated column (weight, height, usage).
Option 2: I create a single column (stats) where I store a serialized version of the array then, in order generate my graphs, I unserialize each row before using it.
The question is: what's the best approach to store this array in terms of effectiveness and performaces?
In my opinion the second approach is the best but let's say that there are many rows and elements involved in the process. I don't understand if it's faster and ligher to unserialize an array made by 20 elements for 100 rows with PHP or to read plain values stored in 20 columns considering that I need to save lot of them very frequently and simultaneously.
I will never update these values
I will never perform a query based on these values
The second you finalise your code having stored them as serialised values, you'll be asked to perform a query to update anything with a weight above ten.
Just store them in their own columns - not only will this future-proof the code, but it is easier to work with and will take up less drive space in the long run.
I have a dilemma that I'm trying to solve right now. I have a table called "generic_pricing" that has over a million rows. It looks like this....
I have a list of 25000 parts that I need to get generic_pricing data for. Some parts have a CLEI, some have a partNumber, and some have both. For each of the 25000 parts, I need to search the generic_pricing table to find all rows that match either clei or partNumber.
Making matters more difficult is that I have to do matches based on substring searches. For example, one of my parts may have a CLEI of "IDX100AB01", but I need the results of a query like....
SELECT * FROM generic_pricing WHERE clei LIKE 'IDX100AB%';
Currently, my lengthy PHP code for finding these matches is using the following logic is to loop through the 25000 items. For each item, I use the query above on clei. If found, I use that row for my calculations. If not, I execute a similar query on partNumber to try to find the matches.
As you can imagine, this is very time consuming. And this has to be done for about 10 other tables similar to generic_pricing to run all of the calculations. The system is now bogging down and timing out trying to crunch all of this data. So now I'm trying to find a better way.
One thought I have is to just query the database one time to get all rows, and then use loops to find matches. But for 25000 items each having to compare against over a million rows, that just seems like it would take even longer.
Another thought I have is to get 2 associative arrays of all of the generic_pricing data. i.e. one array of all rows indexed by clei, and another all indexed by partNumber. But since I am looking for substrings, that won't work.
I'm at a loss here for an efficient way to handle this task. Is there anything that I'm overlooking to simplify this?
Do not query the db for all rows and sort them in your app. Will cause a lot more headaches.
Here are a few suggestions:
Use parameterized queries. This allows your db engine to compile the query once and use it multiple times. Otherwise it will have to optimize and compile the query each time.
Figure out a way to make in work. Instead of using like try ... left(clei,8) in ('IDX100AB','IDX100AC','IDX101AB'...)
Do the calculations/math on the db side. Build a stored proc which takes a list of part/clei numbers and outputs the same list with the computed prices. You'll have a lot more control of execution and a lot less network overhead. If not a stored proc, build a view.
Paginate. If this data is being displayed somewhere, switch to processing in batches of 100 or less.
Build a cheat sheet. If speed is an issue try precomputing prices into a separate table nightly, include some partial clei/part numbers if needed. Then use the precomputed lookup table.
I'm new to PHP/MySQL. I want to minimize the number of tables I have so I've been thinking of saving an array of IDs (from checkboxes users tick off) as a string instead of in a separate table. What do you use to format the list of IDs as a string so I can easily parse the IDs for future use in my program?
you can use implode()
and explode for joining the values and separating the values respectively. You can also try serialize() for storing the values. There are a lot of examples in my given links, so they will be helpful for your desirable data format.
Thats not a good idea. I think a 1:N-Relation is the better choice. This givs a better performance and the db supports integrity checks.
So if you want to, you can build a comma-separated list and than use following where-statement:
where CONCAT(',', FIELDWITHIDS, ',') like '%,13,%'
to find datasets that reference the ID 13. Or you have to use explode() and implode() in PHP.
Either comma separated values or php serialize (http://php.net/manual/en/function.serialize.php)
That being said, you lose the benefit of being able to do joins or integrity checks in your DB, so you should generally avoid it.
I currently have about 4 different database tables which output to html tables. Each of these tables uses a count query to calculate data from a 5th table.
That's no problem, but what about when I want to sort and order the data, and paginate etc (like with zend). If it were a one page table, I could probably sort an array.
My thought was, to use a ticker. But that would require a new column in all 4 tables and seems like overkill or like there could be a better way.
Sadly, I can't find much info on it (likely because I don't know what to search for).
Advice?
..and please take it easy, I'm new and learning.
Assuming youre using Zend_Db_Table_Row and that you dont need to persist any modifications you might make to these rowsets then you can just append the virtual columns to the row object and have them be accessible via array notation. So if youre doing it all in one query now just use that same query, and the column should be there.
OTOH, if youre using a Data Mapper pattern then simply adjust your hydration to look for this "virtual column" and hydrate it if it exists in the result data. Then in your getter for this property have it see if the property is null or some other negative specification, and if it is, to execute a calculation query on that single object or return the already calculated result.
What is the way to get the greatest value into a serialized data. For example i have this in my column 'rating':
a:3:{s:12:"total_rating";i:18;s:6:"rating";i:3;s:13:"total_ratings";i:6;}
How can I select the 3 greatest 'rating' with a query?
thanks a lot
You're probably looking at a pile of SUBSTRING_INDEX(field,':',#offset) calls if you want to do it in SQL. It would be very grisly. Storing a serialized version of an object in the db is a convenience for persistance, but it should not be considered a permanent storage method. If you insist on using the serialized string for queries, you've lost all the power of a relational db and you might as well store the strings in a text file.
The best option is to use the serialized string only for persistance purposes (like remembering what the user was doing last time they visited), and store the data you need for calculations in properly normalized fields and tables. Then you can easily query what you need to know.
The other option is to select all the 'rating' strings from rows whos fields meet certain other criteria (e.g. the date_added field is within the last week), reinstantiate all the objects in your application layer and compare them there.