Let's say you've got a table with a timestamp column, and you want to parse that column into two arrays - $date and $time.
Do you, personally:
a) query like this DATE(timestamp), TIME(timestamp) , or perhaps even going as far as HOUR(timestamp), MINUTE(timestamp
b) grab the timestamp column and parse it out as needed with a loop in PHP
I feel like (a) is easier... but I know that I don't know anything. And it feels a little naughty to make my query hit the same column 2 or 3 times for output...
Is there a best-practice for this?
(a) is probably fine, if it is easier for your code base. I am a big fan of not having to write extra code that is not needed and I love only optimizing when necessary. To me pulling the whole date and then parsing seems like premature optimization.
Always remember that sql servers have a whole lot of smarts in them for optimizing queries so you don't have to.
So go with a), if you find it is dog slow or cause problems, then go to b). I suspect that a) will do all you want and you will never think about it again.
I would personally do (b). You're going to be looping the rows anyway, and PHP's strtotime() and date() functions are so flexible that they will handle most of the date/time formatting issues you run into.
I tend to try to keep my database result sets as small as possible, just so I don't have to deal with lots of array indexes after a database fetch. I'd much rather take a single timestamp out of a result row and turn it into several values in PHP than have to deal with multiple representations of the same data in a result row, or edit my SQL queries to get specific formatting.
b) is what I follow and I use it every time. It also gives you the flexibility of being able to control how you want it to appear in your front end. Think about this: If you are following a) and you want to do a change, you will need to change all the queries manually. But if you are using b) you can just call a function on this value (from the DB) and you are good to go. If you ever need to change anything, just change it within this function and viola! Doesn't that sound like a time saver to you ???
Hope that helps.
I would also use b). I think it is important that, if I at some point need to use names on the days, or the months in another language. I can use PHP locale support to translate it to the given language, that wouldn't be the case in a).
If you need it in the SQL query itself (e.g. in a WHERE, GROUP BY, ORDER BY, etc), then way a) is preferred. If you rather need it in the code logic (PHP or whatever), then way b) is preferred.
If your PHP code actually does a task which can be as good done with SQL, then I'd go for that as well. In other words, way b) is only preferred if you are going to format the date for pure display purposes only.
I think it boils down to this, do you feel more at home writing php code or mysql queries?
I think this is more a question of coding style than technical feasibility, and you get to choose your style.
Related
I have a MySQL database that has large amount of records. For each record there is a field of text called "Comment" and I've put 3 examples below:
"Very fast payment, thank you. "
"love the thank you"
"Fast delivery thank u "
My question is this:
How do I interrogate each record look at the contents of the "Comment" field and then work out what the top 20 words used are?
For example using the 3 comments above the words
"thank" appears 3 times,
"Fast" 2 times
And the rest of the words used are only used once.
I am guessing that I'll need to use PHP to work through each record, explode out using a " " (space), remove characters like commas & full stops, then store the results and then count those.
But I am really not sure on the best approach and not sure how to handle plurals such as "thanks" & "thank". Hence the question :)
Matt
Because they are all in the one column you can't really do much SQL filtering here.
If the data set isn't too huge (i.e. php running out of memory huge) then you should be able to read it into php and process it.
You can use explode to split on spaces and work with the data as a huge array. And you can use preg_match function to do string compare operations, see: http://us3.php.net/preg_match - you should spend some time investigating regular expressions.
It would be easier to use the SQL like function in the where clause if you were looking for something specific like SELECT COUNT(comment) where comment like '%thank%'` but you would have to do that manually.
Also, you may want to consider dumping it out to a file and using unix-based commands like wc which can help you with what you are after. You can also use PHP to interact with these commands if you are in a unix-like environment.
Short of writing the code there isn't much more I can tell you.
Possible, perhaps. However, MySQL is not really good for this type of querying. If you did attempt this using MySQL is is likely to take a long time to actually complete and would not be practical if you wanted to run this type of query frequently.
I'd suggest you look into indexing your data using something that is specifically designed for these kind of queries. Some kind of Apache Lucene derivative would do nicely, for example you could use Elasticsearch. Here are the docs from ES that describe the kind of query you are looking to run: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html
Unlike MySQL running these kind of queries on something like ES would execute very quickly as it is specifically designed for it.
I will try to make this question as clear as I can but bare in mind that English is not my first language. I have a web application written in PHP using a MySQL database. In a table I might have thousands of entries and in each entry I am storing this data:
$hourly_rate,
$minutes
when I process my table through a loop, I calculate the net value using the following formula:
$net_value = $minutes*($hourly_rate/60);
now the question, should I instead add a $net_value field on my table, calculate the net value on the client side using JQUERY and then upload the result of the calculation in the $net_value field? Which one do you think is the best approach considering I might have 1000 users accessing the system at the same time?
Thank you for you help,
Donato
It depends on how important the value is. If it's accessed all the time and by a lot of people it may be worth storing in the database.
But I don't suggest using jQuery to do the calculation, do it server-side for better security.
Generally I wouldn't store such simple calculated values in the database. Doing that calculation in PHP takes so little time that it isn't even worth thinking about.
There is a good reason to store the calculation in the database. The good reason is that this calculation may be used in more than one place in the application.
I would recommend that you create a view, something like:
create view vw_rates as
select t.*, minutes*(hourly_rate/60) as net_value
from t
By putting in a view, everyone will be using the same definition. So a report that summarizes by region or time, for example, would use the same definition. In other databases, you can do the same thing using computed columns, but MySQL does not support them.
From a performance perspective, such a simple calculation on such a small amount of data probably does not make a difference. Do remember, though, that the database can do these types of calculations in parallel if you have multiple threads/processes.
what will be faster?
SELECT * FROM
or
SELECT specified FROM
background: table have one field (specified) which at the same time is a primary index
In your particular case it may very well be the same, but as a matter of good practice, you should always specify the columns you want.
In addition to the various good reasons Dark Falcon put in a comment, it also creates a form of self-documentation in your application code, since it's directly in the query each field you're expecting.
As a matter of good practice, it's usually better to explicitly specify the columns you want, regardless of the performance implications you're concerned about in this question.
But in general, the answer will depend heavily on your version of mysql. Profile it and see:
explain select * from ...;
explain select specified from ...;
I suspect strongly that this is a case of premature optimization, and that you don't really need to know which is faster.
Imho the explicit version will be faster, cause mysql don't need to look up what fields the table contains.
Depending on table structure (including indexes) it may not make a difference -- running some benchmarks and using EXPLAIN SELECT to see where things can be improved will help you along the way. But in general, if you know you only want n fields, only select n fields.
Just specify it, in case in the future more columns are added, but you don't want to retrieve all of them. In any case it is better to be specific.
Write a console app with 2 functions doing the two methods and loop 1000 times on each and print out the average time it took. This would be your fastest way to test the performance.
Generally it's better and I think faster to specify something in your sql query to avoid to get some unuseful data
The "select * from" format imho is just a fast way when quering as dba you just want a quick glance at the table. Even though that will work in programming I wouldn't recommend it, by listing the columns as a programmer it keeps you from having to go back and forth to your db and see what column you want to use or query for. It keeps you in one spot..that's just me though..this really is up to you and how you want to program.
You are looking at this bass ackwards. Do you need the content of the column or not, if you don't and you get it, that will take longer than not.
Parsing the sql to check the column names, order them and potentially alias them, is trivial compared to flooding the network transferring loads of stuff you don't need.
I am developing an URL bookmark application (PHP and MySQL) for myself. With this application I will store URLs in MySQL.
The question is, should I store URLs in a TEXT column or should I first parse the URL and store its components (host, path, query, fragment) in separate columns in one table? The latter one also gives me the chance of generating statistical data by grouping servers and etc. Or maybe I should store servers in a separate table and use JOIN. What do you think?
Thanks.
I'd go with storing them in TEXT columns to start. As your application grows, you can build up the parsing and analysis functionality if you really want to. From what it sounds like, it's all just pie-in-the-sky functionality right now. Do what you need to get the basic application up and running first so that you have something to work with. You can always refactor and go from there.
The answer depends on how you like to use this data in the future.
If you like to analyze the different parts of the URL splitting them is the way to go.
If not. the INSERT, as well, as the SELECT, will be faster, if you store them in just one field.
If you know the URLs are not longer then 255 Chars, varchar(255) will be better, than text, for performance reasons.
If you seriously thing that you're going to be using it for getting interesting data, then sure, do it as a series of columns. Honestly, I'd say it'd probably just be easier to do it as a single column though.
Also, don't forget that it's easy for you to convert back and forth if you want to later. Single to multiple is just a SELECT;regex;INSERT[into another table]; multiple to single is just a INSERT SELECT with CONCAT.
I have a page that will pull many headlines from multiple categories based off a category id.
I'm wondering if it makes more sense to pull all the headlines and then sort them out via PHP if/ifelse statements or it is better to run multiple queries that each contain the headlines from each category.
Why not do it in one query? Something like:
SELECT headline FROM headlines WHERE category_id IN (1, 2, 3, ...);
If you filter your headlines in PHP, think how many you'll be throwing away. If you end up with removing just 10% of the headlines, it won't matter as much as when you'd be throwing away 90% of the results.
These kinds of questions are always hard to answer because the situation determines the best course. There is never a truly correct answer, only better ways. In my experience doesn't really matter whether you attempt to do the work in PHP or in the database because you should always try to cache the results of any expensive operation using a caching engine such as memcached. That way you are not going to spend a lot of time in the db or in php itself since the results will be cached and ready instantaneously for use. When it comes down to it, unlss you profile your application using a tool like xDebug, what you think are your performance bottlenecks are just guesses.
It's usually better not to overload the DB, because you might cause a bottleneck if you have many simultaneous queries.
However, handling your processing in PHP is usually better, as Apache will fork threads as it needs to handle multiple requests.
As usual, it all comes down to: "How much traffic is there?"
MySQL can already do the selecting and ordering of the data for you. I suggest to be lazy and use this.
Also I'd look for a (1) query that fetches all the categories and their headlines at once. Would an ORDER BY category, publishdate or something do?
Every trip to the database costs you something. Returning extra data that you then decide to ignore costs you something. So you're almost certainly better to let the database do your pruning.
I'm sure one could come up with some case where deciding what data you need makes the query hugely complex and thus difficult for the database to optimize, while you could do it in your code easily. But if we're talking about "select headline from story where category='Sports'" followed by "select headline from story where category='Politics'" then "select headline from story where category='Health'" etc, versus "select category, headline from story where category in ('Health','Sports','Politics')", the latter is clearly better.
On the topic of "Faster to query in MYSQL or to use PHP logic", which is how I ended up on this question 10 years later. I have determined that the correct answer is "it depends". There are just too many examples where using the DB saves processing time over writing PHP Code.... but there are just as many examples where writing PHP Code saves time on excessively complex MySQL queries.
There is no right answer here. If you end up here, like I did, then the best I can suggest is try to solve your problem with the skills that you have. Start first with the Query and try to solve it, if you run into issues then start thinking about just gathering the data and running the logic through PHP code to come up with a solution.
At the end of the day, you need to solve a problem.... if you solve it, but its not fast enough, then thats another problem... work on optimizing which may end up meaning that you go back to writing more MySQL logic.
Use the 80/20 rule and try to get things 80% of the way there as quickly as possible. You can go back and optimize once its workable. Spending all your effort on making it perfect the first time will surely mean you miss your deadline.
Thats my $0.02