Some Tips to properly number SQL queries

Some Tips to properly number SQL queries - php

As an Amateur PHP Developer I often have this problem of mixing sequences and variables.
SHORT :
So what would be the tips to keep in mind when writing a webpage that has many SQL queries.
I have thought of an idea of making a function but I am not sure if that would be of any good.
LONG:
The problem is I have a PHP page that has many sql queries which are followed by result, row , row_number, and die() if result unsuccessful after running the query.
I number them as sql1, result1, row1, error1, mysqlouput1 and sql2.... and so on..
When I add one more query after five or six days I have to go through the whole code to find out which was the last sequence used and take the next one, many times i just use the same and it creates strange problems.
The same problem with variables, as the same page is loading again and again with different POST and GET ids keeping track of variables is just too messy.
So what would be the tips to keep the code well sequenced, readable and variable names unique and understandable.
I have though of writing a function that takes the SQL as the input and gives the result, row, row count, and error as output.
What would you experienced people suggest ?
Thanks.

at first you should really try to just use relevant names for your variables.
Like if you are lookin for all the users use variables like $sqlUsers, $queryUsers, $resultUsers/$usersArray
The idea with the function would be one way to tackle the problem, as it no longer possible to code errors in the process itself, so if you do not want to switch to objectoriented programming I would probably choose that way.
Another way would be a mix, you could create only some classes in which you handle the database-interactions for specific domains. I.e. class userDataHandler would have the methods getUserById(), getAllUsers() et cetera. This way you would still have to write the querying process, but you know exactly where your queries are, have them in a structured way, there are no problems with naming your variables inside the methods, and your code doesn't get messy as you have different files for different domains -> separation of domainspecific code.
Can you explain your problem with the request-variables($_POST and $_GET) a little more? Why does your site get called with so many different variables, maybe you could structure your application a little better.

Related

Should a very long function/series of functions be in one php file, or broken up into smaller ones?

At the moment I am writing a series of functions for fetching Dota 2 matches from the Steam API. When someone fetches their games, I have to (for my use) take a history of all of their games (lets say 3 api calls), then all the details from each of those games (so if there's 200 games, another 200 api calls). This takes a long time, and so far I'm programming all of the above to be in one php file "FetchMatchHistory.php", which is run by the user clicking a button on the web page.
Another thing that is making me feel it should be in one file, is that I imagine it is probably good practice to put all of the information (In this case, match history, match details, id's etc.) into the database all at once, so that there doesn't have to be null values in the database?
My question is whether or not having a function that takes a very long time should be in just one PHP file (should meaning, is generally considered good practice), or whether I should break the seperate functions down into smaller files. This is very context dependent, I know, so please forgive me.
Is it common to have API calls spanning several PHP files if that is what you are making? Is there a security/reliability issue with having only one file doing all the leg-work (so to speak)?

Good practice is to have a number of relevant functions grouped together in a php file that describes them, for organize them better and also for caching reasons for the parts that get updated more slowly than other.
But speaking of performance, i doubt you'll get the performance improvements you seek by just moving code through files.
Personally i had the habit to put everything in a file, consistently:
making my files fat
hard-to-update
hard-to-read
hard to find the thing i want (Ctrl+F meltdown)
wasting bandwidth uploading parts they did not need to be updated
virtually disabling caching on server
I dont know if any of the above is of any use for your App, but breaking files into their relevant files/places did my life easier.
UPDATE:
About the database practice, you're going to query only the parts you want to be updated.
I dont understand why you split that logic in files, there's not going to give you performance. Instead, what is going to give you performance is to update only the relevant parts and having tables with relevant content. Speaking of multiple tables have a lot more sense, since you could use them as pointers to the large data contained in another tables, reducing the possible waste of data having just one table.
Also, dont forget a single table has limitations; I personally try to have as few columns as possible. Adding more and more and a day you can't add more because of the row limit. There is a maximum number of columns in general, but this limit rarely ever get maxed by developer; the increased per-row content itself is going to suck out that limit.

Whether to split server side code to multiple files or keep it in a single one is an organizational issue, more than a security/reliability one...
I don't think it's more secure to keep your code in separate source files.
It's entirely a of how you prefer to organize and mantain your code base.
Usually, I separate it when I can find some kind of "categories" in my code.
Obviously, if you write OO code, the most common choice is to keep each class in a single file...

Do I have too many queries for lookup tables and can anyone suggest an alternative?

I am currently building a codeigniter application that handles a specific type of mammal. When a user is adding a new record (mammal), they are given lists of 'breed types', 'genders', etc. Those are stored in separate database tables.
Currently, to get these, I have separate functions such as:
$this->Mammal->get_list_of_breeds()
$this->Mammal->get_list_of_genders()
Each of these calls a query, there may be up to 7 or 8 more different lookups for me to query. Does anyone know if this will significantly slow down my application or cause too many queries on the database. For the most part, the max number of records in any individual table is under 300 records.
Is there a better way I can be doing this by consolidating the queries into a single function and using php to split the lookup fields?
Any ideas or thoughts are greatly appreciated.

One idea is to take some of the smaller sets of options and put them in arrays, especially if they cannot be changed by the user. Gender, for example, could probably just be in an array. As far as I know, there are only two options. If there are any other similar option sets you could make those arrays too.
But, even 300 records is not a huge amount of data. I take it you aren't building the next Facebook, so just making several clean queries to get the options you need probably won't be a big deal.
Personally, I wouldn't put it all in one table. Big generic tables just seem kind of hokey, and you would still be getting the same amount of data. You could have separate tables and accomplish the same thing by UNIONing the queries.

As you commented yourself, yes indeed you should put everything into one table...
So you'd have a table called mammals
And then you'd have the fields: gender, breeds etc...
Now this is a lot easier when programming in php since now you can do one query and then display everything, like this:
$query="SELECT * FROM `mammals`";
$query_exec=mysql_query($query);
while($result=mysql_fetch_array($query_exec))
{
print "gender: ".$result['gender']." breed: ".$result['breed'];
}
Little explanation:
The query gets everything from the table called mammals
Then the while just continues as long as there are still results in the array
The fetch array puts the data in the variable and every field can be read by $result[]
I know this is not a very clear explanation, but my mind also isn't the cleares at this late hour :/

Parsing timestamps - do it in MySQL or in PHP?

Let's say you've got a table with a timestamp column, and you want to parse that column into two arrays - $date and $time.
Do you, personally:
a) query like this DATE(timestamp), TIME(timestamp) , or perhaps even going as far as HOUR(timestamp), MINUTE(timestamp
b) grab the timestamp column and parse it out as needed with a loop in PHP
I feel like (a) is easier... but I know that I don't know anything. And it feels a little naughty to make my query hit the same column 2 or 3 times for output...
Is there a best-practice for this?

(a) is probably fine, if it is easier for your code base. I am a big fan of not having to write extra code that is not needed and I love only optimizing when necessary. To me pulling the whole date and then parsing seems like premature optimization.
Always remember that sql servers have a whole lot of smarts in them for optimizing queries so you don't have to.
So go with a), if you find it is dog slow or cause problems, then go to b). I suspect that a) will do all you want and you will never think about it again.

I would personally do (b). You're going to be looping the rows anyway, and PHP's strtotime() and date() functions are so flexible that they will handle most of the date/time formatting issues you run into.
I tend to try to keep my database result sets as small as possible, just so I don't have to deal with lots of array indexes after a database fetch. I'd much rather take a single timestamp out of a result row and turn it into several values in PHP than have to deal with multiple representations of the same data in a result row, or edit my SQL queries to get specific formatting.

b) is what I follow and I use it every time. It also gives you the flexibility of being able to control how you want it to appear in your front end. Think about this: If you are following a) and you want to do a change, you will need to change all the queries manually. But if you are using b) you can just call a function on this value (from the DB) and you are good to go. If you ever need to change anything, just change it within this function and viola! Doesn't that sound like a time saver to you ???
Hope that helps.

I would also use b). I think it is important that, if I at some point need to use names on the days, or the months in another language. I can use PHP locale support to translate it to the given language, that wouldn't be the case in a).

If you need it in the SQL query itself (e.g. in a WHERE, GROUP BY, ORDER BY, etc), then way a) is preferred. If you rather need it in the code logic (PHP or whatever), then way b) is preferred.
If your PHP code actually does a task which can be as good done with SQL, then I'd go for that as well. In other words, way b) is only preferred if you are going to format the date for pure display purposes only.

I think it boils down to this, do you feel more at home writing php code or mysql queries?
I think this is more a question of coding style than technical feasibility, and you get to choose your style.

Faster to query in MYSQL or to use PHP logic

I have a page that will pull many headlines from multiple categories based off a category id.
I'm wondering if it makes more sense to pull all the headlines and then sort them out via PHP if/ifelse statements or it is better to run multiple queries that each contain the headlines from each category.

Why not do it in one query? Something like:
SELECT headline FROM headlines WHERE category_id IN (1, 2, 3, ...);
If you filter your headlines in PHP, think how many you'll be throwing away. If you end up with removing just 10% of the headlines, it won't matter as much as when you'd be throwing away 90% of the results.

These kinds of questions are always hard to answer because the situation determines the best course. There is never a truly correct answer, only better ways. In my experience doesn't really matter whether you attempt to do the work in PHP or in the database because you should always try to cache the results of any expensive operation using a caching engine such as memcached. That way you are not going to spend a lot of time in the db or in php itself since the results will be cached and ready instantaneously for use. When it comes down to it, unlss you profile your application using a tool like xDebug, what you think are your performance bottlenecks are just guesses.

It's usually better not to overload the DB, because you might cause a bottleneck if you have many simultaneous queries.
However, handling your processing in PHP is usually better, as Apache will fork threads as it needs to handle multiple requests.
As usual, it all comes down to: "How much traffic is there?"

MySQL can already do the selecting and ordering of the data for you. I suggest to be lazy and use this.
Also I'd look for a (1) query that fetches all the categories and their headlines at once. Would an ORDER BY category, publishdate or something do?

Every trip to the database costs you something. Returning extra data that you then decide to ignore costs you something. So you're almost certainly better to let the database do your pruning.
I'm sure one could come up with some case where deciding what data you need makes the query hugely complex and thus difficult for the database to optimize, while you could do it in your code easily. But if we're talking about "select headline from story where category='Sports'" followed by "select headline from story where category='Politics'" then "select headline from story where category='Health'" etc, versus "select category, headline from story where category in ('Health','Sports','Politics')", the latter is clearly better.

On the topic of "Faster to query in MYSQL or to use PHP logic", which is how I ended up on this question 10 years later. I have determined that the correct answer is "it depends". There are just too many examples where using the DB saves processing time over writing PHP Code.... but there are just as many examples where writing PHP Code saves time on excessively complex MySQL queries.
There is no right answer here. If you end up here, like I did, then the best I can suggest is try to solve your problem with the skills that you have. Start first with the Query and try to solve it, if you run into issues then start thinking about just gathering the data and running the logic through PHP code to come up with a solution.
At the end of the day, you need to solve a problem.... if you solve it, but its not fast enough, then thats another problem... work on optimizing which may end up meaning that you go back to writing more MySQL logic.
Use the 80/20 rule and try to get things 80% of the way there as quickly as possible. You can go back and optimize once its workable. Spending all your effort on making it perfect the first time will surely mean you miss your deadline.
Thats my $0.02

How many MySQL queries should I limit myself to on a page? PHP / MySQL

Okay, so I'm sure plenty of you have built crazy database intensive pages...
I am building a page that I'd like to pull all sorts of unrelated database information from. Here are some sample different queries for this one page:
article content and info
IF the author is a registered user, their info
UPDATE the article's view counter
retrieve comments on the article
retrieve information for the authors of the comments
if the reader of the article is signed in, query for info on them
etc...
I know these are basically going to be pretty lightning quick, and that I could combine some; but I wanted to make sure that this isn't abnormal?
How many fairly normal and un-heavy queries would you limit yourself to on a page?

As many as needed, but not more.
Really: don't worry about optimization (right now). Build it first, measure performance second, and IFF there is a performance problem somewhere, then start with optimization.
Otherwise, you risk spending a lot of time on optimizing something that doesn't need optimization.

I've had pages with 50 queries on them without a problem. A fast query to a non-large (ie, fits in main memory) table can happen in 1 millisecond or less, so you can do quite a few of those.
If a page loads in less than 200 ms, you will have a snappy site. A big chunk of that is being used by latency between your server and the browser, so I like to aim for < 100ms of time spent on the server. Do as many queries as you want in that time period.
The big bottleneck is probably going to be the amount of time you have to spend on the project, so optimize for that first :) Optimize the code later, if you have to. That being said, if you are going to write any code related to this problem, write something that makes it obvious how long your queries are taking. That way you can at least find out you have a problem.

I don't think there is any one correct answer to this. I'd say as long as the queries are fast, and the page follows a logical flow, there shouldn't be any arbitrary cap imposed on them. I've seen pages fly with a dozen queries, and I've seen them crawl with one.

Every query requires a round-trip to your database server, so the cost of many queries grows larger with the latency to it.
If it runs on the same host there will still be a slight speed penalty, not only because a socket is between your application but also because the server has to parse your query, build the response, check access and whatever else overhead you got with SQL servers.
So in general it's better to have less queries.
You should try to do as much as possible in SQL, though: don't get stuff as input for some algorithm in your client language when the same algorithm could be implemented without hassle in SQL itself. This will not only reduce the number of your queries but also help a great deal in selecting only the rows you need.
Piskvor's answer still applies in any case.

Wordpress, for instance, can pull up to 30 queries a page. There are several things you can use to stop MySQL pull down - one of them being memchache - but right now and, as you say, if it will be straightforward just make sure all data you pull is properly indexed in MySQL and don't worry much about the number of queries.
If you're using a Framework (CodeIgniter for example) you can generally pull data for the page creation times and check whats pulling your site down.

As other have said, there is no single number. Whenever possible please use SQL for what it was built for and retrieve sets of data together.
Generally an indication that you may be doing something wrong is when you have a SQL inside a loop.
When possible Use joins to retrieve data that belongs together versus sending several statements.
Always try to make sure your statements retrieve exactly what you need with no extra fields/rows.

If you need the queries, you should just use them.
What I always try to do, is to have them executed all at once at the same place, so that there is no need for different parts (if they're separated...) of the page to make database connections. I figure it´s more efficient to store everything in variables than have every part of a page connect to the database.

In my experience, it is better to make two queries and post-process the results than to make one that takes ten times longer to run that you don't have to post-process. That said, it is also better to not repeat queries if you already have the result, and there are many different ways this can be achieved.
But all of that is oriented around performance optimization. So unless you really know what you're doing (hint: most people in this situation don't), just make the queries you need for the data you need and refactor it later.

I think that you should be limiting yourself to as few queries as possible. Try and combine queries to mutlitask and save time.

Premature optimisation is a problem like people have mentioned before, but that's where you're crapping up your code to make it run 'fast'. But people take this 'maxim' too far.
If you want to design with scalability in mind, just make sure whatever you do to load data is sufficiently abstracted and calls are centralized, this will make it easier when you need to implement a shared memory cache, as you'll only have to change a few things in a few places.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.