checking to see if data exist with in a table

checking to see if data exist with in a table - php

How do i go about looking into a table and searching to see if a row exist. the back gorund behind it is the table is called enemies. Every row has a unique id and is set to auto_increment. Each row also has a unique value called monsterid. the monster id isn't auto_increment.
when a monster dies the row is deleted and replaced by a new row. so the id is always changing. as well the monsterid is changed too.
I am using in php the $_GET method and the monsterid is passing through it,
basically i am trying to do this
$monsterID = 334322 //this is the id passed through the $_GET
checkMonsterId = "check to see if the monster id exist within the enemies table"
if monsterid exist then
{RUN PHP}
else
{RUN PHP}
If you need anymore clarity please ask. and thanks for the help in advance.

Use count! If it returns > 0, it exists, else, it doesn't.
select count(*) from enemies where monsterid = 334322
You would use it in PHP thusly (after connecting to the database):
$monsterID = mysql_real_escape_string($monsterID);
$res = mysql_query('select count(*) from enemies where monsterid = ' . $monsterid) or die();
$row = mysql_fetch_row($res);
if ($row[0] > 0)
{
//Monster exists
}
else
{
//It doesn't
}

Use count, like
select count(*) from enemies where monsterid = 334322
However be sure to make certain you've added an index on monsterid to the table. Reason being that if you don't, and this isn't the primary key, then the rdbms will be forced to issue a full table scan - read every row - to give you the value back. On small datasets this doesn't matter as the table will probably sit in core anyway, but once the number of rows becomes significant and you're hitting the disk to do the scan the speed difference can easily be two orders of magnitude or more.
If the number of rows is very small then not indexing is rational as using an non-primary key index requires additional overhead when inserting data, however this should be a definite decision (I regularly impress clients who've used a programmer who doesn't understand databases by adding indexes to tables which were fine when the coder created them but subsequently slow to a crawl when loaded with real volumes of data - quite amazing how one line of sql to add an index will buy you guru status in your clients eyes cause you made his system usable again).
If you're doing more complex queries against the database using subselect, something like finding all locations where there is no monster, then look up the use of the sql EXISTS clause. This is often overlooked by programmers (the temptation is to return a count of actual values) and using it is generally faster than the alternatives.

Simpler :
select 1 from enemies where monsterid = 334322
If it returns a row, you have a row, if not, you don't.

The mysql_real_escape_string is important to prevent SQL injection.
$monsterid = mysql_real_escape_string($_GET['monsterid']);
$query = intval(mysql_query("SELECT count(*) FROM enemies WHERE monsterid = '$monsterid'));
if (mysql_result > 0) {
// monster exists
} else {
// monster doesn't exist
}

Related

What's faster, db calls or resorting an array?

In a site I maintain I have a need to query the same table (articles) twice, once for each category of article. AFAIT there are basically two ways of doing this (maybe someone can suggest a better, third way?):
Perform the db query twice, meaning the db server has to sort through the entire table twice. After each query, I iterate over the cursor to generate html for a list entry on the page.
Perform the query just once and pull out all the records, then sort them into two separate arrays. After this, I have to iterate over each array separately in order to generate the HTML.
So it's this:
$newsQuery = $mysqli->query("SELECT * FROM articles WHERE type='news' ");
while($newRow = $newsQuery->fetch_assoc()){
// generate article summary in html
}
// repeat for informational articles
vs this:
$query = $mysqli->query("SELECT * FROM articles ");
$news = Array();
$info = Array();
while($row = $query->fetch_assoc()){
if($row['type'] == "news"){
$news[] = $row;
}else{
$info[] = $row;
}
}
// iterate over each array separate to generate article summaries
The recordset is not very large, current <200 and will probably grow to 1000-2000. Is there a significant different in the times between the two approaches, and if so, which one is faster?
(I know this whole thing seems awfully inefficient, but it's a poorly coded site I inherited and have to take care of without a budget for refactoring the whole thing...)
I'm writing in PHP, no framework :( , on a MySql db.
Edit
I just realized I left out one major detail. On a given page in the site, we will display (and thus retrieve from the db) no more than 30 records at once - but here's the catch: 15 info articles, and 15 news articles. On each page we pull the next 15 of each kind.

You know you can sort in the DB right?
SELECT * FROM articles ORDER BY type

EDIT
Due to the change made to the question, I'm updating my answer to address the newly revealed requirement: 15 rows for 'news' and 15 rows for not-'news'.
The gist of the question is the same "which is faster... one query to two separate queries". The gist of the answer remains the same: each database roundtrip incurs overhead (extra time, especially over a network connection to a separate database server), so with all else being equal, reducing the number database roundtrips can improve performance.
The new requirement really doesn't impact that. What the newly revealed requirement really impacts is the actual query to return the specified resultset.
For example:
( SELECT n.*
FROM articles n
WHERE n.type='news'
LIMIT 15
)
UNION ALL
( SELECT o.*
FROM articles o
WHERE NOT (o.type<=>'news')
LIMIT 15
)
Running that statement as a single query is going to require fewer database resources, and be faster than running two separate statements, and retrieving two disparate resultsets.
We weren't provided any indication of what the other values for type can be, so the statement offered here simply addresses two general categories of rows: rows that have type='news', and all other rows that have some other value for type.
That query assumes that type allows for NULL values, and we want to return rows that have a NULL for type. If that's not the case, we can adjust the predicate to be just
WHERE o.type <> 'news'
Or, if there are specific values for type we're interested in, we can specify that in the predicate instead
WHERE o.type IN ('alert','info','weather')
If "paging" is a requirement... "next 15", the typical pattern we see applied, LIMIT 30,15 can be inefficient. But this question isn't asking about improving efficiency of "paging" queries, it's asking whether running a single statement or running two separate statements is faster.
And the answer to that question is still the same.
ORIGINAL ANSWER below
There's overhead for every database roundtrip. In terms of database performance, for small sets (like you describe) you're better off with a single database query.
The downside is that you're fetching all of those rows and materializing an array. (But, that looks like that's the approach you're using in either case.)
Given the choice between the two options you've shown, go with the single query. That's going to be faster.
As far as a different approach, it really depends on what you are doing with those arrays.
You could actually have the database return the rows in a specified sequence, using an ORDER BY clause.
To get all of the 'news' rows first, followed by everything that isn't 'news', you could
ORDER BY type<=>'news' DESC
That's MySQL short hand for the more ANSI standards compliant:
ORDER BY CASE WHEN t.type = 'news' THEN 1 ELSE 0 END DESC
Rather than fetch every single row and store it in an array, you could just fetch from the cursor as you output each row, e.g.
while($row = $query->fetch_assoc()) {
echo "<br>Title: " . htmlspecialchars($row['title']);
echo "<br>byline: " . htmlspecialchars($row['byline']);
echo "<hr>";
}

Best way of dealing with a situation like this is to test this for yourself. Doesn't matter how many records do you have at the moment. You can simulate whatever amount you'd like, that's never a problem. Also, 1000-2000 is really a small set of data.
I somewhat don't understand why you'd have to iterate over all the records twice. You should never retrieve all the records in a query either way, but only a small subset you need to be working with. In a typical site where you manage articles it's usually about 10 records per page MAX. No user will ever go through 2000 articles in a way you'd have to pull all the records at once. Utilize paging and smart querying.
// iterate over each array separate to generate article summaries
Not really what you mean by this, but something tells me this data should be stored in the database as well. I really hope you're not generating article excerpts on the fly for every page hit.
It all sounds to me more like a bad architecture design than anything else...
PS: I believe sorting/ordering/filtering of a database data should be done on the database server, not in the application itself. You may save some traffic by doing a single query, but it won't help much if you transfer too much data at once, that you won't be using anyway.

Get next MySQL row ID when any number of previous rows have been deleted

I have the following call to my database to retrieve the last row ID from an AUTO_INCREMENT column, which I use to find the next row ID:
$result = $mysqli->query("SELECT articleid FROM article WHERE articleid=(SELECT MAX(articleid) FROM article)");
$row = $result->fetch_assoc();
$last_article_id = $row["articleid"];
$last_article_id = $last_article_id + 1;
$result->close();
I then use $last_article_id as part of a filename system.
This is working perfectly....until I delete a row meaning the call retrieves an ID further down the order than the one I want.
A example would be:
ID
0
1
2
3
4-(deleted row)
5-(deleted row)
6-(next ID to be used for INSERT call)
I'd like the filename to be something like 6-0.jpg, however the filename ends up being 4-0.jpg as it targets ID 3 + 1 etc...etc...
Any thoughts on how I get the next MySQL row ID when any number of previous rows have been deleted??

You are making a significant error by trying to predict the next auto-increment value. You do not have a choice, if you want your system to scale... you have to either insert the row first, or rename the file later.
This is a classic oversight I see developers make -- you are coding this as if there would only ever be a single user on your site. It is extremely likely that at some point two articles will be created at almost the same time. Both queries will "predict" the same id, both will use the same filename, and one of the files will disappear, one of the table entries may point to the wrong file, and the other entry will reference a file that does not exist. And you'll be scratching your head asking "how did this happen?!"
Predicting auto-increment values is bad practice. Don't do it. Plan for concurrency.
Also, the information_schema tables are not really tables... they are server internals exposed to the SQL interface. Calls to the "tables" table, and show table status are expensive calls that you do not want to make in production... so don't be tempted to use something you find there.

You can use mysql_insert_id() after you insert the new row to retrieve the new key:
$mysqli->query($yourQueryHere);
$newId = $mysqli->insert_id();
That requires the id field to be a primary key, though (I believe).
As for the filename, you could store it in a variable, then do the query, then change the name and then write the file.

How to index a query the right way

I am trying to make my DB more optimized and are in the beginning of indexing it but not sure how to do it right.
I have this query:
$year = date("Y");
$thisYear = $year;
//$nextYear = $thisYear + 1;
$sql = mysql_query("SELECT SUM(points) as userpoints
FROM ".$prefix."_publicpoints
WHERE date BETWEEN '$thisYear" . "-01-01' AND '$thisYear" . "-12-31' AND fk_player_id = $playerid");
$row = mysql_fetch_assoc($sql);
$userPoints = $row['userpoints'];
$sql = mysql_query("SELECT
fk_player_id
FROM ".$prefix."_publicpoints
WHERE date BETWEEN '$thisYear" . "-01-01' AND '$thisYear" . "-12-31'
GROUP BY fk_player_id
HAVING SUM(points) > $userPoints");
$row = mysql_fetch_assoc($sql);
$userWrank = mysql_num_rows($sql)+1;
I am not sure how to index this? I have tried indexing the fk_player_id but it still looks through all the rows (287937).
I have indexed the date field which gives me this back in EXPLAIN:
1
SIMPLE
nf_publicpoints
range
IDXdate
IDXdate
3
NULL
143969
Using where with pushed condition; Using temporary...
I also have 2 calls to the same table... Could that be done in one?
How do I index this and/or could it be done smarter?

You should definitely spend some time reading up on indexing, there's a lot written about it, and it's important to understand what's going on.
Broadly speaking, and index imposes an ordering on the rows of a table.
For simplicity's sake, imagine a table is just a big CSV file. Whenever a row is inserted, it's inserted at the end. So the "natural" ordering of the table is just the order in which rows were inserted.
Imagine you've got that CSV file loaded up in a very rudimentary spreadsheet application. All this spreadsheet does is display the data, and numbers the rows in sequential order.
Now imagine that you need to find all the rows that has some value "M" in the third column. Given what you have available, you have only one option. You scan the table checking the value of the third column for each row. If you've got a lot of rows, this method (a "table scan") can take a long time!
Now imagine that in addition to this table, you've got an index. This particular index is the index of values in the third column. The index lists all of the values from the third column, in some meaningful order (say, alphabetically) and for each of them, provides a list of row numbers where that value appears.
Now you have a good strategy for finding all the rows where the value of the third column is M! For instance, you can perform a binary search! Whereas the table scan requires you to look N rows (where N is the number of rows), the binary search only requires that you look at log-n index entries, in the very worst case. Wow, that's sure a lot easier!
Of course, if you have this index, and you're adding rows to the table (at the end, since that's how our conceptual table works), you need need to update the index each and every time. So you do a little more work while you're writing new rows, but you save a ton of time when you're searching for something.
So, in general, indexing creates a tradeoff between read efficiency and write efficiency. With no indexes, inserts can be very fast -- the database engine just adds a row to the table. As you add indexes, the engine must update each index while performing the insert.
On the other hand, reads become a lot faster.
Hopefully that covers your first two questions (as others have answered -- you need to find the right balance).
Your third scenario is a little more complicated. If you're using LIKE, indexing engines will typically help with your read speed up to the first "%". In other words, if you're SELECTing WHERE column LIKE 'foo%bar%', the database will use the index to find all the rows where column starts with "foo", and then need to scan that intermediate rowset to find the subset that contains "bar". SELECT ... WHERE column LIKE '%bar%' can't use the index. I hope you can see why.
Finally, you need to start thinking about indexes on more than one column. The concept is the same, and behaves similarly to the LIKE stuff -- essentialy, if you have an index on (a,b,c), the engine will continue using the index from left to right as best it can. So a search on column a might use the (a,b,c) index, as would one on (a,b). However, the engine would need to do a full table scan if you were searching WHERE b=5 AND c=1)
Hopefully this helps shed a little light, but I must reiterate that you're best off spending a few hours digging around for good articles that explain these things in depth. It's also a good idea to read your particular database server's documentation. The way indices are implemented and used by query planners can vary pretty widely.
More information and example visit here : http://blog.sqlauthority.com/category/sql-index/

Try create index on date column, indexing fk_payer_id will not help with this query. If does not work - paste explain...
For more information about indexes in Mysql look here: http://hackmysql.com/case1

Why not index the date column, seeing how that's the main criterion that will be evaluated in the lookup?

Efficently check if any value exists from group and which one it is?

I have a list of 100 values (might scale in the future) that I need to put into a database. However when one of them already exists I need to know which one it is and grab some info from its row on the table. That is the goal. However I can't think of an efficent way to do this.
Things I've thought of
Check if value exists. If not, submit. This is done value by value. Advantage: Easy. Disadvantage: Slow (minimum 100 queries, max 200 queries)
Opposite of above. If query fails due to duplicate key constraints, query the value. Same advantages and disadvantages
Insert all values at once. Run duplicate checker. Advantage: 2 (albeit huge) queries. Disadvantage: Difficult, possibly slow
There has to be a better way. Any idea's?

If I understand correctly, another option is to select all rows from the database matching your list in one query, then check array_intersect() in application code to find those which already exist in the database, or array_diff() to find those that don't.
// Your list into a comma-separated string
$your_list = implode(",", $your_list);
$dbexists = array();
$result = mysql_query("SELECT id FROM tbl WHERE id IN ($your_list)");
while ($row = mysql_fetch_assoc($result)) {
$dbexists[] = $row['id'];
}
// Already existing from your set:
$exists = array_intersect($your_list, $dbexists);

Insert automatically on new table?

I will create 5 tables, namely data1, data2, data3, data4 and data5 tables. Each table can only store 1000 data records.
When a new entry or when I want to insert a new data, I must do a check,
$data1 = mysql_query(SELECT * FROM data1);
<?php
if(mysql_num_rows($data1) > 1000){
$data2 = mysql_query(SELECT * FROM data2);
if(mysql_num_rows($data2 > 1000){
and so on...
}
}
I think this is not the way right? I mean, if I am user 4500, it would take some time to do all the check. Is there any better way to solve this problem?

I haven decided the numbers, it can be 5000 or 10000 data. The reason is flexibility and portability? Well, one of my sql guru suggest me to do this way
Unless your guru was talking about something like Partitioning, I'd seriously doubt his advise. If your database can't handle more than 1000, 5000 or 10000 rows, look for another database. Unless you have a really specific example how a record limit will help you, it probably won't. With the amount of overhead it adds it probably only complicates things for no gain.
A properly set up database table can easily handle millions of records. Splitting it into separate tables will most likely increase neither flexibility nor portability. If you accumulate enough records to run into performance problems, congratulate yourself on a job well done and worry about it then.

Read up on how to count rows in mysql.
Depending on what database engine you are using, doing count(*) operations on InnoDB tables is quite expensive, and those counts should be performed by triggers and tracked in a adjacent information table.
The structure you describe is often designed around a mapping table first. One queries the mapping table to find the destination table associated with a primary key.

You can keep a "tracking" table to keep track of the current table between requests.
Also be on alert for race conditions (use transactions, or insure only one process is running at a time.)
Also don't $data1 = mysql_query(SELECT * FROM data1); with nested if's, do something like:
$i = 1;
do {
$rowCount = mysql_fetch_field(mysql_query("SELECT count(*) FROM data$i"));
$i++;
} while ($rowCount >= 1000);

I'd be surprised if MySQL doesn't have some fancy-pants way to manage this automatically (or at least, better than what I'm about to propose), but here's one way to do it.
1. Insert record into 'data'
2. Check the length of 'data'
3. If >= 1000,
- CREATE TABLE 'dataX' LIKE 'data';
(X will be the number of tables you have + 1)
- INSERT INTO 'dataX' SELECT * FROM 'data';
- TRUNCATE 'data';
This means you will always be inserting into the 'data' table, and 'data1', 'data2', 'data3', etc are your archived versions of that table.

You can create a MERGE table like this:
CREATE TABLE all_data ([col_definitions]) ENGINE=MERGE UNION=(data1,data2,data3,data4,data5);
Then you would be able to count the total rows with a query like SELECT COUNT(*) FROM all_data.

If you're using MySQL 5.1 or above, you can let the database handle this (nearly) automatically using partitioning:
Read this article or the official documentation

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.