mysql query to delete records that match multiple fields - php

I have a database that looks like this:
CREATE TABLE cargo (
cargodate int unsigned NOT NULL,
cargoname CHAR(128) NOT NULL,
lattitude double NOT NULL,
longitude double NOT NULL,
user CHAR(64) NOT NULL
) type=MyISAM;
I want to make sure there are no more than 5 entries for this cargo at the same location with the same user. A user can have multiple entries as long as they are in different locations (lattitude, longitude).
How do I make my sql INSERT statement take care of this?
Right now I execute:
INSERT INTO cargo VALUES (UNIX_TIMESTAMP(), '{$cargoname}', '{$lat}', '{$lng}', '{$user}');
I can do a DELETE FROM, but I want to only delete entries if there are more than 5. In that case I want to delete the oldest entries
Thanks
Deshawnt

You could use triggers http://dev.mysql.com/doc/refman/5.0/en/triggers.html
Just after insert you may delete all not needed entries.

Using a trigger as mentioned by Ruslan Polutsygan would be the most natural solution.
If you don't want to mess around with them, then the alternative would be to run:
SELECT * FROM cargo ORDER BY cargodate DESC LIMIT 4,18446744073709551615
before you run an insert, and then delete the rows that are returned by that query, and then do the insert.
As a side-note, the number 18446744073709551615 was taken out of the example from the MySQL documentation:
To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter

Related

MySQL - How to check if a value exists before appending to a TEXT Field?

On duplicated record, I want to update the record by appending a string to the TEXT column in table, on the condition that the appending value does not already exist in that TEXT column.
I have come so far with my query
INSERT INTO events (event_id, event_types)
VALUES ("1", "partyEvent")
ON DUPLICATE KEY UPDATE event_types = CONCAT(event_types, ",testEvent")
Is there a such check with MySQL, or is necessary that I fetch the record and do the comparison myself with PHP?
It looks like event_types is a denormalized field, containing a comma-separated sequence of text strings. With respect, this is a notorious database design antipattern. The next programmer to work on your code will be very unhappy indeed.
I'll answer your question even though it pains me.
First of all, how can you tell whether a particular string occurs within a comma-separated set of text strings? FIND_IN_SET() can do this.
FIND_IN_SET('testEvent', event_types)
returns a value greater than zero if 'testEvent' shows up in the column.
So we can use it in your event_types = clause. If FIND_IN_SET comes up with a positive number, you want event_types = event_types, that is, an unchanged value. If not, you want what you have in your question. How to do this? Use IF(condition,trueval,falseval). Try this:
...UPDATE event_types = IF(FIND_IN_SET('testEvent',event_types) > 0,
CONCAT(event_types, ',', 'testEvent'),
event_types)
There's a much better way however. Make a new table called event_types defined like this.
CREATE TABLE event_types (
event_id INT(11) NOT NULL,
event_type VARCHAR(50) NOT NULL,
PRIMARY KEY (event_id, event_type)
)
This has a compound primary key, meaning it cannot have duplicate event_type values for any particular event_id.
Then you will add a type to an event using this query:
INSERT IGNORE INTO event_types (event_id, event_type)
VALUES (1, 'testEvent');
The IGNORE tells MySQL to be quiet if there's already a duplicate.
If you must have your event types comma-separated for processing by some program, this aggregate query with GROUP_CONCAT() will produce them..
SELECT e.event_id, GROUP_CONCAT(t.event_type ORDER BY t.event_type) event_types
FROM events e
LEFT JOIN event_types t ON e.event_id = t.event_it
GROUP BY e.event_id
You can find all the events with a particular type like this.
SELECT event_id FROM event_types WHERE event_type='testEvent')
Pro tip: Comma separated: bad. Normalized: good.
Don't worry, we've all made this design mistake once or twice.

Mysql where between query optimization

Below is the format of the database of Autonomous System Numbers ( download and parsed from this site! ).
range_start range_end number cc provider
----------- --------- ------ -- -------------------------------------
16778240 16778495 56203 AU AS56203 - BIGRED-NET-AU Big Red Group
16793600 16809983 18144 AS18144
745465 total rows
A Normal query looks like this:
select * from table where 3232235520 BETWEEN range_start AND range_end
Works properly but I query a huge number of IPs to check for their AS information which ends up taking too many calls and time.
Profiler Snapshot:
Blackfire profiler snapshot
I've two indexes:
id column
a combine index on the range_start and range_end column as both the make unique row.
Questions:
Is there a way to query a huge number of IPs in a single query?
multiple where (IP between range_start and range_end) OR where (IP between range_start and range_end) OR ... works but I can't get the IP -> row mapping or which rows are retrieved for which IP.
Any suggestions to change the database structure to optimize the query speed and decrease the time?
Any help will be appreciated! Thanks!
It is possible to query more than one IP address. Several approaches we could take. Assuming range_start and range_end are defined as integer types.
For a reasonable number of ip addresses, we could use an inline view:
SELECT i.ip, a.*
FROM ( SELECT 3232235520 AS ip
UNION ALL SELECT 3232235521
UNION ALL SELECT 3232235522
UNION ALL SELECT 3232235523
UNION ALL SELECT 3232235524
UNION ALL SELECT 3232235525
) i
LEFT
JOIN ip_to_asn a
ON a.range_start <= i.ip
AND a.range_end >= i.ip
ORDER BY i.ip
This approach will work for a reasonable number of IP addresses. The inline view could be extended with more UNION ALL SELECT to add additional IP addresses. But that's not necessarily going to work for a "huge" number.
When we get "huge", we're going to run into limitations in MySQL... maximum size of a SQL statement limited by max_allowed_packet, there may be a limit on the number of SELECT that can appear.
The inline view could be replaced with a temporary table, built first.
DROP TEMPORARY TABLE IF EXISTS _ip_list_;
CREATE TEMPORARY TABLE _ip_list_ (ip BIGINT NOT NULL PRIMARY KEY) ENGINE=InnoDB;
INSERT INTO _ip_list_ (ip) VALUES (3232235520),(3232235521),(3232235522),...;
...
INSERT INTO _ip_list_ (ip) VALUES (3232237989),(3232237990);
Then reference the temporary table in place of the inline view:
SELECT i.ip, a.*
FROM _ip_list_ i
LEFT
JOIN ip_to_asn a
ON a.range_start <= i.ip
AND a.range_end >= i.ip
ORDER BY i.ip ;
And then drop the temporary table:
DROP TEMPORARY TABLE IF EXISTS _ip_list_ ;
Some other notes:
Churning database connections is going to degrade performance. There's a significant amount overhead in establishing and tearing down a connection. That overhead get noticeable if the application is repeatedly connecting and disconnecting, if its doing that for every SQL statement being issued.
And running an individual SQL statement also has overhead... the statement has to be sent to the server, the statement parsed for syntax, evaluated from semantics, choose an execution plan, execute the plan, prepare a resultset, return the resultset to the client. And this is why it's more efficient to process set wise rather than row wise. Processing RBAR (row by agonizing row) can be very slow, compared to sending a statement to the database and letting it process a set in one fell swoop.
But there's a tradeoff there. With ginormous sets, things can start to get slow again.
Even if you can process two IP addresses in each statement, that halves the number of statements that need to be executed. If you do 20 IP addresses in each statement, that cuts down the number of statements to 5% of the number that would be required a row at a time.
And the composite index already defined on (range_start,range_end) is appropriate for this query.
FOLLOWUP
As Rick James points out in a comment, the index I earlier said was "appropriate" is less than ideal.
We could write the query a little differently, that might make more effective use of that index.
If (range_start,range_end) is UNIQUE (or PRIMARY) KEY, then this will return one row per IP address, even when there are "overlapping" ranges. (The previous query would return all of the rows that had a range_start and range_end that overlapped with the IP address.)
SELECT t.ip, a.*
FROM ( SELECT s.ip
, s.range_start
, MIN(e.range_end) AS range_end
FROM ( SELECT i.ip
, MAX(r.range_start) AS range_start
FROM _ip_list_ i
LEFT
JOIN ip_to_asn r
ON r.range_start <= i.ip
GROUP BY i.ip
) s
LEFT
JOIN ip_to_asn e
ON e.range_start = s.range_start
AND e.range_end >= s.ip
GROUP BY s.ip, s.range_start
) t
LEFT
JOIN ip_to_asn a
ON a.range_start = t.range_start
AND a.range_end = t.range_end
ORDER BY t.ip ;
With this query, for the innermost inline view query s, the optimizer might be able to make effective use of an index with a leading column of range_start, to quickly identify the "highest" value of range_start (that is less than or equal to the IP address). But with that outer join, and with the GROUP BY on i.ip, I'd really need to look at the EXPLAIN output; it's only conjecture what the optimizer might do; what is important is what the optimizer actually does.)
Then, for inline view query e, MySQL might be able to make more effective use of the composite index on (range_start,range_end), because of the equality predicate on the first column, and the inequality condition on MIN aggregate on the second column.
For the outermost query, MySQL will surely be able to make effective use of the composite index, due to the equality predicates on both columns.
A query of this form might show improved performance, or performance might go to hell in a handbasket. The output of EXPLAIN should give a good indication of what's going on. We'd like to see "Using index for group-by" in the Extra column, and we only want to see a "Using filesort" for the ORDER BY on the outermost query. (If we remove the ORDER BY clause, we want to not see "Using filesort" in the Extra column.)
Another approach is to make use of correlated subqueries in the SELECT list. The execution of correlated subqueries can get expensive when the resultset contains a large number of rows. But this approach can give satisfactory performance for some use cases.
This query depends on no overlapping ranges in the ip_to_asn table, and this query will not produce the expected results when overlapping ranges exist.
SELECT t.ip, a.*
FROM ( SELECT i.ip
, ( SELECT MAX(s.range_start)
FROM ip_to_asn s
WHERE s.range_start <= i.ip
) AS range_start
, ( SELECT MIN(e.range_end)
FROM ip_to_asn e
WHERE e.range_end >= i.ip
) AS range_end
FROM _ip_list_ i
) r
LEFT
JOIN ip_to_asn a
ON a.range_start = r.range_start
AND a.range_end = r.range_end
As a demonstration of why overlapping ranges will be a problem for this query, given a totally goofy, made up example
range_start range_end
----------- ---------
.101 .160
.128 .244
Given an IP address of .140, the MAX(range_start) subquery will find .128, the MIN(range_end) subquery will find .160, and then the outer query will attempt to find a matching row range_start=.128 AND range_end=.160. And that row just doesn't exist.
This is a duplicate of the question here however I'm not voting to close it, as the accepted answer in that question is not very helpful; the answer by Quassnoi is much better (but it only links to the solution).
A linear index is not going to help resolve a database of ranges. The solution is to use geospatial indexing (available in MySQL and other DBMS). An added complication is that MySQL geospatial indexing only works in 2 dimensions (while you have a 1-D dataset) so you need to map this to 2-dimensions.
Hence:
CREATE TABLE IF NOT EXISTS `inetnum` (
`from_ip` int(11) unsigned NOT NULL,
`to_ip` int(11) unsigned NOT NULL,
`netname` varchar(40) default NULL,
`ip_txt` varchar(60) default NULL,
`descr` varchar(60) default NULL,
`country` varchar(2) default NULL,
`rir` enum('APNIC','AFRINIC','ARIN','RIPE','LACNIC') NOT NULL default 'RIPE',
`netrange` linestring NOT NULL,
PRIMARY KEY (`from_ip`,`to_ip`),
SPATIAL KEY `rangelookup` (`netrange`)
) ENGINE=MyISAM DEFAULT CHARSET=ascii;
Which might be populated with....
INSERT INTO inetnum
(from_ip, to_ip
, netname, ip_txt, descr, country
, netrange)
VALUES
(INET_ATON('127.0.0.0'), INET_ATON('127.0.0.2')
, 'localhost','127.0.0.0-127.0.0.2', 'Local Machine', '.',
GEOMFROMWKB(POLYGON(LINESTRING(
POINT(INET_ATON('127.0.0.0'), -1),
POINT(INET_ATON('127.0.0.2'), -1),
POINT(INET_ATON('127.0.0.2'), 1),
POINT(INET_ATON('127.0.0.0'), 1),
POINT(INET_ATON('127.0.0.0'), -1))))
);
Then you might want to create a function to wrap the rather verbose SQL....
DROP FUNCTION `netname2`//
CREATE DEFINER=`root`#`localhost` FUNCTION `netname2`(p_ip VARCHAR(20) CHARACTER SET ascii) RETURNS varchar(80) CHARSET ascii
READS SQL DATA
DETERMINISTIC
BEGIN
DECLARE l_netname varchar(80);
SELECT CONCAT(country, '/',netname)
INTO l_netname
FROM inetnum
WHERE MBRCONTAINS(netrange, GEOMFROMTEXT(CONCAT('POINT(', INET_ATON(p_ip), ' 0)')))
ORDER BY (to_ip-from_ip)
LIMIT 0,1;
RETURN l_netname;
END
And therefore:
SELECT netname2('127.0.0.1');
./localhost
Which uses the index:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE inetnum range rangelookup rangelookup 34 NULL 1 Using where; Using filesort
(and takes around 10msec to find a record from the combined APNIC,AFRINIC,ARIN,RIPE and LACNIC datasets on the very low spec VM I'm using here)
You can compare IP ranges using MySQL. This question might contain an answer you're looking for: MySQL check if an IP-address is in range?
SELECT * FROM TABLE_NAME WHERE (INET_ATON("193.235.19.255") BETWEEN INET_ATON(ipStart) AND INET_ATON(ipEnd));
You will likely want to index your database. This optimizes the time it takes to search your database, similar to the index you will find in the back of a textbook, but for databases:
ALTER TABLE `table` ADD INDEX `name` (`column_id`)
EDIT: Apparently INET_ATON cannot be used on indexed databases, so you would have to pick one of these!

Mysql count slow when filter by category

Why my query fast when I run.
select count(*) as aggregate from `news` where `news`.`deleted_at` is null and `status` = '1'
But, slow when I run.
select count(*) as aggregate from `news` where `news`.`deleted_at` is null and `status` = '1' and `newscategory_id` = '17'
It is my table news structure image, have a look at here.
Sorry because my reputation is less than 8, so I can't attach image.
try adding an composite index on the three columns you are using for your select:
ALTER TABLE news ADD INDEX comp_index (deleted_at, status, newscategory_id);
and check it again.
probably use EXPLAIN to see if any indexes you have are used.
Indexes are used to find rows with specific column values quickly.
Without an index, MySQL must begin with the first row and then read
through the entire table to find the relevant rows. The larger the
table, the more this costs. If the table has an index for the columns
in question, MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the data.
This is much faster than reading every row sequentially.
Try to add this in your DB:
CREATE INDEX newCategory_indx ON news (newscategory_id)
CREATE INDEX status_indx ON news (status)
This will give you quick result, as compared to previously generate (non-indexed column) result.
To know more about index and it's importance visit here

Whats wrong with these SQL statements?

Problem 1: Using the SQL CREATE TABLE statement, create a table, MOVSTARDIR, with attributes for the movie number, star number, and director number and the 4 acting awards. The primary key is the movie number, star number and director number (all 3), with referential integrity enforced. The director number is the director for that movie, and the star must have appeared in that movie.
Load MOVSTARDIR (from existing tables) using INSERt INTO.
My answer:
CREATE TABLE MOVSTARDIR
(MVNUM SHORT NOT NULL, STARNUM SHORT NOT NULL, DIRNUM SHORT NOT NULL, BESTF TEXT, BESTM TEXT, SUPM TEXT, SUPF TEXT)
ALTER TABLE MOVSTARDIR
ADD PRIMARY KEY (MVNUM,STARNUM,DIRNUM)
INSERT INTO MOVSTARDIR
SELECT MOVIE.MVNUM,STAR.STARNUM,DIRECTOR.DIRNUM... BESTF,BESTM,SUPM,SUPF
FROM MOVSTAR, DIRECTOR, MOVIE
WHERE MOVSTAR.MVNUM=MOVIE.MVNUM
AND MOVIE.DIRNUM=DIRECTOR.DIRNUM`
*Its giving me an error saying something is wrong with "create table" statement and it highlights the word "alter" in the SQL statement. Also how do i add referential integrity?*
Problem 2:List the directors in MOVSTARDIR with the total awards won from the 4 award categories included in the table. List the director name (not number), and the count in each of the 4 categories and the sum for all 4 categories. Group the report by the director name (i.e. one line per director, each director appears once), and order it by the sum (descending). Only show lines where the sum is more than 3.
SELECT DISTINCT DIRNAME, COUNT(BESTF) AS BESTFE, COUNT(BESTM) AS BESTML,
COUNT(SUPM) AS SUPML, COUNT(SUPF) AS SUPFE,
(COUNT(BESTM) COUNT(BESTF) COUNT(SUPM) COUNT(SUPF)) AS TOTAL
FROM MOVSTARDIR, DIRECTOR
WHERE MOVSTARDIR.DIRNUM=DIRECTOR.DIRNUM
AND ((BESTM IS NOT NULL) OR (BESTF IS NOT NULL) OR (SUPM IS NOT NULL)
OR (SUPF IS NOT NULL))
GROUP BY DIRNAME
HAVING (COUNT(BESTM) COUNT(BESTF) COUNT(SUPM) COUNT(SUPF)) 3
ORDER BY (COUNT(BESTM) COUNT(BESTF) COUNT(SUPM) COUNT(SUPF))DESC`
*Problem with this is it list all records not just wins*
if the database is needed i can send the data base through email.
For Problem 1:
If you are using mysql, the query for create should be as follows
CREATE TABLE `MOVSTARDIR` (
`MVNUM` SMALLINT NOT NULL ,
`STARNUM` SMALLINT NOT NULL ,
`DIRNUM` SMALLINT NOT NULL ,
`BESTF` TEXT NOT NULL ,
`BESTM` TEXT NOT NULL ,
`SUPM` TEXT NOT NULL ,
`SUPF` TEXT NOT NULL
);
You're missing the semicolon after each of the statements, causing Access to treat the entire text as one statement.
Your tags show MySQL, SQL Server and SQL. The syntax of the SQL can vary according to the RDBMS.
Assuming you are using MySQL, these are the issues with your query.
a. Data type - There is no SHORT in MySQL. You can use SMALLINT
b. You need to add semi colons after each sql statement
Even if you are using any other RDBMS, you need to refer the corresponding SQL manual and verify that you specify the exact data types.
Access doesn't allow to run a batch of queries, only one by one.
So, run first CREATE TABLE, then ALTER and so on.

incremental counter mysql

My question is pretty simple but answer might be tricky.
I'm in PHP and I want to manage manually a unique ID for my objects.
What is tricky is to manage atomicity. I dont want that 2 elements get the same ID.
"Elements" are grouped in "Groups". In each group I want elements ID starting from 1 and grow incrementally for each insert in that group.
My first solution is to have a "lastID" column in the table "Groups" :
CREATE TABLE groups ( id INT AUTO_INCREMENT, lastId INT )
CREATE TABLE elements ( myId INT, multiple values ...)
In order to avoid many elements with the same ID, I have to update lastId and select it in an atomic SQL Query.
After that, one retrieved, I have a unique ID that can't be picked again and I can insert my element.
My question is how to solve the bold part ? My database is MySQL with MyISAM engine so there is no transaction support.
UPDATE groups
SET lastId = lastId + 1
WHERE id = 42
SELECT lastId
FROM groups
WHERE id = 42
Is there something more atomic than these 2 requests ?
Thanks
UPDATE groups SET lastId = last_insert_id(lastId + 1)
and then you can get your new id with
SELECT last_insert_id()
Using last_insert_id with a parameter will store the value and return it when you call it later.
This method of generating autonumbers works best with MyISAM tables having only a few rows (MyISAM always locks the entire table). It also has the benefit of not locking the table for the duration of the transaction (which will happen if it is an InnoDB table).
This is from the MySQL manual:
If expr is given as an argument to LAST_INSERT_ID(), the value of the
argument is returned by the function and is remembered as the next
value to be returned by LAST_INSERT_ID(). This can be used to simulate
sequences:
Create a table to hold the sequence counter and initialize it:
CREATE TABLE sequence (id INT NOT NULL);
INSERT INTO sequence VALUES (0);
Use the table to generate sequence numbers like this:
UPDATE sequence SET id=LAST_INSERT_ID(id+1);
SELECT LAST_INSERT_ID();
The UPDATE statement increments the sequence counter
and causes the next call to LAST_INSERT_ID() to return the updated
value. The SELECT statement retrieves that value. The
mysql_insert_id() C API function can also be used to get the value.
See Section 21.8.3.37, “mysql_insert_id()”.
You can generate sequences without calling LAST_INSERT_ID(), but the
utility of using the function this way is that the ID value is
maintained in the server as the last automatically generated value. It
is multi-user safe because multiple clients can issue the UPDATE
statement and get their own sequence value with the SELECT statement
(or mysql_insert_id()), without affecting or being affected by other
clients that generate their own sequence values.
One option is for you to use the nifty MyISAM feature that let's auto_increment values be incremented for each group.
CREATE UNIQUE INDEX elements_ix1 ON elements (groupId, myID)
myID INT NOT NULL AUTO_INCREMENT
That's more "atomic" than anything that involves updating a separate table. Note that this only works for MyISAM, not InnoDB.
excerpt from http://dev.mysql.com/doc/refman/5.1/en/example-auto-increment.html
MyISAM Notes
For MyISAM tables, you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
I would assume your MySQL installation also has InnoDB engine which does support transactions. You just need to change the engine type of you tables.

Categories