MySQL Delete Rows With Duplicate and Similar Content - php

I am new for PHP and MYSQL. In my study I met a problem
I have a table like this:
+----+----------+---------+
| time | name | number |
+----+----------+---------+
| 12.1 | google | 10 |
| 12.2 | yahoo | 15 |
| 12.3 | msn | 20 |
| 12.1 | google | 10 |
| 12.1 | google | 29 |
| 12.2 | yahoo | 10 |
+----+----------+---------+
but I want the talbe like this:
+----+----------+---------+
| time | name | number |
+----+----------+---------+
| 12.2 | yahoo | 15 |
| 12.3 | msn | 20 |
| 12.1 | google | 29 |
+----+----------+---------+
when the time and the name are the same, I want the row with the max number,
what should I do? I am very worrying about this problem and thank you for anwsing me

TRY ( not tested )
SELECT `time`, `name`, `number`
FROM tbl
GROUP BY `name`,`time`
HAVING MAX(`number`)
tip : TIME is a mysql reserve keyword so wrap it with ` and create Index on name and time column together

Try it.
Select time, name, number from [table name] group by time, name having max(number)

Related

MySQL DELETE query works except not in PHP

I have a MySQL query to delete 'near' duplicate rows from a table, and while using test data outside of my project, the query appears to work as intended. When I use the same query with PHP in the project, I get an SQL error. I've been trying all sorts of different combinations of quotes and backticks and I can't seem to get this working.
Any idea what is going on here?
Problem being solved:
This table sometimes will have rows that are nearly identical, with the only exception being the as_of_date column and the total. Only the the most recent date is important, and any older data is no longer needed in this table once newer data comes in.
Table structure with example data:
+----+---------+------+-------------+-------+
| id | account | year | as_of_date | total |
+----+---------+------+-------------+-------+
| 1 | 123 | 2017 | 2017-02-02 | 250 |
| 2 | 123 | 2017 | 2017-11-24 | 790 |
| 3 | 123 | 2018 | 2018-01-30 | 55 |
| 4 | 456 | 2016 | 2016-04-04 | 500 |
| 5 | 456 | 2016 | 2016-10-10 | 300 |
| 6 | 456 | 2017 | 2017-03-12 | 44 |
| 7 | 789 | 2015 | 2015-12-23 | 2000 |
+----+---------+------+-------------+-------+
Expected Outcome:
The desired result is to delete all 'near-duplicate' rows in the table except for the most recent one (as_of_date). So there should only be at most 1 row for any given account and year. The table should look like this after the query is executed:
+----+---------+------+-------------+-------+
| id | account | year | as_of_date | total |
+----+---------+------+-------------+-------+
| 2 | 123 | 2017 | 2017-11-24 | 790 |
| 3 | 123 | 2018 | 2018-01-30 | 55 |
| 5 | 456 | 2016 | 2016-10-10 | 300 |
| 6 | 456 | 2017 | 2017-03-12 | 44 |
| 7 | 789 | 2015 | 2015-12-23 | 2000 |
+----+---------+------+-------------+-------+
The query:
$query = "DELETE FROM `my_table` AS t
WHERE t.as_of_date NOT IN (
SELECT MAX(as_of_date)
FROM (SELECT * FROM `my_table`) AS t2
WHERE t2.account = t.account AND t2.year = t.year
GROUP BY account, `year`
)";
Here is the SQL error:
You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use
near 'AS t
WHERE t.as_of_date NOT IN (
S' at line 1
Don't use table aliases in DELETE FROM. That is, try DELETE FROM my_table WHERE..., omitting AS t.
By the way, the only times you really need backticks are when you have table names that are the same as reserved words or have spaces in them.
SELECT * FROM `SELECT`
or
SELECT * FROM `My Favorite Table`
Wise programmers avoid those situations.

MySQL Search Double with Currency Format

I have a PHP Based App that stores invoices entered by the user. Currently I have the invoice amount stored in a MySQL database tables as a double like so:
+------+--------------+--------------+----------------+----------------+-------------+-----------+---------------+-------------+-------------+-----------------+
| id | date_entered | invoice_date | invoice_number | invoice_amount | client_type | unique_id | supplier_type | supplier_id | category_id | childcare_hours |
+------+--------------+--------------+----------------+----------------+-------------+-----------+---------------+-------------+-------------+-----------------+
| 1 | 1411098397 | 1411048800 | 123 | 0.01 | 0 | 137 | 0 | 139 | 5 | NULL |
| 2 | 1412123404 | 1416920400 | 5093 | 130 | 0 | 168 | 0 | 19 | 18 | NULL |
| 3 | 1412125933 | 1412085600 | 000 | 79 | 0 | 151 | 0 | 177 | 8 | NULL |
| 4 | 1412645652 | 1412600400 | 000 | 60.8 | 0 | 104 | 0 | 179 | 9 | NULL |
| 5 | 1412647563 | 1409320800 | 804560 | 225.5 | 0 | 18 | 0 | 174 | 10 | NULL |
I am also using DataTables toorganise the data. I am using Server Side Processing to perform the data lookup to return as JSON.
The issue I am having is that the User is attempting to search by price eith by typing $123 or 123.50 This is not working as the SQL is being genrated like so: SELECT * FROM invoices WHERE invoice_amount LIKE "%$123%";
This is obviosuly failing due to the data being stored in the database as a double.
My Question is, is there a way to make the SQL (or Maybe PHP) search for the correct value no matter what the client types in?
I don't think there is any generic solution for the problem that you are facing but yeah you can remove the special characters like $ etc. from the beginning or end of the invoice amount to be placed in the query. Moreover I'll recommend that you should use functions as round in PHP as well as MySQL in the best possible fashion rather than using the LIKE statement. Using LIKE statement is absolutely incorrect in this situation.
You can try query without "$"
SELECT * FROM invoices WHERE invoice_amount LIKE "%123%";

Select rows from specific date with left join

I'm currently working on a simple employee scheduling tool and have a problem with a SQL query. To explain my problem let use this two very abstract tables.
The first table simply consists of the employees
employees
======================
empId | name | ...
----------------------
10 | Scott |
11 | Schrute |
12 | Halpert |
13 | Howard |
In the second table you find the assigned tasks to each employee by day.
tasks
==============================================
tasId | name | task | date | ...
----------------------------------------------
10 | Scott | Support | 2014-02-17 |
11 | Scott | Bugfix | 2014-02-18 |
12 | Halpert | Bugfix | 2014-02-17 |
13 | Halpert | Develop | 2014-02-18 |
14 | Howard | Support | 2014-02-17 |
Now I want to know what the employees are working on on Feb 17th or if they have no tasks planned for that day. I use the following SQL query to do that.
SELECT e.name, t.task
FROM employees e LEFT JOIN tasks t ON e.name = t.name
WHERE date IS NULL OR date = DATE('2014-02-17')
The result delivers exactly what I need:
name | task
--------------------
Scott | Support
Schrute | NULL
Halpert | Bugfix
Howard | Support
And now to my problem. If I want to see the tasks of Feb 18th I get this result set:
name | task
--------------------
Scott | Bugfix
Schrute | NULL
Halpert | Develop
The reason to this is obvious, the date of Howard's tasks are neither NULL nor do they equal 2014-02-18.. What would be the best way to get the desired result?
I use MySQL and PHP.
(Sorry for the stupid title, I couldn't think of anything better..)
Move your filter to the join predicate:
SELECT e.name, t.task
FROM employees e
LEFT JOIN tasks t ON e.name = t.name AND t.date = DATE('2014-02-18');
Your query as it is will only return people who have no tasks at all, or have tasks on the date specified. It will omit people who have tasks, but not on the date specified. Consider the results of joining your sample data with no where clause:
empId | name | task | date | ...
----------------------------------------|
10 | Scott | Support | 2014-02-17 |
10 | Scott | Bugfix | 2014-02-18 |
11 | Schrute | NULL | NULL |
12 | Halpert | Bugfix | 2014-02-17 |
12 | Halpert | Develop | 2014-02-18 |
13 | Howard | Support | 2014-02-17 |
As you can see there is no record for Howard where Date = 2014-02-18, or where Date is null, this is why no record is returned for Howard. When you add the filter to the join predicate your results become:
empId | name | task | date | ...
----------------------------------------|
10 | Scott | Bugfix | 2014-02-18 |
11 | Schrute | NULL | NULL |
12 | Halpert | Develop | 2014-02-18 |
13 | Howard | NULL | NULL |
Which I think is the desired results.
You can use
UNIX_TIMESTAMP(`date`) = 0
Looks like you have some data which are stored as '0000-00-00' and its not null so adding the extra OR condition as above will return those data as well.

How to find most common words in a MySQL database table column

i have a table in following format:
id | title
---+----------------------------
1 | php jobs, usa
3 | usa, php, jobs
4 | ca, mysql developer
5 | developer
i want to get the most popular keywords in title field, please guide.
If you have a list of keywords, you can do the following:
select kw.keyword, count(*)
from t cross join
keywords kw
on concat(', ', t.title, ',') like concat(', ', kw.keyword, ',')
As others have mentioned, though, you have a non-relational database design. The keywords in the title should be stored in separate rows, rather than as a comma separated list.
If your data is small (a few hundred thousand rows or less), you can put it into Excel, use the text-to-columns function, rearrange the keywords, and create a new, better table in the database.
SELECT title 1, COUNT(*) FROM table GROUP BY title 1
EDIT
Since you've edited and presented a non-normalized table, I would recommend you normalize it.
Have a read of: http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/
You need to modify your database. You should have something like this:
items
+----+---------------+
| id | title |
+----+---------------+
| 1 | something |
| 3 | another thing |
| 4 | yet another |
| 5 | one last one |
+----+---------------+
keywords
+----+-----------------+
| id | keyword |
+----+-----------------+
| 1 | php jobs |
| 2 | usa |
| 3 | php |
| 4 | jobs |
| 5 | ca |
| 6 | mysql developer |
| 7 | developer |
+----+-----------------+
items_to_keywords
+---------+------------+
| item_id | keyword_id |
+---------+------------+
| 1 | 1 |
| 1 | 2 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 4 | 5 |
| 4 | 6 |
| 5 | 7 |
+---------+------------+
Do you see the advantage? The ability to make relations is what you should be leveraging here.

disable the last row

I have this table in PostgreSQL:
appid | appname | apptype | creationtime | createdby | display
-------+-------------------------------------+---------+--------------+-----------+---------
0 | Custom | -1 | | | t
1000 | Performance/Resource | -2 | | | t
2000 | PING | 0 | | | t
2001 | HTTP | 0 | | | t
2002 | HTTPS | 0 | | | t
2003 | FTP | 0 | | | t
2004 | LDAP | 0 | | | t
2005 | IMAP | 0 | | | t
2006 | POP | 0 | | | t
2007 | SMTP | 0 | | | t
2008 | DNS | 0 | | | t
2009 | NFS | 0 | | | t
2010 | NTP | 0 | | | t
2011 | SSH | 0 | | | t
2012 | TCP | 0 | | | t
2013 | TELNET | 0 | | | t
3000 | Generic Mail (RTT) | 3 | | | t
3001 | Apache Tomcat | 2 | | | t
3002 | JBoss | 2 | | | t
3003 | MySQL | 1 | | | t
3004 | WebSphere | 2 | | | t
4000 | Microsoft Exchange Server 2003 | 3 | | | t
4001 | Exchange Server 2007 /2010 | 3 | | | t
4003 | Microsoft SQL Server 2008 | 1 | | | t
4004 | Microsoft ISA Server 2006 | 99 | | | t
4005 | Microsoft IIS | 4 | | | t
3005 | DB2 | 1 | | | t
3006 | Apache HTTP Server | 4 | | | t
3007 | Oracle | 1 | | | t
3008 | PostgreSQL | 1 | | | t
3009 | WebLogic | 2 | | | t
3010 | Adobe ColdFusion | 2 | | | t
3011 | Sybase | 1 | | | t
4007 | Microsoft SQL Server 2005 (EXPRESS) | 1 | | | t
4008 | Microsoft Team Foundation Server | 2 | | | t
4009 | Microsoft .NET | 99 | | | t
3012 | Apache Tomcat | 2 | | | f
Apache Tomcat repeats 2 times when this table shown in an dropdown. How do I disable the last row which is 3012? I have an option to delete it but this table is used in several places so I don't want to delete it.
I think the more important question is why is it in your database twice. Depending on what you're using this query for ignoring one could be just as wrong as deleting one. Maybe it has somehting to do with that being the only entry with display f. I guess you could just do WHERE display != 'f'
Tomcat appears twice because it's in the table twice, once with ID 3001 and again with ID 3012. I'd call that a problem with the data, not with the query. Does it really need to be there twice? You say you're reluctant to delete one of them because it's used in several places, but if you don't fix it now, it'll only be harder to fix later.
I'd focus on merging those duplicate rows. Find all the records (in other tables) that refer to one of this pair, and update them to refer to the other one instead. Then you can delete the one that nothing refers to anymore.
Why not use the display column? Sounds like it is for filtering out things that are not supposed to be shown. So, you could do a query like this:
SELECT appid, appname
FROM stat_applications
WHERE appid <> 0
AND appid <> 1000
AND display = 't'
If you don't want to use display, then you could just add 3012 to your list of things to ignore:
SELECT appid, appname
FROM stat_applications
WHERE appid <> 0
AND appid <> 1000
AND appid <> 3012
Or, since your exclusion list is getting longer, use NOT IN:
SELECT appid, appname
FROM stat_applications
WHERE appid NOT IN (0, 1000, 3012)

Categories