I have a question as to a better way of doing this as I have a very large database with a lot symbols. "Hence a, aa... etc"
I would like to know if I can actually query every table also desc order would be nice. In one line. Otherwise I will have to type thousands of unions and it will be a pain later as the database will be changed often. As a table is erased and another joins it place.
Every table has the Date column and would like to search based on a date.
Thank you in advance.
I.E.
SELECT * from a where Date = '2017-07-31' union
SELECT * from aa where Date = '2017-07-31' union
SELECT * from aaap where Date = '2017-07-31' union
SELECT * from aabvf where Date = '2017-07-31' union
I mean, you COULD....
SELECT * FROM a,aa,aaap,aabvf WHERE date='2017-07-21'
Ahmed helped me out. As to why my data structure is like that. Well. If you have better suggested I'm opening to it. So.
Why.
Basically I have data in the form of symbols
I.E. A, AA that are stock tickers
They have dates that are unique keys to open, high, low, various other stock measurements.
So why I would want to grab just a single date. It's basically the top date or "today" to display and chart. So I can do various other things with the data.
If you have another method of storing I'm open.
I written a java program (not normally a web developer) that mines the data and that form and stores how I suggested. Which I could change, if you have a better way. I would love to hear. Also. If you have opinion on how to store data faster with MySQL I would love to hear. Currently I have few hundred threads that basically store data. Each thread handles a symbol. It creates a table if it doesn't exist with the ticker name and puts its data in columns separated date (unique key) open, high, etc... also various other operations the incoming data and stores that. Thank you for the answer and thank you if you have a better method !
Ps sorry I didn't mean chart. I display the top date as a table with corresponding data attached!
Related
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Which is faster/best? SELECT * or SELECT column1, colum2, column3, etc
I recall reading a number of years ago that in the scenario where you wanted to select everything from a MySQL table it was more efficient and better practice to specify every column rather than use the lazy and less-efficient SELECT * approach.
I am struggling to find any evidence of this online, so i'm not sure if it applies with newer versions of MySQL and PHP.
Would it be better to specify every column in my SELECT rather than using SELECT *?
SELECT * FROM golf_course WHERE id = 2;
Select * is inherently less efficient because the database has to look up the columns. Further if you have even one join you are sending unnecessary repeated data which is wasteful of database and network resourses, particularly if you do it on every query. Finally, select * doesn't specify the order of the columns, so if someone foolish drops and recreates the table with the columns in a different order, you may suddenly have your Social security number showing up in your first name column on the form or report. And if someone adds a column that you don't want displayed everywhere (say for auditing purposes or notes about the customer that you don't want the customer to ever see) you are in trouble. Further, if you do add columns, you need to determine where they should show up anyway and why not just have them willy nilly show up everywhere. Select * is an extremely bad SQL antipattern.
Yes.
Your query will be easier to understand
You will only select the columns you need
If you specify all columns, it will be easier to remove those you don't want later but if you didn't and you want to do later, you will need to type all the columns
You'll be able to arrange the columns the way you want and reference them in your code even if more columns are added to the table later
You can easily add computation, concatenation
and more
Performance-wise, I am not sure
IMHO, calling a SELECT * will have to read all fields where are calling only the required fields will be more efficient. And when you are querying a larger database the performance may be affected using SELECT *
I am making a social networking site and can't decide what the best way to get data from various tables is to display in a feed. Everytime something is stored it has a timestamp stored against it so I was wondering the best way to retrieve data from various different tables ordered by timestamp, and limited to 20 results per page. Ideally I would like mysql to query all of the different tables and order and limit it for me but because the different tables are not all neccessarily related and different data needs to be returned depending on what the table is for I don't think this is going to be possible. I can query each table individually of course but then how do I sort and order all of the information into pages so that all of the different entities are in one ordered list together. The server side language I use is PHP with the codeigniter framework.
Anyone got any ideas?
Can you establish a common format for what's returned out of all the separate tables? So for example, you would write a query that got back FeedTitle, FeedSummary, and Timestamp:
select top 20 *
from (
select a.Title as FeedTitle,
a.A + a.B + a.C as FeedSummary,
a.Timestamp as TimeStamp
from a
union all
select b.Name + ' married ' + b.Spouse as FeedTitle,
b.AtPlace as FeedSummary,
b.TimeStamp as TimeStamp
from b
) as allFeeds
order by TimeStamp desc
Not sure on the exact my-sql syntax, this will work in SQL Server and should be very similar. It's just pseudocode anyway, the idea is that you'd do some of your application logic in the database in order to hopefully gain a performance boost (so you don't have to sort through lots of data in PHP).
Another approach would be to return the last 20 from each table and let the client side sort through them. So send them all to the UI and let jQuery code display the top 20, then let the users select the type of feed dynamically, and they'd see the top 20 stories in any one type or any combination of types.
I know i am writing query's wrong and when we get a lot of traffic, our database gets hit HARD and the page slows to a grind...
I think I need to write queries based on CREATE VIEW from the last 30 days from the CURDATE ?? But not sure where to begin or if this will be MORE efficient query for the database?
Anyways, here is a sample query I have written..
$query_Recordset6 = "SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC";
Any help or suggestions would be great! I have about 11 queries like this, but I am confident if I could get help on one of these, then I can implement them to the rest!!
Putting a wildcard on the left side of a value comparison:
LIKE '%xyz'
...means that an index can not be used, even if one exists. Might want to consider using Full Text Searching (FTS), which means adding full text indexing.
Normalizing the data would be another step to consider - categories should likely be in a separate table.
SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC
The LIKE '%45%' means a full table scan will need to be performed. Are you perhaps storing a list of categories in the column? If so creating a new table storing category and news_article_id will allow an index to be used to retrieve the matching records much more efficiently.
OK, time for psychic debugging.
In my mind's eye, I see that query performance would be improved considerably through database normalization, specifically by splitting the category multi-valued column into a a separate table that has two columns: the primary key for cute_news and the category ID.
This would also allow you to directly link said table to the categories table without having to parse it first.
Or, as Chris Date said: "Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else)."
Anything with LIKE '%XXX%' is going to be slow. Its a slow operation.
For something like categories, you might want to separate categories out into another table and use a foreign key in the cute_news table. That way you can have category_id, and use that in the query which will be MUCH faster.
Also, I'm not quite sure why you're talking about using CREATE VIEW. Views will not really help you for speed. Not unless its a materialized view, which MySQL doesn't suppose natively.
If your database is getting hit hard, the solution isn't to make a view (the view is still basically the same amount of work for the database to do), the solution is to cache the results.
This is especially applicable since, from what it sounds like, your data only needs to be refreshed once every 30 days.
I'd guess that your category column is a list of category values like "12,34,45,78" ?
This is not good relational database design. One reason it's not good is as you've discovered: it's incredibly slow to search for a substring that might appear in the middle of that list.
Some people have suggested using fulltext search instead of the LIKE predicate with wildcards, but in this case it's simpler to create another table so you can list one category value per row, with a reference back to your cute_news table:
CREATE TABLE cute_news_category (
news_id INT NOT NULL,
category INT NOT NULL,
PRIMARY KEY (news_id, category),
FOREIGN KEY (news_id) REFERENCES cute_news(news_id)
) ENGINE=InnoDB;
Then you can query and it'll go a lot faster:
SELECT n.`date`, n.title, c.category, n.url, n.comments
FROM cute_news n
JOIN cute_news_category c ON (n.news_id = c.news_id)
WHERE c.category = 45
ORDER BY n.`date` DESC
Any answer is a guess, show:
- the relevant SHOW CREATE TABLE outputs
- the EXPLAIN output from your common queries.
And Bill Karwin's comment certainly applies.
After all this & optimizing, sampling the data into a table with only the last 30 days could still be desired, in which case you're better of running a daily cronjob to do just that.
I'm running a sql query to get basic details from a number of tables. Sorted by the last update date field. Its terribly tricky and I'm thinking if there is an alternate to using the UNION clause instead...I'm working in PHP MYSQL.
Actually I have a few tables containing news, articles, photos, events etc and need to collect all of them in one query to show a simple - whats newly added on the website kind of thing.
Maybe do it in PHP rather than MySQL - if you want the latest n items, then fetch the latest n of each of your news items, articles, photos and events, and sort in PHP (you'll need the last n of each obviously, and you'll then trim the dataset in PHP). This is probably easier than combining those with UNION given they're likely to have lots of data items which are different.
I'm not aware of an alternative to UNION that does what you want, and hopefully those fetches won't be too expensive. It would definitely be wise to profile this though.
If you use Join in your query you can select datas from differents tables who are related with foreign keys.
You can look of this from another angle: do you need absolutely updated information? (the moment someone enters new information it should appear)
If not, you can have a table holding the results of the query in the format you need (serving as cache), and update this table every 5 minutes or so. Then your query problem becomes trivial, as you can have the updates run as several updates in the background.
Say you've got a database like this:
books
-----
id
name
And you wanted to get the total number of books in the database, easiest possible sql:
"select count(id) from books"
But now you want to get the total number of books last month...
Edit: but some of the books have been
deleted from the table since last month
Well obviously you cant total for a month thats already past - the "books" table is always current and some of the records have already been deleted
My approach was to run a cron job (or scheduled task) at the end of the month and store the total in another table, called report_data, but this seems clunky. Any better ideas?
Add a default column that has the value GETDATE(), call it "DateAdded". Then you can query between any two dates to find out how many books there were during that date period or you can just specify one date to find out how many books there were before a certain date (all the way into history).
Per comment: You should not delete, you should soft delete.
I agree with JP, do a soft delete/logical delete. For the one extra AND statement per query it makes everything a lot easier. Plus, you never lose data.
Granted, if extreme size becomes an issue, then yeah, you'll potentially have to start physically moving/removing rows.
My approach was to run a cron job (or scheduled task) at the end of the month and store the total in another table, called report_data, but this seems clunky.
I have used this method to collect and store historical data. It was simpler than a soft-delete solution because:
The "report_data" table is very easy to generate reports/graphs from
You don't have to implement special soft-delete code for anything that needs to delete a book
You don't have to add "and active = 1" to the end of every query that selects from the books table
Because the code to do the historical reporting is isolated from everything else that uses books, this was actually the less clunky solution.
If you needed data from the previous month then you should not have deleted the old data. Instead you can have a "logical delete."
I would add a status field and some dates to the table.
books
_____
id
bookname
date_added
date_deleted
status (active/deleted)
From there you would be able to query:
SELECT count(id) FROM books WHERE date_added <= '06/30/2009' AND status = 'active'
NOTE: It my not be the best schema, but you get the idea... ;)
If changing the schema of the tables is too much work I would add triggers that would track the changes. With this approach you can track all kinds of things like date added, date deleted etc.
Looking at your problem and the reluctance in changing the schema and the code, I would suggest you to go with your idea of counting the books at the end of each month and storing the count for the month in another table. You can use database scheduler to invoke a SP to do this.
You have just taken a baby step down the road of history databases or data warehousing.
A data warehouse typically stores data about the way things were in a format such that later data will be added to current data instead of superceding current data. There is a lot to learn about data warehousing. If you are headed down that road in a serious way, I suggest a book by Ralph Kimball or Bill Inmon. I prefer Kimball.
Here's the websites: http://www.ralphkimball.com/
http://www.inmoncif.com/home/
If, on the other hand, your first step into this territory is the only step you plan to take, your proposed solution is good enough.
The only way to do what you want is to add a column to the books table "date_added". Then you could run a query like
select count(id) from books where date_added <= '06/30/2009';