I'm setting up to gather long time statistics. It will be recorded in little blocks that I'm planning to stick all into one TEXT field, latest first.. sorta like this
[date:03.01.2016,data][date:02.01.2016,data][date:01.01.2016,data]...
it will be more frequent than that (just a sample) but should remain small enough to keep recording for decades, yet big enough to make me want to optimize it.
I'm looking for 2 things
Can you append to the front of a field in mysql?
Can you read the field partially, just the first 100 characters for example?
The blocks will be fixed length so I can accurately estimate how many characters I need to download to display statistics for X time period.
The answer to your two questions is "yes":
update t
set field = concat($newval, field)
where id = $id;
And:
select left(field, 100)
from t
where id = $id;
(These assume that you have multiple rows in the table.)
That said, you method of storing the data is absolutely not the right thing to do in a relational database.
Presumably, you want a table that looks something like this:
create table t (
tId int auto_increment primary key,
creationDate date,
data <something>
);
(This may be more complicated if data should be multiple columns.)
Then you insert into the table:
insert into t(createDate, data)
select $date, $data;
And you can fetch the most recent row:
select t.*
from t
order by tId desc
limit 1;
All of these are just examples, because your question doesn't give a complete picture of the data.
Related
I wrote a small script which uses the concept of long polling.
It works as follows:
jQuery sends the request with some parameters (say lastId) to php
PHP gets the latest id from database and compares with the lastId.
If the lastId is smaller than the newly fetched Id, then it kills the
script and echoes the new records.
From jQuery, i display this output.
I have taken care of all security checks. The problem is when a record is deleted or updated, there is no way to know this.
The nearest solution i can get is to count the number of rows and match it with some saved row count variable. But then, if i have 1000 records, i have to echo out all the 1000 records which can be a big performance issue.
The CRUD functionality of this application is completely separated and runs in a different server. So i dont get to know which record was deleted.
I don't need any help coding wise, but i am looking for some suggestion to make this work while updating and deleting.
Please note, websockets(my fav) and node.js is not an option for me.
Instead of using a certain ID from your table, you could also check when the table itself was modified the last time.
SQL:
SELECT UPDATE_TIME
FROM information_schema.tables
WHERE TABLE_SCHEMA = 'yourdb'
AND TABLE_NAME = 'yourtable';
If successful, the statement should return something like
UPDATE_TIME
2014-04-02 11:12:15
Then use the resulting timestamp instead of the lastid. I am using a very similar technique to display and auto-refresh logs, works like a charm.
You have to adjust the statement to your needs, and replace yourdb and yourtable with the values needed for your application. It also requires you to have access to information_schema.tables, so check if this is available, too.
Two alternative solutions:
If the solution described above is too imprecise for your purpose (it might lead to issues when the table is changed multiple times per second), you might combine that timestamp with your current mechanism with lastid to cover new inserts.
Another way would be to implement a table, in which the current state is logged. This is where your ajax requests check the current state. Then generade triggers in your data tables, which update this table.
You can get the highest ID by
SELECT id FROM table ORDER BY id DESC LIMIT 1
but this is not reliable in my opinion, because you can have ID's of 1, 2, 3, 7 and you insert a new row having the ID 5.
Keep in mind: the highest ID, is not necessarily the most recent row.
The current auto increment value can be obtained by
SELECT AUTO_INCREMENT FROM information_schema.tables
WHERE TABLE_SCHEMA = 'yourdb'
AND TABLE_NAME = 'yourtable';
Maybe a timestamp + microtime is an option for you?
I build a like system for a website and I'm front of a dilemma.
I have a table where all the items which can be liked are stored. Call it the "item table".
In order to preserve the speed of the server, do I have to :
add a column in the item table.
It means that I have to search (with a regex in my PHP) inside a string where all the ID of the users who have liked the item are registered, each time a user like an item. This in order verify if the user in question has (or not) already liked the item before. In this case, I show a different button on my html.
Problem > If I have (by chance) 3000 liked on an item, I fear the string to begin very big and heavy to regex each time ther is a like
on it...
add a specific new table (LikedBy) and record each like separately with the ID of the liker, the name of the item and the state of the like (liked or not).
Problem > In this case, I fear for the MySQL server with thousand of rows to analyze each time a new user like one popular item...
Server version: 5.5.36-cll-lve MySQL Community Server (GPL) by Atomicorp
Should I put the load on the PHP script or the MySql Database? What is the most performant (and scalable)?
If, for some reasons, my question does not make sens could anyone tell me the right way to do the trick?
thx.
You have to create another table call it likes_table containing id_user int, id_item int that's how it should be done, if you do like your proposed first solution your database won't be normalized and you'll face too many issues in the future.
To get count of like you just have to
SELECT COUNT(*) FROM likes_table WHERE id_item='id_item_you_are_looking_for';
To get who liked what:
SELECT id_item FROM likes_table WHERE id_user='id_user_you_are_looking_for';
No regex needed nothing, and your database is well normalized for data to be found easily. You can tell mysql to index id_user and id_item making them unique in likes_table this way all your queries will run much faster
With MySQL you can set the user ID and the item ID as a unique pair. This should improve performance by a lot.
Your table would have these 2 columns: item id, and user id. Every row would be a like.
I have a table in MySQL that I'm accessing from PHP. For example, let's have a table named THINGS:
things.ID - int primary key
things.name - varchar
things.owner_ID - int for joining with another table
My select statement to get what I need might look like:
SELECT * FROM things WHERE owner_ID = 99;
Pretty straightforward. Now, I'd like users to be able to specify a completely arbitrary order for the items returned from this query. The list will be displayed, they can then click an "up" or "down" button next to a row and have it moved up or down the list, or possibly a drag-and-drop operation to move it to anywhere else. I'd like this order to be saved in the database (same or other table). The custom order would be unique for the set of rows for each owner_ID.
I've searched for ways to provide this ordering without luck. I've thought of a few ways to implement this, but help me fill in the final option:
Add an INT column and set it's value to whatever I need to get rows
returned in my order. This presents the problem of scanning
row-by-row to find the insertion point, and possibly needing to
update the preceding/following rows sort column.
Having a "next" and "previous" column, implementing a linked list.
Once I find my place, I'll just have to update max 2 rows to insert
the row. But this requires scanning for the location from row #1.
Some SQL/relational DB trick I'm unaware of...
I'm looking for an answer to #3 because it may be out there, who knows. Plus, I'd like to offload as much as I can on the database.
From what I've read you need a new table containing the ordering of each user, say it's called *user_orderings*.
This table should contain the user ID, the position of the thing and the ID of the thing. The (user_id, thing_id) should be the PK. This way you need to update this table every time but you can get the things for a user in the order he/she wants using ORDER BY on the user_orderings table and joining it with the things table. It should work.
The simplest expression of an ordered list is: 3,1,2,4. We can store this as a string in the parent table; so if our table is photos with the foreign key profile_id, we'd place our photo order in profiles.photo_order. We can then consider this field in our order by clause by utilizing the find_in_set() function. This requires either two queries or a join. I use two queries but the join is more interesting, so here it is:
select photos.photo_id, photos.caption
from photos
join profiles on profiles.profile_id = photos.profile_id
where photos.profile_id = 1
order by find_in_set(photos.photo_id, profiles.photo_order);
Note that you would probably not want to use find_in_set() in a where clause due to performance implications, but in an order by clause, there are few enough results to make this fast.
I'm having some trouble approaching a +1/-1 voting system in PHP, it should vaguely resemble the SO voting system. On average, it will get about ~100 to ~1,000 votes per item, and will be viewed by many.
I don't know whether I should use:
A database table dedicated for voting, which has the userid and their vote... store their vote as a boolean, then calculate the "sum" of the votes in MySQL.
A text field in the "item" table, containing the uids that already voted (in a serialized array), and also a numeric field that contains the total sum of the votes.
A numeric field in the "item" table, that contains the total sum of the votes, then store whether or not the user voted in a text field (with a serialized array of the poll id).
Something completely different (please post some more ideas!)
I'd probably go with option 3 that you've got listed above. By putting the total number of votes as another column in the item table you can get the total number of votes for an item without doing any more sql queries.
If you need to store which user voted on which item I'd probably create another table with the fields of item, user and vote. item would be the itemID, user would be the userID, vote would contain + or - depending on whether it's an up or down vote.
I'm guessing you'll only need to access this table when a user is logged in to show them which items they've voted on.
I recommend storing the individual votes in one table.
In another table store the summary information like question/poll ID, tally
Do one insert in to the individual votes table.
For the summary table you can do this:
$votedUpOrDown = ($voted = 1) ? 1 : -1;
$query = 'UPDATE summary SET tally = tally + '.$votedUpOrDown.' WHERE pollid = '.$pollId;
I'd go with a slight variant of the first option:
A database table dedicated for voting, which has the userid and their vote... store their vote as a boolean, then calculate the "sum" of the votes in MySQL.
Replace the boolean with an integer: +1 for an up-vote and -1 for a down-vote.
Then, instead of computing the sum over and over again, keep a running total somewhere; every time there is an up-vote, add one to the total and subtract one every time there is a down-vote. You could do this with an insert-trigger in the database or you could send an extra UPDATE thing SET vote_total = vote_total + this_vote to the database when adding new votes.
You'd probably want a unique constraint on the thing/userid pair in the vote tracking table too.
Keeping track of individual votes makes it easy to keep people from voting twice. Keeping a running total makes displaying the total quick and easy (and presumably this will be the most common operation).
Adding a simple sanity checker that you can run to ensure that the totals match the votes would be a nice addition as well.
serialized array: Please don't do that, such things make it very difficult to root around the database by hand to check and fix things, serialized data structures also make it very difficult (impossible in some cases) to properly constrain your data with foreign keys, check constraints, unique constraints, and what have you. Storing serialized data structures in the database is usually a bad idea unless the database doesn't need to know anything about the data other than how to give it back to you. Packing an array into a text column is a recipe for broken and inconsistent data in your database: broken code is easy to fix, broken data is often forever.
I know i am writing query's wrong and when we get a lot of traffic, our database gets hit HARD and the page slows to a grind...
I think I need to write queries based on CREATE VIEW from the last 30 days from the CURDATE ?? But not sure where to begin or if this will be MORE efficient query for the database?
Anyways, here is a sample query I have written..
$query_Recordset6 = "SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC";
Any help or suggestions would be great! I have about 11 queries like this, but I am confident if I could get help on one of these, then I can implement them to the rest!!
Putting a wildcard on the left side of a value comparison:
LIKE '%xyz'
...means that an index can not be used, even if one exists. Might want to consider using Full Text Searching (FTS), which means adding full text indexing.
Normalizing the data would be another step to consider - categories should likely be in a separate table.
SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC
The LIKE '%45%' means a full table scan will need to be performed. Are you perhaps storing a list of categories in the column? If so creating a new table storing category and news_article_id will allow an index to be used to retrieve the matching records much more efficiently.
OK, time for psychic debugging.
In my mind's eye, I see that query performance would be improved considerably through database normalization, specifically by splitting the category multi-valued column into a a separate table that has two columns: the primary key for cute_news and the category ID.
This would also allow you to directly link said table to the categories table without having to parse it first.
Or, as Chris Date said: "Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else)."
Anything with LIKE '%XXX%' is going to be slow. Its a slow operation.
For something like categories, you might want to separate categories out into another table and use a foreign key in the cute_news table. That way you can have category_id, and use that in the query which will be MUCH faster.
Also, I'm not quite sure why you're talking about using CREATE VIEW. Views will not really help you for speed. Not unless its a materialized view, which MySQL doesn't suppose natively.
If your database is getting hit hard, the solution isn't to make a view (the view is still basically the same amount of work for the database to do), the solution is to cache the results.
This is especially applicable since, from what it sounds like, your data only needs to be refreshed once every 30 days.
I'd guess that your category column is a list of category values like "12,34,45,78" ?
This is not good relational database design. One reason it's not good is as you've discovered: it's incredibly slow to search for a substring that might appear in the middle of that list.
Some people have suggested using fulltext search instead of the LIKE predicate with wildcards, but in this case it's simpler to create another table so you can list one category value per row, with a reference back to your cute_news table:
CREATE TABLE cute_news_category (
news_id INT NOT NULL,
category INT NOT NULL,
PRIMARY KEY (news_id, category),
FOREIGN KEY (news_id) REFERENCES cute_news(news_id)
) ENGINE=InnoDB;
Then you can query and it'll go a lot faster:
SELECT n.`date`, n.title, c.category, n.url, n.comments
FROM cute_news n
JOIN cute_news_category c ON (n.news_id = c.news_id)
WHERE c.category = 45
ORDER BY n.`date` DESC
Any answer is a guess, show:
- the relevant SHOW CREATE TABLE outputs
- the EXPLAIN output from your common queries.
And Bill Karwin's comment certainly applies.
After all this & optimizing, sampling the data into a table with only the last 30 days could still be desired, in which case you're better of running a daily cronjob to do just that.