Ok, I have two tables that I am joining and doing a select on to get three fields back (timestamp, message, and articleNum). Now, this data is basically a log of when an article was created and modified, I have to write the logic to process those three fields I got back. In order to figure out when an article was created and/or modified last, I need to look at the message and look for the keywords ("added" for created and "updated" for modified). I have the results of the query back in an assoc array and I would eventually like to have the end result be in an assoc array with the articleNum being the key and a two value array (created, modified) being the key. Sometimes though, there won't always be a modified value, but there will always be a created. Any idea on how I would even start a problem like this?
EDIT
From what I can tell, it looks like the date is stored as a bigint in Unix seconds. Clarification: the created and modified values are not fields from the table, I need to figure it out from the message field. There will always be one added time but sometimes there could be 0 or more updated messages and I would need to figure out the latest.
EDIT 2
Ok, sorry about the wording of the question. After looking at the problem a little longer I realized I could do this all in two SQL statements. For finding the added date I used:
"SELECT MIN(action_logs.time_added), article.number
FROM action_logs
JOIN proposal ON action_logs.article_num = article.number
WHERE action_logs.message LIKE '%added%'
GROUP BY article.number"
Could probably do the same thing for last modified, except with a MAX. Thanks for the suggestions though.
This is a poorly worded question. It smells a bit of this
What are the two tables? What is the message field? Presumably events?
Why don't you just have 'created' and 'updated' in the articles tables?
Are you sure your design is sound?
I'm making some assumptions here:
If you're table stores a row for each of your messages (added/updated), then this query would return one row per article, with the columns you need (articleNum, Created, Updated):
SELECT
A.articleNum,
MCreated.Timestamp AS Created,
MUpdated.Timestamp Updated
FROM Articles A
JOIN Messages MCreated
ON MCreated.articleNum = A.articleNum
AND MCreated.Message = 'added'
LEFT JOIN Messages MUpdated
ON MUpdated.articleNum = A.articleNum
AND MUpdated.Message = 'updated'
Try something like this:
$articles = array();
while ($row = mysql_fetch_assoc($result)) {
$articles[$row['articleId']] = array(
'created' => $row['created'],
'modified' => $row['modified']
);
}
Not sure I understand the question..
You want to look in message and extract Created Time and Modified Time?
If that's the case, what is the date/time format? It should be very easy to find that information using a regular expression.
Ian
Edit -
If you want to extract them from the message that you get out of the query, then regular expressions are a likely candidate for completing this task.
Post a sample message.
Related
im currently doing a query that pulls a string from my db. but it has to check for every new row i import. and a file of 100k takes almost 4 hours to import. thats way too long. im assuming that my sql code to check if he exist is the thing slowing it down.
ive heard about indexing but i have no clue what it is or how to use it.
this is the current code im using:
$sql2 = $pdo->prepare('SELECT * FROM prospects WHERE :ssn = ssn');
$sql2->execute(array(':ssn' => $ssn));
if($sql2->fetch(PDO::FETCH_NUM) > 0){
so everytime the phpscript reads a new row, it does this check. problem is, that i cant put it in "on duplicate key" in the sql code. it has to check before going to any sql, because if this is empty, then it should continue doing its thing.
what could i do to make this more efficient regarding time? and also, if index is the way to go, could someone enlighten me how this would be done by either posting examples, linking a guide or php.net page. and how i could read from that index to do what i am in my code
So you have 100k records and you don't have any index? Start then with creating one
CREATE INDEX ssn_index ON prospects (ssn)
Now, each time when you try to select something from prospects table with where condition on ssn column MySQL is going to check where it should look for the records by the index. If this column is strongly selective (there are many different values) the query is going to be performed fast.
You can check your execution plan by querying
EXPLAIN SELECT * FROM prospects WHERE :ssn = ssn
I have two databases I need to work with, db_site and db_forum (these are generic names, FYI).
db_site has a table called main-news, which has a forumurl field which holds a forum thread id and a views field which holds the current pageviews for the article entry in the database. db_forum has a table called forum_threads which has a tid field and a replies field.
I have two things I need to do, one using just the replies and another using the replies and the views. I assume once the former is figured out the latter won't be much more than adding some extra parts, so I'm concerned with the former for the time being.
Not sure how I should approach this since the two tables are in different databases. The login I'm using has access to both of them (AFAIK), so that isn't the problem, it's more of the syntax involved. Would what I'm looking to do be something like this, perhaps?
SELECT
db_forum.forum_threads.replies AS replies
FROM
`db_forum.forum_threads` AS f,
`db_site.main-news` AS s
WHERE
f.tid = s.forumurl
That's a rough guess, from what I can find online abut doing this type of query. Any help is appreciated. :)
First of all, you should indent your SQL code properly. That long line was almost unreadable.
SELECT
db_forum.forum_threads.replies AS replies
FROM
`db_forum.forum_threads` AS f,
`db_site.main-news` AS s
WHERE
f.tid = s.forumurl
Then, make use of your table aliases "f" and "s". You introduced them, so you have to use them:
SELECT
f.replies AS replies
FROM
`db_forum.forum_threads` AS f,
`db_site.main-news` AS s
WHERE
f.tid = s.forumurl
Finally, you should remove the unnecessary quoting:
SELECT
f.replies AS replies
FROM
db_forum.forum_threads AS f,
db_site.main-news AS s
WHERE
f.tid = s.forumurl
If the names of the fields are indicative of their function, then f.tid refers to an identity column while s.forumurl does not. Normally the s.formurl in this case would be a foreign key. Just a guess.
I'm storing a list of items in a serialized array within a field in my database (I'm using PHP/MySQL).
I want to have a query that will select all the records that contain a specific one of these items that is in the array.
Something like this:
select * from table WHERE (an item in my array) = '$n'
Hopefully that makes sense.
Any ideas would be greatly appreciated.
Thanks
As GWW says in the comments, if you need to query things this way, you really ought to be considering storing this data as something other than a big-ole-string (which is what your serialized array is).
If that's not possible (or you're just lazy), you can use the fact that the serialized array is just a big-ole-string, and figure out a LIKE clause to find matching records. The way PHP serializes data is pretty easy to figure out (hint: those numbers indicate lengths of things).
Now, if your serialized array is fairly complex, this will break down fast. But if it's a flat array, you should be able to do it.
Of course, you'll be using LIKE '%...%', so you'll get no help from any indicies, and performance will be very poor.
Which is why folks are suggesting you store that data in some normalized fashion, if you need to query "inside" it.
If you have control of the data model, stuffing serialized data in the database will bite you in the long run just about always. However, oftentimes one does not have control over the data model, for example when working with certain open source content management systems. Drupal sticks a lot of serialized data in dumpster columns in lieu of a proper model. For example, ubercart has a 'data' column for all of its orders. Contributed modules need to attach data to the main order entity, so out of convenience they tack it onto the serialized blob. As a third party to this, I still need a way to get at some of the data stuffed in there to answer some questions.
a:4:{s:7:"cc_data";s:112:"6"CrIPY2IsMS1?blpMkwRj[XwCosb]gl<Dw_L(,Tq[xE)~(!$C"9Wn]bKYlAnS{[Kv[&Cq$xN-Jkr1qq<z](td]ve+{Xi!G0x:.O-"=yy*2KP0#z";s:7:"cc_txns";a:1:{s:10:"references";a:1:{i:0;a:2:{s:4:"card";s:4:"3092";s:7:"created";i:1296325512;}}}s:13:"recurring_fee";b:1;s:12:"old_order_id";s:2:"25";}
see that 'old_order_id'? thats the key I need to find out where this recurring order came from, but since not everybody uses the recurring orders module, there isnt a proper place to store it in the database, so the module developer opted to stuff it in that dumpster table.
My solution is to use a few targeted SUBSTRING_INDEX's to chisel off insignificant data until I've sculpted the resultant string into the data gemstone of my desires.
Then I tack on a HAVING clause to find all that match, like so:
SELECT uo.*,
SUBSTRING_INDEX(
SUBSTRING_INDEX(
SUBSTRING_INDEX( uo.data, 'old_order_id' , -1 ),
'";}', 1),
'"',-1)
AS `old order id`
FROM `uc_orders AS `uo`
HAVING `old order id` = 25
The innermost SUBSTRING_INDEX gives me everything past the old_order_id, and the outer two clean up the remainder.
This complicated hackery is not something you want in code that runs more than once, more of a tool to get the data out of a table without having to resort to writing a php script.
Note that this could be simplified to merely
SELECT uo.*,
SUBSTRING_INDEX(
SUBSTRING_INDEX( uo.data, '";}' , 1 ),
'"',-1)
AS `old order id`
FROM `uc_orders` AS `uo`
HAVING `old order id` = 25
but that would only work in this specific case (the value I want is at the end of the data blob)
So you mean to use MySQL to search in a PHP array that has been serialized with the serialize command and stored in a database field? My first reaction would be: OMG. My second reaction would be: why? The sensible thing to do is either:
Retrieve the array into PHP, unserialize it and search in it
Forget about storing the data in MySQL as serialized and store it as a regular table and index it for fast search
I would choose the second option, but I don't know your context.
Of course, if you'd really want to, you could try something with SUBSTRING or another MySQL function and try to manipulate the field, but I don't see why you'd want to. It's cumbersome, and it would be an unnecessary ugly hack. On the other hand, it's a puzzle, and people here tend to like puzzles, so if you really want to then post the contents of your field and we can give it a shot.
You can do it like this:
SELECT * FROM table_name WHERE some_field REGEXP '.*"item_key";s:[0-9]+:"item_value".*'
But anyway you should consider storing that data in a separate table.
How about you serialize the value you're searching for?
$sql = sprintf("select * from tbl WHERE serialized_col like '%%%s%%'", serialize($n));
or
$sql = sprintf("select * from tbl WHERE serialized_col like '%s%s%s'", '%', serialize($n), '%');
Working with php serialized data is obviously quite ugly, but I've got this one liner mix of MySQL functions that help to sort that out:
select REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(searchColumn, 'fieldNameToExtract', -1), ';', 2), ':', -1), '"', '') AS extractedFieldName
from tableName as t
having extractedFieldName = 'expressionFilter';
Hope this can help!
Well, i had the same issue, and apparently it's a piece of cake, but maybe it needs more tests.
Simply use the IN statement, but put the field itself as array!
Example:
SELECT id, title, page FROM pages WHERE 2 IN (child_of)
~ where '2' is the value i'm looking for inside the field 'child_of' that is a serialized array.
This serialized array was necessary because I cannot duplicate the records just for storing what id they were children of.
Cheers
If I have attribute_dump field in log table and the value in one of its row has
a:69:{s:9:"status_id";s:1:"2";s:2:"id";s:5:"10215"}
If I want to fetch all rows having status_id is equal to 2, then the query would be
SELECT * FROM log WHERE attribute_dump REGEXP '.*"status_id";s:[0-9]+:"2".*'
There is a good REGEX answer above, but it assumes a key and value implementation. If you just have values in your serialized array, this worked for me:
value only
SELECT * FROM table WHERE your_field_here REGEXP '.*;s:[0-9]+:"your_value_here".*'
key and value
SELECT * FROM table WHERE your_field_here REGEXP '.*"array_key_here";s:[0-9]+:"your_value_here".*'
For easy method use :
column_field_name LIKE %VALUE_TO_BE_SEARCHED_FOR%
in MySQL query
You may be looking for an SQL IN statement.
http://www.w3schools.com/sql/sql_in.asp
You'll have to break your array out a bit first, though. You can't just hand an array off to MySQL and expect it will know what to do with it. For that, you may try serializing it out with PHP's explode.
http://php.net/manual/en/function.explode.php
Select * from table where table_field like '%"enter_your_value"%'
select * from postmeta where meta_key = 'your_key' and meta_value REGEXP ('6')
foreach( $result as $value ) {
$hour = unserialize( $value->meta_value );
if( $hour['date'] < $data['from'] ) {
$sum = $sum + $hour['hours'];
}
}
I've got a somewhat complicated question for you cakephp experts.
Basically, I have created a db table called "locations". Every month I will get this table sent to me in csv format from a client. Unfortunately, instead of updating this table, I will have to empty it and reimport all of the records. Unfortunately, I cannot alter this table at all.
Functionality wise, users will have the ability to look at a display of these records, and be able to choose to hide certain ones. This "hidden" attribute must be persistent and survive the month to month purging of all records.
I had all of this working yesterday. What I did was, create a separate table called location_properties (columns were: id(int), location_id(foreign key), is_hidden(boolean)). When showing these records, it would simply check to see if "is_hidden==true".
This was all well and good(AND WORKING!), but then my boss kind of gummed up the works. He told me to delete the "is_hidden" column from the table because it would be more efficient. That I should be able to simply check for the existence of the location_id to hide or show it.
It doesn't appear to be quite that simple. Anyone know how I can pull this off? I've tried everything I can think of.
Your boss is wrong.
It's more efficient to add your column, than it is too delete and re-import the locations every month.
Did he say it was less efficient, or did you do an actual benchmark to see if its harms performance too much?
At first glance I see 2 solutions:
1) add a condition array('Location.id' => 'NOT NULL')
2) change join type to right join
I hope this helps
I am just learning php as I go along, and I'm completely lost here. I've never really used join before, and I think I need to here, but I don't know. I'm not expecting anyone to do it for me but if you could just point me in the right direction it would be amazing, I've tried reading up on joins but there are like 20 different methods and I'm just lost.
Basically, I hand coded a forum, and it works fine but is not efficient.
I have board_posts (for posts) and board_forums (for forums, the categories as well as the sections).
The part I'm redoing is how I get the information for the last post for the index page. The way I set it up is that to avoid using joins, I have it store the info for latest post in the table for board_forums, so say there is a section called "Off Topic" there I would have a field for "forum_lastpost_username/userid/posttitle/posttime" which I woudl update when a user posts etc. But this is bad, I'm trying to grab it all dynamically and get rid of those fields.
Right now my query is just like:
`SELECT * FROM board_forums WHERE forum_parent='$forum_id''
And then I have the stuff where I grab the info for that forum (name, description, etc) and all the data for the last post is there:
$last_thread_title = $forumrow["forum_lastpost_title"];
$last_thread_time = $forumrow["forum_lastpost_time"];
$lastpost_username = $forumrow["forum_lastpost_username"];
$lastpost_threadid = $forumrow["forum_lastpost_threadid"];
But I need to get rid of that, and get it from board_posts. The way it's set up in board_posts is that if it's a thread, post_parentpost is NULL, if it's a reply, then that field has the id of the thread (first post of the topic). So, I need to grab the latest post_date, see which user posted that, THEN see if parentpost is NULL (if it's null then the last post is a new thread, so I can get all the info of the title and user there, but if it's not, then I need to get the info (title, id) of the first post in that thread (which can be found by seeing what post_parentpost is, looking up that ID and getting the title from it.
Does that make any sense? If so please help me out :(
Any help is greatly appreciated!!!!
Updating board___forums whenever a post or a reply is inserted is - regarding performance - not the worst idea. For displaying the index page you only have to select data from one table board_forums - this is definitely much faster than selecting a second table to get the "last posts' information", even when using a clever join.
You are better off just updating the stats on each action, New Post, Delete Post etc.
The other instances would not likely require any stats update (deletion of a thread would trigger a forum update, to show one less topic in the topic count).
Think about all the actions the user would do, in most cases, you dont need to update any stats, therefore, getting the counts on the fly is very inefficient and you are right to think so.
It looks like you've already done the right thing.
If you were to join, you'd do it like this:
SELECT * FROM board_forums
JOIN board_posts ON board_posts.forum_id = board_forums.id
WHERE forum_parent = '$forum_id'
The problem with that, is that it gets you every post, which is not useful (and very slow). What you would want to do is something like this
SELECT * FROM board_forums
JOIN board_posts ON board_posts.forum_id = board_forums.id ORDER BY board_posts.id desc LIMIT 1
WHERE forum_parent = '$forum_id'
except SQL doesn't work like that. You can't order or limit on a join (or do many other useful things like that), so you have to fetch every row and then scan them in code (which sucks).
In short, don't worry. Use joins for the actual case where you do want to load all forums and all posts in one hit.
The simple solution will result in numerous queries, some optional, as you're already discovered.
The classic approach to this is to cache the results, and only retrieve it once in a while. The cache doesn't have to live long; even two or three seconds on a busy site will make a significant difference.
De-normalizing the data into a table you're already reading anyway will help. This approach saves you figuring out optional queries and can be a bit of a cheap win because it's just one more update when an insert is already happening. But it shifts some data integrity to the application.
As an aside, you might be running into the recursive-query problem with your threads. Relational databases do not store heirarchical data all that well if you use a "simple" algorithim. A better way is something sometimes called 'set trees'. It's a bit hard to Google, unfortunately, so here are some links.