Complex sorting on MySQL database - php

I'm facing the following situation.
We've got an CMS with an entity with translations. These translations are stored in a different table with a one-to-many relationship. For example newsarticles and newsarticle_translations. The amount of available languages is dynamically determined by the same CMS.
When entering a new newsarticle the editor is required to enter at least one translation, which one of the available languages he chooses is up to him.
In the newsarticle overview in our CMS we would like to show a column with the (translated) article title, but since none of the languages are mandatory (one of them is mandatory but i don't know which one) i don't really know how to construct my mysql query to select a title for each newsarticle, regardless of the entered language.
And to make it all a little harder, our manager asked for the possibilty to also be able to sort on title, so fetching the translations in a separate query is ruled out as far as i know.
Anyone has an idea on how to solve this in the most efficient way?
Here are my table schema's it it might help
> desc news;
+-----------------+----------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+----------------+------+-----+-------------------+----------------+
| id | int(10) | NO | PRI | NULL | auto_increment |
| category_id | int(1) | YES | | NULL | |
| created | timestamp | NO | | CURRENT_TIMESTAMP | |
| user_id | int(10) | YES | | NULL | |
+-----------------+----------------+------+-----+-------------------+----------------+
> desc news_translations;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| enabled | tinyint(1) | NO | | 0 | |
| news_id | int(1) unsigned | NO | | NULL | |
| title | varchar(255) | NO | | | |
| summary | text | YES | | NULL | |
| body | text | NO | | NULL | |
| language | varchar(2) | NO | | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
PS: i've though about subqueries and coalesce() solutions but those seem rather dirty tricks, wondering if something better is know that i'm not thinking of?

This is not a fast approach, but I think it gives you what you want.
Let me know how it works, and we can work on speed next :)
select nt.title
from news n
join news_translations nt on(n.id = nt.news_id)
where nt.title is not null
and nt.language = (
select max(x.language)
from news_translations x
where x.title is not null
and x.new_id = nt.news_id)
order
by nt.title;

Assuming I've read your problem aright, you want to get a list of titles for articles, preferring the "required" language? A query for that might go along the lines of ...
SELECT * FROM (
SELECT nt.`title`, nt.news_id
FROM news n
INNER JOIN news_translations nt ON (n.id = nt.news_id)
WHERE title != ''
ORDER BY
CASE
WHEN nt.language = 'en' THEN 3
WHEN nt.language = 'jp' THEN 2
WHEN nt.language = 'de' THEN 1
ELSE 0 END DESC
) AS t1
GROUP BY `news_id`
This example prefers a title in English (en) if available, Japanese (jp) as a second preference, and German (de) as a third, but will display the first 'other' entry if none of the requested languages are available.

Related

Database structure for multiple locations of an organisation

I'm creating an application using PHP (Codeigniter/MySQL) and within the application are organisations.
Each organisation can have multiple locations, regions, departments, etc (I'm calling these areas)
Each area has an administrator, and sometimes I will need to escalate things to a higher area.
I've currently got all the data in 1 table, and I am using a parent_area_id and area_level to determine the parents,children etc.
But I think this is very inefficient, and I've been pointed towards closure loops, which I have no knowledge of.
Here the database table, is this ok, will it be efficient or is there a better way to do it?
+----------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+----------------+
| area_id | int(12) | NO | PRI | NULL | auto_increment |
| area_title | varchar(40) | NO | | NULL | |
| area_name | varchar(40) | NO | | NULL | |
| address1 | varchar(40) | YES | | NULL | |
| address2 | varchar(40) | YES | | NULL | |
| address3 | varchar(40) | YES | | NULL | |
| town | varchar(20) | YES | | NULL | |
| county | varchar(20) | YES | | NULL | |
| post_code | varchar(10) | YES | | NULL | |
| has_ra | varchar(1) | YES | | 0 | |
| org_id | int(12) | NO | MUL | NULL | |
| parent_area_id | int(8) | YES | | NULL | |
| area_level | int(1) | YES | | NULL | |
+----------------+-------------+------+-----+---------+----------------+
EDIT:
(better explanation of how this is being used)
1) Areas relate to customers of the business only.
2) The areas are different area(region,location,department) that a customer might have. (South region, Oxford Office, Accounts Dept).
3) Each area may have many employees allocated.
SO
If I had a regional administrator for example, they might have the following areas under them: e.g:
South Region
Oxford office
Sales Department
Accounts Department
London Office
Marketing
Planning
SO
If I wanted to get the user_id's of all employees under the regional administrator, using the above database structure, i would need to:
1) Query the db to get all area_id's that have a parent_area_id of the regional administrator.
2) Loop through each returned area_id, and query the db and get all area_id's that have a parent_area_id of the returned area_id
3) Continue looping through returned area_id's until we get to the bottom level
4) Query the db to get all user_id's that have an area_id of all above returned records
SO
That doesn't seem very efficient, and needs multiple SQL queries and programming loops to get a list of users associated with a regional manager.
If thats the most efficient way to do it then fine I just don't seem convinced, and im sure there must be an easier way?
There's no serious problem here if you're dealing with a situation where you're escalating one level at a time. I've got no idea how "closure loops" would factor in here, that's programming related, not a database schema concern, and is largely a matter of personal preference.
So long as you don't violate the Zero, One or Infinity Rule of design, you should be okay. Your multiple address fields here skirt the line, that might be better represented as a single field that accepts multiple lines of text, but that is also how a lot of databases traditionally represent arbitrary street addresses.

SELECT id FROM table WHERE id=$_GET['id'] AND user1=$user OR user2=$user

I'm trying to build a similar facebook style messaging system (conversations).
This is the conversation table.
DESCRIBE conversation;
+----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+----------------+
| c_id | int(11) | NO | PRI | NULL | auto_increment |
| user_one | int(11) | NO | | NULL | |
| user_two | int(11) | NO | | NULL | |
| ip | varchar(30) | NO | | NULL | |
| time | int(11) | NO | | NULL | |
+----------+-------------+------+-----+---------+----------------+
Now before the user can read a conversation, I need to check if the conversation (c_id) exists, and if the user is the owner of the given conversation id. What is the best possible way to write this query?
Example of what I have, which is not working:
$cid = intval($_GET['cid']);
$conv = $this->db->fetchRow('SELECT c_id FROM `conversation` WHERE
user_one=? OR
user_two=? AND
c_id=?',
array($this->user->id, $this->user->id, $cid));
if ($conv) {
// get the conversation replies etc..
}
I see a couple of problems.
One is that you seem to have overlooked that AND has a higher precedence than OR. So the logic of your condition works as if you had written it this way:
WHERE user_one=? OR (user_two=? AND c_id=?)
Whereas I would guess that you intended the logic to work this way:
WHERE (user_one=? OR user_two=?) AND c_id=?
But if that's how you intended it to work, I wonder why you need to search for the user id's at all, since the condition on c_id=? will select only one row (or zero rows if there's no match), because it's searching for one specific primary key value.

Store data in MySQL or a PHP file?

I am working on a project and I ended up with the table below:
+---------------+--------------+------+-----+--------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+--------------------+-------+
| id | int(11) | NO | MUL | NULL | A_I |
| user _id | int(11) | NO | | NULL | |
| info | varchar(255) | NO | | NULL | |
| country | tinyint(3) | NO | | NULL | |
| date_added | timestamp | NO | | 0000-00-00 00:00:00| |
+---------------+--------------+------+-----+--------------------+-------+
Because I wanted to avoid storing countries as varchar all the time I thought I should use number IDs instead. My question is, would it be better to store the country IDs in a table where I would give a name to each one of them or do that in a php file? Countries won't change or anything. It will be a list of around 100 countries.
Thanks!
Use a seperate country table.
countries table
---------------
id
name
Then you can relate to the country ID in your table. That way you make sure only countries from your list are added and you don't need to store strings everywhere and you can easily change country names or addnew ones.

Conditional Join Statement in MySQL using IF-ELSE

I'm making a notification scheme for my social networking app. I've different kind of notification which are categorized in two groups: Friends-related and Events-related. Currently, my database schema is like this:
+---------------------+------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+------------------------+------+-----+---------+----------------+
| notification_id | int(11) | NO | PRI | NULL | auto_increment |
| notification_type | enum('event','friend') | NO | | NULL | |
| notification_date | datetime | NO | | NULL | |
| notification_viewed | bit(1) | NO | | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
+---------------------+------------------------+------+-----+---------+----------------+
Now, I've two different tables fro event-related notification and friend-related notification. Below is schema for event-related notification table:
+-------------------------+----------------------------------------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+----------------------------------------------------+------+-----+---------+-------+
| notification_id | int(11) | NO | PRI | NULL | |
| event_id | int(11) | NO | MUL | NULL | |
| event_notification_type | enum('added','kicked','new-message','info-edited') | NO | | NULL | |
+-------------------------+----------------------------------------------------+------+-----+---------+-------+
And again I've 4 more tables for each kicked, added, new-message, info-edited type of notification, since each requires to have it different kind of property (for example kicked requires a reason).
Now, I want to write a conditional SQL query such that it joins the notification with event_notification if notification_type is event otherwise different.
SELECT * FROM notification_table t WHERE t.seen = FALSE AND t.user_id = ? INNER JOIN event_notification en ON(t.notification_type='event' AND en.notification_id = t.notification_id) INNER JOIN .....
There is going to be so many inner joins is there any better way of doing it? I think my query is not very optimized either, would appreciate if any help could be provided.
You can use the joins. However, you want to create the query using left outer joins rather than inner joins:
SELECT *
FROM notification_table t
WHERE t.seen = FALSE AND t.user_id = ? left JOIN
event_notification en
ON(t.notification_type='event' AND en.notification_id = t.notification_id) left JOIN ...
Don't worry about the proliferation of joins. If your tables have proper indexing, they will perform fine.
Do consider changing the data structure so you have only one table for the different notification types. Having a few fields that are not used does not add much performance overhead, especially when you consider the complications of having so many joins and the additional management overhead of having more tables.

Sphinx Search indexing some fields but not others

I'm using generic Sphinx with Python (though I tested this against PHP as well and got the same problem). I have a table where I have several fields I want to be able to search in sphinx against but it seems like only some of the fields get indexed.
Here's my source (dbconfig just has the connection information):
source bill_src : dbconfig
{
sql_query = \
SELECT id,title,official_title,summary,state,chamber,UNIX_TIMESTAMP(last_action) AS bill_date FROM bill
sql_attr_timestamp = bill_date
sql_query_info = SELECT * FROM bill WHERE id=$id
}
Here's the index
index bills
{
source = bill_src
path = /var/data/bills
docinfo = extern
charset_type = sbcs
}
I'm trying to use extended match mode. It seems that title and summary are fine but the official_title, the state and the chamber fields are ignored in the index. So for example if I do:
#official_title Affordable Care Act
I get:
query error: no field 'official_title' found in schema
but the same query with #summary produces results. Any ideas what I'm missing?
EDIT
Here's the table I'm trying to index:
+--------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| bt50_id | int(11) | YES | MUL | NULL | |
| type | varchar(10) | YES | | NULL | |
| title | varchar(255) | YES | | NULL | |
| official_title | text | YES | | NULL | |
| summary | text | YES | | NULL | |
| congresscritter_id | int(11) | NO | MUL | NULL | |
| last_action | datetime | YES | | NULL | |
| sunlight_id | varchar(45) | YES | | NULL | |
| number | int(11) | YES | | NULL | |
| state | char(2) | YES | | NULL | |
| chamber | varchar(45) | YES | | NULL | |
| session | varchar(45) | YES | | NULL | |
| featured | tinyint(1) | YES | | 0 | |
| source_url | varchar(255) | YES | | | |
+--------------------+--------------+------+-----+---------+----------------+
I seem to have fixed the problem, though I'll admit this is all dumb luck so it might not be a root cause:
First I thought maybe it didn't like the order of the fields in the query I have the only attribute field last so I decided to move it to after the ID:
SELECT id, UNIX_TIMESTAMP(last_action) AS bill_date, \
title,official_title,summary,state,chamber, FROM bill
This did not fix the problem.
Secondly, I noticed all the example date fields are converted using UNIX_TIMESTAMP and then aliased to the same name, so instead of UNIX_TIMESTAMP(last_action) AS bill_date I changed it to UNIX_TIMESTAMP(last_action) AS last_action ... the first attempt tripped me up though because it still wasn't working.
Finally I dropped the date altogether and added each field successfully (re-indexing and testing each time). Each time it worked and finally I added the date field on the end and I was able to sort by it and search all the fields. So the final query is:
SELECT \
id,title,official_title,summary,state,chamber, \
UNIX_TIMESTAMP(last_action) AS last_action FROM bill
It seems that attribute fields must come after the full text fields and aliases must be the same name as the actual field name. I find it strange that the date field seemed fine but other fields suddenly disappeared (randomly!).
I hope this helps someone else though I feel it might be some kind of isolated bug that doesn't affect many people. (This is on OSX and sphinx was compiled by hand)
Little rusty on sphinx, but believe in your source { } clause needs a sql_field_string definition.
source bill_src : dbconfig
{
sql_query = \
SELECT \
id,title,official_title,summary,state,chamber, \
UNIX_TIMESTAMP(last_action) AS bill_date \
FROM bill
sql_attr_timestamp = bill_date
sql_field_string = official_title
sql_query_info = SELECT * FROM bill WHERE id=$id
}
According to http://sphinxsearch.com/docs/1.10/conf-sql-field-string.html the sql_field_string declaration will index and store the string for referencing. That's different from a sql_attr_string, which is stored but not indexed.

Categories