Cross Join to DQL - php

I'm trying to convert this I think simple mysql query into Doctrine dql, however, Im experience quite a struggle right now...
SELECT (c.prix-aggregates.AVG) AS test
FROM immobilier_ad_blank c
CROSS JOIN (
SELECT AVG(prix) AS AVG
FROM immobilier_ad_blank)
AS aggregates
Purpose of this: creating z-score.
Original implementation coming from this question Calculating Z-Score for each row in MySQL? (simple)
I thought about creating an association within the entity, but I mean its not necessary, its only for stats.
Edit: Btw, I dont wanna use raw SQL, I will extract the "subquery" from another query builder expression using getDQL. Otherwise, I will have to rewrite my dynamic query builder to take in account for rawSQL.
Edit 2:
Tried this
$subQb = $this->_em->createQueryBuilder();
$subQb->addSelect("AVG(subC.prix) as AMEAN")
->from("MomoaIntegrationBundle:sources\Common", "subC");
$subDql = $subQb->getDQL();
$dql = "SELECT c.prix FROM MomoaIntegrationBundle:sources\Common c INNER JOIN ($subDql) AS aggregates";
Raw dql is:
SELECT c.prix FROM MomoaIntegrationBundle:sources\Common c INNER JOIN (SELECT AVG(subC.prix) as AMEAN FROM MomoaIntegrationBundle:sources\Common subC) AS aggregates
Getting this strange error:line 0, col 70 near '(SELECT AVG(subC.prix)': Error: Class '(' is not defined.
Edit 3:
I found kinda of a hawkish way to make it work but doctrine is stubborn with its implementation of entities and such and forgot that STATISTICS do NOT need ENTITIES !
$subQb = $this->_em->createQueryBuilder();
$subQb->addSelect("AVG(subC.prix) as AMEAN")
->from("MomoaIntegrationBundle:sources\Common", "subC");
$sql = "SELECT (c.prix-aggregates.sclr_0) AS test FROM immobilier_ad_blank c CROSS JOIN "
. "({$subQb->getQuery()->getSQL()}) AS aggregates";
$stm = $stm = $this->_em->getConnection()->prepare($sql);
$stm->execute();
$data = $stm->fetchAll();
If you have a better solution, Im all ears ! I actually dislike this solution.

Starting with Doctrine 2.4 it is possible to JOIN without using a defined association, for example:
SELECT u FROM User u JOIN Items i WITH u.age = i.price
This one doesn't make any sense but you get the point. The WITH keyword is absolutely required in this case, otherwise it is a syntax error, but you can just provide a dummy condition, like so:
SELECT u FROM User u JOIN Items i WITH 0 = 0
This essentially results in a cross join. Whether this is a good idea in a given situation is a different question, but I have encountered situations where this was indeed very useful.

For complex queries you might want to consider bypassing DQL and using a native query - especially since you don't need the result in an entity.
$connection = $em->getConnection();
$statement = $connection->prepare("
select c.prix-aggregates, t1.avg
from immobilier_ad_blank
cross join (
select avg(prix) as avg
from immobilier_ad_blank
) t1
");
$statement->execute();
$results = $statement->fetchAll();

Related

How to improve query performance (using explain command results f.e.)

I'm currently running this query. However, when run outside phpMyAdmin it causes a 504 timeout error. I'm thinking it has to do with how efficient the number of rows is returned or accessed by the query.
I'm not extremely experienced with MySQL and so this was the best I could do:
SELECT
s.surveyId,
q.cat,
SUM((sac.answer_id*q.weight))/SUM(q.weight) AS score,
user.division_id,
user.unit_id,
user.department_id,
user.team_id,
division.division_name,
unit.unit_name,
dpt.department_name,
team.team_name
FROM survey_answers_cache sac
JOIN surveys s ON s.surveyId = sac.surveyid
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
JOIN cluster c ON sc.cluster_id = c.cluster_id
JOIN user ON user.user_id = sac.user_id
JOIN questions q ON q.question_id = sac.question_id
JOIN division ON division.division_id = user.division_id
LEFT JOIN unit ON unit.unit_id = user.unit_id
LEFT JOIN department dpt ON dpt.department_id = user.department_id
LEFT JOIN team ON team.team_id = user.team_id
WHERE c.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
GROUP BY user.team_id, s.surveyId, q.cat
ORDER BY s.surveyId, user.team_id, q.cat ASC
The problem I get with this query is that when I get a correct result returned it runs quickly (let's say +-500ms) but when the result has twice as much rows, it takes more than 5 minutes and then causes a 504 timeout.
The other problem is that I didn't create this database myself, so I didn't set the indices myself. I'm thinking of improving these and therefore I used the explain command:
I see a lot of primary keys and a couple double indices, but I'm not sure if this would affect the performance this greatly.
EDIT: This piece of code takes up all the execution time:
$start_time = microtime(true);
$stmt = $conn->query($query); //query is simply the query above.
while ($row = $stmt->fetch_assoc()){
$resultSurveys["scores"][] = $row;
}
$stmt->close();
$end_time = microtime(true);
$duration = $end_time - $start_time; //value typically the execution time #reallyHigh...
So my question: Is it possible to (greatly?) improve the performance of the query by altering the database keys or should I divide my query into multiple smaller queries?
You can try something like this ( although its not practical for me to test this )
SELECT
sac.surveyId,
q.cat,
SUM((sac.answer_id*q.weight))/SUM(q.weight) AS score,
user.division_id,
user.unit_id,
user.department_id,
user.team_id,
division.division_name,
unit.unit_name,
dpt.department_name,
team.team_name
FROM survey_answers_cache sac
JOIN
(
SELECT
s.surveyId,
sc.subcluster_id
FROM
surveys s
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
JOIN cluster c ON sc.cluster_id = c.cluster_id
WHERE
c.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
) AS v ON v.surveyid = sac.surveyid
JOIN user ON user.user_id = sac.user_id
JOIN questions q ON q.question_id = sac.question_id
JOIN division ON division.division_id = user.division_id
LEFT JOIN unit ON unit.unit_id = user.unit_id
LEFT JOIN department dpt ON dpt.department_id = user.department_id
LEFT JOIN team ON team.team_id = user.team_id
GROUP BY user.team_id, v.surveyId, q.cat
ORDER BY v.surveyId, user.team_id, q.cat ASC
So I hope I didn't mess anything up.
Anyway, the idea is in the inner query you select only the rows you need based on your where condition. This will create a smaller tmp table as it only pulls 2 fields both ints.
Then in the outer query you join to the tables that you actually pull the rest of the data from, order and group. This way you are sorting and grouping on a smaller dataset. And your where clause can run in the most optimal way.
You may even be able to omit some of these tables as your only pulling data from a few of them, but without seeing the full schema and how it's related that's hard to say.
But just generally speaking this part (The sub-query)
SELECT
s.surveyId,
sc.subcluster_id
FROM
surveys s
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
JOIN cluster c ON sc.cluster_id = c.cluster_id
WHERE
c.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
Is what is directly affected by your WHERE clause. See so we can optimize this part then use it to join the rest of the data you need.
An example of removing tables can be easily deduced from the above, consider this
SELECT
s.surveyId,
sc.subcluster_id
FROM
surveys s
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
WHERE
sc.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
The c table cluster is never used to pull data from, only for the where. So is not
JOIN cluster c ON sc.cluster_id = c.cluster_id
WHERE
c.cluster_id=?
The same as or equivalent to
WHERE
sc.cluster_id=?
And therefore we can eliminate that join completely.
The EXPLAIN result is showing signs of problem
Using temporary;using filesort: the ORDER BY needs to create temporary tables to do the sorting.
On 3rd row for user table type is ALL, key and ref are NULL: means that it needs to scan the whole table each time to retrieve results.
Suggestions:
add indexes on user.cluster_id and all fields involved on the ORDER BY and GROUP by clauses. Keep in mind that user table seems to be under changein database (cross database query).
Add indexes on user columns involved on JOINs.
Add index to s.survey_id
If possible, keep the same sequence for GROUP BY and ORDER BY clauses
According to the accepted answer in this question move the JOIN on user table to the first position in the join queue.
Carefully read this official documentation. You may need to optimize the server configuration.
PS: query optimization is an art that requires patience and hard work. No silver bullet for that.
Welcome to the fine art of optimizing MySQL!
i think the problem happends when you add this:
JOIN user ON user.cluster_id = sc.subcluster_id
JOIN survey_answers_cache sac ON (sac.surveyId = s.surveyId AND sac.user_id = user.user_id)
the extra condition sac.user_id = user.user_id can be easily not consistent.
Can you try do a second join with user table?
pd. can you add a "SHOW CREATE TABLE"

mySql INNER JOIN Confusion

I have been racking my brains for ages trying to figure this out but just can't seem to get my head round it. There is plenty of information out there but I have been unable to get it right.
I understand that using the following method is not good:
WHERE id IN (SELECT.....
I am trying to optimize the query below as I have many of this type in one script and it is making the page dreadfully slow:
$res = mysqli_query($DB, 'SELECT * FROM nde WHERE id IN (SELECT relto FROM ft_n_rel WHERE rel_fr = '.$DB->real_escape_string($rid).' AND rel_ty = '.$DB->real_escape_string($FA).') LIMIT 1');
Trying to do that query optimizes with INNER JOIN is completely baffling me.
Can anyone help?
Regards
Sub-selects and in array clauses are always very slow.
Try:
SELECT n.*
FROM nde n
INNER JOIN ft_n_rel f ON n.id = f.relto
WHERE f.rel_fr = '.$DB->real_escape_string($rid).'
AND f.rel_ty = '.$DB->real_escape_string($FA).'
LIMIT 1
#daremachine has a good point as well, run your select with EXPLAIN or EXPLAIN EXTENDED to check if indexes exist and are properly used.
Update
To check the page with explain prefix you query as follows:
EXPLAIN
SELECT n.*
FROM nde n
INNER JOIN ft_n_rel f ON n.id = f.relto
WHERE f.rel_fr = '.$DB->real_escape_string($rid).'
AND f.rel_ty = '.$DB->real_escape_string($FA).'
LIMIT 1

Query to get data from relational databases in MySQL

I have this database format below (taken from phpmyadmin, the tables are relational already):
I'm trying to get all "videos.Video_Name, videos.Video_URL" with a certain "tags.Tag_Name" via the "tagmap" relational mapping. I've never really used MySQL before for anything more than SELECT's and DELETE's and the syntax of JOIN is proving too much to bear, and at this point it'd just be faster to ask for help than to keep bashing my head against it.
I know I should be using JOIN but I have no idea of the syntax to accomplish what I want.
The completely invalid query I tried was:
SELECT videos.Video_URL, videos.Video_Name
FROM tagmap
INNER JOIN videos ON videos.Video_ID = tagmap.Video_ID
INNER JOIN tagmap ON tagmap.Tag_ID = tags.Tag_ID
WHERE tags.Tag_Name = '$_GET[tag]'
But it returned no rows.
If your query returned no raws and did not a return an error then it's not "completely invalid".
Indeed, looking at the code, it should do exactly what you say you are trying to achieve. Hence if it's returning no rows then the reason must be that there is no matching data.
Break it down to find out where the data is missing:
SELECT COUNT(*)
FROM tags
WHERE tags.Tag_Name = '$_GET[tag]';
If you get a non-zero value then try....
SELECT COUNT(DISTINCT tagmap.Video_ID), COUNT(*)
FROM tags INNER JOIN tagmp
ON tags.tag_ID=tagmap.tag_ID
WHERE tags.Tag_Name = '$_GET[tag]';
(BTW you might want to read up on SQL Injection).
Now Try this one.
SELECT videos.Video_Name, videos.Video_URL FROM videos,tags,tagmap
WHERE videos.Video_ID = tagmap.Video_ID AND tags.Tag_ID = tagmap.Tag_ID AND
tags.Tag_Name='$_GET[tag]'
Same Result with Joins
SELECT videos.Video_Name, videos.Video_URL FROM tagmap
RIGHT JOIN videos ON videos.Video_ID = tagmap.Video_ID
LEFT JOIN tags ON tags.Tag_ID = tagmap.Tag_ID
WHERE tags.Tag_Name = '$_GET[tag]'
Hope it will not give any error.
Thanks.

Get aggregated column from query with multiple joins to single table

I'm struggling with retreiving data in propel 1.6.7.
Here's the sample of query I want to achieve:
select count(1) as amount, p1.local_id FROM panel_data pd
JOIN panel_data_has_code pp1 on (pd.panel_data_id = pp1.panel_data_id)
JOIN panel_code p1 on (pp1.panel_code_id = p1.panel_code_id AND p1.type='equipment' AND model_id = 'my_model_id')
JOIN panel_data_has_code pp2 on (pd.panel_data_id = pp2.panel_data_id)
JOIN panel_code p2 on (pp2.panel_code_id = p2.panel_code_id AND p2.type='model' AND p2.local_id='my_local_id')
GROUP BY p1.local_id
However whenever I try to construct proper criteria in Propel ORM, I'm having an issue - apparently propel always translate join alias to table name, whenever I try to use add join condition or use alias in select method. I've replaced multiple join condition with simple filterBy method (I always use inner joins, so effect will be the same), but I'm still having issue with retreiving groupped column (p1.local_id).
Here's the code I work with atm:
PanelDataQuery::create()
->select(array( 'p1.localId'))
->withColumn('count(1)', 'amount')
->usePanelDataHasCodeQuery('pp1')->usePanelCodeQuery('p1')->groupByLocalId()->filterByType($type)->endUse()->endUse()
->usePanelDataHasCodeQuery('pp2')->usePanelCodeQuery('p2')->filterByType('model')->filterByLocalId($model)->endUse()->endUse()
->find();
Above statement returns an error:
Unable to execute SELECT statement [SELECT count(1) AS amount, panel_code.LOCAL_ID AS \"p1.localId\" FROM panel_data INNER JOIN panel_data_has_code pp1 ON (panel_data.PANEL_DATA_ID=pp1.PANEL_DATA_ID) INNER JOIN panel_code p1 ON (pp1.PANEL_CODE_ID=p1.PANEL_CODE_ID) INNER JOIN panel_data_has_code pp2 ON (panel_data.PANEL_DATA_ID=pp2.PANEL_DATA_ID) INNER JOIN panel_code p2 ON (pp2.PANEL_CODE_ID=p2.PANEL_CODE_ID) WHERE p1.TYPE=:p1 AND p2.TYPE=:p2 AND p2.LOCAL_ID=:p3 GROUP BY p1.LOCAL_ID] [wrapped: SQLSTATE[42000]: [Microsoft][SQL Server Native Client 11.0][SQL Server]The multi-part identifier \"panel_code.LOCAL_ID\" could not be bound.]
Obviously, select clause causes the problem. Any ideas how to force propel to use table alias instead of translating it to table name?
Thank you in advance for any help.
Found partial solution:
PanelDataQuery::create()
->withColumn('p1.local_id','localId')
->withColumn('count(1)', 'amount')
->select(array( 'localId','amount'))
[...]
Seems to work fine. Altought I'd appreciate any other solutions. Still would be great how to deal with similar problem in addJoinCondition.
When i replace:
->filterByType('model')->filterByLocalId($model)
From the original query, with:
->addJoinCondition('p2', 'p2.localId = ?', $model)
->addJoinCondition('p2', "p2.type = 'model'");
I got similar issue as stated in error: propel translates p2.localId to panel_code.local_id which isn't correct (multiple panel_code table joins with aliases are included in query).

Strange Doctrine behaviour with double innerJoin()

I have a database schema like this:
My database schema: http://i.stack.imgur.com/vFKRk.png
To explain the context: One user writes one message. He can send it to one or more users.
I succeeded to get the title of message, the author for one user. However Doctrine, which I use for this project, do it with 2 queries. It's a little bit strange for me and I'm looking to understand, why. Normally, we can do it with one SQL query.
My DQL query:
$q = Doctrine_Query::create()->select('id_me, users_id_us, state_me, type_me, mc.title_mc, us.login_us') ->from('messages m')->innerJoin('m.messages_content mc')->innerJoin('mc.Users us') ->where('users_id_us = ?', $user)->limit($opt['limit'])->offset($opt['offset'])->orderBy($opt['order']);return $q->fetchArray();
SQL queries returned by Doctrine:
SELECT DISTINCT m3.id_me FROM messages m3 INNER JOIN messages_content m4 ON m3.messages_content_id_mc = m4.id_mc INNER JOIN users u2 ON m4.users_id_us = u2.id_us WHERE m3.users_id_us = '6' ORDER BY m3.id_me DESC LIMIT 2
SELECT m.id_me AS m__id_me, m.users_id_us AS m__users_id_us, m.state_me AS m__state_me, m.type_me AS m__type_me, m2.id_mc AS m2__id_mc, m2.title_mc AS m2__title_mc, u.id_us AS u__id_us, u.login_us AS u__login_us FROM messages m INNER JOIN messages_content m2 ON m.messages_content_id_mc = m2.id_mc INNER JOIN users u ON m2.users_id_us = u.id_us WHERE m.id_me IN ('11') AND (m.users_id_us = '6') ORDER BY m.id_me DESC
Why my Doctrine query doesn't return the query like this:
SELECT m.id_me, m.users_id_us, m.state_me, m.type_me, mc.title_mc, u.login_us FROM messages m JOIN messages_content mc ON mc.id_mc = m.messages_content_id_mc JOIN users u ON u.id_us = mc.users_id_us WHERE m.users_id_us = 6;
Any idea to transform my DQL query and execute it one time ?
The ORM Limit issue
This has to do with the LIMIT part. :) Doctrine LIMIT works a bit different than MySQL limit does.
MySQL LIMIT just issues the query, and stops searching as soon as n rows are found that matches your SQL query. Since in ORM, this is really unexpected behaviour (it might well be that in a scalar layout the SQL SELECT * FROM myModel LEFT JOIN someOtherModel ON someCondition LIMIT 3 actually only returns one myModel instance rather than three, since a left join can result 3 rows.
What does Doctrine do?
If your DQL query is FROM school s INNER JOIN s.students LIMIT 15, it means: give me 15 instances of school that have at least one student associated (hence INNER JOIN), with ALL of their student associates. To do this, Doctrine first asks for DISTINCT school with the exact same query parameters and a LIMIT part, to figure out which 15 school IDs should be returned. After this is done, these IDs are queried next, without the LIMIT part.
How to solve your issue
If you are not having an actual problem, huzzah, find the explaination of this behaviour above. If your query output is other than you expected, make sure you take these steps into consideraton. If for instance your DQL is FROM school s INNER JOIN s.students LIMIT 15, and you are wondering why you get more than 15 students, try: FROM students s INNER JOIN s.school LIMIT 15. In MySQL this means basically the same (disregarding the order of the result), though in Doctrine, this means you will get 15 students instead of 15 schools.
This bothered me too with some of the more complex queries. The only solution that I found was to bypass the sillyness altogether:
$q = Doctrine_Manager::getInstance()->getCurrentConnection();
$my_result = $q->fetchAssoc(" ... PUT SQL HERE ... ");
The solution which works:
I changed the relation alias and specified the columns participating in ON joint between messages_content and users.
The right Doctrine query is:
$q = Doctrine_Query::create()
->select('id_me, users_id_us, state_me, type_me, mc.title_mc, us.login_us')
->from('messages m')
->innerJoin('m.messages_content mc')
->innerJoin('m.Users us ON mc.users_id_us=us.id_us')
->where('users_id_us = ?', $user)
->limit($opt['limit'])
->offset($opt['offset'])
->orderBy($opt['order']);
It gives a SQL query like this:
SELECT m.id_me AS m__id_me, m.users_id_us AS m__users_id_us, m.state_me AS m__state_me, m.type_me AS m__type_me, m2.id_mc AS m2__id_mc, m2.title_mc AS m2__title_mc, u.id_us AS u__id_us, u.login_us AS u__login_us FROM messages m INNER JOIN messages_content m2 ON m.messages_content_id_mc = m2.id_mc INNER JOIN users u ON (m2.users_id_us = u.id_us) WHERE (m.users_id_us = '7') ORDER BY m.id_me DESC LIMIT 2
Tom and Pelle ten Cate, thanks for your participation.

Categories