PHP/MySQL - Analyzing common sets across multiple sets - php

Let's say I have two tables, people and families.
families has two fields - id and name. The name field contains the family surname.
people has three fields - id, family_id and name - The family_id is the id of the family that that person belongs to. The name field is that person's first name.
It's basically a one to many relationship with one family having many people.
I want to get a lists of name sets, ordered by the highest occurrence of the largest set of names across families.
That probably doesn't make much sense...
To explain what I want further, we can score each set of names. The 'score' is the array size * number of occurrences across families.
For example, let's say two names, 'John' and 'Jane' both existed in three families - That set's 'score' would be 2*3 = 6.
How could I get an array of sets of names, and the set's 'score', ordered by each set's score?
Sample Result Set (I've put it in a table layout, but this could be a multi-dimensional array in PHP) - Note this is just randomly thought up and doesn't reflect any statistical name data.
names | occurrences | score
Ben, Lucy | 4 | 8
Jane, John | 3 | 6
James, Rosie, Jack | 2 | 6
Charlie, Jane | 2 | 4
Just to clarify, I'm not interested in sets where:
The number of occurrences is 1 (obviously, just one family).
The set size is 1 (just a common name).
I hope I have explained my somewhat complex problem - if anyone needs clarification please say.

OK, got it:
<?php
require_once('query.lib.php');
$db=new database(DB_TYPE,DB_HOST,DB_USER,DB_PASS,DB_MISC);
$qry=new query('set names utf8',$db);
//Base query, this filters out names that are in just one family
$sql='select name, cast(group_concat(family order by family) as char) as famlist, count(*) as num from people group by name having num>0 order by num desc';
$qry=new query($sql,$db);
//$qry->result is something like
/*
Array
(
[name] => Array
(
[0] => cathy
[1] => george
[2] => jack
[3] => john
[4] => jane
[5] => winston
[6] => peter
)
[famlist] => Array
(
[0] => 2,4,5,6,8
[1] => 2,3,4,5,8
[2] => 1,3,5,7,8
[3] => 1,2,3,6,7
[4] => 2,4,7,8
[5] => 1,2,6,8
[6] => 1,3,6
)
[num] => Array
(
[0] => 5
[1] => 5
[2] => 5
[3] => 5
[4] => 4
[5] => 4
[6] => 3
)
)
$qry->rows=7
*/
//Initialize
$names=$qry->result['name'];
$rows=$qry->rows;
$lists=array();
for ($i=0;$i<$rows;$i++) $lists[$i]=explode(',',$qry->result['famlist'][$i]);
//Walk the list and populate pairs - this filters out pairs, that are specific to only one family
$tuples=array();
for ($i=0;$i<$rows;$i++) {
for ($j=$i+1;$j<$rows;$j++) {
$isec=array_intersect($lists[$i],$lists[$j]);
if (sizeof($isec)>1) {
//Every tuple consists of the name-list, the family list, the length and the latest used name
$tuples[]=array($names[$i].'/'.$names[$j],$isec,2,$j);
}
}
}
//Now walk the tuples again rolling forward, until there is nothing left to do
//We do not use a for loop just for style
$i=0;
while ($i<sizeof($tuples)) {
$tuple=$tuples[$i];
//Try to combine this tuple with all later names
for ($j=$tuple[3]+1;$j<$rows;$j++) {
$isec=array_intersect($tuple[1],$lists[$j]);
if (sizeof($isec)>0) $tuples[]=array($tuple[0].'/'.$names[$j],$isec,$tuple[2]+1,$j);
}
$i++;
}
//We have all the tuples, now we just need to extract the info and prepare to sort - some dirty trick here!
$final=array();
while (sizeof($tuples)>0) {
$tuple=array_pop($tuples);
//name list is in $tuple[0]
$list=$tuple[0];
//count is sizeof($tuple[1])
$count=sizeof($tuple[1]);
//length is in $tuple[2]
$final[]=$tuple[2]*$count."\t$count\t$list";
}
//Sorting and output is all that is left
rsort($final);
print_r($final);
?>
I am sorry I just realized I use a query lib that I can't source in here, but from the comment you will easily be able to create the arrays as in the section "Initialize".
Basically what I do is starting with the pairs I keep an array of the families all the names in the current name list belong to, then intersect it with all not-yet tried names.

Will this work?
SELECT
f.name AS 'surname',
GROUP_CONCAT(DISTINCT p.name ORDER BY p.name) AS 'names',
COUNT(DISTINCT p.name) AS 'distinct_names',
COUNT(p.id) AS 'occurrences',
COUNT(DISTINCT p.name) * COUNT(p.id) AS 'score'
FROM
families f
LEFT JOIN people p ON ( f.id = p.family_id )
GROUP BY
f.id
ORDER BY
f.name

Related

Pushing pointers to followers with the metadata (MySQL Query)

I’ve seen the following question on StackOverflow, Intelligent MySQL GROUP BY for Activity Streams posted by Christian Owens 12/12/12.
So I decided to try out the same approach, make two tables similar to those of his. And then I pretty much copied his query which I do understand.
This is what I get out from my sandbox:
Array
(
[0] => Array
(
[id] => 0
[user_id] => 1
[action] => published_post
[object_id] => 776286559146635
[object_type] => post
[stream_date] => 2015-11-24 12:28:09
[rows_in_group] => 1
[in_collection] => 0
)
)
I am curious, since looking at the results in Owens question, I am not able to fully get something, and does he perform additional queries to grab the actual metadata? And if yes, does this mean that one can do it from that single query or does one need to run different optimized sub-queries and then loop through the arrays of data to render the stream itself.
Thanks a lot in advanced.
Array
(
[0] => Array
(
[id] => 0
[user_id] => 1
[fullname] => David Anderson
[action] => hearted
[object_id] => array (
[id] => 3438983
[title] => Grand Theft Auto
[Category] => Games
)
[object_type] => product
[stream_date] => 2015-11-24 12:28:09
[rows_in_group] => 1
[in_collection] => 1
)
)
In "pseudo" code you need something like this
$result = $pdo->query('
SELECT stream.*,
object.*,
COUNT(stream.id) AS rows_in_group,
GROUP_CONCAT(stream.id) AS in_collection
FROM stream
INNER JOIN follows ON stream.user_id = follows.following_user
LEFT JOIN object ON stream.object_id = object.id
WHERE follows.user_id = '0'
GROUP BY stream.user_id,
stream.verb,
stream.object_id,
stream.type,
date(stream.stream_date)
ORDER BY stream.stream_date DESC
');
then parse the result and convert it in php
$data = array(); // this will store the end result
while($row = $result->fetch(PDO::FETCH_ASSOC)) {
// here for each row you get the keys and put it in a sub-array
// first copy the selected `object` data into a sub array
$row['object_data']['id'] = $row['object.id'];
$row['object_data']['title'] = $row['object.title'];
// remove the flat selected keys
unset($row['object.id']);
unset($row['object.title']);
...
$data[] = $row; // move to the desired array
}
you should get
Array
(
[0] => Array
(
[id] => 0
[user_id] => 1
[fullname] => David Anderson
[verb] => hearted
[object_data] => array (
[id] => 3438983
[title] => Grand Theft Auto
[Category] => Games
)
[type] => product
[stream_date] => 2015-11-24 12:28:09
[rows_in_group] => 1
[in_collection] => 1
)
)
It seems that you want a query where you can return the data you're actually able to get plus the user fullname and the data related to the object_id.
I think that the best effort would be to include some subqueries in your query to extract these data:
Fullname: something like (SELECT fullname FROM users WHERE id = stream.user_id) AS fullname... or some modified version using the stream.user_id, as we can't identify in your schema where this fullname comes from;
Object Data: something like (SELECT CONCAT_WS(';', id, title, category_name) FROM objects WHERE id = stream.object_id) AS object_data. Just as the fullname, we can't identify in your schema where these object data comes from, but I'm assuming it's an objects table.
One object may have just one title and may have just one category. In this case, the Object Data subquery works great. I don't think an object can have more than one title, but it's possible to have more than one category. In this case, you should GROUP_CONCAT the category names and take one of the two paths:
Replace the category_name in the CONCAT_WS for the GROUP_CONCAT of all categories names;
Select a new column categories (just a name suggestion) with the subquery which GROUP_CONCAT all categories names;
If your tables were like te first two points of my answer, a query like this may select the data, just needing a proper parse (split) in PHP:
SELECT
MAX(stream.id) as id,
stream.user_id,
(select fullname from users where id = stream.user_id) as fullname,
stream.verb,
stream.object_id,
(select concat_ws(';', id, title, category_name) from objects where id = stream.object_id) as object_data,
stream.type,
date(stream.stream_date) as stream_date,
COUNT(stream.id) AS rows_in_group,
GROUP_CONCAT(stream.id) AS in_collection
FROM stream
INNER JOIN follows ON 1=1
AND stream.user_id = follows.following_user
WHERE 1=1
AND follows.user_id = '0'
GROUP BY
stream.user_id,
stream.verb,
stream.object_id,
stream.type,
date(stream.stream_date)
ORDER BY stream.stream_date DESC;
In ANSI SQL you can't reference columns not listed in your GROUP BY, unless they're in aggregate functions. So, I included the id as an aggregation.

mysql get result of most frequent value of every post

I found some similar questions with good answers but i couldnt figure out how to apply this to my specific case. I have a site where users can rate there favorite post from 1-6. Every number is a different category.
Now i need to know the most frequently votes for every single post. So i need to count every post id and than the most frequent values of every post id.
After that i wanna update every result in another table. (dont know how to figure this out right now i'm not that good with Mysql yet).
this are the two columns where i need to know how often every post exist in post_id and what is the most frequently voting number of every single post.
just an example of my table (value = voting)
value | post_id
---------------
3 | 12
1 | 6
4 | 13
2 | 5
6 | 12
5 | 6
i need the output like this to know which post is mostly votet for which category.
post | most voted in this category
---------------
1 | 3
2 | 5
3 | 6
4 | 1
5 | 4
6 | 6
i need this for every post in the table. and than i would need to update every post in another table. i guess i have to do this in a loop.
but im already stuck at the first part.
all i have is this. for the first part.
<?php global $wpdb;
$test = $wpdb->get_results('SELECT posts_id, value, COUNT(posts_id) AS ActionCount
FROM rating_item_entry_value
GROUP BY posts_id
ORDER BY ActionCount DESC');
echo '<pre>';
print_r($test);
echo '</pre>';
and this is the output i get
Array
(
[0] => stdClass Object
(
[posts_id] => 0
[value] => 5
[ActionCount] => 7
)
[1] => stdClass Object
(
[posts_id] => 221
[value] => 3
[ActionCount] => 3
)
[2] => stdClass Object
(
[posts_id] => 197
[value] => 5
[ActionCount] => 2
)
[3] => stdClass Object
(
[posts_id] => 164
[value] => 3
[ActionCount] => 1
)
)
for the example.
I have no idea how to do this better, trying a lot but can't figure it out. does anyone has a good solution how to get the most frequent number for every single id? (and maybe how to safe the results in a variable to update every post in another table within a loop?) thank u so much for any help. regards
most frequently means count aggregation and group by frequency. you can map this to your problem:
select
x.amount,
count(*) as times -- I forgot that row
from
X x
group by
x.amount
order by
count(*) DESC
// edit to you mean that
select
post_id,
value,
count(*)
from
your_table
group by
post_id,
value
order by
count(*) desc

match all values in an array php

The user can search for something using select and checkbox forms. The data is sent in GET variables.
I'm collected all the possible values in variables and putting it into an array:
$term_taxomony_ids_array = array($term_taxonomy_id_m, $term_taxonomy_id_l, $term_taxonomy_id_t_1, $term_taxonomy_id_t_2, $term_taxonomy_id_t_3, $term_taxonomy_id_t_4, $term_taxonomy_id_t_5, $term_taxonomy_id_t_6, $term_taxonomy_id_t_7, $term_taxonomy_id_t_8);
print_r($term_taxomony_ids_array); would then give
eg:
Array (
[0] => 12
[1] => 14
[2] =>
[3] =>
[4] => 9
[5] =>
[6] => 2
[7] =>
[8] =>
[9] =>
)
How would I make this array simpler but leaving out the empty results altogether (as suggested in a comment)?
I need to find the 'places' in the database who match all the criteria that was selected.
My database table is set up so I have two columns.
1. Object_id 2. term_taxonomy_id.
The term_taxonomy_id are the values in the array.
eg my table looks like this
Object id term_taxonomy_id
2 12
2 3
3 12
3 14
3 9
3 2
4 5
5 9
So only object_id '3' matches all the terms in the array - 12, 14, 9, 2
How would I run a query to find this result?
I'm using a mysql database, phpmyadmin and my site is built on wordpress.
Thanks
Basically:
SELECT objectID, COUNT(term_taxonomy_id) AS cnt
FROM yourtable
WHERE term_taxonomy_id IN (2, 9, 14, 12)
GROUP BY objectID
HAVING cnt = 4
find all the objectIDs that have one or more matching taxonomy IDs, but then return only the object IDs that have FOUR matching taxonomy IDs.
Use IN, Combined with COUNT() and having/GROUP.
$array = array_filter(array_unique($array));
$count = count($array);
$sql = "SELECT id, COUNT(*) FROM table WHERE field IN (" . implode(',', $array) . ") GROUP BY id HAVING COUNT(*) = " . $count;
The SQL might be a bit off (you might have to re-order having and group).

Optimizing multiple queries in MySQL and PHP

Questions
How should I do the query(ies) to get this results?
Should I use a different structure for database tables?
Details
I want to get results from 3 tables:
+------------------------------+-------------------+
| courses | id | <-------+
| | name | |
| | | |
+------------------------------+-------------------+ |
| sections | id | <-------|----------+
| | course_id | <- FK(courses.id) |
| | name | |
+------------------------------+-------------------| |
| resources | id | |
| | section_id | <- FK(sections.id)-+
| | name |
+------------------------------+-------------------+
I want to store results in a PHP Array like this:
Array
(
[courses] => Array
(
[id] => 1
[name] => course 1
[sections] => Array
(
[0] => Array
(
[id] => 1
[course_id] => 1
[name] => course 1 section 1
[resources] => Array
(
[0] => Array
(
[id] => 1
[section_id] => 1
[name] => resource 1
)
)
)
)
)
)
EDIT
What I did:
$cources = DB::query(Database::SELECT,
'select * from courses')->execute($db,false)[0]; // Get all courses as array
foreach($courses as &$course) {
$sections = DB::query(Database::SELECT,
'select * from sections where course_id = '.$courses['id']);
$course['sections'] = $sections;
foreach($course['sections'] as &&section) {
$resources = DB::query(...); // Get array of resources
$section['resources'] = $resources;
}
}
The database structure is normalized - this is correct and should not be changed.
However, SQL returns de-normalized or "flattened" data for an N+ join: only a set of homogenous records can be returned in a single result-set. (Some databases, like SQL Server, allow returning structure by supporting XML generation.)
To get the desired array structure in PHP will require:
Separate queries/result-sets (as shown in the post): ick!
There will about one query/object. While the theoretical bounds might be similar, the practical implementation will be much less efficient and the overhead will be much more than for single query. Remember that every query incurs (at the very least) a round-trip penalty - as such, this is not scalable although it will likely work just fine for smaller sets of data or for "time insensitive" operations.
Re-normalize the resulting structure:
This is very trivial to do with support of a "Group By" operation, as found in C#/LINQ. I am not sure how this would be approached [easily] in PHP1. This isn't perfect either, but assuming that hashing is used for the grouping, this should be able to scale fairly well - it will definitely be better than #1.
Instead of the above, consider writing the query in such a way that the "flat" result can be used within the current problem/scope, if possible. That is, analyze how the array is to be used - then write the queries around that problem. This is often a better approach that can scale very well.
1 Related to re-normalizing the data, YMMV:
SQL result to PHP multidimensional array
PHP array to multidimensional array
Group a multidimensional array by a particular value?
You can try something like this
SELECT * FROM (
select c.id, c.name from courses c
union
select s.id, r.name,s.course_ID from sections s
union
select r.id, r.name,r.section_ID from resources r
)
You cant get multi dimensional result from mysql. The query for getting the elements should be like this:
select courses.id as coursesId,courses.name as coursesName,sections.id as sectionsId,sections.name as sectionsName,resources.id as resourcesId, resources.name as resourcesName
from courses
left join sections on courses.id=sections.course_id
left join resources on sections.id=resources.section_id;
But ofcourse it will not give you the array as you like.
if you are familiar with php then you can use this code i am writing only 2nd level you can write same way with third label
$final=array();
$c=-1;
$cid=false;
$cname=false;
$query = "SELECT c.*,s.*,r.* FROM courses AS c LEFT JOIN sections AS s ON c.id=s.course_id LEFT JOIN resources AS r ON r.section_id =s.id";
$result=mysql_query($query, $this->con) or die(mysql_error());
while($row= mysql_fetch_array($result)){
if($cid!=$row[2]){
$final['cources'][++$c]['id']=$cid=$row[0];
$final['cources'][$c]['name']=$cname=$row[1];
$s=-1;
}
$final['cources'][$c]['sections'][++$s]['id']=$row[2];
$final['cources'][$c]['sections'][$s]['course_id']=$row[3];
$final['cources'][$c]['sections'][$s]['name']=$row[4];
}
echo "<pre>";
print_r($final);
echo "</pre>";
//Outpur
Array
(
[cources] => Array
(
[0] => Array
(
[id] => 1
[name] => c1
[sections] => Array
(
[0] => Array
(
[id] => 1
[course_id] => 1
[name] => s1-1
)
[1] => Array
(
[id] => 1
[course_id] => 1
[name] => s1-1
)
)
)
[1] => Array
(
[id] => 2
[name] => c2
[sections] => Array
(
[0] => Array
(
[id] => 2
[course_id] => 2
[name] => s1-2
)
)
)
)
)

MySQL: fetch a row and multiple related rows - possible?

Let's say I have one table: "cars" with 3 fields: id, brand, cost.
There's a second table: "models" with 3 fields: id, brand, model_name.
Each "cars" row can have multiple related "models" rows.
Is it possible to do an sql-select whose output looks like this?
edit: I use PHP for the database querys
array(
[0] => array(
[id] => 1
[brand] => mercedes
[cost] => 1000
[models] => array(
[0] => array(
[id] => 1
[brand] => mercedes
[model_name] => slk
)
[1] => array(
[id] => 2
[brand] => mercedes
[model_name] => clk
)
[2] => array(
[id] => 3
[brand] => mercedes
[model_name] => whatever
)
)
)
)
You need to add a foreign key relation to the models table, say car_id. Then:
SELECT * FROM cars JOIN models ON car_id = models.id;
This will output something similar to what you are looking for.
Assuming you are using PHP, using the output:
$query= "SELECT * FROM cars JOIN models ON car_id = models.id";
$r= #mysqli_query($dbc, $query);
while ($row= mysqli_fetch_array($r, MYSQLI_ASSOC)) {
$carstuff['id']=$row[id];
$carstuff['brand']=$row[brand];
$carstuff['cost']=$row[cost];
$carstuff[$row['models']][]=$row['model_name'];
}
var_dump($carstuff);
Note, that the id, brand and cost are repeatedly overwritten, but that is okay because they are overwritten with the same information. I'm not too sure about the cleanliness of the code, but that is the basic idea.
Try this:
Query:
SELECT c.ID, c.brand,c.cost, GROUP_CONCAT(model_name SEPARATOR '","') as models
, GROUP_CONCAT(m.ID SEPARATOR ',') as MID
, GROUP_CONCAT(m.brand SEPARATOR '","') as mbrand
FROM cars c
LEFT OUTER JOIN model m
ON m.brand = c.brand
GROUP BY brand;
Output:
ID BRAND COST MODELS MID MBRAND
1 audi 1000 m11","m22 4,5 audi","audi
1 mercedes 1200 m1","m2","m3 1,2,3 mercedes","mercedes","mercedes
Now in your php code you can process the MODEL,MID and MBrand
(By using explode)
$modelArray = explode(" ,", $row["MODELS");
SQLFIDDLE

Categories