Check sentences if have same words - php

tb_content(left) and tb_word(right) :
===================================== ================================
|id|sentence |sentence_id|content_id| |id|word|sentence_id|content_id|
===================================== ================================
| 1|sentence1| 0 | 1 | | 1| a | 0 | 1 |
| 2|sentence2| 1 | 1 | | 2| b | 0 | 1 |
| 3|sentence5| 0 | 2 | | 3| c | 1 | 1 |
| 4|sentence6| 1 | 2 | | 4| a | 1 | 1 |
| 5|sentence7| 2 | 2 | | 5| e | 1 | 1 |
===================================== | 6| f | 0 | 2 |
| 7| g | 1 | 2 |
| 8| h | 1 | 2 |
| 9| i | 1 | 2 |
|10| f | 2 | 2 |
|11| h | 2 | 2 |
|12| f | 2 | 2 |
================================
I need to check if every sentence consist of words that owned by other sentences in every content_id.
for example :
Check for the content_id = 1 they are sentence1 and sentence2. from tb_word, we can see that sentence1 and sentence2 consist of the same word a. if the number of a in two sentences is >=2, then a will be the result. So if I print the result, it must be :
00Array ( [0] => a [1] => b) 01Array ( [3] => a ) 10Array ( [3] => a )11Array ( [0] => c [1] => a [2] => e) where 00 means sentence_id = 0 and sentence_id = 0
first, I make functionTotal to count how many sentence that owned by every content_id :
$total = array();
$sql = mysql_query('select content_id, count(*) as RowAmount
from tb_content Group By contente_id') or die(mysql_error());
while ($row = mysql_fetch_array($sql)) {
$total[] = $row['RowAmount'];
}
return $total;
From that function I get the value of $total and from that I need to check the similarity of some words (from tb_word) between all the possibilities of 2 sentence
foreach ($total as $content_id => $totals){
for ($x=0; $x <= ($totals-1); $x++) {
for ($y=0; $y <= ($totals-1); $y++) {
$shared = getShared($x, $y);
}
}
the function of getShared is :
function getShared ($x, $y){
$token = array();
$shared = array();
$i = 0;
if ($x == $y) {
$query = mysql_query("SELECT word FROM `tb_word`
WHERE sentence_id ='$x' ");
while ($row = mysql_fetch_array($query)) {
$shared[$i] = $row['word'];
$i++;
}
} else {
$query = mysql_query("SELECT word, count(word) as jml
FROM `tb_word` WHERE sentence_id ='$x'
OR sentence_id ='$y'
GROUP BY word ");
while ($row = mysql_fetch_array($query)) {
$jml = $row['jml'];
$token[$i] = $row['word'];
if ($jml >= 2) {
$shared[$i] = $token[$i];
}
$i++;
}
But the result I get is still wrong. the result still mix between different content_id. the result must be group by content_id also. sorry for my bad english and my bad explanation. cmiiw, please help me.. thank you :)

This one can be actually done by DBMS itself, two steps in one query. First, you make a self join in order to prepare sentence combinations within the same content:
SELECT a.content_id,
a.sentence_id AS sentence_id_1,
b.sentence_id AS sentence_id_2
FROM tb_content AS a
JOIN tb_content AS b
ON ( a.content_id = b.content_id
AND a.sentence_id <= b.sentence_id )
The "<=" will keep same sentence joins, like "1-1" or "2-2", and yet avoid bidirectional repetitions, like "1-2" and "2-1". Next you can join the above result with words and count the number of occurances. Like that:
SELECT s.content_id,
s.sentence_id_1,
s.sentence_id_2,
c.word,
Count(*) AS jml
FROM (SELECT a.content_id,
a.sentence_id AS sentence_id_1,
b.sentence_id AS sentence_id_2
FROM tb_content AS a
JOIN tb_content AS b
ON ( a.content_id = b.content_id
AND a.sentence_id <= b.sentence_id )) AS s
JOIN tb_word AS c
ON ( s.content_id = c.content_id
AND ( c.sentence_id = s.sentence_id_1
OR c.sentence_id = s.sentence_id_2 ) )
GROUP BY s.content_id,
s.sentence_id_1,
s.sentence_id_2,
c.word
HAVING Count(*) >= 2;
The result of the above query will give you the container, sentences 1 and 2, the word, and the number of occurances (which is 2 or more). All you need now is collecting the result into the array which as I see you already know to do.
Let me know, if I missunderstood your goal.

How about simply SELECT content_id, word, COUNT(*) as num_appearing FROM tb_word GROUP BY content_id, word?
EDIT: I see the complexity now: your main issue is that the getShared() function has two sentence IDs passed to it, but no content_id to know which content is being analyzed. You're also assuming that content_id and sentence_id numbers are consecutive and start at zero. My code doesn't assume that, and pulls those IDs directly from the database.
<?php
$rs = mysql_query("SELECT * FROM tb_content");
$content = array();
while ($row = mysql_fetch_assoc($rs)) {
if (!isset($content[$row['content_id']])) $content[$row['content_id']] = array();
$content[$row['content_id']][] = $row['sentence_id'];
}
foreach($content as $content_id => $sentences) {
foreach($sentences as $sentence_id) {
foreach($sentences as $compare) {
$shared = getShared($content_id, $sentence_id, $compare);
}
}
}
function getShared($cid, $s1, $s2) {
$rs = mysql_query("SELECT `word`, COUNT(*) AS 'num' FROM `tb_word` WHERE `content_id`={$cid} AND `sentence_id` IN ({$s1}, {$s2}) GROUP BY `word`");
$out = array();
while ($row = mysql_fetch_assoc($rs)) {
if ($rs['num'] >= 2) $out[$rs['word']] = $rs['num'];
}
return $out;
}

Related

Group Array items in PHP

I have the above array:
| Student First Name |Student Last Name | Age |Disability|
| Student_First_Name_1 |Student_Last_Name_1 | 30 | 1 |
| Student_First_Name_2 |Student_Last_Name_2 | 28 | 0 |
| Student_First_Name_3 |Student_Last_Name_3 | 21 | 0 |
| Student_First_Name_4 |Student_Last_Name_4 | 20 | 1 |
| Student_First_Name_5 |Student_Last_Name_5 | 22 | 0 |
and I want to grouped the students by age and Disability.
So if my code runs correctly I'll have the above results:
Student_First_Name_1 : Student_First_Name_4
Student_First_Name_3 : Student_First_Name_5
Student_First_Name_2
But instead I have the above:
Student_First_Name_1 : Student_First_Name_4
Student_First_Name_3 : Student_First_Name_5
Student_First_Name_2 : Student_Last_Name_3
Student_First_Name_2 : Student_Last_Name_5
My code is:
$StudentsForSID = $conn->prepare("SELECT * FROM members WHERE sid = :sid AND level = :level");
$StudentsForSID->execute([ 'sid' => $SelectedSID, 'level' => 'LRN_B1' ]);
while($row = $StudentsForSID->fetch(PDO::FETCH_ASSOC)){
$TempSelected[] = $row;
}
$count=count($TempSelected);
for($i=0; $i<$count-1; $i++){
for ($j = $i+1; $j < $count; $j++) {
if($TempSelected[$i]['disability']==$TempSelected[$j]['disability']){
if( abs($TempSelected[$i]['age']-$TempSelected[$j]['age']) <= 23 ){
$Student1 = $TempSelected[$j]['first_name'];
$Student2 = $TempSelected[$i]['first_name'];
print_r($Student1.'-'.$Student2.'<br/>');
}
}
}
}
I don't think I explained very well. So i edit the question.
What I want:
I want to make groups of 2 students with the same value in disability and the age difference between the 2 students to be equal or under 23.
So I have the above array with 5 students. From this array I'll make 3 groups and the groups will be the above (2 groups with 2 students with fulfilled the criteria, and 1 group with one student).
Can you help me?
Thank you
Why don't you use a double group by in your query?
Group by Age , Disability
This will actually group your results into two groups like you wanted so you will save the php sorting and those multiple if and for.

real rank in mysql and php

I do a simple SQL-Query:
SELECT `name`, `likes`
FROM `social`
WHERE `month` = '2015-01'
ORDER BY `likes` DESC
then I add a "Rank" wich is an intenger with ++
$data = array();
$rank = 0;
while ($table_row = mysqli_fetch_assoc($table)) {
$rank++;
$data[$table_row['name']] = $table_row;
$data[$table_row['name']]['rank'] = $rank;
}
The result is left and what I want on the right side
+------+------+-------+ +------+------+-------+
| rank | name | likes | | rank | name | likes |
+------+------+-------+ +------+------+-------+
| 1 | foo | 123 | | 1 | foo | 123 |
| 2 | mfoo | 33 | | 2 | mfoo | 33 |
| 3 | xfoo | 33 | | 2 | xfoo | 33 |
| 4 | yfoo | 30 | | 4 | yfoo | 30 |
| 5 | zfoo | 29 | | 5 | zfoo | 29 |
+------+------+-------+ +------+------+-------+
how do I get the right side table? is there a way to solve it in the query?
EDIT:
There I am standing now:
select IF(#likes=s.likes, #rownum, #rownum:=#rownum+1) rank2,
s.domain_name, s.likes,
(#likes:=s.likes) dummy
from social s,
(SELECT #rownum:=0) x,
(SELECT #likes:=0) y
WHERE `month` = '2015-01'
order by likes desc
but the rank is not 100% correct because I want to skip a rank instead of counting through
I see what's happening here.
I dont know about best practices but I'd probably do something like this myself:
$data = array();
$rank = 0;
$lastlike = 1;
$currentlike = 0;
$i = 0;
while ($table_row = mysqli_fetch_assoc($table)) {
$name = $table_row['name'];
$currentlike = $table_row['likes'];
if ($currentlike != $lastlike) $rank++;
$data[$i] = array('rank'=>$rank,'name'=>$name,'likes'=>$currentlike);
$lastlike = $currentlike;
$i++;
}
I've not checked it but you're welcome to try it out and see.
You can do this with SQL or with PHP (you can test to know which one is the faster).
Note : this 2 solutions will give you a rank which start at 0.
SQL :
SELECT
`name`,
`likes`,
(SELECT COUNT(*) FROM `social` AS S2 WHERE S1.likes > S2.likes) AS `rank_2`
FROM `social` AS S1
WHERE `month` = '2015-01'
ORDER BY `likes` DESC
PHP :
$data = array();
$rank = 0;
$likes_pre = -1;
while ($table_row = mysqli_fetch_assoc($table)) {
$likes_cur = $table_row['likes'];
if ($likes_pre > $likes_cur) {
$rank++;
}
$data[$table_row['name']] = $table_row;
$data[$table_row['name']]['rank'] = $rank;
$likes_pre = $likes_cur;
}
Not tested, but it should work.
Try this... rank should be the same for all rows with the same number of likes
$data = array();
$rank = 0;
$last_likes =0;
while ($table_row = mysqli_fetch_assoc($table)) {
if ($last_likes != $row['likes']) {
$rank++;
}
$data[$table_row['name']] = $table_row;
$data[$table_row['name']]['rank'] = $rank;
}

update table from array with different values where empty field

I have problem with my code, problem is data not update still NULL.
table A
year | period | code | user_id
2013 | 4 | 1231 |
2013 | 4 | 1232 |
2013 | 4 | 1233 |
2013 | 4 | 1234 |
2013 | 4 | 1235 |
Table B
user_id | user_name | cash
A1 | AB | 10
A2 | BC | 5
A3 | CD | 7
I will put table B user_id to table A user_id when cash >= 7
Table Result
year | period | code | user_id
2013 | 4 | 1231 | 10
2013 | 4 | 1232 | 7
2013 | 4 | 1233 |
2013 | 4 | 1234 |
2013 | 4 | 1235 |
here my code,
$arr = array();
$query = mysql_query("select user_id from tableB where cash >= 7");
while ($arrs = mysql_fetch_array($query)) {
$arr[] = $arrs[0];
}
$count = count($arr);
for ($i = 0; $i < $count; $i++) {
$sql = mysql_query("UPDATE tableA SET user_id ='$arr[$i]' WHERE year = 2013 and period = 4 and user_id IS NULL");
}
if ($sql) {
echo "success";
}
The operator in the query should be >= and not =>
check on this:
$arr[] = $arrs[0];
this now holds the first record of the result set. Expected that there is only one record with cash >= 7, that might be ok, but thats a risky assumption.
for ($i = 0; $i < $count; $i++) {
$sql = mysql_query("UPDATE tableA SET user_id ='$arr[$i]' WHERE year = 2013 and
period = 4 and user_id IS NULL");
}
in here $arr[$i] iterates over the fields of the first record, giving you in a sequence the values 'A3', 'CD' and 7 and you are running three useless updates on your table. Afterwards the column user_id in Table A has the value 7 and not 'A3' since this is the last value in your loop.
I have solution for my code,
$arr = array();
$query = mysql_query("select user_id from tableB where cash >= 7");
while ($arrs = mysql_fetch_array($query)) {
$arr[] = $arrs[0];
}
$q = mysql_query("select code from tableA order by code");
$index = 0;
while($codes = mysql_fetch_row($q)){
$sql = mysql_query("UPDATE tableA SET user_id ='".$arr[$index++]."' WHERE code='".$codes[0]."'");
}
result perfect !
thanks all

get the column name where the data was found in MySQL

is it possible to get the Column name where a particulare data was found without looping through the result, maybe by PDO? or is there another way to do this in mySQL?
the example show only 3 columns but for my table i may have up to 30 columns need to be check
if i have a table, table1 and want to find the column(s) where 'x' was found
+----+------+------+------+
| id | Col1 | Col2 | Col3 |
+----+------+------+------+
| 0 | x | b | x |
| 1 | x | x | f |
| 2 | d | x | g |
| 3 | h | j | k |
+----+------+------+------+
currentyl i run a Select then loop to each row and check each row column if data is 'x'
$query= "SELECT * FROM table1 WHERE (Col1='x' OR Col2='x' OR Col3='x')"
$result=mysql_query($query);
$foundCols = array();
$rowCnt = 0;
while ($row = mysql_fetch_assoc($result)) {
$tmpArr = array();
if ($row['Col1'] == 'x') {
$tmpArr[] = 'Col1';
}
if ($row['Col2'] == 'x') {
$tmpArr[] = 'Col2';
}
if ($row['Col3'] == 'x') {
$tmpArr[] = 'Col3';
}
$foundCols[$rowCnt] = $tmpArr;
$rowCnt = $rowCnt+1
}
thank you
Try this:
while ($row = mysql_fetch_assoc($result)) {
...
foreach (array('Col1', 'Col2', 'Col3') as $key) {
if ($row[$key] == 'x') {
$tmpArr[] = $key;
}
}
...
}

Stop Words Filtering for some documents

I confused how to filter words of some documents. I have to checked out the documents one by one. for example from tb_tokens :
======================================================================
| tokens_id | tokens_word | tokens_freq| sentence_id | document_id |
======================================================================
| 1 | A | 1 | 0 | 1 |
| 2 | B | 1 | 0 | 1 |
| 3 | C | 1 | 1 | 1 |
| 4 | D | 1 | 0 | 2 |
| ... | | | | |
======================================================================
I have to remove all words that appear on a list od common words like “and”, “the”, etc.. The list recorded in table tb_stopword and then remove words that occurr in large number across most documents that appear on a list recorded in tb_term table.
the function cekStopWord :
function cekStopWord ($word) {
$query = mysql_query("SELECT stoplist_word FROM tb_stopword where stoplist_word = '$word' ");
$row = mysql_fetch_row($query);
if($row > 0) {
return true;
} else {
return false;
}
}
And the similar function for the second process (remove words that occurr in large number across most documents)
function cekTerm ($word) {
$query = mysql_query("SELECT term_word FROM tb_term where term_word = '$word' ");
I confused how to process in every documents. I tried to call by doc_id, but it doesnt work. and here's my code :
//$doc_id is a variable that save array of document_id
$query = mysql_query('SELECT tokens_word, sentence_id, document_id FROM tb_tokens WHERE document_id IN (' . implode(",", $doc_id) . ')') or die(mysql_error());
while ($row = mysql_fetch_array($query)) {
$word[$row['document_id']][$row['sentence_id']] = $row['tokens_word'];
}
foreach ($word as $doc_id => $words){
$cekStopWord = cekStopWord($words);
$cekTerm = cekTerm($words);
if((preg_match("/^[A-Z, 0-9]/", $words))&& (!$cekStopWord) && (!$cekTerm) ){
$q = mysql_query("INSERT INTO tb_tagging VALUES ('','$words','','$sentence_id','$doc_id') ");
and also how to use preg_match in array ?
thank you so much :)

Categories