Condensing, restructuring and adding subarrays - php

I've been scratching my head and failing miserably at coming up with a solution to my array structuring issue. I'm not sure exactly what part would be better to try and fix, the data being returned from SQL or the PHP array after the fact.
My SQL data is returned like this:
$i = 0;
while ( $row = sqlsrv_fetch_array( $stmt, SQLSRV_FETCH_ASSOC ) ) {
$colData[$i] = array(
'name' => $row['FULLNAME'],
'invoice' => $row['CUST_InvoiceNumber_020911544'],
array(
'service' => $row['CUST_Service_052400634'],
'date' => date_normalizer($row['CUST_ServiceDate_064616924']),
'service_amount' => $row['CUST_ServiceAmount_054855553'],
),
'do_all_for' => $row['CUST_DoAllFor_021206685'],
'memo' => $row['CUST_Memo_021614200'],
'paymenttype' => $row['CUST_PAYMENTTYPE_123838203'],
'deposit' => $row['CUST_DEPOSIT_124139703'],
'datepaid' => date_normalizer($row['CUST_DATEPAID_124941578']),
);
$i++;
}
And the resultant array has this structure:
array (
0 =>
array (
'name' => 'ABRAHAM PRETORIS',
'invoice' => '63954',
0 =>
array (
'service' => 'Tree Work',
'date' => '2015-01-22',
'service_amount' => '1305.00',
),
'do_all_for' => '4924.68',
'memo' => 'CHECK #947 $2400',
'paymenttype' => 'VISA',
'deposit' => '4429.48',
'datepaid' => '2015-02-09',
),
1 =>
array (
'name' => 'ABRAHAM PRETORIS',
'invoice' => '63954',
0 =>
array (
'service' => 'DRF',
'date' => '2015-01-22',
'service_amount' => '740.00',
),
'do_all_for' => '4924.68',
'memo' => 'CHECK #947 $2400',
'paymenttype' => 'VISA',
'deposit' => '4429.48',
'datepaid' => '2015-02-09',
),
2 =>
array (
'name' => 'ABRAHAM PRETORIS',
'invoice' => '63954',
0 =>
array (
'service' => 'Stumps',
'date' => '2015-01-26',
'service_amount' => '360.00',
),
'do_all_for' => '4924.68',
'memo' => 'CHECK #947 $2400',
'paymenttype' => 'VISA',
'deposit' => '4429.48',
'datepaid' => '2015-02-09',
),
Notice that I'm getting a new subarray for the same person because the sub-subarray (service, date & service_amount) has multiple values.
What I'm trying to accomplish is condensing the array so that I only have one array for "ABRAHAM PRETORIS" etc, but all of the different services listed as a sub array. I would like it to look like this:
array (
0 =>
array (
'name' => 'ABRAHAM PRETORIS',
'invoice' => '63954',
0 =>
array (
'service' => 'Tree Work',
'date' => '2015-01-22',
'service_amount' => '1305.00',
),
1 =>
array (
'service' => 'DRF',
'date' => '2015-01-22',
'service_amount' => '740.00',
),
2 =>
array (
'service' => 'STUMPS',
'date' => '2015-01-26',
'service_amount' => '360.00',
),
'do_all_for' => '4924.68',
'memo' => 'CHECK #947 $2400',
'paymenttype' => 'VISA',
'deposit' => '4429.48',
'datepaid' => '2015-02-09',
),
I've looked at tons of examples of nested foreach statements and php array functions but I just can't wrap my head around how to loop through and add the additional services to the array then proceed when it's a row with a different name and/or invoice number.
Thanks in advance for the help!!

First, make sure your SQL query has an order by name, invoice. That will ensure all the records you want to group are sequential.
Then you have to create a loop with some additional inner logic:
// Creates an array to hold the final array.
$result = array();
// This var will keep track of name changes.
$current_name = '';
while ( $row = sqlsrv_fetch_array( $stmt, SQLSRV_FETCH_ASSOC ) )
{
// Let's check if the name changed. This will be true for the first
// time the loop runs.
if($current_name != $row['FULLNAME'])
{
// If we are beginning, the if below will not run. But in subsequent
// records, it will add the acumulated array to the main result.
if($current_name != '') $result[] = $temp;
// The temp array will be populated with all data that DOES NOT change
// for the current name.
$temp = array('name' => $row['FULLNAME'],
'invoice' => $row['CUST_InvoiceNumber_020911544'],
'do_all_for' => $row['CUST_DoAllFor_021206685'],
'memo' => $row['CUST_Memo_021614200'],
'paymenttype' => $row['CUST_PAYMENTTYPE_123838203'],
'deposit' => $row['CUST_DEPOSIT_124139703'],
'datepaid' => date_normalizer($row['CUST_DATEPAID_124941578']),
);
// Update the current name.
$current_name = $row['FULLNAME'];
}
// The part that runs only on name changes has finished. From now on, we
// will take care of data which will be accumulated
// in a sub-array (until name changes and the block above resets it).
$temp['sub-array'][] =
array('service' => $row['CUST_Service_052400634'],
'date' => date_normalizer($row['CUST_ServiceDate_064616924']),
'service_amount' => $row['CUST_ServiceAmount_054855553']);
}
// After the loop, the last temp array needs to be added too.
$result[] = $temp;
This is the general concept: you will create a temporary array to hold the current name, inside which you will acummulate other data. Once the name changes, the acummulated data will be dumped to the main result, the temp array is reset, and a new acummulation begins.
I can't test the code right now, so it probably needs some fixes, but this approach works really well, and my point here is to show you the concept, so you can adapt it to your specific needs.

Related

Find element with duplicate key value and add new key using PHP

I have an array
$info = array(
[0] => array(
'id' => 1,
'uid' => '677674e21aed487fd7180da4a7619a9d'
),
[1] => array(
'id' => 1,
'uid' => 'd3c98a10fe4e42fb1fe868008c0f4cc1'
),
[2] => array(
'id' => 1,
'uid' => 'd3c98a10fe4e42fb1fe868008c0f4cc1'
),
[3] => array(
'id' => 1,
'uid' => '658284e5395a29bf34d21f30a854e965'
),
[4] => array(
'id' => 1,
'uid' => '01f33ae45a463e0c1de4ad989b3ccad5'
),
[5] => array(
'id' => 1,
'uid' => '677674e21aed487fd7180da4a7619a9d'
)
)
As you can see, uid of 0th index and 5th index are same. Similarly, uid of 2nd index and 3rd index are same.
I want a PHP script by which I can randomly create one hexadecimal color code for duplicate uids. Say something like this.
$info = array(
[0] => array(
'id' => 1,
'uid' => '677674e21aed487fd7180da4a7619a9d',
'col' => 'black'
),
[1] => array(
'id' => 1,
'uid' => 'd3c98a10fe4e42fb1fe868008c0f4cc1',
'col' => 'green'
),
[2] => array(
'id' => 1,
'uid' => 'd3c98a10fe4e42fb1fe868008c0f4cc1',
'col' => 'green'
),
[3] => array(
'id' => 1,
'uid' => '658284e5395a29bf34d21f30a854e965'
),
[4] => array(
'id' => 1,
'uid' => '01f33ae45a463e0c1de4ad989b3ccad5'
),
[5] => array(
'id' => 1,
'uid' => '677674e21aed487fd7180da4a7619a9d',
'col' => 'black'
)
)
How can I do this with the most minimum execution time?
There might be various ways for doing this workout, but due to lack of proper response, I came up with this probable lengthier code. I am posting the answer here for people who might need this.
$uidArray = array(); // creating a blank array to feed each uid
$uidDuplicateArray = array(); // creating a blank array as container to hold duplicate uid(s) only
foreach($all_data as $key => $ad)
{
// iterate through each item of the list
/.................
.................. //
$uidArray[] = $ad['uid'];
}
foreach(array_count_values($uidArray) as $val => $c)
{
if($c > 1)
{
// if count value is more than 1, then its duplicate
// set the uid duplicate array with key as uid and unique color code as value
$uidDuplicateArray[$val] = sprintf('#%06X', mt_rand(0, 0xFFFFFF));
}
}
foreach($all_data as $keyAgain => $adg)
{
// iterating through each item of original data
if(isset($uidDuplicateArray[$adg['uid']]))
{
// if the uid is key of the duplicate array, feed the value to original array in a new key.
$all_data[$keyAgain]['color'] = $uidDuplicateArray[$adg['uid']];
}
}
Each comment associated with each LOC is self explanatory.
The reason I wanted this, is to mark the duplicates in UI like this:-

How to make an array key an array PHP

I'm storing data to an array like this which is inside three nested loops (loops omitted):
$teamDetails[$k] = array(
'side' => $json['data'][$i]['rosters'][$k]['side'],
'gold' => $json['data'][$i]['rosters'][$k]['gold'],
'aces' => $json['data'][$i]['rosters'][$k]['aces_earned'],
'herokills' => $json['data'][$i]['rosters'][$k]['hero_kills'],
'winner' => translateGame($json['data'][$i]['rosters'][$k]['winner']),
'participants'[$j] => array(
'work' => 'it worked',
)
);
How can make 'participants' an array with the indices coming from $j?
That's easy
$teamDetails[$k] = array(
'side' => $json['data'][$i]['rosters'][$k]['side'],
'gold' => $json['data'][$i]['rosters'][$k]['gold'],
'aces' => $json['data'][$i]['rosters'][$k]['aces_earned'],
'herokills' => $json['data'][$i]['rosters'][$k]['hero_kills'],
'winner' => translateGame($json['data'][$i]['rosters'][$k]['winner']),
'participants' => array(
$j => array(
'work' => 'it worked',
))
);

Mongodb $exist not working

Ok I am not sure why this is not working I know the field is there because it has sub arrays in this mydetails field.
function firsttime($uid){
$collection = static::db()->members;
var_dump($collection->findOne(array("_id"=> new MongoId($uid), array("mydetails"=> array('$exists' => true)))));
}
all it returns is NULL
is there a better way to find if there is or is not a field
in this example I want to see if the field mydetails exist?
It would be nice if I could either have a true or false return.
an example data
array (
'_id' => new MongoId("53b9ea3ae7fda8863c8b4568"),
'mydetails' =>
array (
'name' =>
array (
'first' => 'Russell',
'last' => 'Harrower',
),
'email' => 'hidden#ipet.xyz',
'birthday' =>
array (
'day' => '02',
'month' => '02',
'year' => '1988',
),
)
)
You got an array( too much in there. Try this:
$collection->findOne(array("_id"=> new MongoId($uid), "mydetails"=> array('$exists' => true)));

PHP Mongo Aggregation only returns _id

I am trying to return a collection of messages grouped by in_reply_to field, I have this code:
$result = $this->db->Message->aggregate(
array(
array(
'$project' => array('message' => 1, 'in_reply_to'=> 1, 'to_user' => 1, 'from_user' => 1)
),
array(
'$group' => array('_id' => '$in_reply_to'),
),
)
);
print_r($result);exit;
the result is:
Array (
[result] => Array (
[0] => Array (
[_id] => MongoId Object (
[$id] => 53a03d43b3f7e236470041a8
)
)
[1] => Array (
[_id] => MongoId Object (
[$id] => 53a03cbdb3f7e2e8350041bb
)
)
)
[ok] => 1
)
Ideally I'd like the entire Message object, but I did think that $project would be used to specify returns fields, even so, I dont get the fields I'm specifying.
Any help is greatly appreciated
In order to get all the messages in the thread you basically want to $push
$result = $this->db->Message->aggregate(
array(
array(
'$group' => array(
'_id' => '$in_reply_to',
'messages' => array(
'$push' => array(
'_id' => '$_id',
'message' => '$message',
'to_user' => '$to_user',
'from_user' =>'$from_user'
)
)
)
)
)
);
MongoDB 2.6 you have the $$ROOT variable that shortens this:
$result = $this->db->Message->aggregate(
array(
array(
'$group' => array(
'_id' => '$in_reply_to',
'messages' => array(
'$push' => '$$ROOT'
)
)
)
)
);
So that puts all of the related messages inside the "messages" array tied to that key.
Just as side note, while you can do this you may as well just sort the results by your "in_reply_to" field and process them that way looking for changes in the value to indicate a new thread.
Sorting with a find would be the fastest way to process, even if it does not conveniently put everything right under the one key.
If you want to get additional fields beside _id field, when using $group operator, you need to include them using some of the available accummulators like $first or $last. You can see the full list on the MongoDB $group documentation page.
The query will look like this:
$result = $this->db->Message->aggregate(
array(
array(
'$project' => array(
'message' => 1,
'in_reply_to'=> 1,
'to_user' => 1,
'from_user' => 1
)
),
array(
'$group' => array(
'_id' => '$in_reply_to',
'message' => array('$first' => '$message'),
'to_user' => ('$first' => '$to_user'),
'from_user' => ('$first' => '$from_user')
),
),
)
);
If the message, to_user and from_user values are same in all documents using $last instead of $first $last will produce the same results.

How do I find fuzzy duplicates from this php array?

Before I add clarification, here is some pseudo data. The array I need to iterate is like this:
$ipBodies = array(
'1.2.3.4' => array(
array('id' => 1, 'body' => 'asdfasdfasdf_X'),
array('id' => 2, 'body' => 'asdfasdfasdf_Y'),
array('id' => 3, 'body' => '123456789_X'),
array('id' => 4, 'body' => '123456789_Y'),
),
'5.6.7.8' => array(
array('id' => 13, 'body' => 'foobarbaz_X'),
array('id' => 14, 'body' => 'foobarbaz_Y'),
array('id' => 15, 'body' => 'adsflkjlsdfjlkjlkasdfj'),
array('id' => 16, 'body' => 'foobarbaz_Z'),
),
);
So from this sample data, you can see there are two sets of unique 'fuzzy duplicates' in the 1.2.3.4 array, and only 1 set of 'fuzzy duplicates' in the 5.6.7.8 array.
In the real data, everything is scaled up. The main array will have hundreds of ip addresses, and those arrays could have hundreds of members. Also the body section is larger in the real data.
I've considered that I need to run through each ip address array and create a new array of every combination to a new array, say $pairs, then run similar_text (seems to work well for this) on those to find duplicates, but creating these sets of pairs will be expensive I believe. I think the $pairs array count would end up being the factorial of the count of the array, which could become enormous as the array size increases.
I'm thinking I'd like to end up with an array $dupes that (based on the sample data above) should look like this:
$dupes = array(
'1.2.3.4' => array(
array('1', '2'),
array('3', '4'),
),
'5.6.7.8' => array(
array('13', '14', '16'),
),
);
I really just need some help and advice here so I can start solving the problem. God I hope my explanation made sense. If it didn't, let me know and I'll clarify.
If possible, I recommend using levenshtein instead of similar_text because it's a faster algorithm.
The complexity of the algorithm is O(m*n), where n and m are the
length of str1 and str2 (rather good when compared to similar_text(),
which is O(max(n,m)**3), but still expensive).
The code below uses an associative array to put each element into buckets where the ip['body'] has levensthein distance of < 2 (which means matches within the same bucket will have at most 1 different character, change as needed). Once all elements have been placed into their respective buckets every bucket with only 1 element is discarded.
$ipBodies = array(
'1.2.3.4' => array(
array('id' => 1, 'body' => 'asdfasdfasdf_X'),
array('id' => 2, 'body' => 'asdfasdfasdf_Y'),
array('id' => 3, 'body' => '123456789_X'),
array('id' => 4, 'body' => '123456789_Y'),
),
'5.6.7.8' => array(
array('id' => 13, 'body' => 'foobarbaz_X'),
array('id' => 14, 'body' => 'foobarbaz_Y'),
array('id' => 15, 'body' => 'adsflkjlsdfjlkjlkasdfj'),
array('id' => 16, 'body' => 'foobarbaz_Z'),
),
);
$counts = [];
foreach($ipBodies as $groupName => $group) {
$counts[$groupName] = [];
foreach($group as $key => $ip) {
foreach($counts[$groupName] as $countGroup => $groupCount) {
if(levenshtein($ip['body'],$countGroup) < 2) {
$counts[$groupName][$countGroup][] = $ip['id'];
continue 2;
}
}
$counts[$groupName][$ip['body']] = [$ip['id']];
}
}
//remove elements that appear just once
foreach($counts as $groupName => &$groupCounts) {
foreach($groupCounts as $k => &$v) {
if(count($v) < 2) {
unset($counts[$groupName][$k]);
}
}
$counts[$groupName] = array_values($groupCounts);
}
print_r($counts);
Output
Array
(
[1.2.3.4] => Array
(
[0] => Array
(
[0] => 1
[1] => 2
)
[1] => Array
(
[0] => 3
[1] => 4
)
)
[5.6.7.8] => Array
(
[0] => Array
(
[0] => 13
[1] => 14
[2] => 16
)
)
)

Categories