How to simplify array into only unique values in Python

How to simplify array into only unique values in Python - php

Because of array depth issues in PHP, receiving this array from Python becomes truncated with an ellipsis ("..."). I'd like to process the array in Python before returning back to php.
Clarification: I need to maintain the inner sets [135, 121, 81]. These are R, G, B values and I'm tying to group sets that occur more than once. Values in sets need to maintain [1, 2, 3] sequence, NOT [1,2,3,4,5,6,7,8] as some answers have suggested below.
How would you simplify this 3D numpy.ndarray to a collection of unique RGB triples?
Here is how the array is printed by Python:
[[[135 121 81]
[135 121 81]
[135 121 81]
...,
[135 121 81]
[135 121 81]
[135 121 81]]
[[135 121 81]
[135 121 81]
[135 121 81]
...,
[135 121 81]
[135 121 81]
[135 121 81]]
[[ 67 68 29]
[135 121 81]
[ 67 68 29]
...,
[135 121 81]
[135 121 81]
[135 121 81]]
...,
[[200 170 19]
[200 170 19]
[200 170 19]
...,
[ 67 68 29]
[ 67 68 29]
[ 67 68 29]]
[[200 170 19]
[200 170 19]
[200 170 19]
...,
[116 146 15]
[116 146 15]
[116 146 15]]
[[200 170 19]
[200 170 19]
[200 170 19]
...,
[116 146 15]
[116 146 15]
[116 146 15]]]
Here is the code that I have attempted:
def uniquify(arr)
keys = []
for c in arr:
if not c in keys:
keys[c] = 1
else:
keys[c] += 1
return keys
result = uniquify(items)

Based on the representation of your "array", it looks like you're working with a numpy.ndarray. This becomes quite a simple problem if that is the case -- You can transform to a 1-D iterable simple by using the .flat attribute. To make it unique, you can just use a set:
set(array.flat)
This will give you a set, but you could easily get a list from it:
list(set(array.flat))
Here's how it works:
>>> array = np.zeros((10,12,42,53))
>>> list(set(array.flat))
[0.0]
As a side note, there's also np.unique which will give you the unique elements of your array as well.
>>> array = np.zeros((10,12),dtype=int)
>>> print array
[[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]]
>>> np.unique(array)
array([0])
>>> array[0,5] = 1
>>> array[4,10] = 42
>>> np.unique(array)
array([ 0, 1, 42])
I think I finally got this one figured out:
from itertools import product
items = set(tuple(a[itr+(slice(None),)]) for itr in product(*[range(x) for x in a.shape[:-1]]))
print items
Seems to work. Phew!
How this works -- the pieces that you want to keep as triplets are accessed as:
array[X,Y,:]
So, we just need to loop over all of the combinations of X and Y. That is exactly what itertools.product is good for. We can get the valid X and Y in an arbitrary number of dimensions:
[range(x) for x in array.shape[:-1]]
So we pass that to product:
indices_generator = product(*[range(x) for x in array.shape[:-1]])
Now we have something that will generate the first to indices -- We just need to construct a tuple to pass to __getitem__ that numpy will interpret as (X,Y,:) -- That's easy, we're already getting (X,Y) from indices_generator -- We just need to tack on an emtpy slice:
all_items = ( array[idx+(slice(None),)] for idx in indices_generator )
Now we can loop over all_items looking for the unique ones with a set:
unique_items = set(tuple(item) for item in all_items)
Now turn this back into a list, or a numpy array or whatever you want for the purposes of passing it back to PHP.

Look at the recipes in the itertools documentation. There are flatten and unique_everseen functions that do exactly what you want.
So, you can copy and paste them. Or you can just pip install more-itertools so you just import them. Now, you can flatten the 3D array to 2D, and uniquify the 2D array with unique_everseen…
Except for one problem. The elements of your 2D array are lists, which are not hashable, so you have to convert them to something hashable. But that's easy:
def uniquify(arr3d):
return unique_everseen(flatten(arr3d), tuple)
That's it.
And if you look at the implementations of those functions while you're pasting them, they're pretty simple. The only real trick here is using a set to hold the values seen so far: sets only hold one copy of each unique element (and can determine whether an element is already in the set very quickly).
In fact, if you don't need to preserve the ordering, it's even simpler:
def uniquify(arr3d):
return set(tuple(x) for x in flatten(arr3d))
As a test, I copied your string and turned it into an actual Python list display, then did this:
inarray = [[[135, 121, 81],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81]],
[[135, 121, 81],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81]],
[[67, 68, 29],
[135, 121, 81],
[67, 68, 29],
[135, 121, 81],
[135, 121, 81],
[135, 121, 81]],
[[200, 170, 19],
[200, 170, 19],
[200, 170, 19],
[67, 68, 29],
[67, 68, 29],
[67, 68, 29]],
[[200, 170, 19],
[200, 170, 19],
[200, 170, 19],
[116, 146, 15],
[116, 146, 15],
[116, 146, 15]],
[[200, 170, 19],
[200, 170, 19],
[200, 170, 19],
[116, 146, 15],
[116, 146, 15],
[116, 146, 15]]]
for val in uniquify(inarray):
print(val)
The output was:
[135, 121, 81]
[67, 68, 29]
[200, 170, 19]
[116, 146, 15]
Is that what you wanted?
If you want it as a list of lists, that's just:
array2d = list(uniquify(array3d))
If you're used a simple set instead of unique_everseen, these will be tuples instead of lists, so if you need a list of lists:
array2d = [list(val) for val in uniquify(array3d)]

Assuming the python list looks like [[[1,2,3], [4,5,6]], [[7,8,9]]] (that is, a list of lists of integers
mylist = [[[1,2,3], [4,5,6]], [[7,8,9]]]
items = set()
for sublist in mylist:
for subsublist in sublist:
for item in subsublist:
items.add(item)
If you then specifically need a list, you can just cast it as so: items = list(items)
A set is a datatype that is similar to a list, but doesn't contain duplicates.
A side-effect of the set datatype is the insertion order isn't preserved - if this is important to you you'll need something like:
mylist = [[[1,2,3], [4,5,6]], [[7,8,9]]]
items = []
for sublist in mylist:
for subsublist in sublist:
for item in subsublist:
if not item in items:
items.add(item)
Edit: based on your edit, you probably want this:
mylist = [[[1,2,3], [4,5,6]], [[7,8,9], [1,2,3]]]
items = []
for sublist in mylist:
for item in sublist:
if not item in items:
items.append(item)
# items = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

itertools is your friend here:
>>> import itertools
>>> array = [1,1,1,2,2,2,3,3,3,4,5,6,6,6]
>>> [x[0] for x in itertools.groupby(array)]
[1, 2, 3, 4, 5, 6]
For example:
array = [[[135,121,81],
[135,121,81],
[135,121,81],
[135,121,81],
[135,121,81],
[135,121,81]],
[[135,121,81],
[135,121,81],
[135,121,81],
[135,121,81],
[135,121,81],
[135,121,81]],
[[67,68,29],
[135,121,81],
[67,68,29],
[135,121,81],
[135,121,81],
[135,121,81]]]
import itertools
new_array = list()
for inner in array:
new_inner = [x[0] for x in itertools.groupby(inner)]
new_array.append(new_inner)
Produces:
[ [ [135, 121, 81] ],
[ [135, 121, 81] ],
[ [67, 68, 29],
[135, 121, 81],
[67, 68, 29],
[135, 121, 81] ] ]
Not quite unique, but you can sort inner to get only unique.

Related

How to decode from an unknown encoding in PHP?

My question is different from this one. I am trying to fix a broken encoding but I don't know how to proceed.
In my database I have this name:
mysql> select filename from file WHERE filename LIKE 'MAC%';
+-------------------------------------------------+
| filename |
+-------------------------------------------------+
| MAC-1600PVå–æ‰±èª¬æ˜Žæ›¸.pdf |
+-------------------------------------------------+
1 row in set (0.00 sec)
But on my filesystem the file is named:
$ ls files/*MAC*
files/MAC-1600PV取扱説明書.pdf
I have tried to unpack both strings from PHP and the content differ:
The utf-8 sequence read from the filesystem:
=> "MAC-1600PV取扱説明書"
>>> unpack('C*', $u)
...
7 => 48,
8 => 48,
9 => 80,
10 => 86,
11 => 195,
12 => 165,
13 => 226,
14 => 128,
15 => 147,
16 => 195,
17 => 166,
18 => 226,
And for the one read from the database:
...
7 => 48,
8 => 48,
9 => 80,
10 => 86,
11 => 229,
12 => 143,
13 => 150,
14 => 230,
15 => 137,
16 => 177,
So at some-point I lost the original encoding and I have no clue of how to fix my database which is in utf8mb4.
Any advice?

Adding values together based on their position in PHP

Each number has a corresponding value with it. There are many numbers which I can demonstate in a table here with their appropriate values:
[N] [V] N=Number V=Value
2 19
4 19
6 19
8 21
10 21
12 22
14 23
16 23
18 23
20 33
22 37
24 42
26 45
28 48
30 50
32 55
34 61
36 66
38 72
40 78
42 155
44 179
46 202
48 233
50 360
There is a process that a user will go through where they go from Number x to Number y. The values inbetween those numbers need to get added together. So for example, let's say a user goes from 16 to 38:
[N] [V] N=Number V=Value
2 19
4 19
6 19
8 21
10 21
12 22
14 23
[16][23]--
18 23 |
20 33 |
22 37 |
24 42 |
26 45 |
28 48 |---- All of these values get added together
30 50 |
32 55 |
34 61 |
36 66 |
[38][72]--
40 78
42 155
44 179
46 202
48 233
50 360
So the users total value would equal be:
23 + 23 + 33 + 37 + 42 + 45 + 48 + 50 + 55 + 61 + 66 + 72
Total Value = 555
The problem is, is that I have no idea how I to put this together in code. Like how to assign these values to their specific number and how to add those specific values together to get me a result. In PHP I simply do not know where to begin with this.
Also, the approximate values from the numbers can be represented by this equation:
v = 11.218e^(0.057n)
I would imagine this would be useful in making this whole process easier but I am still not sure how to go about implementing all of this. Any help would be very much apprieciated!

Put each each number with it's corresponding value into an array making number as key and value pair like this.
<?php
$arr = array(
2=> 19,
4=> 19,
6=> 19,
8=> 21,
10=> 21,
12=> 22,
14=> 23,
16=> 23,
18=> 23,
20=> 33,
22=> 37,
24=> 42,
26=> 45,
28=> 48,
30=> 50,
32=> 55,
34=> 61,
36=> 66,
38=> 72,
40=> 78,
42=> 155,
44=> 179,
46=> 202,
48=> 233,
50=> 360,
);
?>
Loop array with foreach loop like this
<?php
$sum = 0;
foreach($arr as $k => $v) {
if($k >= 16 && $k <= 38)
$sum += $v;
}
?>
There is another way using for loop statement, put both number in two separate array ($n and $v). Iterate the loop of the first array($n) and find the value from second array($v) through the index number of first array. But both array count should have same.
Example-
<?php
$n = array(2,4,6,8,10,12,14,16,18,20);
$v = array(19,19,19,21,21,22,23,23,23,33);
$sum = 0;
for($i=0, $i<count($n); $i++) {
if($n[$i] >= 16 && $n[$i] <= 38)
$sum += $v[$i];
}
?>

You would put your number and value pairs into an key / value array. So a shortened version of your test data would look like this:
$myDataStore = array(
"2" => "19",
"4" => "19",
"6" => "19",
"8" => "21",
"10" => "21",
"12" => "22",
"14" => "23",
"16" => "23",
"18" => "23",
"20" => "23"
);
Now you need a function to calculate your sum given a range as defined by starting and ending numbers.
function getRangeTotal($array, $startNumber, $endNumber){
$total = 0;
foreach($array as $key => $value){
if($key >= $startNumber && $key <= $endNumber){
$total = $total + $value;
}
}
return $total;
}
If you run the above function
getRangeTotal($myDataStore, 6, 12);
You'll get 83

Here is how you can do this using foreach
// first store you data to an array.
$kv = array(
2 => 19, 4 => 19, 6 => 19, 8 => 21,
10 => 21, 12 => 22, 14 => 23, 16 => 23,
18 => 23, 20 => 33, 22 => 37, 24 => 42,
26 => 45, 28 => 48, 30 => 50, 32 => 55,
34 => 61, 36 => 66, 38 => 72, 40 => 78,
42 => 155, 44 => 179, 46 => 202,48 => 233,
50 => 360
);
$start = 16;
$end = 32;
$sum = 0; //variable to keep sum
$n = 0; //variable to keep count
//loop through the array
foreach ($kv as $k => $v){
if ($k >=$start && $k <= $end){ //if key is in your range
$sum += $v; //add value to sum
$n ++; // increment count
}
}
$v = 11.218*pow(M_E,0.057*$n); //calculate the approximate values
echo "$sum\n$v\n";
Also refer to : pow and Predefined Constants

PHP MySQL query (sum from different rows)

I have a small MySQL database for an online booking calendar. Now I wanted to get some stats out of this calendar. I am adding to each entry in the calendar values (full paid, partly paid and a status (paid, unpaid, reserved, etc)).
I have attached an image of the screenshot. As you can see, there are 4 different custom_attribute_ids. ID 1 is saving the status, ID 2 the full price and ID 3 the price already paid. The column entity_id is saving it together. So e.g. all 4 entries with entity_id 232 belongs together.
I now what to display the following:
1. The Sum of all full prices (so custom_attribute_id 2). This I have done with this code:
$result = mysql_query('SELECT SUM(attribute_value) AS value_sum
FROM custom_attribute_values WHERE custom_attribute_id=2');
$row = mysql_fetch_assoc($result);
$sum = $row['value_sum'];
This is working and showing me the sum of all full prices entered in the calender.
With the same code I am also showing the sum of the already partly paid amounts.
But now the problem, I want to show the sum of the attribute_value depending on the status. So the code should summarize all values when the custom_attribute_id=2 AND the attribute_value of the relevant entity_id is "Reserviert".
Would be very nice, if somebody could help me, or at least let me know, if this is possible. I am not able to re-design the database, as this code is given from the calendar system.
Here the db as text:
ID custom_attribute_value_id attribute_value entity_id attribute_category
1124 1 Anfrage 233 1
1125 2 1188 233 1
1126 4 233 1
1127 3 015757817858 233 1
1053 1 Reserviert 232 1
1054 2 1700 232 1
1057 3 017697544266 232 1
1058 4 232 1
1039 2 573 231 1
1040 3 088259216300 231 1
1042 1 Reserviert 231 1
1037 3 0043676845617203 230 1
1045 2 2346,50 230 1
1046 1 Reserviert 230 1
1032 1 Anfrage 229 1
1033 2 474 229 1
1034 3 229 1
1027 1 Anfrage 228 1
1029 3 228 1
1030 2 588,50 228 1
1024 3 01729843043 227 1
1025 1 Reserviert 227 1
1023 2 990 227 1

You need a self join so you can have both status and related full price in the same row for each entity. Basically you take a part of the table with only status data and another part of the table with only full price data and join them on the same entity id. On the resulting set you can easily get the sum while using WHERE to restrict rows to those with 'Reserviert' status.
SELECT SUM(t2.attribute_value) AS value_sum
FROM custom_attribute_values t1 JOIN custom_attribute_values t2
ON t1.entity_id = t2.entity_id
AND t1.custom_attribute_id = 1
AND t2.custom_attribute_id = 2
WHERE t1.attribute_value = 'Reserviert'
In this query attribute_value holds status info for t1 and full price for t2.

This should be
$result = mysql_query('SELECT SUM(attribute_value) AS value_sum
FROM custom_attribute_values
WHERE custom_attribute_id=2 AND entity_id = \'Reserviert\'');

You could use a sub select to get the value like this
SELECT entity AS e,
(SELECT SUM(value) FROM custom_attribute_values WHERE entity = e AND attribute = 2) AS s
FROM custom_attribute_values
WHERE value = 'Reserviert'
Your table and column names weren't clear so I made some up...
The first SELECT picks all the 'Reserviert' entities then the sub select sums the values associated with each picked entity.
This is the data I used, with names serial, attribute, value, entity, x
1124, 1, Anfrage, 233, 1
1125, 2, 1188, 233, 1
1126, 4, , 233, 1
1127, 3, 015757817858, 233, 1
1053, 1, Reserviert, 232, 1
1054, 2, 1700, 232, 1
1057, 3, 017697544266, 232, 1
1058, 4, , 232, 1
1039, 2, 573, 231, 1
1040, 3, 088259216300, 231, 1
1042, 1, Reserviert, 231, 1
1037, 3, 004367684561, 230, 1
1045, 2, 2346.50, 230, 1
1046, 1, Reserviert, 230, 1
1032, 1, Anfrage, 229, 1
1033, 2, 474, 229, 1
1034, 3, , 229, 1
1027, 1, Anfrage, 228, 1
1029, 3, , 228, 1
1030, 2, 588.50, 228, 1
1024, 3, 01729843043, 227, 1
1025, 1, Reserviert, 227, 1
1023, 2, 990, 227, 1

PHP: Most frequent value in array

So I have this JSON Array:
[0] => 238
[1] => 7
[2] => 86
[3] => 79
[4] => 55
[5] => 92
[6] => 55
[7] => 7
[8] => 254
[9] => 9
[10] => 75
[11] => 238
[12] => 89
[13] => 238
I will be having more values in the actual JSON file. But by looking at this I can see that 238 and 55 is being repeated more than any other number. What I want to do is get the top 5 most repeated values in the array and store them in a new PHP array.

$values = array_count_values($array);
arsort($values);
$popular = array_slice(array_keys($values), 0, 5, true);
array_count_values() gets the count of the number of times each item appears in an array
arsort() sorts the array by number of occurrences in reverse order
array_keys() gets the actual value which is the array key in the results from array_count_values()
array_slice() gives us the first five elements of the results
Demo
$array = [1,2,3,4,238, 7, 86, 79, 55, 92, 55, 7, 254, 9, 75, 238, 89, 238];
$values = array_count_values($array);
arsort($values);
$popular = array_slice(array_keys($values), 0, 5, true);
array (
0 => 238,
1 => 55,
2 => 7,
3 => 4,
4 => 3,
)

The key is to use something like array_count_values() to tally up the number of occurrences of each value.
<?php
$array = [238, 7, 86, 79, 55, 92, 55, 7, 254, 9, 75, 238, 89, 238];
// Get array of (value => count) pairs, sorted by descending count
$counts = array_count_values($array);
arsort($counts);
// array(238 => 3, 55 => 2, 7 => 2, 75 => 1, 89 => 1, 9 => 1, ...)
// An array with the first (top) 5 counts
$top_with_count = array_slice($counts, 0, 5, true);
// array(238 => 3, 55 => 2, 7 => 2, 75 => 1, 89 => 1)
// An array with just the values
$top = array_keys($top_with_count);
// array(238, 55, 7, 75, 89)
?>

Rank array values with potential duplicate values and skipping some positions if there is a tie

I am working with database data that manipulates college students exam results. Basically, I am pulling the records from a MySQL database and pulling one class at any given time. I want to rank the students with the highest performer given the rank of 1.
Here is an illustration;
Marks: 37, 92, 84, 83, 84, 65, 41, 38, 38, 84.
I want to capture MySQL data as a single array. Once I have the data in an array, I should then assign each student a position in the class such as 1/10 (number 1, the 92 score), 4/10 etc. Now the problem is that if there is a tie, then the next score skips a position and if there are 3 scores at one position then the next score skips 2 positions. So the scores above would be ranked as follows;
92 - 1
84 - 2,
84 - 2,
84 - 2,
83 - 5,
65 - 6,
41 - 7,
38 - 8,
38 - 8 ,
37 - 10
The grading system requires that the number of positions (ranks, if you will) will be maintained, so we ended up with 10 positions in this class since positions 3, 4, 5 and 9 did not have any occupants. (The alternative of filling every number will have given us only 8 positions!)
Is it possible (humanly/programmatically possible) to use PHP to rank the scores above in such a way that it can handle possible ties such as 4 scores at one position? Sadly, I could not come up with a function to do this. I need a PHP function (or something in PHP) that will take an array and produce a ranking as above.
If it's possible to do this with MySQL query data without having it in an array, then that will also be helpful!

I assume the grades are already sorted by the database, otherwise use sort($grades);.
Code:
$grades = array(92, 84, 84, 84, 83, 65, 41, 38, 38, 37);
$occurrences = array_count_values($grades);
$grades = array_unique($grades);
foreach($grades as $grade) {
echo str_repeat($grade .' - '.($i+1).'<br>',$occurrences[$grade]);
$i += $occurrences[$grade];
}
Result:
92 - 1
84 - 2
84 - 2
84 - 2
83 - 5
65 - 6
41 - 7
38 - 8
38 - 8
37 - 10
EDIT (Response to discussion below)
Apparently, in case the tie occurs at the lowest score,
the rank of all lowest scores should be equal to the total count of scores.
Code:
$grades = array(92, 84, 84, 84, 83, 65, 41, 38, 37, 37);
$occurrences = array_count_values($grades);
$grades = array_unique($grades);
foreach($grades as $grade) {
if($grade == end($grades))$i += $occurrences[$grade]-1;
echo str_repeat($grade .' - '.($i+1).'<br>',$occurrences[$grade]);
$i += $occurrences[$grade];
}
Result:
92 - 1
84 - 2
84 - 2
84 - 2
83 - 5
65 - 6
41 - 7
38 - 8
37 - 10
37 - 10

$scores = array(92, 84, 84, 84, 83, 65, 41, 38, 38, 37);
$ranks = array(1);
for ($i = 1; $i < count($scores); $i++)
{
if ($scores[$i] != $scores[$i-1])
$ranks[$i] = $i + 1;
else
$ranks[$i] = $ranks[$i-1];
}
print_r($ranks);

I needed to end up with a map of values to rank. This method may be more efficient for the original question too.
public static function getGrades($grades)
{
$occurrences = array_count_values($grades);
krsort($occurrences);
$position = 1;
foreach ($occurrences as $score => $count) {
$occurrences[$score] = $position;
$position += $count;
}
return $occurrences;
}
If you print_r on $occurrences you get
Array
(
[92] => 1
[84] => 2
[83] => 5
[65] => 6
[41] => 7
[38] => 8
[37] => 10
)
Based on the original answer, so thanks!

Using array_count_values() followed by a foreach() is doing 2 loops over the input array, but this task can be done will one loop (minimizing/optimizing the time complexity).
Code: (Demo)
// assumed already rsort()ed.
$scores = [92, 84, 84, 84, 83, 65, 41, 38, 38, 37];
$gappedRank = 0;
$result = [];
foreach ($scores as $score) {
++$gappedRank;
$gappedRanks[$score] ??= $gappedRank;
$result[] = [$score => $gappedRanks[$score]];
}
var_export($result);
For a flat, associative lookup array of scores and their rank, unconditionally increment the counter and only push a new element into the lookup array if the key will be new. (Demo)
$gappedRank = 0;
$lookup = [];
foreach ($scores as $score) {
++$gappedRank;
$lookup[$score] ??= $gappedRank;
}
var_export($lookup);
The first snippet provides "gapped ranking". I have another answer which implements a similar approach but with a different input data structure and with the intent of modifying row data while looping.
Get dense rank and gapped rank for all items in array
In the realm of ranking, there is also "dense ranking". See my time complexity optimized answers at:
Populate multidimensional array's rank column with dense rank number
Add order column to array to indicate rank from oldest to youngest

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to simplify array into only unique values in Python - php

Related

How to decode from an unknown encoding in PHP?

Adding values together based on their position in PHP

PHP MySQL query (sum from different rows)

PHP: Most frequent value in array

Rank array values with potential duplicate values and skipping some positions if there is a tie

Categories

Resources