Get single column from 2d array [duplicate] - php

This question already has answers here:
Is there a function to extract a 'column' from an array in PHP?
(15 answers)
Closed 4 months ago.
This is my array
Array
(
[0] => Array
(
[sample_id] => 3
[time] => 2010-05-30 21:11:47
)
[1] => Array
(
[sample_id] => 2
[time] => 2010-05-30 21:11:47
)
[2] => Array
(
[sample_id] => 1
[time] => 2010-05-30 21:11:47
)
)
And I want to get all the sample_ids in one array. can someone please help ?
Can this be done without for loops (because arrays are very large).

$ids = array_map(function($el){return $el["sample_id"];}, $array);
Or in earlier versions:
function get_sample_id($el){return $el["sample_id"];}
$ids = array_map('get_sample_id', $array);
However, this is probably not going to be faster.

This is a problem I've had MANY times. There isn't an easy way to flatten arrays in PHP. You'll have to loop them adding them to another array. Failing that rethink how you're working with the data to use the original structure and not require the flatten.
EDIT: I thought I'd add a bit of metric information, I created an array $data = array(array('key' => value, 'value' => other_value), ...); where there were 150,000 elements in my array. I than ran the 3 typical ways of flattening
$start = microtime();
$values = array_map(function($ele){return $ele['key'];}, $data);
$end = microtime();
Produced a run time of: Run Time: 0.304405 Running 5 times averaged the time to just below 0.30
$start = microtime();
$values = array();
foreach ($data as $value) {
$values[] = $value['key'];
}
$end = microtime();
Produced a run time of Run Time: 0.167301 with an average of 0.165
$start = microtime();
$values = array();
for ($i = 0; $i < count($data); $i++) {
$values[] = $data[$i]['key'];
}
$end = microtime();
Produced a run time of Run Time: 0.353524 with an average of 0.355
In every case using a foreach on the data array was significantly faster. This is likely related to the overhead of the execution of a function for each element in the array for hte array_map() implementation.
Further Edit: I ran this testing with a predefined function. Below are the average numbers over 10 iterations for 'On the Fly' (defined inline) and 'Pre Defined' (string lookup).
Averages:
On the fly: 0.29714539051056
Pre Defined: 0.31916437149048

no array manitulation can be done without a loop.
if you can't see a loop, it doesn't mean it's absent.
I can make a function
array_summ($array) {
$ret=0;
foreach ($array as $value) $ret += $value;
return $ret;
}
and then call it array_summ($arr) without any visible loops. But don't be fooled by this. There is loop. Every php array function iterate array as well. You just don't see it.
So, the real solution you have to look for, is not the magic function but reducing these arrays.
From where it came? From database most likely.
Consider to make a database to do all the calculations.
It will save you much more time than any PHP bulit in function

I don't think you'll have any luck doing this without loops. If you really don't want to iterate the whole structure, I'd consider looking for a way to alter your circumstances...
Can you generate the sample_id data structure at the same time the larger array is created?
Do you really need an array of sample_id entries, or is that just a means to an end? Maybe there's a way to wrap the data in a class that uses a cache and a cursor to keep from iterating the whole thing when you only need certain pieces?

Related

PHP Find Gaps over the course of N years across multiple date ranges

Taking a multidimensional array such as
array(
array('begin' => '2006-01-01', 'finish' => '2006-02-28'),
array('begin' => '2006-03-01', 'finish' => '2006-06-30'),
array('begin' => '2006-08-01', 'finish' => '2007-12-30'),
array('begin' => '2007-01-01', 'finish' => '2016-12-30'),
);
I am trying to figure out the best way to process Nth number of arrays with varying degrees of ranges and overlaps to see if there is an gaps over the course of N years. My current requirement is down to the month. But I simply can not currently wrap my head around this. Without going through a series of nested foreaches that ultimately paint me in a corner and are way to expensive to process on bigger data sets.
This code is hopefully in O(n) and assumes that it's an ordered array as in the example.
I used that this date format can be compared as strings and gives the same result, as if you'd worked with complex date objects.
// $a is your multidimensional array as above
$gap = array();
for($i=1; $i<sizeof($a); $i++){
$gap[$i] = $a[$i-1]['finish'] < $a[$i]['begin'];
}
$gap contains an array or booleans, which indicate at which indexes there is a gap.
I'm not going to suggest an actual implementation, but more a general approach.
You can usually look at this kind of problems two ways:
Make super smart code. Craft an algorithm that recursively merges the ranges that overlaps, leaving you with an array of discrete ranges. You have gaps if you have more than one row in the array, and the gaps are defined between the end of a range and the start of another one. The keyword here is recursively.
Make super dumb code. Build an assoc array with all the months of your interval (that can't be very much, even 10 years is only 120 months) as keys, and "true" as value. Iterate your array, and set the months that appears in a range to "false". Use array_filter and ta-dah! You're left with the months having gaps. The key here is to not use date related functions (they're slow) and instead just go at it arithmetically.
Hope this help putting you on the right track.
Not widely tested and needs some refactoring, but this is the best algorithm I could come up with (I've done something similar with database actually):
define('DAY_SEC', 86400);
$result = [];
foreach ($dates as $date) {
$begin = strtotime($date['begin']);
$finish = strtotime($date['finish']);
$merged = null;
foreach ($result as $idx => $span) {
if ($span['begin'] <= $finish + DAY_SEC && $span['finish'] >= $begin - DAY_SEC) {
if (isset($merged)) {
$min = min($span['begin'], $begin, $result[$merged]['begin']);
$max = max($span['finish'], $finish, $result[$merged]['finish']);
unset($result[$idx]);
} else {
$min = min($span['begin'], $begin);
$max = max($span['finish'], $finish);
$merged = $idx;
}
$result[$merged] = ['begin' => $min, 'finish' => $max];
}
}
if (!isset($merged)) {
$result[] = ['begin' => $begin, 'finish' => $finish];
}
}
foreach ($result as &$span) {
$span['begin'] = date('Y-m-d', $span['begin']);
$span['finish'] = date('Y-m-d', $span['finish']);
}
It will result in array of continuous timespans. It adds timespans one by one and merges them when overlaping with all previously added. If it overlaps more than one period then it merges consecutive matches into the first one.

Specific data selection from php json

I have the following code :
$json = json_decode(URL, true);
foreach($json as $var)
{
if($var[id] == $valdefined)
{
$number = $var[count];
}
}
With json it looks like this :
[{"id":"1","count":"77937"},
{"id":"2","count":"20"},
{"id":"4","count":"25"},
{"id":"5","count":"11365"}]
This is what the array ($json) looks like after jsondecode
Array ( [0] => Array ( [id] => 1 [count] => 77937 ) [1] => Array ( [id] => 2 [count] => 20 ) [2] => Array ( [id] => 4 [count] => 25 ) [3] => Array ( [id] => 5 [count] => 11365) )
is there a way to say what is $json[count] where $json[id] = 3 for example
I'm not sure about a better way, but this is also fine, provided the JSON object is not huge. php is pretty fast when looping through JSON. If the object is huge, then you may want to split it. What I personally do is make my JSON into an array of normal objects, sort them, and then searching is faster on sorted items.
EDIT
Do json_decode($your_thing, true); set it true to make it an associative array, and then the id would be key and and the count would be value. After you do this, getting the value with the ID should really be easy and far more efficient.
If you change the way you build your json object to look like this :-
{"1":77937,"2":20,"4":25,"5":11365}
And then use the json_decode() parameter 2 set to TRUE i.e. turn the json into an array.
Then you have a usable assoc array with the ID as the key like so:
<?php
$json = '{"1":77937,"2":20,"4":25,"5":11365}';
$json_array = json_decode($json, TRUE);
print_r( $json_array);
?>
Resulting in this array
Array
(
[1] => 77937
[2] => 20
[4] => 25
[5] => 11365
)
Which you can do a simple
$number = json_array( $valdefined );
Or better still
if ( array_key_exists( $valdefined, $json_array ) ) {
$number = json_array( $valdefined );
} else {
$number = NULL; // or whatever value indicates its NON-EXISTANCE
}
Short answer to your initial question: why can't you write $json['count'] where $json['id'] = 3? Simply because PHP isn't a query language. The way you formulated the question reads like a simple SQL select query. SQL will traverse its indexes, and (if needs must) will perform a full table scan, too, its Structured Query Language merely enables you not to bother writing out the loops the DB will perform.
It's not that, because you don't write a loop, there is no loop (the absence of evidence is not the evidence of absence). I'm not going to go all Turing on you, but there's only so many things we can do on a machine level. On the lower levels, you just have to take it one step at a time. Often, this means incrementing, checking and incrementing again... AKA recursing and traversing.
PHP will think it understands what you mean by $json['id'], and it'll think you mean for it to return the value that is referenced by id, in the array $json, whereas you actually want $json[n]['id'] to be fetched. To determine n, you'll have to write a loop of sorts. Some have suggested sorting the array. That, too, like any other array_* function that maps/filters/merges means looping over the entire array. There is just no way around that. Since there is no out-of-the-box core function that does exactly what you need to do, you're going to have to write the loop yourself.
If performance is important to you, you can write a more efficient loop. Below, you can find a slightly less brute loop, a semi Interpolation search. You could use ternary search here, too, implementing that is something you can work on.
for ($i = 1, $j = count($bar), $h = round($j/2);$i<$j;$i+= $h)
{
if ($bar[++$i]->id === $search || $bar[--$i]->id === $search || $bar[--$i]->id === $search)
{//thans to short-circuit evaluation, we can check 3 offsets in one go
$found = $bar[$i];
break;
}//++$i, --$i, --$i ==> $i === $i -1, increment again:
if ($bar[++$i]->id > $search)
{// too far
$i -= $h;//return to previous offset, step will be halved
}
else
{//not far enough
$h = $j - $i;//set step the remaining length, will be halved
}
$h = round($h/2);//halve step, and round, in case $h%2 === 1
//optional:
if(($i + $h + 1) === $j)
{//avoid overflow
$h -= 1;
}
}
Where $bar is your json-decoded array.
How this works exactly is explained below, as are the downsides of this approach, but for now, more relevant to your question: how to implement:
function lookup(array $arr, $p, $val)
{
$j = count($arr);
if ($arr[$j-1]->{$p} < $val)
{//highest id is still less value is still less than $val:
return (object) array($p => $val, 'count' => 0, 'error' => 'out of bounds');
}
if ($arr[$j-1]->{$p} === $val)
{//the last element is the one we're looking for?
return $end;
}
if ($arr[0]->{$p} > $val)
{//the lowest value is still higher than the requested value?
return (object) array($p => $val, 'count' => 0, 'error' => 'underflow');
}
for ($i = 1, $h = round($j/2);$i<$j;$i+= $h)
{
if ($arr[++$i]->{$p} === $val || $arr[--$i]->{$p} === $val || $arr[--$i]->{$p} === $val)
{//checks offsets 2, 1, 0 respectively on first iteration
return $arr[$i];
}
if ($arr[$i++]->{$p} < $val && $arr[$i]->{$p} > $val)
{//requested value is in between? don't bother, it won't exist, then
return (object)array($p => $val, 'count' => 0, 'error' => 'does not exist');
}
if ($arr[++$i]->{$p} > $val)
{
$i -= $h;
}
else
{
$h = ($j - $i);
}
$h = round($h/2);
}
}
$count = lookup($json, 'id', 3);
echo $count['count'];
//or if you have the latest version of php
$count = (lookup($json, 'id', 3))['count'];//you'll have to return default value for this one
Personally, I wouldn't return a default-object if the property-value pair wasn't found, I'd either return null or throw a RuntimeException, but that's for you to decide.
The loop basically works like this:
On each iteration, the objects at offset $i, $i+1 and $i-1 are checked.
If the object is found, a reference to it is assigned to $found and the loop ends
The object isn't found. Do either one of these two steps:
ID at offset is greater than the one we're looking for, subtract step ($h) from offset $i, and halve the step. Loop again
ID is smaller than search (we're not there yet): change step to half of the remaining length of the array
A diagram will show why this is a more "clever" way of looping:
|==========x=============================|//suppose x is what we need, offset 11 of a total length 40:
//iteration 1:
012 //checked offsets, not found
|==========x=============================|
//offset + 40/2 == 21
//iteration 2:
012//offsets 20, 21 and 22, not found, too far
|==========x=============================|
//offset - 21 + round(21/2)~>11 === 12
//iteration 3:
123 //checks offsets 11, 12, 13) ==> FOUND
|==========x=============================|
assign offset-1
break;
Instead of 11 iterations, we've managed to find the object we needed after a mere 3 iterations! Though this loop is somewhat more expensive (there's more computation involved), the downsides rarely outweigh the benefits.
This loop, as it stands, though, has a few blind-spots, so in rare cases it will be slower, but on average it performs pretty well. I've tested this loop a couple of times, with an array containing 100,000 objects, looking for id random(1,99999) and I haven't seen it take more time than .08ms, on average, it manages .0018ms, which is not bad at all.
Of course, you can improve on the loop by using the difference between the id at the offset, and the searched id, or break if id at offset $i is greater than the search value and the id at offset $i-1 is less than the search-value to avoid infinite loops. On the whole, though, this is the most scalable and performant loopup algorithm provided here so far.
Check the basic codepad in action here
Codepad with loop wrapped in a function

Search Array : array_filter vs loop

I am really new in PHP and need a suggestion about array search.
If I want to search for an element inside a multidimensional array, I can either use array_filter or I can loop through the array and see if an element matching my criteria is present.
I see both suggestion at many places. Which is faster? Below is a sample array.
Array (
[0] => Array (
[id] => 4e288306a74848.46724799
[question] => Which city is capital of New York?
[answers] => Array (
[0] => Array (
[id] => 4e288b637072c6.27436568
[answer] => New York
[question_id_fk] => 4e288306a74848.46724799
[correct] => 0
)
[1] => Array (
[id] => 4e288b63709a24.35955656
[answer] => Albany
[question_id_fk] => 4e288306a74848.46724799
[correct] => 1
)
)
)
)
I am searching like this.
$thisQuestion = array_filter($pollQuestions, function($q) {
return questionId == $q["id"];
});
I know, the question is old, but I disagree with the accepted answer. I was also wondering, if there was a difference between a foreach() loop and the array_filter() function and found the following post:
http://www.levijackson.net/are-array_-functions-faster-than-loops/
Levi Jackson did a nice job and compared the speed of several loop and array_*() functions. According to him a foreach() loop is faster than the array_filter() function. Although it mostly doesn't make such a big difference, it starts to matter, when you have to process a lot of data.
I've made a test script because I was a little skeptical ...how can an internal function be slower than a loop...
But actually it's true. Another interesting result is that php 7.4 is almost 10x faster than 7.2!
You can try yourself
<?php
/*** Results on my machine ***
php 7.2
array_filter: 2.5147440433502
foreach: 0.13733291625977
for i: 0.24090600013733
php 7.4
array_filter: 0.057109117507935
foreach: 0.021071910858154
for i: 0.027867078781128
**/
ini_set('memory_limit', '500M');
$data = range(0, 1000000);
// ARRAY FILTER
$start = microtime(true);
$newData = array_filter($data, function ($item) {
return $item % 2;
});
$end = microtime(true);
echo "array_filter: ";
echo $end - $start . PHP_EOL;
// FOREACH
$start = microtime(true);
$newData = array();
foreach ($data as $item) {
if ($item % 2) {
$newData[] = $item;
}
}
$end = microtime(true);
echo "foreach: ";
echo $end - $start . PHP_EOL;
// FOR
$start = microtime(true);
$newData = array();
$numItems = count($data);
for ($i = 0; $i < $numItems; $i++) {
if ($data[$i] % 2) {
$newData[] = $data[$i];
}
}
$end = microtime(true);
echo "for i: ";
echo $end - $start . PHP_EOL;
I know it's an old question, but I'll give my two cents: for me, using a foreach loop was much faster than using array_filter. Using foreach, it took 1.4 seconds to perform a search by id, and using the filter it took 8.6 seconds.
From my own experience, foreach is faster. I think it has something to do with function call overhead, arguments check, copy to variable return instruction, etc.. When using a basic syntax, the parsed code is more likely to be closer to the compiled/interpreted bytecodes, have better optimization down the core.
The common rule is : anything is simplier, run faster (imply less check, less functionnality, as long as it has all you need)
Array_Filter
Iterates over each value in the input array passing them to the
callback function. If the callback function returns true, the current
value from input is returned into the result array. Array keys are
preserved.
as for me same.

Multiply each integer in a flat array by 60

I have an array called $times. It is a list of small numbers (15,14,11,9,3,2). These will be user submitted and are supposed to be minutes. As PHP time works on seconds, I would like to multiply each element of my array by 60.
I've been playing around with array_walk and array_map but I can't get those working.
You can use array_map:
array_map() returns an array containing all the elements of arr1 after applying the callback function to each one. The number of parameters that the callback function accepts should match the number of arrays passed to the array_map()
Examples with lambda functions for callbacks:
array_map(function($el) { return $el * 60; }, $input);
Same for PHP < 5.3
array_map(create_function('$el', 'return $el * 60;'), $input);
Or with bcmul for callback
array_map('bcmul', $input, array_fill(0, count($input), 60));
But there is nothing wrong with just using foreach for this as well.
Just iterate over the array with the foreach statement and multiply:
foreach ($times as $value) {
$new_times[] = $value * 60;
}
You may want to use foreach for simplicity in your case:
foreach( $times as &$val ){ $val *= 60; }
I assume that the values of your array are the ones you want to multiply, not the keys. As the above solution uses references, it will also alter your original array - but as you were aiming at array_map before, I think that you want that.
The solution above is probably easier to understand and most likely faster than using array_map which (obviously) has to call a function for each element.
I'd use array_map only for more complicated things such as advanced sorting algorithms and so on, but definitely not for something as trivial as this.
I'd personally second the suggestion from Gordon to just use a lambda (or created) function and either do:
array_map(function($el) { return $el * 60; }, $input_array);
(PHP >= 5.3) or
array_map(create_function('$el', 'return $el * 60;'), $input_array);
(PHP < 5.3)
Definitely I see no reasons for duplicating the array (can become cumbersome if lot of values are involved); also, pay attention that using foreach (which I second can be handy) can also be dangerous if you're not working with it carefully ...and then maintenance can become daunting anyways (because you have to remember to deal with it every time you work on that code). If you have no reasons for optimizing at this stage (IE your application has not problems of speed), don't do it now and don't worry about using array_map. You can think about ease of maintenance now and optimize later, in case you really need to.
In fact, if you go by reference and then you use foreach again, you might step into unexpected results (see example below...)
$a=array('1','2','3');
foreach ($a as &$v) {
$v *= 60;
}
print_r($a);
foreach ($a as $v);
print_r($a);
Output is:
Array
(
[0] => 60
[1] => 120
[2] => 180
)
Array
(
[0] => 60
[1] => 120
[2] => 120
)
Probably not what you expect on the second cycle.
This is why I usually avoid the foreach & byref combo when I can.
array_walk($myArray, function(&$v) {$v *= 60;});

Indexing an array every cycle (in seconds)

I have to put datas every 10 seconds in an array. Is it silly to index this array with modified timestamps
$a[timestamp] = 54;
$a[timestamp+10] = 34;
or in Javascript with the setInterval() and passing via Ajax the index (very crappy to me) ?
or have I a best option ?
Further details :
I have to link the real-time with entries in my array : this is my problem. At the 3rd cycle (21 sec to 30 sec from the beginning time).
I have only 15 entries to store.
My present code :
$first_time = (int)date('Hi');
$_SESSION['mypile'][$first_time] = array_fill ($first_time, 15, array('paramA' => 0, 'paramB' => 0));
then, the Ajax part calls this script :
$time = (int)date('Hi');
$_SESSION['mypile'][$time]['paramA'] = calcul_temp($_SESSION['mypile'], $time);
Why would you not use a plain numerically indexed array? If you don't need the timestamp, then:
$a[] = 54;
$a[] = 34;
If you do need the timestamp, then it would make more sense to do something like:
$a[] = array('timestamp' => time(), 'number' => 54);
$a[] = array('timestamp' => time(), 'number' => 34);
Then at each offset you have a more meaningful associative array:
echo 'Timestamp: ' . $a[0]['timestamp'] . ', Number: ' . $a[0]['number'];
If those operations happen in rapid succession, you would probably be better using microtime
That seems like a very good solution, though you will have to be careful about memory usage if the script will be running for a long time.
That is fairly silly; if you've set a time interval, simply have your function be called every 10 seconds and add your new number to the next index in the array. Keep track of this index globally or within the scope of the iteration.
$a['timestamp'] = time();
while (true) {
$a['data'][] = getData();
sleep(10);
}
You could make a class of it. The construct then sets the timestamp, and with SPL array index and iterator it can be looped in foreach and used with some array functions. You can make a method to get an array with or without the timestamp, etc.
$dataCycle = new DataCycle();
while(true) {
$dataCycle->addData(getData());
sleep(10);
}
Ok so I decided to round my timestamp every 10 seconds to have pieces of time. Simple, silly and work for me.
Thank you for ideas.

Categories