Which of these two methods (array_count_values or array_key_exists) is more 'friendly' with regard to memory consumption, resource usage and ease of code processing. And is the difference big enough to where a significant performance bump or loss could take place?
If more details on how these methods may be used are needed, I will be parsing a very large XML file and using simpleXML to retrieve certain XML values into an array (to store URL's), like this:
$XMLproducts = simplexml_load_file("products.xml");
foreach($XMLproducts->product as $Product) {
if (condition exists) {
$storeArray[] = (string)$Product->store; //potentially several more arrays will have values stored in them
}}
No duplicate values will be inserted into the array, and I am debating on either using the array_key_exists or array_count_values method to prevent duplicate values. I understand array_key_exists is more , but my question is how much more resource friendly is it? Enough to slow down performance significantly?
I would much rather use the array_count_values method primarily because along with preventing duplicate value in the array, you can also easily display how many duplicate values are in each value. IE if in the storeArray "Ace Hardware" had 3 instances, you could easily display "3" next to the URL, like this:
// The foreach loop builds the arrays with values from the XML file, like this:
$storeArray2[] = (string)$Product->store;
// Then we later display the filter links like this:
$storeUniques = array_count_values($storeArray)
foreach ($storeUniques as $stores => $amts) {
?>
<?php echo $stores; ?> <?php echo "(" . ($amts) . ")" . "<br>";
}
If there is a big performance gap between the 2 methods, I will go with the array_key_exists method. But the code below only prevents duplicate values from being displayed. Is there a way to also display the number of duplicate values (for each value) that would have occurred?
// The foreach loop builds the arrays with values from the XML file, like this:
$Store = (string)$Product->store;
if (!array_key_exists($Store, $storeArray)) {
$storeArray[$Store] = "<a href='webpage.php?Keyword=".$keyword."&Features=".$features."&store=".$Product->store"'>" . $Store . "</a>";
}
// Then we later display the filter links like this:
foreach ($storeArray as $storeLinks) {
echo $storeLinks . "<br>";
}
Use isset() !
It's 2.5x faster than array_key_exists
Source :
Here
Related
I need to check some input string against a huge (and growing) list of strings coming from a CSV file (1000000+). I currently load every string in an array and check against it via in_array(). Code looks like this:
$filter = array();
$filter = ReadFromCSV();
$input = array("foo","bar" /* more elements... */);
foreach($input as $i){
if(in_array($i,$filter)){
// do stuff
}
}
It already takes some time and I was wondering is there is a faster way to do this?
in_array() checks every element in the array until it finds a match. The average complexity is O(n).
Since you are comparing strings, you might store your input as array keys instead of values and look them up via array_key_exists(); which requires a constant time O(1).
Some code:
$filter = array();
$filter = ReadFromCSV();
$filter = array_flip($filter); // switch key <=> value
$input = array("foo","bar" /* more elements... */);
foreach($input as $i){
if(array_key_exists($i,$filter)){ // array_key_exists();
// do stuff
}
}
That's what indexes were invented for.
It's not a matter of in_array() speed, as the data grows, you should probably consider using indexes by loading data into a real DBMS.
It is my understanding that using isset to prevent duplicate values from being inserted into an array is the best method with regard to memory consumption, resource usage and ease of code processing. I am currently using the array_count_values, like this:
$XMLproducts = simplexml_load_file("products.xml");
foreach($XMLproducts->product as $Product) {
if (condition exists) {
$storeArray[] = (string)$Product->store; //potentially several more arrays will have values stored in them
}}
$storeUniques = array_count_values($storeArray)
foreach ($storeUniques as $stores => $amts) {
?>
<?php echo $stores; ?> <?php echo "(" . ($amts) . ")" . "<br>";
}
How would prevent duplicate values from being inserted into an array (similar to the above) using ISSET? And is there a big performance difference between the 2 if the XML file being parsed is very large (5-6MB)?
As you are using the count in your output, you cannot use array_unique() because you would loose that information.
What you could do, is build the array you need in your loop, using the string as your key and counting the values as you go:
$storeArray = array();
foreach($XMLproducts->product as $Product) {
if (condition exists) {
$store = (string)$Product->store;
if (array_key_exists($store, $storeArray))
{
$storeArray[$store]++;
}
else
{
$storeArray[$store] = 1;
}
}
}
Note that this is just to illustrate, you can probably wrap it up in one line.
This way you will not have multiple duplicate strings in your array (assuming that that is your concern) and you don't increase your memory consumption by generating a second (potentially big...) array.
I think array_unique and company are considered unfriendly because they check the database each time an entry is made. The code you're trying to write is doing essentially the same thing, so I don't see a problem with using array_unique.
Very simple, no checking required:
foreach($XMLproducts->product as $Product)
$helperArray[$product->store] = "";
associative arrays have unique keys by definition. if a key already exists, it is simply overwritten.
Now swap key and value:
$storeArray = array_keys($helperArray);
EDIT: to also count the number of occurences of each <store>, I suggest:
foreach($XMLproducts->product as $Product)
$helperArray[] = (string)$product->store;
And then:
$storeArray = array_count_values($helperArray);
Result: key = unique store, value = count.
In my PHP application. I am taking value from user and all these user values are stored in Array. And just for validation. I am comparing user input value with my array. :-
<?php
// Current Code
$masterArray = array(......); // ..... represents some 60-100 different values.
foreach($_POST as $key => $value) {
if(in_array($value, $masterArray)) {
$insertQuery = $mysqli->query("INSERTION stuff or Updating Stuff");
} else {
echo "Are you tampering html-form-data ?";
}
}
?>
But this is so worthless code, as it takes quite good time in updating or insertion.
Is there any better function that is way faster to check if value in slave array exists in master array ?
From Slave Array i Mean => List / Array of User Input value.
From Master Array i mean => List of my array value stored in page.
Thanks
I think i got the better option with array_diff.
Please let me know if i am doing anything wrong in below before i put this code in production page:- Thanks a lot for your efforts #J.David Smith & #grossvogel
<?php
$masterArray = array(.......); // My Master Array List
$result = array_diff($_POST['checkBox'], $masterArray);
if(count($result) > 0) {
// If they are trying to do some tampering , let them submit all again.
echo 'Something is not Right';
} else {
// If Person is genuine, no waiting just insert them all
$total = count($_POST['checkBox']);
$insertQuery = "INSERT into notes(user,quote) values ";
for($i=0;$i<=$total; $i++) {
array_push($values, "('someuser','".$mysqli->real_escape_string($_POST['checkBox'][$i])."')");
}
$finalQuery = $mysqli->query($insertQuery.implode(',', $values));
}
?>
Is my Code Better , I am testing it in localhost i don't see much of difference, I just want to know expert views if I am messing arround with something ? before i put this code in production page.
Update : This looks pretty better and faster than code in question.
Thanks
The only other way to do this is to use an associative array with your values as keys (well, you could custom-implement another storage container specifically for this, but that'd be overkill imo). Then, you can check with isset. For example:
$masterArray = array(....); // same thing, but with values as keys instead of values
foreach($_POST as $key => $value) {
if(isset($masterArray[$value])) {
// do stuff
} else {
// do stuff
}
}
I'm kind of curious what the point of doing this is anyway, especially given the statement printed by your echo call. There may be an even better way to accomplish your goal than this.
EDIT: Another method suggested by grossvogel: loop over $masterArray instead of $_POST. If you expect $_POST to be a large set of data consistently (ie most of the time people will select 50+ items), this could be faster. Hashing is already very fast, so you'll have to benchmark it on your code in order to make a decision.
$masterArray = array(...); // either style of definition will work; i'll use yours for simplicity
foreach($masterArray as $value) {
if(isset($_POST[$value])) {
// do stuff
}
}
As an intern I realized that I spend the bulk of my time building and manipulating tables from sql queries in PHP. My current method is to use two foreach loops:
foreach($query as $record){
foreach($record as $field => $value){
*Things that need to be done on each field-value pair*
}
*Things that need to be done on each row*
}
Is there a better way of doing this?
Also I tend to pack data together as a ~ separated list and store it in the server, is this a bad practice?
I'd rather put some code up for review but I don't want to risk exposing the internals of the company cod.
Foreach loops are the best way to iterate through your data. If you are looking to make your code a bit prettier, try using the ternary version
<?php foreach($query as $record) : ?>
<?php foreach($record as $field => $value) : ?>
*Things that need to be done on each field-value pair*
<?php endforeach; ?>
*Things that need to be done on each row*
<?php endforeach; ?>
Also, like mentioned in the comments above, you lose a lot of functionality when storing ~ seperated data in the db. If you must do this, you can try storing serialized objects instead of delimited strings. You can manipulate the objects in a number of ways such as json_encode() and json_decode()
$myArray = array();
$myArray['User1']['book'] = 'Pride and Prejudice';
$myArray['User1']['favorites'] = 'Water skiing';
$myArray['User2']['book'] = 'Mansfield Park';
$myArray['User2']['favorites'] = array('skateboarding', 'surfing', 'running');
$myArray['description'] = 'Things people like.';
echo '<pre>';
print_r(json_encode($myArray)); //This will convert your array to a string for the db
echo '</pre>';
echo '<pre>';
$myArrayString = json_encode($myArray);
print_r(json_decode($myArrayString)); //This will convert the db string to an object for manipulation
echo '</pre>';
There is no built in way to produce a HTML table from the result of a query. If you find yourself writing that sort of code over and over, then it would be a good candidate to make a reusable class or library. For example:
$table = new HTMLTable();
$table->setData($data);
echo $table->toHTML();
The above is not working code, just an example of how you could create reusable code instead of repeating the same table building code many times.
I tend to use a while loop with one of the mysql_fetch_.. functions. But essentially it's the same as what you do.
$query = 'SELECT
stuff
FROM
table';
if ($query = mysql_query($query)) {
while ($row = mysql_fetch_assoc($query)) {
foreach ($row as $key => $value) {
/* Things that need to be done on each field-value pair */
}
/* Things that need to be done on each row */
}
}
And as to the ~ separated list. I strongly recommend to save the data in separate DB fields, instead of packing it like that. Just create a new table for each such pack.
I have a working script, but I'm sure that my method of managing arrays could be better. I've searched for a solution and haven't found one, but I'm sure that I should be using the functionality of associative arrays to do things more efficiently.
I have two arrays, one from a CSV file and one from a DB. I've created the CSV array as numeric and the DB array as associative (although I'm aware that the difference is blurry in PHP).
I'm trying to find a record in the DB array where the value in one field matches a value in the CSV array. Both arrays are multi-dimensional.
Within each record in each array there is a reference number. It appears once in the CSV array and may appear in the DB array. If it does, I need to take action.
I'm currently doing this (simplified):
$CSVarray:
('reference01', 'blue', 'small' ),
('reference02', 'red', 'large' ),
('reference03', 'pink', 'medium' )
$Dbarray:
(0 => array(ref=>'reference01',name=>"tom",type=>"mouse"),
(1 => array(ref=>'reference02',name=>"jerry",type=>"cat"),
(2 => array(ref=>'reference03',name=>"butch",type=>"dog"),
foreach ($CSVarray as $CSVrecord) {
foreach ($Dbarray as $DBrecord) {
if ($CSVarray[$numerickey] == $DBrecord['key'] {
do something with the various values in the $DBrecord
}
}
}
This is horrible, as the arrays are each thousands of lines.
I don't just want to know if matching values exist, I want to retrieve data from the matching record, so functions like 'array_search ' don't do what I want and array_walk doesn't seem any better than my current approach.
What I really need is something like this (gibberish code):
foreach ($CSVarray as $CSVrecord) {
WHERE $Dbarray['key']['key'] == $CSVrecord[$numerickey] {
do something with the other values in $Dbarray['key']
}
}
I'm looking for a way to match the values using the keys (either numeric or associative) rather than walking the arrays. Can anyone offer any help please?
use a hash map - take one array and map each key of the record it belongs to, to that record. Then take the second array and simply iterate over it, checking for each record key if the hashmap has anything set for it.
Regarding your example:
foreach ($DBarray as $DBrecord){
$Hash[$record[$key]] = $DBrecord;
}
foreach ($CSVarray as $record){
if (isset($Hash[$record[$CSVkey]])){
$DBrecord = $Hash[$record[$CSVkey]];
//do stuff with $DBrecord and $CSVrecord
}
}
this solution works at O(n) while yours at O(n^2)...
You can use foreach loops like this too:
foreach ($record as $key => $value) {
switch($key)
{
case 'asd':
// do something
break;
default:
// Default
break;
}
}
A switch may be what you are looking for also :)
Load CSV into the db, and use db (not db array) if possible for retrieval. Index the referenceid field.