Is there a faster way for searching item in json array

Is there a faster way for searching item in json array - php

Let says that I stored dataID in json file with 1,000,000 records.
My zresults.json = {"dataID":["1","2","3", ... "1000000"]}z
I want to find ID "100000" in the array.
$file = file_get_contents('results.json');
$data = json_decode($file,true);
if(in_array('100000', $data['dataID']))
{
echo "found";
} else {
echo "not found";
}
It took about 0.6 sec. for the result.
Is there a faster way for searching in json array like this?
Please give me an example!
Thank you in advance.
Update:
Although sql would much faster but considered 1,000,000 record in one table the more record the more space! At least, static file reduced server load and less space.
It depends on how designed your system. Use it the right place and the right time!

Sure!
$stm = $pdo->prepare("SELECT 1 FROM data WHERE id = ?");
$stm->execute(array(100000));
if ($stm->fetchColumn())
{ echo "found"; } else { echo "not found"; }
you will need to import your array into database first.

Depending on the structure of the data in the results.json file you may be able to do a simple string search for example
$file = file_get_contents('results.json');
if(strpos($file, '"100000"') !== false)
{
echo 'found';
}
else
{
echo 'not found';
}
After benchmarking your method I got around 0.78 seconds (on my slow local system) however with this method I achieved around 0.03 seconds.
Like I say, it depends on your data structure but if it does permit you to use this method you'll see significant speed benefits.

Maybe it's possible to try to predict the outcome, you can then use in_array to search the value in a much smaller json.
Otherwise you could try alternative search algorithms, those can be complicated.

A document based database like mongo DB is what you should try instead of working with plain json files - if you are bound to the json format.
Note that mongoDB can keep the json object in memory where plain PHP solutions had to parse the file again and again.
I see three performance boosts:
less disk IO
less parsing
index based searches

Why don't you store the id as keys and then do:
if(isset($data['dataID']['100000'])){
// do something
}
Because checking if a key exists is alot faster than looping through the array. You can check out this link for further information:
List of Big-O for PHP functions

Related

How can I most efficiently check for the existence of a single value in an array of thousands of values?

Due to a weird set of circumstances, I need to determine if a value exists in a known set, then take an action. Consider:
An included file will look like this:
// Start generated code
$set = array();
$set[] = 'foo';
$set[] = 'bar';
// End generated code
Then another file will look like this:
require('that_last_file.php');
if(in_array($value, $set)) {
// Do thing
}
As noted, the creation of the array will be from generated code -- a process will create a PHP file which will be included above the if statement with require.
How concerned should I be about the size of this mess -- both in bytes, and array values? It could easily get to 5,000 values. How concerned should I be with the overhead of a 5,000-value array? Is there a more efficient way to search for the value, other than using in_array on an array? How painful is including a 5,000-line file via require?
I know there are ultimately better ways of doing this, but my limitations are that the set creation and logic has to be in an included PHP file. There are odd technical restrictions that prevent other options (i.e. -- a database lookup).

A faster way would be:
if (array_flip($set)[$value] !== null) {
// Do thing
}
A 5000 value array really isn't that bad though if it's just strings

How do I efficiently run a PHP script that doesn't take forever to execute in wamp enviornemnt...?

I've made a script that pretty much loads a huge array of objects from a mysql database, and then loads a huge (but smaller) list of objects from the same mysql database.
I want to iterate over each list to check for irregular behaviour, using PHP. BUT everytime I run the script it takes forever to execute (so far I haven't seen it complete). Is there any optimizations I can make so it doesn't take this long to execute...? There's roughly 64150 entries in the first list, and about 1748 entries in the second list.
This is what the code generally looks like in pseudo code.
// an array of size 64000 containing objects in the form of {"id": 1, "unique_id": "kqiweyu21a)_"}
$items_list = [];
// an array of size 5000 containing objects in the form of {"inventory: "a long string that might have the unique_id", "name": "SomeName", id": 1};
$user_list = [];
Up until this point the results are instant... But when I do this it takes forever to execute, seems like it never ends...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"]) !== false)
{
echo("Found a version of the item");
}
}
}
Note that the echo should rarely happen.... The issue isn't with MySQL as the $items_list and $user_list array populate almost instantly.. It only starts to take forever when I try to iterate over the lists...

With 130M iterations, adding a break will help somehow despite it rarely happens...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"])){
echo("Found a version of the item");
break;
}
}
}
alternate solutions 1 with PHP 5.6: You could also use PTHREADS and split your big array in chunks to pool them into threads... with break, this will certainly improve it.
alternate solutions 2: use PHP7, the performances improvements regarding arrays manipulations and loop is BIG.
Also try to sort you arrays before the loop. depends on what you are looking at but very oftenly, sorting arrays before will limit a much as possible the loop time if the condition is found.

Your example is almost impossible to reproduce. You need to provide an example that can be replicated ie the two loops as given if only accessing an array will complete extremely quickly ie 1 - 2 seconds. This means that either the string your searching is kilobytes or larger (not provided in question) or something else is happening ie a database access or something like that while the loops are running.

You can let SQL do the searching for you. Since you don't share the columns you need I'll only pull the ones I see.
SELECT i.unique_id, u.inventory
FROM items i, users u
WHERE LOCATE(i.unique_id, u inventory)

PHP array optimization for 80k rows

I need help to find workaround for getting over memory_limit. My limit is 128MB, from database I'm getting something about 80k rows, script stops at 66k. Thanks for help.
Code:
$posibilities = [];
foreach ($result as $item) {
$domainWord = str_replace("." . $item->tld, "", $item->address);
for ($i = 0; $i + 2 < strlen($domainWord); $i++) {
$tri = $domainWord[$i] . $domainWord[$i + 1] . $domainWord[$i + 2];
if (array_key_exists($tri, $possibilities)) {
$possibilities[$tri] += 1;
} else {
$possibilities[$tri] = 1;
}
}
}

Your bottleneck, given your algorithm, is most possibly not the database query, but the $possibilities array you're building.
If I read your code correctly, you get a list of domain names from the database. From each of the domain names you strip off the top-level-domain at the end first.
Then you walk character-by-character from left to right of the resulting string and collect triplets of the characters from that string, like this:
example.com => ['exa', 'xam', 'amp', 'mpl', 'ple']
You store those triplets in the keys of the array, which is nice idea, and you also count them, which doesn't have any effect on the memory consumption. However, my guess is that the sheer number of possible triplets, which is for 26 letters and 10 digits is 36^3 = 46656 possibilities each taking 3 bytes just for key inside array, don't know how many boilerplate code around it, take quite a lot from your memory limit.
Probably someone will tell you how PHP uses memory with its database cursors, I don't know it, but you can do one trick to profile your memory consumption.
Put the calls to memory-get-usage:
before and after each iteration, so you'll know how many memory was wasted on each cursor advancement,
before and after each addition to $possibilities.
And just print them right away. So you'll be able to run your code and see in real time what and how seriously uses your memory.
Also, try to unset the $item after each iteration. It may actually help.
Knowledge of specific database access library you are using to obtain $result iterator will help immensely.

Given the tiny (pretty useless) code snippet you've provided I want to provide you with a MySQL answer, but I'm not certain you're using MySQL?
But
- Optimise your table.
Use EXPLAIN to optimise your query. Rewrite your query to put as much of the logic in the query rather than in the PHP code.
edit: if you're using MySQL then prepend EXPLAIN before your SELECT keyword and the result will show you an explanation of actually how the query you give MySQL turns into results.
Do not use PHP strlen function as this is memory inefficient - instead you can compare by treating a string as a set of array values, thus:
for ($i = 0; !empty($domainWord[$i+2]); $i++) {
in your MySQL (if that's what you're using) then add a LIMIT clause that will break the query into 3 or 4 chunks, say of 25k rows per chunk, which will fit comfortably into your maximum operating capacity of 66k rows. Burki had this good idea.
At the end of each chunk clean all the strings and restart, set into a loop
$z = 0;
while ($z < 4){
///do grab of data from database. Preserve only your output
$z++;
}
But probably more important than any of these is provide enough details in your question!!
- What is the data you want to get?
- What are you storing your data in?
- What are the criteria for finding the data?
These answers will help people far more knowledgable than me to show you how to properly optimise your database.

Whats the best way to generate large amounts of unique promo codes?

I am trying to create promo codes in large batches (with php/mysql).
Currently my code looks something like this:
$CurrentCodesInMyDB = "asdfasdf,asdfsdfx"; // this is a giant comma delimited string of the current promo codes in the database.
$PromoCodes = "";
for($i=0;$i<=30000;$i++)
{
$NewCode = GetNewCode($PromoCodes, $CurrentCodesInMyDB );
$PromoCodes .= $NewCode . ","; //this string gets used to allow them to download a txt file
//insert $newcode into database here
}
function GetNewCode($CurrentList, $ExistingList)
{
$NewPromo = GetRandomString();
if(strpos($CurrentList, $NewPromo) === false && strpos($ExistingList, $NewPromo) === false)
{
return $NewPromo;
}
else
{
return GetNewCode($CurrentList, $ExistingList);
}
}
function GetRandomString()
{
return "xc34cv87"; //some random 8 character alphanumeric string
}
When I do batches in 10k, it seems to be ok. But the client would like to be able to generate 30k at a time. When I bump the loop up to 30k, I've been having issues. Are there any obvious performance tweaks that I could make or maybe a different way I could do this?

You probably don't need to have all 30,000 codes loaded into memory in a single giant string. Just create a table in your database, add a code unique field (either primary key or unique index) and insert random codes until you have 30,000 successful insertions.

What kind of issues specifically?
My advice is: don't store the codes in a a CSV format, instead create a new indexed column and store each code on its own row - also, use prepared queries.
Doing 60,000 strpos() on a ~250 KB string might not be the best idea ever...

If you don't want to do inserts inside the loop.(they are also expensive) use an array and the method in_array to check for the string. Look in the comments for the in_array function there is someone saying that you can achieve better performance using array_flip and then checking for the array key
http://www.php.net/manual/en/function.in-array.php#96198

PHP Change Array Over and Over

I have any array
$num_list = array(42=>'0',44=>'0',46=>'0',48=>'0',50=>'0',52=>'0',54=>'0',56=>'0',58=>'0',60=>'0');
and I want to change specific values as I go through a loop
while(list($pq, $oin) = mysql_fetch_row($result2)) {
$num_list[$oin] = $pq;
}
So I want to change like 58 to 403 rather then 0.
However I always end up getting just the last change and non of the earlier ones. So it always ends up being something like
0,0,0,0,0,0,0,0,0,403
rather then
14,19,0,24,603,249,0,0,0,403
How can I do this so it doesn't overwrite it?
Thanks

Well, you explicititly coded that each entry should be replaced with the values from the database (even with "0").
You could replace the values on non-zero-values only:
while(list($pq, $oin) = mysql_fetch_row($result2)) {
if ($pq !== "0") $num_list[$oin] = $pq;
}

I don't get you more clear, i thought your asking this only. Check this
while(list($pq, $oin) = mysql_fetch_row($result2)) {
if($oin==58) {
$num_list[$oin] = $pq;
}
}

In my simulated tests (although You are very scarce with information), Your code works well and produces the result that You want. Check the second query parameter, that You put into array - namely $pg, thats what You should get there 0,0,0,0,0...403 OR Other thing might be that Your $oin numbers are not present in $num_list keys.
I tested Your code with mysqli driver though, but resource extraction fetch_row is the same.
Bear in mind one more thing - if Your query record number is bigger than $numlist array, and $oin numbers are not unique, Your $numlist may be easily overwritten by the folowing data, also $numlist may get a lot more additional unwanted elements.
Always try to provide the wider context of Your problem, there could be many ways to solve that and help would arrive sooner.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.