It seems like a pretty simple issue and one that I thought for sure would have been asked before but I've been searching for some time now and have not found a solution.
When using PHP's file() command, it reads the file and puts each line in an array as a String. Is there any way to read each line as an integer?
(I know that I could just through the array and convert it to an int, but I figure that will slow it down somewhat)
Nope. The only way is to do it manually. Fortunately, that's easy:
$lines = array_map('intval', file($path));
Related
I'm looking for a way to serialize large arrays to a file in PHP.
Right now I use a simple JSON format. Unfortunately to store JSON to a file you need to convert it to a string first with json_encode and then write the string to a file. During this process the amount of used memory almost doubles (it's less). And in some cases it can be a problem if things are happening concurrently.
My question is: is there a PHP library (binary preferably) which can serialize an array to a file (a JSON format would be nice) without converting the object to a string and thus 'doubling' the memory. If the output can be compressed with GZIP, what would be even better.
Any other suggestion to write (and read) of large object without intermediate format/state are welcome too.
If memory is the only concern
At the risk of being called Captain Obvious - I'd like to suggest a weird approach I like to use when there's not enough memory and I have to deal with something that only fits in once. Also, if garbage collection doesn't happen, that can be solved by doing the job in several steps as this article explains.
What I mean is something like this:
function packWithoutExhaustingMemory (array $a) {
foreach($a as $key => $value) {
$a[$key] = gzcompress(serialize($value)); // but only one piece at a
time!
}
return $a;
}
Again, not sure if this exact piece will do the job but it illustrates the concept.
Hy,
I'm creating a script to load a binary file into an array then parse the array myself (creating another array with decoded binary data : IA5String, Int, String, (basicaly ASN.1) etc and then create a .csv)
The problem that slows my script down is loading the hex values into an array and i'm using this method :
$hex = explode(" ",rtrim(chunk_split(bin2hex(file_get_contents($filename)),2,' ')));
The thing is that explode() function is taking a lot of time and resources and I was wondering if there is another faster or maybe simpler solution to save some running time.
Thanks
str_split is another function to convert string to array.
I'm trying to think of the most efficient way to parse a file that stores names, studentids and Facebook ids. I'm trying to get the fbid value, so for this particular line it would be: 1281766051. I thought about using regex for this, but I'm a bit lost as to where to start. I thought about adding all this data to an array and chopping away at it, but it just seems inefficient.
{"name":"John Smith","studentid":"10358595","fbid":"1284556651"}
I apologise if the post is too brief. I'll do my best to add anything that I might have missed out. Thanks.
Well, this seems to be JSON, so the right way would be
$json = json_decode($str);
$id = $json->fbid;
The regex solution would look like this:
preg_match('/"fbid":"(\d+)"/', $str, $matches);
$id = $matches[1];
But I cannot tell you off the top of my head which of these is more efficient. You would have to profile it.
UPDATE:
I performed a very basic check on execution times (nothing too reliable, I just measured 1,000,000 executions of both codes). For your particular input, the difference is rather negligible:
json_decode: 27s
preg_match: 24s
However, if your JSON records get larger (for example, if I add 3 fields to the beginning of the string (so that both solutions are affected)), the difference becomes quite noticeable:
json_decode: 46s
preg_match: 30s
Now, if I add the three fields to the end of the string, the difference becomes even larger (obviously, because preg_match does not care about anything after the match):
json_decode: 45s
preg_match: 24s
Even so, before you apply optimizations like this, perform proper profiling of your application and make sure that this is actually a crucial bottleneck. If it is not, it's not worth obscuring your JSON-parsing code with regex functions.
That's JSON, use:
$str = '{"name":"John Smith","studentid":"10358595","fbid":"1284556651"}';
$data = json_decode($str);
echo $data->fbid;
Cheers
Use json_decode
$txt='{"name":"John Smith","studentid":"10358595","fbid":"1284556651"}';
$student =json_decode($txt);
echo $student->fbid;
How would i split a string every 4 characters and then have every 4 characters put into a separate variable so i can do other stuff with each individual 4 characters?
Well, I don't really see you putting each of them in a different variable, because then you'll have like 10 variables:
$var1 = ...;
$var2 = ...;
$var3 = ...;
But you could use the str_split function as following:
$variable = str_split($origionalvar, 4);
it will create an array, which you can access:
$variable[0] to $variable[sizeof(variable)]
If you don't know about arrays, you really should read up on those. They are really great if you want to store a lot of similar information in an object and you don't want to create a new variable for it every time. Also you can loop over them very easily and do some other great stuff with them.
I assume you probably googled this question a bit too. You must have come across http://php.net/manual/en/function.str-split.php. Did you see this page, and if you did, did you have some trouble with it. If so perhaps we can help you reading the documentation properly.
Have a look at the str_split function. The PHP manual will tell you all you need to know.
I have a list of 50,000 ID's in a flat file and need to remove any duplicate ID's. Is there any efficient/recommended algorithm for my problem?
Thanks.
You can use the command line sort program to order and filter the list of ids. This is a very efficient program and scales well too.
sort -u ids.txt > filteredIds.txt
Read into a dictionary line by line, discarding duplicates. When all read, write out to a new file.
I've did some experiments once and the fastest solution I could get in PHP was by sorting the items and manually remove all the duplicate items.
If performance isn't that much of an issue for you (which I suspect, 50,000 is not that much) than you can use array_unique(): http://php.net/array_unique
i guess if you have large enough memory allowance, you can put all these ids in array
$array[$id] = $id;
this would automatically weed out the dupes.
You can do:
file_put_contents($file,implode("\n",array_unique(file($file)));
How it works?
Read the file using function file
which returns an array.
Get rid of the duplicate lines using
array_unique
implode those unique lines with "\n"
to get a string
write the string back to the file
using file_put_contents
This solution assumes that you've got one ID per line in the flat file.
You can do it via array / array_unique, in this example i guess your ids are separated by line braks, if thats not the case just change it
$file = file_get_contents('/path/to/file.txt');
$array = explode("\n",$file);
$array = array_unique($array);
$file = implode("\n",$array);
file_put_contents('/path/to/file.txt',$file);
If you can just explode the contents of the file on a comma (or any delimiter), then array_unique will produce the least (and cleanest) code, otherwise if your are parsing the file going with the $array[$id] = $id is the fastest and cleanest solution.
If you can use a terminal (or native unix execution), the easiest way: (assuming that there is nothing else in the file):
sort < ids.txt | uniq > filteredIds.txt