php - simpleXML trying to understand what is using memory - php

Situation.
I download and process multiple xml files. Download the first file.
Open it with $xml_file = simplexml_load_file( dirname(__FILE__). '/_downloaded_files/filename.xml' );
Go through the file, create variables to insert into mysql, insert into mysql. Do the same with next xml files.
After i processed the opened xml file, i unset (set null) variables. Like $xml_file = null; Tried also like unset($xml_file);, saw no difference. Somewhere found advice to use gc_enable(); gc_collect_cycles();, also no difference (no effect).
After executed mysql code, also set to null all used variables.
As result with echo '<pre>', print_r(get_defined_vars(), true), '</pre>'; saw like
[one_variable] =>
[another_variable] => 1
I saw many (~ 100) variables with empty content (or one short value for the a variable).
But with echo (memory_get_peak_usage(false)/1024/1024); see 38.01180267334 MiB memory used.
Can someone advice where is problem? ~100 empty variables can not use 38 megabytes... What else may use the memory?

Related

XML reading script using PHP incompletely reads some elements

I have an XML data source URL from where I am reading the data using fread. It contains student information from which I am extracting the Grades and compiling them in an array.
The problem is when I run this script locally, it works fine and all the grades are correctly listed/collected in array. However, when I run this script on a shared server, I get some incorrectly read grades in addition the normal grade names, for example, "ergarten". The complete grade name "Kindergarten" is also recorded in the array which means that there a problem in reading only specific elements.
The first suspect I have in mind is fread byte length. I have changed it to 8192 but without luck.
Here is the relevant code chunk from the php file:
if (!($xml_parser = xml_parser_create())) die("Couldn't create parser.");
xml_set_element_handler( $xml_parser, "startElementHandler", "endElementHandler");
xml_set_character_data_handler( $xml_parser, "characterDataHandler");
while( $data = fread($fp, 8192)){
if(!xml_parse($xml_parser, $data, feof($fp))) {
break;}}
xml_parser_free($xml_parser);
Any thoughts?
I found the problem and fixed it myself.
The problem was that in the loop where the data was being read in chunks using fread, I was simultaneously converting that data using the XML parser and that was causing the problem since the streams of data do not always have a full tags. I removed the parser from that point to run it only when all the data has been read by the script.
That solved the problem.

User phpseclib0.3.1 - sftp get - When I leave local file blank, I don't get correct content of file

If I use a local filename, the filename is properly copied, however, if you leave local filename empty, you are supposed to receive the content of the file.
Example code:
$stat = $sftp->get('xmlfile.cml','xmlfile.xml');
print "$stat";
(This works fine)
$xmlcontent = $sftp->get('cp1301080801_status.xml');
print "Content of file = $xmlcontent<>";
*(This prints what looks more like the stat of the file instead of the content. It starts with the date (which is the modofoed timestamp of file, followed by some numbers and the name of the web server repeated about 10 times with a number after it that increases each time - like maybe a port number or byte offset) *
It would make things easier if I didn't have to fopen the local file after the transfer. Anyone have an idea what is going on here?
Can you post a copy of the logs? Here's an example of how to get them:
http://phpseclib.sourceforge.net/ssh/examples.html#logging
Note the define() and the $ssh->getLog() stuff.
As for the specific problem you're having... what does print "$stat" do? It should print "1".
Also, fwiw, you're opening two different files in your example. My best guess, atm, is that you're thinking you're opening the same files and expecting the content to be the same when in fact they should be different and that what you're getting with both of the $sftp->get()'s is, in fact, correct.
The logs will tell us for sure.

Include -- using a string rather than a filename

I'd like to run include on a string rather than a file, but an unaware of how to achieve this.
//This is the desired functionality
include($filename);
//But I want to do something like this instead.
$file_contents = getFileFromCacheOrSomewhereElse($filename);
include($file_contents); // Doens't work...
eval($file_contents); // Also incorrect.
Please note: "eval" is not the same as include -- "include" echos out the contents of the file (and executes any PHP tags) while "eval" executes the string as PHP code.
An example use case is loading a template file from Memcache (as a string), then running include on that string, rather than running include and relying on PHP filecache.
If you can turn on the allow_url_fopen and allow_url_include php.ini settings, then an alternative is the data stream wrapper (manual).
include 'data:text/plain,' . urlencode($file_contents);
eval("?>" . $file_contents . "<?php ");
does it.
Storing PHP code in the memcache is not the best idea.
And evaling it thereafter is even worse.
Any opcode cache, APC or EAccelerator will cache your PHP files on the fly, with no strange efforts like this, and even parse it for the faster execution.
EDIT. Given the voting results after all these years, I assume that this question is attracting only noobs, who have the same strange whim. So I have to repeat: although it defeats your brilliant idea,
just leave your includes as is
They will be cached much better and executed much faster by the internal PHP's opcode cache.

PHP: Writing and sorting a file

I'm trying to write a php function that takes the $name and $time and write it to a txt file (no mySQL) and sort the file numerically.
For example:
10.2342 bob
11.3848 CandyBoy
11.3859 Minsi
12.2001 dj
just added Minsi under a faster time, for example.
If the $name already exists in the file, only rewrite it if the time is faster (smaller) than the previous one, and only write if the time fits within 300 entries to keep the file small.
My forte isn't file writing but I was guessing to go about using the file() to turn the whole file into an array, but to my avail, it didn't work quite like I wanted. Any help would be appreciated
If your data sets are small, you may consider using var_export()
function dump($filename, Array &$data){
return file_put_contents('<?php return ' . var_export($data, true) . ';');
}
// create a data set
$myData = array('alpha', 'beta', 'gamma');
// save a data set
dump('file.dat', $myData);
// load a data set
$myData = require('file.dat');
Perform your sorts using the PHP array_* functions, and dump when necessary. var_export() saves the data as PHP parsable text, which is why the dump() function prepends the string <?php return. Of course, this is really only a viable option when your data sets are going to be small enough that keeping their contents in memory is not unreasonable.
Try creating a multi dimensional array "$timeArray[key][time] = name" and then sort($timeArray)

PHP - *fast* serialize/unserialize?

I have a PHP script that builds a binary search tree over a rather large CSV file (5MB+). This is nice and all, but it takes about 3 seconds to read/parse/index the file.
Now I thought I could use serialize() and unserialize() to quicken the process. When the CSV file has not changed in the meantime, there is no point in parsing it again.
To my horror I find that calling serialize() on my index object takes 5 seconds and produces a huge (19MB) text file, whereas unserialize() takes unbearable 27 seconds to read it back. Improvements look a bit different. ;-)
So - is there a faster mechanism to store/restore large object graphs to/from disk in PHP?
(To clarify: I'm looking for something that takes significantly less than the aforementioned 3 seconds to do the de-serialization job.)
var_export should be lots faster as PHP won't have to process the string at all:
// export the process CSV to export.php
$php_array = read_parse_and_index_csv($csv); // takes 3 seconds
$export = var_export($php_array, true);
file_put_contents('export.php', '<?php $php_array = ' . $export . '; ?>');
Then include export.php when you need it:
include 'export.php';
Depending on your web server set up, you may have to chmod export.php to make it executable first.
Try igbinary...did wonders for me:
http://pecl.php.net/package/igbinary
First you have to change the way your program works. divide CSV file to smaller chunks. This is an IP datastore i assume. .
Convert all IP addresses to integer or long.
So if a query comes you can know which part to look.
There are <?php ip2long() /* and */ long2ip(); functions to do this.
So 0 to 2^32 convert all IP addresses into 5000K/50K total 100 smaller files.
This approach brings you quicker serialization.
Think smart, code tidy ;)
It seems that the answer to your question is no.
Even if you discover a "binary serialization format" option most likely even that would be to slow for what you envisage.
So, what you may have to look into using (as others have mentioned) is a database, memcached, or on online web service.
I'd like to add the following ideas as well:
caching of requests/responses
your PHP script does not shutdown but becomes a network server to answer queries
or, dare I say it, change the data structure and method of query you are currently using
i see two options here
string serialization, in the simplest form something like
write => implode("\x01", (array) $node);
read => explode() + $node->payload = $a[0]; $node->value = $a[1] etc
binary serialization with pack()
write => pack("fnna*", $node->value, $node->le, $node->ri, $node->payload);
read => $node = (object) unpack("fvalue/nre/nli/a*payload", $data);
It would be interesting to benchmark both options and compare the results.
If you want speed, writing to or reading from the file system in less than optimal.
In most cases, a database server will be able to store and retrieve data much more efficiently than a PHP script that is reading/writing files.
Another possibility would be something like Memcached.
Object serialization is not known for its performance but for its ease of use and it's definitely not suited to handle large amounts of data.
SQLite comes with PHP, you could use that as your database. Otherwise you could try using sessions, then you don't have to serialize anything, you just saving the raw PHP object.
What about using something like JSON for a format for storing/loading the data? I have no idea how fast the JSON parser is in PHP, but it's usually a fast operation in most languages and it's a lightweight format.
http://php.net/manual/en/book.json.php

Categories