Not sure if this is possible, but it's become an academic struggle now.
Using the __halt_compiler() trick to embed binary data in a PHP file, I've successfully created a self-opening script which will fseek() to __COMPILER_HALT_OFFSET__ (not too hard seeing as this precise example is documented in the manual)
Anyways, I've stored a small lump of binary ZIP data (a single folder containing a single file that says "hello world") after my call to __halt_compiler()
What I've tried to do is copy the data directly to the php://temp stream, and have done so with success (if I rewind() and passthru() the temporary stream handle, it dumps the data)
$php = fopen(__FILE__, 'rb');
$tmp = fopen('php://temp', 'r+b');
fseek($php, __COMPILER_HALT_OFFSET__);
stream_copy_to_stream($php, $tmp);
My problem comes with trying to now open php://temp1 with zip_open()
$zip = zip_open('php://temp');
1From what I can see (despite other such possibilities as lack of stream support with zip_open()) the problem here is the inherent non-permanence of data in php://memory and php://temp streams between handles. If this can be worked around, perhaps it is in fact possible.
It keeps kicking back error code 11, which I have found no2 documentation on (seemingly, like most other possible error codes)
var_dump($zip); // int(11)
2 As #cweiske pointed out, error code 11 = ZipArchive::ER_OPEN, Can't open file
Is this consequence to my attempt at using the php://temp stream, or some other possible issue? I'm also aware there exists an OOP approach (ZipArchive, et al.) but I figured I'd start with the basics.
Any ideas?
11 is the constant ZIPARCHIVE::ER_OPEN, which the manual describes with
Can't open file
Note that the manual does not state that stream wrappers may be used.
Please think about using PHP's phar extension - it does what you want, and is well tested.
Related
I am writing some json results in files in PHP on shared hosting (fwrite).
Then I read those files to extract json results (file_get_contents).
It happens some times (maybe one out of more than one thousand) that when I read this file it appears truncated: I can only read a multiple of the first 32768 bytes of the file.
I added some code to copy/paste the file I am reading in case the json string is not valid, and I then get 2 different files: the original one was correctly written as it contains a valid json string and the copied one contains only the beginning of the original one and has a size of x*32768 bytes.
Would you have any idea of what could be the problem and how to solve this? (I don't know how to investigate further)
Thank you
Without example code it is impossible to give a 'fix my code' answer, but when doing file write/read sort of programming, you should follow a simple process (which, from the description, is missing one fairly critical step!)
First, write to a TEMP file (you are writing to a file, but it is important here to write to a TEMP file - otherwise, you could have race conditions....... ;);
an easy way to do that in php
$yourData = "whateverYourDataIs....";
$goodfilename = 'whateverYourGoodFileNameIsForYourData.json';
$tempfilename = 'tempfile' . time(); // MANY ways to do this (lots of SO posts on it - just get a unique name every time you write ('unique' may not be needed if you only occasionally do a write, but it is a good safety measure to avoid collisions and time() works for many programs.)
// Now, use $tempfilename in your fwrite.
$fwrite = fwrite($tempfilename,$yourData);
if ($fwrite === false) {
// the write failed, so do whatever 'error' function you may need
// since it failed, there should be no file, but not a bad idea to attempt to delete it
unlink ($tempfile);
}
else {
// the write succeeded, so let's do a 'sanity check' on the file to make sure it is good JSON (this is a 'paranoid' check, but "better safe than sorry", right?)
if(json_decode($tempfile)){
// we know the file is good JSON, so now RENAME (this is really fast, so collisions are almost impossible) NOTE: see http://php.net/manual/en/function.rename.php comments for some potential challenges and workarounds if you have trouble with rename.
rename($tempfilename,$goodfilename);
}
// Now, the GOOD file will contain your new data - and those read issues are gone! (though never say 'never' - it may be possible, but very unlikely!)
}
This may/not be your issue directly and you will have to suit this to fit your code, but as a safety factor - and a good way to avoid collisions, it should give you ~100% read success, which I believe is what you are after!)
If this doesn't help, then some direct code will be needed to provide a more complete answer.
As suggested by #UlrichEckhardt comment, it was due to read / write concurrency problem. I was trying to read a file that was being writen. I solved this by just waiting before trying to read the file again
Okay, this is driving me crazy. I have a small file. Here is the dropbox link https://www.dropbox.com/s/74nde57f07jj0zj/transcript.txt?dl=0.
If I try to read the content of the file using python f.read(), I can easily read it. But, if I try to run the same python program using php shell_exec(), the file read fails. This is the error I get.
Traceback (most recent call last):
File "/var/www/python_code.py", line 2, in <module>
transcript = f.read()
File "/opt/anaconda/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 107: ordinal not in range(128)
I have checked all the permission issues and there is no problem with that.
Can anyone kindly shed some light?
Here is my python code.
f = open('./transcript/transcript.txt', 'r')
transcript = f.read()
print(transcript)
Here is my PHP code.
$output = shell_exec("/opt/anaconda/bin/python /var/www/python_code.py");
Thank you!
EDIT: I think the problem is in the file content. If I replace the content with simple 'I eat rice', then I can read the content from php. But the current content cannot be read. Still don't know why.
The problem appears is that your file contains non-ASCII characters, but you're trying to read it as ASCII text.
Either it is text, but is in some encoding or other that you haven't told us (probably UTF-8, Latin-1, or cp1252, but there are countless other possibilities), or it's not text at all, but rather arbitrary binary data.
When you open a text file without specifying an encoding, Python has to guess. When you're running from inside the terminal or whatever IDE you use, presumably, it's guessing the same encoding that you used in creating the file, and you're getting lucky. But when you're running from PHP, Python doesn't have as much information, so it's just guessing ASCII, which means it fails to read the file because the file has bytes that aren't valid as ASCII.
If you want to understand how Python guesses, see the docs for open, but briefly: it calls locale.getpreferredencoding(), which, at least on non-Windows platforms, reads it from the locale settings in the environment. On a typical linux system that's not new enough to be based on systemd but not too old, the user's shell will be set up for a UTF-8 locale, but services will be set up for C locale. If all of that makes sense to you, you may see a way to work around your problem. If it all sounds like gobbledegook, just ignore it.
If the file is meant to be text, then the right solution is to just pass the encoding to the open call. For example, if the file is UTF-8, do this:
f = open('./transcript/transcript.txt', 'r', encoding='utf-8')
Then Python doesn't have to guess.
If, on the other hand, the file is arbitrary binary data, then don't open it in text mode:
f = open('./transcript/transcript.txt', 'rb')
In this case, of course, you'll get bytes instead of str every time you read from it, and print is just going to print something ugly like b'aq\x9bz' that makes no sense; you'll have to figure out what you actually want to do with the bytes instead of printing them as a bytes.
I have a problem with a connection to a telnet-server using PHP sockets. I've a finished telnet class, but on my other server did that class not work because of the stream_get_meta_data unread_bytes value. Does PHP have changed that in Version 5.4? I can't find what about this change.
The code that i use:
$buff = '';
while (!feof($this->socket)) {
$buff .= fread($this->socket, 1024);
$stream_meta_data = stream_get_meta_data($this->socket);
if ($stream_meta_data['unread_bytes'] <= 0)
break;
}
Can anyone help me or say me, what can i change?
You didn't clearly state what your code snippet is supposed to do:
read bytes until the socket connection is shut down, or
read bytes that are available at the moment, on a live connection.
But your comment feof() dont work correctly suggests that you're after 2., since a feof() check would be sufficient for 1.; cf. the comment from Wez to the "Not a bug" unread_bytes always 0:
unread_bytes is the number of bytes remaining in PHPs buffering layer
after the last read.
If you have consumed all data from the buffer
on a prior read, unread_bytes will remain at zero until you next read
a chunk of data from the network.
So, unread_bytes should not be
used to determine if more data is pending; you should use either:
feof() to detect end of file.
Don't forget that you
can use non-blocking mode here. PHP 4.3 has a new function
called stream_select() which behaves like socket_select() from the
sockets extension, but works on all files returned by fopen() and
fsockopen(). You can use it to test which files are ready for
reading/writing and also specify a timeout.
So, if you want 2., you can use stream_select() or socket_select().
From a comment to this answer I read that "stream_get_contents is low-level" comparing to file_get_contents. However according to Manual, stream_get_contents is
Identical to file_get_contents(), except that stream_get_contents() operates on an already open stream resource and returns the remaining contents in a string, up to maxlength bytes and starting at the specified offset.
Which statement is correct?
Is stream_get_contents really lower level and faster?
Specifically I am interested in reading local files from HD.
I'm late here but it might help others
file_get_contents() loads the file content into memory. It sits there in memory and waits for the program to call echo upon which it will be delivered to the output buffer.
A good usage example is:
echo file_get_contents('file.txt');
stream_get_contents() delivers the content on an already open stream. An example is this:
$handle = fopen('file.txt', 'w+');
echo stream_get_contents($handle);
You could see that stream_get_contents() used an existing stream created by fopen() to get the contents as a string.
file_get_contents() is the more preferred way as it doesn't depend on an open stream, and is efficient with your memory using memory mapping techniques. For external sites reading, you can also set HTTP headers when getting the content. (Refer to https://www.php.net/manual/en/function.file-get-contents.php for more info)
For larger files/resources, stream_get_contents() may be preferred as it delivers the content fractionally as opposed to file_get_contents() where the entire data is dumped in memory.
I am a php programmer and currently I am working with files. I have to parse and insert the data to mysql database. Since its large amount of data php unable to load or parse the file. I am getting memory leak error even though I have increased memory_limit upto 1500MB.
FATAL: emalloc(): Unable to allocate 456185835 bytes
my text file contains text and xml data. I have to parse the xml data from the text file.
eg: <ajax>some text goes here</ajax> non relativ text <ajax>other content</ajax>
In the above example I have to parse the content inside tag. If any one can give some advice to separate each tag into individual file(eg: 1.txt, 2.txt), it will be great(perl or c or shell scripting..etc ).
Cough... a 1500 MB memory limit is a sure sign you have gone off the rails.
Where are you getting your file? I assume (given the size) that this is a local file. If you are trying to load the file into a string using file_get_contents() it is worth noting that the docs are wrong and that said function does not in fact using memory-mapped I/O (cf. bug 52802). So this is not going to work for you.
What you might try is instead falling back to more C-like (but still PHP) constructs, in particular fopen(), fseek(), and fread(). If the file is of a known structure with newlines, you might also consider fgets().
These should allow you to read in bytes in chunks into a reasonable size buffer from which you can do your processing. Since it looks like you are processing tagged strings, you will have to play the usual games of keeping multiple buffers around in which you can accumulate data until processable. This is fairly standard stuff covered in most introductions to, e.g., stream processing in C.
Note that in PHP (or any other language for that matter), you are also going to have to potentially consider issues of string encoding because, in general, it is no longer the case that 1 byte == 1 character (cf. Unicode).
As you insinuate, PHP may well not be the best language for this task (though it certainly can do it). But your problem isn't really a language-specific one; you are running into a fundamental limitation of handling large files without memory-mapping.
you can actually parse XML with PHP a small block at a time so you dont actually require much ram at all:
set_time_limit(0);
define('__BUFFER_SIZE__', 131072);
define('__XML_FILE__', 'pf_1360591.xml');
function elementStart($p, $n, $a) {
//handle opening of elements
}
function elementEnd($p, $n) {
//handle closing of elements
}
function elementData($p, $d) {
//handle cdata in elements
}
$xml = xml_parser_create();
xml_parser_set_option($xml, XML_OPTION_TARGET_ENCODING, 'UTF-8');
xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($xml, XML_OPTION_SKIP_WHITE, 1);
xml_set_element_handler($xml, 'elementStart', 'elementEnd');
xml_set_character_data_handler($xml, 'elementData');
$f = fopen(__XML_FILE__, 'r');
if($f) {
while(!feof($f)) {
$content = fread($f, __BUFFER_SIZE__);
xml_parse($xml, $content, feof($f));
unset($content);
}
fclose($f);
}