PHP extension for Linux: reality check needed! - php

Okay, I've written my first functional PHP extension. It worked but it was a proof-of-concept only. Now I'm writing another one which actually does what the boss wants.
What I'd like to know, from all you PHP-heads out there, is whether this code makes sense. Have I got a good grasp of things like emalloc and the like, or is there stuff there that's going to turn around later and try to bite my hand off?
Below is the code for one of the functions. It returns a base64 of a string that has also been Blowfish encrypted. When the function is called, it is supplied with two strings, the text to encrypt and encode, and the key for the encryption phase. It's not using PHP's own base64 functions because, at this point, I don't know how to link to them. And it's not using PHP's own mcrypt functions for the same reason. Instead, it links in the SSLeay BF_ecb_encrypt functions.
PHP_FUNCTION(Blowfish_Base64_encode)
{
char *psData = NULL;
char *psKey = NULL;
int argc = ZEND_NUM_ARGS();
int psData_len;
int psKey_len;
char *Buffer = NULL;
char *pBuffer = NULL;
char *Encoded = NULL;
BF_KEY Context;
int i = 0;
unsigned char Block[ 8 ];
unsigned char * pBlock = Block;
char *plaintext;
int plaintext_len;
int cipher_len = 0;
if (zend_parse_parameters(argc TSRMLS_CC, "ss", &psData, &psData_len, &psKey, &psKey_len) == FAILURE)
return;
Buffer = (char *) emalloc( psData_len * 2 );
pBuffer = Buffer;
Encoded = (char *) emalloc( psData_len * 4 );
BF_set_key( &Context, psKey_len, psKey );
plaintext = psData;
plaintext_len = psData_len;
for (;;)
{
if (plaintext_len--)
{
Block[ i++ ] = *plaintext++;
if (i == 8 )
{
BF_ecb_encrypt( Block, pBuffer, &Context, BF_ENCRYPT );
pBuffer += 8;
cipher_len += 8;
memset( Block, 0, 8 );
i = 0;
}
} else {
BF_ecb_encrypt( Block, pBuffer, &Context, BF_ENCRYPT );
cipher_len += 8;
break;
}
}
b64_encode( Encoded, Buffer, cipher_len );
RETURN_STRINGL( Encoded, strlen( Encoded ), 0 );
}
You'll notice that I have two emalloc calls, for Encoded and for Buffer. Only Encoded is passed back to the caller, so I'm concerned that Buffer won't be freed. Is that the case? Should I use malloc/free for Buffer?
If there are any other glaring errors, I'd really appreciate knowing.

emalloc() allocates memory per request, and it's free()'d automatically when the runtime ends.
You should, however, compile PHP with
--enable-debug --enable-maintainer-zts
It will tell you if anything goes wrong (it can detect memory leaks if you've used the e*() functions and report_memleaks is set in your php.ini).
And yes, you should efree() Buffer.

You'll notice that I have two emalloc calls, for Encoded and for Buffer. Only Encoded is passed back to the caller, so I'm concerned that Buffer won't be freed. Is that the case? Should I use malloc/free for Buffer?
Yes, you should free it with efree before returning.
Although PHP has safety net and memory allocated with emalloc will be freed at the end of the request, it's still a bug to leak memory and, depending you will warned if running a debug build with report_memleaks = On.

Related

Store output of php_execute_script

How can I store the output of php_execute_script() in a character array? php_execute_script() only prints the output.
Here's what I've got:
PHP_EMBED_START_BLOCK(argc, argv)
char string[200];
setbuf(stdout, string);
zend_file_handle file_handle;
zend_stream_init_filename(&file_handle, "say.php");
fflush(stdout);
if (php_execute_script(&file_handle) == FAILURE) {
php_printf("Failed to execute PHP script.\n");
}
setbuf(stdout, NULL);
printf("\"%s\" is what say.php has to say.\n", string);
PHP_EMBED_END_BLOCK()
I've tried redirecting stdout to string, but it doesn't even look like php_execute_script is actually writing to stdout! It just ignores it.
I'm trying to communicate with the PHP embedded script ("say.php") from C using the PHP SAPI after building PHP with embedding enabled.
Thinking that there's no such thing, I made an issue:
https://github.com/php/php-src/issues/9330
#include <sapi/embed/php_embed.h>
char *string[999] = {NULL};
int i = 0;
static size_t embed_ub_write(const char *str, size_t str_length){
string[i] = (char *) str; i++;
return str_length;
}
int main(int argc, char **argv)
{
php_embed_module.ub_write = embed_ub_write;
PHP_EMBED_START_BLOCK(argc, argv)
zend_file_handle file_handle;
zend_stream_init_filename(&file_handle, "say.php");
if (php_execute_script(&file_handle) == FAILURE) {
php_printf("Failed to execute PHP script.\n");
}
for (i = 0; string[i] != NULL; i++) {
printf("%s", string[i]);
}
PHP_EMBED_END_BLOCK()
}
You need to set php_embed_module.ub_write to a callback function that gets called every time PHP makes an echo.
It's weird how PHP doesn't write to stdout...
EDIT: As it turns out, it does. You just need to use pipe.

Alternative for htmlentities($string,ENT_SUBSTITUTE)

I got a bit of a stupid question;
Currently I am making a website for a company on a server which actually has a bit an outdated PHP version (5.2.17). I have a database in which many fields are varchar with characters like 'é ä è ê' and so on, which I have to display in an HTML page.
So as the version of PHP is outdated (and I am not allowed to updated it because there are parts of the site that must keep working and to whom I have no acces to edit them) I can't use the htmlentities function with the ENT_SUBSTITUTE argument, because it was only added after version 5.4.
So my question is:
Does there exist an alternative to
htmlentities($string,ENT_SUBSTITUTE); or do I have to write a function
myself with all kinds of strange characters, which would be incomplete anyway.
Define a function for handling ill-formed byte sequences and call the function before passing the string to htmlentties. There are various way to define the function.
At first, try UConverter::transcode if you don't use Windows.
http://pecl.php.net/package/intl
If you are willing to handle bytes directly, see my previous answer.
https://stackoverflow.com/a/13695364/531320
The last option is to develop PHP extension. Thanks to php_next_utf8_char, it's not hard.
Here is code sample. The name "scrub" comes from Ruby 2.1 (see Equivalent of Iconv.conv("UTF-8//IGNORE",...) in Ruby 1.9.X?)
// header file
// PHP_FUNCTION(utf8_scrub);
#include "ext/standard/html.h"
#include "ext/standard/php_smart_str.h"
const zend_function_entry utf8_string_functions[] = {
PHP_FE(utf8_scrub, NULL)
PHP_FE_END
};
PHP_FUNCTION(utf8_scrub)
{
char *str = NULL;
int len, status;
size_t pos = 0, old_pos;
unsigned int code_point;
smart_str buf = {0};
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &len) == FAILURE) {
return;
}
while (pos < len) {
old_pos = pos;
code_point = php_next_utf8_char((const unsigned char *) str, len, &pos, &status);
if (status == FAILURE) {
smart_str_appendl(&buf, "\xEF\xBF\xBD", 3);
} else {
smart_str_appendl(&buf, str + old_pos, pos - old_pos);
}
}
smart_str_0(&buf);
RETURN_STRINGL(buf.c, buf.len, 0);
smart_str_free(&buf);
}
You don't need ENT_SUBSTITUTE if your encoding is handled correctly.
If the characters in your database are utf-8, stored in utf-8, read in utf-8 and displayed to the user in utf-8 there should be no problem.
Just add
if (!defined('ENT_SUBSTITUTE')) define('ENT_SUBSTITUTE', 0);
and you'll be able to use ENT_SUBSTITUTE into htmlentities.

issue with stream_select() in PHP

I am using stream_select() but it returns 0 number of descriptors after few seconds and my function while there is still data to be read.
An unusual thing though is that if you set the time out as 0 then I always get the number of descriptors as zero.
$num = stream_select($read, $w, $e, 0);
stream_select() must be used in a loop
The stream_select() function basically just polls the stream selectors you provided in the first three arguments, which means it will wait until one of the following events occur:
some data arrives
or reaches timeout (set with $tv_sec and $tv_usec) without getting any data.
So recieving 0 as a return value is perfectly normal, it means there was no new data in the current polling cycle.
I'd suggest to put the function in a loop something like this:
$EOF = false;
do {
$tmp = null;
$ready = stream_select($read, $write, $excl, 0, 50000);
if ($ready === false ) {
// something went wrong!!
break;
} elseif ($ready > 0) {
foreach($read as $r) {
$tmp .= stream_get_contents($r);
if (feof($r)) $EOF = true;
}
if (!empty($tmp)) {
//
// DO SOMETHING WITH DATA
//
continue;
}
} else {
// No data in the current cycle
}
} while(!$EOF);
Please note that in this example, the script totally ignores everything aside from the input stream. Also, the third section of the "if" statement is completely optional.
Does it return the number 0 or a FALSE boolean? FALSE means there was some error but zero could be just because of timeout or nothing interesting has happen with the streams and you should do a new select etc.
I would guess this could happen with a zero timeout as it will check and return immediately. Also if you read the PHP manual about stream-select you will see this warning about using zero timeout:
Using a timeout value of 0 allows you to instantaneously poll the status of the streams, however, it is NOT a good idea to use a 0 timeout value in a loop as it will cause your script to consume too much CPU time.
If this is a TCP stream and you want to check for connection close you should check the return value from fread etc to determine if the other peer has closed the conneciton. About the read streams array argument:
The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).
http://www.php.net/stream_select
Due to a limitation in the current Zend Engine it is not possible to
pass a constant modifier like NULL directly as a parameter to a
function which expects this parameter to be passed by reference.
Instead use a temporary variable or an expression with the leftmost
member being a temporary variable:
<?php $e = NULL; stream_select($r, $w, $e, 0); ?>
I have a similar issue which is caused by the underlying socket timeout.
Eg. I create some streams
$streams = stream_socket_pair(STREAM_PF_UNIX, STREAM_SOCK_STREAM, STREAM_IPPROTO_IP);
Then fork, and use a block such as the following
stream_set_blocking($pipes[1], 1);
stream_set_blocking($pipes[2], 1);
$pipesToRead = array($pipes[1], $pipes[2]);
while (!feof($pipesToRead[0]) || !feof($pipesToRead[1])) {
$reads = $pipesToRead;
$writes = null;
$excepts = $pipesToRead;
$tSec = null;
stream_select($reads, $writes, $excepts, $tSec);
// while it's generating any kind of output, duplicate it wherever it
// needs to go
foreach ($reads as &$read) {
$chunk = fread($read, 8192);
foreach ($streams as &$stream)
fwrite($stream, $chunk);
}
}
Glossing over what other things might be wrong there, my $tSec argument to stream_select is ignored, and the "stream" will timeout after 60 seconds of inactivity and produce an EOF.
If I add the following after creating the streams
stream_set_timeout($streams[0], 999);
stream_set_timeout($streams[1], 999);
Then I get the result I desire, even if there's no activity on the underlying stream for longer than 60 seconds.
I feel that this might be a bug, because I don't want that EOF after 60 seconds of inactivity on the underlying stream, and I don't want to plug in some arbitrarily large value to avoid hitting the timeout if my processes are idle for some time.
In addition, even if the 60 second timeout remains, I think it should just timeout on my stream_select() call and my loop should be able to continue.

Binary mode writing

I am writing a php program to write a binary file (may be video or image files). I would like to make it as a web service and call it from another application like c#, mac etc.
My code is give below,
<?php
$fileChunk = $_POST["filechunk"];
$vodFolder = 'D:\\HYSA SVN\\Trunk\\workproducts\\source\\hysa_he\\web\\entertainment\\';
$vodFile = $vodFolder . "abcd.mov";
$fh = fopen($vodFile, 'ab');
flock ($fh, LOCK_EX);
$varsize = fwrite($fh, $fileChunk);
fclose($fh);
?>
But when I called the php web service from a c# code, the abcd.mov is creating in the location, but its size is only one kb. I suspects that, the writing in halted when a character ‘&’ found in the binary file. I read the php documentation and found that, fopen with binary mode ‘b’ will solve this issue? But it is not working. Can somebody help me ?
This is my c# code.
BinaryReader b = new BinaryReader(File.Open("d:\\image38kb.jpg", FileMode.Open));
int pos = 0;
int length = (int)b.BaseStream.Length;
byte[] bt = b.ReadBytes(length);
char[] ch = b.ReadChars(length);
HttpWebRequest request = null;
Uri uri = new Uri("http://d0327/streamtest.php");
request = (HttpWebRequest)WebRequest.Create(uri);
NetworkCredential obj = new NetworkCredential("shihab.kb",
"India456*", "tvm");
request.Proxy.Credentials = obj;
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = bt.Length;
using (Stream writeStream = request.GetRequestStream())
{
UTF8Encoding encoding = new UTF8Encoding();
byte[] bytes = encoding.GetBytes("filechunk=");
byte[] rv = new byte[bytes.Length + bt.Length];
System.Buffer.BlockCopy(bytes, 0, rv, 0, bytes.Length);
System.Buffer.BlockCopy(bt, 0, rv, bytes.Length, bt.Length);
writeStream.Write(rv, 0, bt.Length);
}
string result = string.Empty;
using (
HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (Stream responseStream = response.GetResponseStream())
{
using (StreamReader readStream = new StreamReader(responseStream, Encoding.UTF8))
{
result = readStream.ReadToEnd();
}
}
}
Well the problem is not in the fwrite part, that works fine.
However, POST requests in HTTP look like this:
POST /somepage.php HTTP/1.1
Content-Length: 34
variable1=blah&var2=something else
As you can see, variables are divided by ampersands (&) (as well as equal signs (=) for the key - value mapping)... So, even when using POST requests, ampersands are not safe characters.
To solve this, you could try another transfer method, for example, TCP/IP socket connection, or simply escape the ampersands with, say '\x26' (the ASCII value of ampersand) and escape all the backslashes (\) with '\x5C'... You'd have to edit the PHP code to parse these values.

Does PHP copy variables when retrieving from shared memory?

If I run an shm_get_var(), will it return a "reference", keeping the data in shared memory?
I'm wanting to keep an array about 50MB in size in shared memory so that it can be used by multiple processes without having to keep multiple copies of this 50MB array hanging around. If shared memory isn't the answer, does anyone have another idea?
This is the relevant C code snippet from sysvsem.c in PHP 5.2.9 :
/* setup string-variable and serialize */
/* get serialized variable from shared memory */
shm_varpos = php_check_shm_data((shm_list_ptr->ptr), key);
if (shm_varpos < 0) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "variable key %ld doesn't exist", key);
RETURN_FALSE;
}
shm_var = (sysvshm_chunk*) ((char *)shm_list_ptr->ptr + shm_varpos);
shm_data = &shm_var->mem;
PHP_VAR_UNSERIALIZE_INIT(var_hash);
if (php_var_unserialize(&return_value, (const unsigned char **) &shm_data, shm_data + shm_var->length, &var_hash TSRMLS_CC) != 1) {
PHP_VAR_UNSERIALIZE_DESTROY(var_hash);
php_error_docref(NULL TSRMLS_CC, E_WARNING, "variable data in shared memory is corrupted");
RETURN_FALSE;
}
PHP_VAR_UNSERIALIZE_DESTROY(var_hash);
PHP will have to unserialize the entire value every time you call shm_get, which, on a 50MB array, is going to be really really slow.
How about breaking it up into individual values?
Also you might want to consider using APC's variable cache, which will handle all of the shared memory and locking for you (and will also use a hash table for key lookups)
I'm no expert on this, but would it be possible to write a quick test for this something like the following?
$key = 1234;
//put something small into shared memory
$identifier = shm_attach($key, 1024, 0777);
shm_put_var($identifier, $key, 'shave and a hair cut');
$firstVar = shm_get_var($identifier, $key);
$firstVar .= 'Test String of Doom';
$secondVar = shm_get_var($identifier, $key);
if ($firstVar == $secondVar) {
echo 'shm_get_var passes by reference';
} else {
echo 'shm_get_var passes by value';
}
form the wording of the documentation
shm_get_var() returns the variable
with a given variable_key , in the
given shared memory segment. The
variable is still present in the
shared memory.
I would say yes it's a reference to the shared memory space.
you can use shm_remove()
Check this out: http://php.net/manual/en/function.shm-remove.php

Categories