error when changing ereg_replace to preg_replace - php

I am working on old sites and updating the deprecated php functions. I have the following code, which creates an error when I change the ereg to preg.
private function stripScriptTags($string) {
$pattern = array("'\/\*.*\*\/'si", "'<\?.*?\?>'si", "'<%.*?%>'si", "'<script[^>]*?>.*?</script>'si");
$replace = array("", "", "", "");
return ereg_replace($pattern, $replace, $string);
}
This is the error I get:
Fatal error: Allowed memory size of 10000000 bytes exhausted (tried to allocate 6249373 bytes) in C:\Inetpub\qcppos.com\library\vSearch.php on line 403
Is there something else in that line of code that I need to be changing along with the ereg_replace?

So your regexes are as follows:
"'\/\*.*\*\/'si"
"'<\?.*?\?>'si"
"'<%.*?%>'si"
"'<script[^>]*?>.*?</script>'si"
Taking those one at a time, you are first greedily stripping out multiline comments. This is almost certainly where your memory problem is coming from, you should ungreedify that quantifier.
Next up, you are stipping out anything that looks like a PHP tag. This is done with a lazy quantifier, so I don't see any issue with it. Same goes for the ASP tag, and finally the script tag.
Leaving aside potental XSS threats left out by your regex, the main issue seems to be coming from your first regex. Try "'\/\*.*?\*\/'si" instead.

get your memory limitations value
ini_get('memory_limit');
and check your script with memory_get_usage() and memory_get_peak_usage()
(this one needs php 5.2 or higher)
if this turns out to be too low then you can set it higher with:
ini_set("memory_limit","128M"); // "8M" before PHP 5.2.0, "16M" in PHP 5.2.0, > "128M"
Just put in whatever memory you have and works for your script. Keep in mind that this limits the individual php process; to set it globally you need to adapt your php.ini file.
Obviously if it needs an insane amount to run, then consider it a monkeypatch and start rewriting it for better memory footprint.
for more info look at core php.ini directives, search for Resource Limits

Since allowing higher memory allocation wasn't working, the following functions were updated as follows (due to the fact that they aren't actually doing anything but causing issues):
private function stripScriptTags($string)
{
/* $pattern = array("'\/\*.*\*\/'si", "'<\?.*?\?>'si", "'<%.*?%>'si",
"'<script[^>]*?>.*?</script>'si");
$replace = array("", "", "", "");
return ereg_replace($pattern, $replace, $string);
*/
return $string;
}
private function clearSpaces($string, $clear_enters = true)
{
/*$pattern = ($clear_enters == true) ? ("/\s+/") : ("/[ \t]+/");
return preg_replace($pattern, " ", trim($string));
*/
return $string;
}

Related

Running out of memory always on the same line

First of all, I am not looking for an answer saying "Check your PHP memory limit" or "You need to add more memory" or these kind of stuff ... I am on a dedicated machine, with 8GB of RAMS; 512MB of it is the memory limit. I always get an out of memory error on one single line :
To clarify: This part of the code belongs to Joomla! CMS.
function get($id, $group, $checkTime){
$data = false;
$path = $this->_getFilePath($id, $group);
$this->_setExpire($id, $group);
if (file_exists($path)) {
$data = file_get_contents($path);
if($data) {
// Remove the initial die() statement
$data = preg_replace('/^.*\n/', '', $data); // Out of memory here
}
}
return $data;
}
This is a part of Joomla's caching ... This function read the cache file and remove the first line which block direct access to the files and return the rest of the data.
As you can see the line uses a preg_replace to remove the first line in the cache file which is always :
<?php die("Access Denied"); ?>
My question is, it seems to me as a simple process (removing the first line from the file content) could it consume a lot of memory if the initial $data is huge? if so, what's the best way to work around that issue? I don't mind having the cache files without that die() line I can take of security and block direct access to the cache files.
Am I wrong?
UPDATE
As suggested by posts, regex seems to create more problems than solving them. I've tried:
echo memory_get_usage() . "\n";
after the regex then tried the same statement using substr(). The difference is very slight in memory usage. Almost nothing.
That's for your contributions, I am still trying to find out why this happen.
Use substr to avoid the memory hungry preg_replace() , like this:
$data = substr($data, strpos($data, '?>') + 3);
As a general advice don't use regular expressions if you can do the same task by using other string/array functions, regular expression functions are slower and consume more memory than the core string/array functions.
This is explicitly warned in PHP docs too, see some examples:
http://www.php.net/manual/en/function.preg-match.php#refsect1-function.preg-match-notes
http://www.php.net/manual/en/function.preg-split.php#refsect1-function.preg-split-notes
don't use a string function to replace something in a huge string. You can cycle through the lines of a file and just break after you have found what your looking for.
check the PHP docs here:
http://php.net/manual/en/function.fgets.php
basically what #cbuckley just said :p
If You just want to remove the first line of a file and return the rest, you should make use of file:
$lines = file($path);
array_shift($lines);
$data = implode("\n", $lines);
In stead of using file_get_contents() that gets the entire file at once, which can be too big to run regexps on, you should use fopen() in combination with fgets() (http://php.net/fgets). This function gets the file line by line.
You can then choose to do a regexp on a specific line. Or in your case just skip the entire line.
So in stead of:
$data = file_get_contents($path);
if($data) {
// Remove the initial die() statement
$data = preg_replace('/^.*\n/', '', $data); // Out of memory here
}
try this:
$fileHandler = fopen($path,'r');
$lineNumber = 0;
while (($line = fgets($fileHandler)) !== false) {
if($lineNumber++ != 0) { // Skip the initial die() statement
$data .= $line; // or maybe echo out $line directly so $data doesn't take up too much memory as well.
}
}
I suggest you use this in the file including the cached files:
define('INCLUDESALLOW', 1);
and in the file that will be included:
if( !defined('INCLUDESALLOW') ) die("Access Denied");
Then just use include instead of file_get_contents. This would run PHP code in the included though, not 100% sure if that is what you need.
There times that you will use more memory than the 8 MB php has allotted. If your unable to use less memory by making your code more efficient, you might have to increase your available memory. This can be done in two ways.
The limit can be set to a global default in php.ini:
memory_limit = 32M
Or you can override it in your script like this:
<?php
ini_set('memory_limit', '64M');
...
For more on PHP memory limit you can see This SO question or ini.memory-limit.

Echoing large string in PHP results in no output at all

I am helping to build a Joomla site (using Joomla 1.5.26). One of the pages are really really big. As a result, PHP just stops working without any error and all previously printed strings are ignored. There is no output at all. We have display_errors set to TRUE and error_reporting set to E_ALL.
I found the exact line where PHP breaks. It's in libraries/joomla/application/component/view.php:196
function display($tpl = null)
{
$result = $this->loadTemplate($tpl);
if (JError::isError($result)) {
return $result;
}
echo $result;
}
Some information:
Replacing echo $result; with echo strlen($result); works. The length of the string is 257759.
echo substr($result, 0, 103396); is printing partial content.
echo substr($result, 0, 103397); results in no output at all.
echo substr($result, 0, 103396) . "A"; results in no output at all. So splitting string into chunks is not a solution.
I have checked server performance during the execution of the script. CPU usage is 100% but there's plenty of memory left. PHP memory limit is 1024M. output_buffering is 4096 but I tried setting it to unreasonably high number - dies at exact same position. Server runs Apache 2.2.14-5ubuntu8.10 and PHP 5.3.2-1ubuntu4.18. PHP runs as fast_cgi module.
I have never experienced something like that and Google search results in nothing also. Have any of you experienced something like that and know the solution?
Thanks for reading!
Maybe try exploding the string and looping through each line.
You could also try this, found on php.net - echo:
<?php
function echobig($string, $bufferSize = 8192)
{
// suggest doing a test for Integer & positive bufferSize
for ($chars = strlen($string)-1, $start = 0;$start <= $chars; $start += $bufferSize) {
echo substr($string, $start, $bufferSize);
}
}
?>
Basically, it seems echo can't handle such large data in one call. Breaking it up somehow should get you where you need to go.
what about try using print_r rather than echo
function display($tpl = null)
{
$result = $this->loadTemplate($tpl);
if (JError::isError($result)) {
return $result;
}
print_r($result);
}
I have tested this on the CLI and it works fine with PHP 5.4.11 and 5.3.15:
$str = '';
for ($i=0;$i<257759;$i++) {
$str .= 'a';
}
echo $str;
It seems a reasonable assumption that PHP itself works fine, but that the output buffer is too large for Apache/fast_cgi. I would investigate the Apache config further. Do you have any special Apache settings?
May be that?
Try something like this
php_flag output_buffering On
Or try to turn on gzip in Joomla!
Or use nginx as reverse proxy or standalone server :^ )
It seems I solved the problem by myself. It was somewhat unexpected thing - faulty HTML formatting. We use a template for order page and inside there is a loop which shows all ordered products. When there were a few products, everything worked great but when I tried to do the same with 40 products, the page did break.
However I still don't understand why the server response would be empty with status code 200.
Thanks for answers, everybody!

php file writing problem

I have a script which constanly append strings into a file.
For example (this is a test script):
$i = 1;
$file = 'wikipedia/test.txt';
$text = 'the quick brown fox jumps over the lazy dog';
while($i!=0)
{
file_put_contents($file, $text, FILE_APPEND );
}
But for an unknown reason my program stops appending strings when the text file reached the file size of 2097156 B . It wasn't a disk space issue since i could still create another text file yet limited to the same exact file size value.
I tried using other php functions fwrite, fputs but still didn't work out.
Any idea why this problem occurs?
Seems unlikely, but you might have run up against PHP's max_execution_time if its current setting is very low. Try increasing its value in php.ini
Your loop doesnt make sense. It never changes $i.
Try it without the while.
$file = 'wikipedia/test.txt';
$text = 'the quick brown fox jumps over the lazy dog';
file_put_contents($file, $text, FILE_APPEND );
There are several issues that could cause this problem.
You may have encountered the max execution time limit ( default: 30 seconds ).
You may have exhausted the memory limit ( default: varies by version)
Something may have changed on disk ( the file permissions may have changed, or you may have exceeded a disk quota).
PHP's error output would be invaluable for identify which of these issues may have contributed to the problem.

Am I using preg_replace correctly (PHP)?

I think I have this right but I would like someone to verify that.
function storageAmmount()
{
system('stat /web/'.$user."/ | grep Size: > /web/".$user."/log/storage");
$storage = fopen('/web/'.$user.'/log/storage');
$storage = str_replace('Size:', " ", $storage);
$storage = preg_replace('Blocks:(((A-Z), a-z), 1-9)','',$storage);
}
This is the line in the text file:
Size: 4096 Blocks: 8 IO Block: 4096 directory
I am trying to get just the numeric value the proceeds "Size: ", the word Size: and everything else is usless to me.
I am mainly looking at the preg_replace. Is it just me or is regex a tad bit confusing? Any thoughts. Thanks for any help in advance.
Cheers!,
Phill
Ok,
Here is what the function looks like now:
function storageAmmount()
{
$storage = filesize('/web/'.$user.'/');
$final = $storage/(1024*1024*1024);
return $final;
}
Where would I put the number_format(), I am not really sure if it would go in the equation or in the return statement. I have tred it in both an all it returns is "0.00".
V1.
function storageAmmount()
{
$storage = filesize('/web/'.$user.'/');
$final = number_format($storage/(1024*1024*1024), 2);
return $final;
}
or V2.
function storageAmmount()
{
$storage = filesize('/web/'.$user.'/');
$final = $storage/(1024*1024*1024);
return number_format($final, 2);
}
neither work and they both return "0.00". Any thoughts?
Looks like you are trying to get the size of the file in bytes. If so why not just use the filesize function of PHP which takes the file name as its argument and returns the size of the file in bytes.
function storageAmmount(){
$storage = filesize('/web/'.$user);
}
No, you're not using preg_replace properly.
There's a lot of misunderstanding in your code; to correct it would mean I'd have to explain the basics of how Regex works. I really recommend reading a few primers on the subject. There's a good one here: http://www.regular-expressions.info/
In fact, what you're trying to do here with the str_replace and the preg_replace together would be better achieved using a single preg_match.
Something like this would do the trick:
$matches = preg_match('/Size: (\d+)/',$storage);
$storage = $matches[1];
The (\d+) picks up any number of digits and puts them into an element in the $matches array. Putting Size: in front of that forces it to only recognise the digits that are immediately after Size: in your input string.
If your input string is consistently formatted in the way you described, you could also do it without using any preg_ functions at all; just explode() on a space character and pick up the second element. No regex required.
The best usage of regex is
// preg_match solution
$storage = 'Size: 4096 Blocks: 8 IO Block: 4096 directory';
if (preg_match('/Size: (?P<size>\d+)/', $storage, $matches)) {
return matches['size'];
}
But if you are doing it localy, you can use th php function stat
// php stat solution
$f = escapeshellcmd($user);
if ($stat = #stat('/web/'.$f.'/log/storage')) {
return $stat['size'];
}
Bearing in mind the fact that you're already using string manipulation (don't quite get why - a single regular expression could handle it all), I don't know why you don't proceed down this path.
For example using explode:
function storageAmount($user) {
system(escapeshellcmd('stat /web/'.$user.'/ | grep Size: > /web/'.$user.'/log/storage'));
$storageChunks = explode(' ', file_get_contents('/web/'.$user.'/log/storage'));
return $storageChunks[1];
}
Incidentally:
The $user variable doesn't exist within the scope of your function - you either need to pass it in as an argument as I've done, or make it a global.
You should really use escapeshellcmd on all commands passed to system/exec, etc.
You're using fopen incorrectly. fopen returns a resource which you then need to read from via fread, etc. prior to using fclose. That said, I've replaced this with file_get_contents which does the open/read/close for you.
I'm not really sure what you're attempting to do with the system command, but I've left it as-is. However, you could just get the result of the grep back directly as the last string output rather than having to output it to a file. (This is what the system command returns.) You're also mixing ' and " as string delimiters - this won't work. (Just use one consistently.)
I suspect you actually want to the final line of "df --si /web/'.$user.'/'" command as otherwise you'll always be returning the value 4096.
have you tried
$storage = preg_replace('Block:.*','',$storage)
?
Even better would be to use
function storageAmmount()
{
exec('stat /web/'.$user.'/ | grep Size:',$output);
preg_match("/Size: ([0-9]*).*/",$output[0],$matches);
return $matches[1];
}
(tested on my machine)

file_get_contents and file_put_contents with large files

I'm trying to get file contents, replace some parts of it using regular expressions and preg_replace and save it to another file:
$content = file_get_contents('file.txt', true);
$content_replaced = preg_replace('/\[\/m\]{1}\s+(\{\{.*\}\})\s+[\x{4e00}-\x{9fa5}]+/u', 'replaced text', $contents);
if ($content_replaced) {
file_put_contents('file_new.txt', $content_replaced);
echo "Successful!";
}
else {
echo "Some error ocurred";
}
this piece of code works fine with small files, but when I try the original file, which is about 60Mb, it just keeps giving me a message "Some error ocurred".
Any suggestions are greatly appreciated.
Update. No errors in the logs, memory limit is set to 1024M
I've had max/limit issues with file_put_contents.
No idea what the limits might be, but using fwrite solved my troubles and I put down the bottle.
You're probably running out of memory. What's the memory_limit set to? (phpinfo() will tell you). You may be able to increase the memory limit like:
ini_set('memory_limit','128M');
I'm pretty sure you're hitting some regex limit. Heck, some time ago I hit a limit with 1000 chars... with 60Mb of input I bet you will likely hit regex limits everywhere also with really simple patterns. I will try at least to simplify it as much as possible, making it ungreedy with .*? instead of .* if possible.
To get more information, just check the return value of preg_last_error().

Categories