Am I using preg_replace correctly (PHP)? - php

I think I have this right but I would like someone to verify that.
function storageAmmount()
{
system('stat /web/'.$user."/ | grep Size: > /web/".$user."/log/storage");
$storage = fopen('/web/'.$user.'/log/storage');
$storage = str_replace('Size:', " ", $storage);
$storage = preg_replace('Blocks:(((A-Z), a-z), 1-9)','',$storage);
}
This is the line in the text file:
Size: 4096 Blocks: 8 IO Block: 4096 directory
I am trying to get just the numeric value the proceeds "Size: ", the word Size: and everything else is usless to me.
I am mainly looking at the preg_replace. Is it just me or is regex a tad bit confusing? Any thoughts. Thanks for any help in advance.
Cheers!,
Phill
Ok,
Here is what the function looks like now:
function storageAmmount()
{
$storage = filesize('/web/'.$user.'/');
$final = $storage/(1024*1024*1024);
return $final;
}
Where would I put the number_format(), I am not really sure if it would go in the equation or in the return statement. I have tred it in both an all it returns is "0.00".
V1.
function storageAmmount()
{
$storage = filesize('/web/'.$user.'/');
$final = number_format($storage/(1024*1024*1024), 2);
return $final;
}
or V2.
function storageAmmount()
{
$storage = filesize('/web/'.$user.'/');
$final = $storage/(1024*1024*1024);
return number_format($final, 2);
}
neither work and they both return "0.00". Any thoughts?

Looks like you are trying to get the size of the file in bytes. If so why not just use the filesize function of PHP which takes the file name as its argument and returns the size of the file in bytes.
function storageAmmount(){
$storage = filesize('/web/'.$user);
}

No, you're not using preg_replace properly.
There's a lot of misunderstanding in your code; to correct it would mean I'd have to explain the basics of how Regex works. I really recommend reading a few primers on the subject. There's a good one here: http://www.regular-expressions.info/
In fact, what you're trying to do here with the str_replace and the preg_replace together would be better achieved using a single preg_match.
Something like this would do the trick:
$matches = preg_match('/Size: (\d+)/',$storage);
$storage = $matches[1];
The (\d+) picks up any number of digits and puts them into an element in the $matches array. Putting Size: in front of that forces it to only recognise the digits that are immediately after Size: in your input string.
If your input string is consistently formatted in the way you described, you could also do it without using any preg_ functions at all; just explode() on a space character and pick up the second element. No regex required.

The best usage of regex is
// preg_match solution
$storage = 'Size: 4096 Blocks: 8 IO Block: 4096 directory';
if (preg_match('/Size: (?P<size>\d+)/', $storage, $matches)) {
return matches['size'];
}
But if you are doing it localy, you can use th php function stat
// php stat solution
$f = escapeshellcmd($user);
if ($stat = #stat('/web/'.$f.'/log/storage')) {
return $stat['size'];
}

Bearing in mind the fact that you're already using string manipulation (don't quite get why - a single regular expression could handle it all), I don't know why you don't proceed down this path.
For example using explode:
function storageAmount($user) {
system(escapeshellcmd('stat /web/'.$user.'/ | grep Size: > /web/'.$user.'/log/storage'));
$storageChunks = explode(' ', file_get_contents('/web/'.$user.'/log/storage'));
return $storageChunks[1];
}
Incidentally:
The $user variable doesn't exist within the scope of your function - you either need to pass it in as an argument as I've done, or make it a global.
You should really use escapeshellcmd on all commands passed to system/exec, etc.
You're using fopen incorrectly. fopen returns a resource which you then need to read from via fread, etc. prior to using fclose. That said, I've replaced this with file_get_contents which does the open/read/close for you.
I'm not really sure what you're attempting to do with the system command, but I've left it as-is. However, you could just get the result of the grep back directly as the last string output rather than having to output it to a file. (This is what the system command returns.) You're also mixing ' and " as string delimiters - this won't work. (Just use one consistently.)
I suspect you actually want to the final line of "df --si /web/'.$user.'/'" command as otherwise you'll always be returning the value 4096.

have you tried
$storage = preg_replace('Block:.*','',$storage)
?
Even better would be to use
function storageAmmount()
{
exec('stat /web/'.$user.'/ | grep Size:',$output);
preg_match("/Size: ([0-9]*).*/",$output[0],$matches);
return $matches[1];
}
(tested on my machine)

Related

Preg_replace do not exist in php7. What can i do? [duplicate]

This question already has answers here:
Replace preg_replace() e modifier with preg_replace_callback
(3 answers)
Closed 2 years ago.
I have a form where i get a filename from a inputbox. I create the a directory and change the extension form "gpx" to "xml" of the file before i upload the file to my directory.
In php5 i need preg_replace, but in php i can't do it anymore.
I have this code:
My old code is:
if (!file_exists($dirPath)) {
mkdir($dirPath, 0755,true);
}
$target = '/'.$mappe.'/';
$target = $dirPath .'/'. basename( $_FILES['gpsfilnavn']['name']);
$target = preg_replace("/(\w+).gpx/ie","$1.'.xml'",$target);
$xmlfil = $xmlfil . basename( $_FILES['gpsfilnavn']['name']);
$xmlfil = preg_replace("/(\w+).gpx/ie","$1.'.xml'",$xmlfil);
if(move_uploaded_file($_FILES['gpsfilnavn']['tmp_name'], $target)) {
echo "The file ". basename( $_FILES['gpsfilnavn']['name'])." has been uploaded";
Can anybody help me what i have to change?
There has been a change since PHP 7 regarding the preg_replace() function.
According to the man-page
7.0.0 Support for the /e modifier has been removed. Use preg_replace_callback() instead.
Maybe this helps you?
I've stumbled upon this issue today when upgrading a phpBB-based website from PHP5 to PHP7. I came up with three different workarounds that can be used for different scenarios, the second one being the only viable one for mine since I had template-based regexp stored within the filesystem/db instead of static ones, which I couldn't easily alter.
Basically, I went from this:
$input = preg_replace($search, $replace, $input);
to something like this:
$input = preg_replace($search,
function($m) use ($replace) {
$rep = $replace;
for ($i = 1; $i<count($m); $i++) {
$rep = str_replace('\\'.$i, '$m['.$i.']', $rep);
$rep = str_replace('\$'.$i, '$m['.$i.']', $rep);
}
eval('$str='.$rep);
return $str;
},
$input);
Needless to say, this is nothing more than a quick-and-dirty workaround, only "viable" for those who cannot easily refactor their code with the updated preg_replace_callback function without having to use any potentially unsecure eval() call: for this very reason it should only be used if the developer has full control over the replacement strings/rules and is able to properly test/debug them. That said, in my specific scenario, I was able to effectively use that in order to get the job done.
IMPORTANT: the str_replace and/or the eval() could break some replacement rules unless they are "properly" defined taking the above code into account.
For further info regarding this issue and other alternative workarounds you can also read this post on my blog.

Reading a log file into an array reversed, is it best method when looking for keyword near the bottom?

I am reading from log files which can be anything from a small log file up to 8-10mb of logs. The typical size would probably be 1mb. Now the key thing is that the keyword im looking for is normally near the end of the document, in probably 95% of the cases. Then i extract 1000 characters after the keyword.
If i use this approach:
$lines = explode("\n",$body);
$reversed = array_reverse($lines);
foreach($reversed AS $line) {
// Search for my keyword
}
Would it be more efficent than using:
$pos = stripos($body,$keyword);
$snippet_pre = substr($body, $pos, 1000);
What i am not sure on is with stripos does it just start searching through the document 1 character at a time so in theory if there is 10,000 characters after the keyword then i wont have to read those into memory, whereas the first option would have to read everything into memory even though it probably only needs the last 100 lines, could i alter it to read 100 lines into memory, then search another 101-200 lines if the first 100 was not successful or is the query so light that it doesnt really matter.
I have a 2nd question and this assumes the reverse_array is the best approach, how would i extract the next 1000 characters after i have found the keyword, here is my woeful attempt
$body = $this_is_the_log_content;
$lines = explode("\n",$body);
$reversed = array_reverse($lines);
foreach($reversed AS $line) {
$pos = stripos($line,$keyword);
$snippet_pre = substr($line, $pos, 1000);
}
Why i don't think that will work is because each $line might only be a few hundred characters so would the better solution be to explode it every say 2,000 lines and also keep the previous $line as a backup variable so something like this.
$body = $this_is_the_log_content;
$lines = str_split($body, 2000);
$reversed = array_reverse($lines);
$previous_line = $line;
foreach($reversed AS $line) {
$pos = stripos($line,$keyword);
if ($pos) {
$line = $previous_line . ' ' . $line;
$pos1 = stripos($line,$keyword);
$snippet_pre = substr($line, $pos, 1000);
}
}
Im probably massively over-complicating this?
I would strongly consider using a tool like grep for this. You can call this command line tool from PHP and use it to search the file for the word you are looking for and do things like give you the byte offset of the matching line, give you a matching line plus trailing context lines, etc.
Here is a link to grep manual. http://unixhelp.ed.ac.uk/CGI/man-cgi?grep
Play with the command a bit on the command line to get it the way you want it, then call it from PHP using exec(), passthru(), or similar depending on how you need to capture/display the content.
Alternatively, you can simply fopen() the file with the pointer at the end and move the file pointer forward in the file using fseek() searching for the string as you move along the way. Once you find you needle, you can then read the file from that offset until you get to the end of file or the number of log entries.
Either of these might be preferable to reading the entire log file into memory and then trying to work with it.
The other thing to consider is whether 1000 characters is meaningful. Typically log files would have lines that vary in length. To me it would seem that you should be more concerned about getting the next X lines from the log file, not the next Y characters. What if a line has 2000 characters, are you saying you only want to get half of it? That may not be meaningful at all.

Running out of memory always on the same line

First of all, I am not looking for an answer saying "Check your PHP memory limit" or "You need to add more memory" or these kind of stuff ... I am on a dedicated machine, with 8GB of RAMS; 512MB of it is the memory limit. I always get an out of memory error on one single line :
To clarify: This part of the code belongs to Joomla! CMS.
function get($id, $group, $checkTime){
$data = false;
$path = $this->_getFilePath($id, $group);
$this->_setExpire($id, $group);
if (file_exists($path)) {
$data = file_get_contents($path);
if($data) {
// Remove the initial die() statement
$data = preg_replace('/^.*\n/', '', $data); // Out of memory here
}
}
return $data;
}
This is a part of Joomla's caching ... This function read the cache file and remove the first line which block direct access to the files and return the rest of the data.
As you can see the line uses a preg_replace to remove the first line in the cache file which is always :
<?php die("Access Denied"); ?>
My question is, it seems to me as a simple process (removing the first line from the file content) could it consume a lot of memory if the initial $data is huge? if so, what's the best way to work around that issue? I don't mind having the cache files without that die() line I can take of security and block direct access to the cache files.
Am I wrong?
UPDATE
As suggested by posts, regex seems to create more problems than solving them. I've tried:
echo memory_get_usage() . "\n";
after the regex then tried the same statement using substr(). The difference is very slight in memory usage. Almost nothing.
That's for your contributions, I am still trying to find out why this happen.
Use substr to avoid the memory hungry preg_replace() , like this:
$data = substr($data, strpos($data, '?>') + 3);
As a general advice don't use regular expressions if you can do the same task by using other string/array functions, regular expression functions are slower and consume more memory than the core string/array functions.
This is explicitly warned in PHP docs too, see some examples:
http://www.php.net/manual/en/function.preg-match.php#refsect1-function.preg-match-notes
http://www.php.net/manual/en/function.preg-split.php#refsect1-function.preg-split-notes
don't use a string function to replace something in a huge string. You can cycle through the lines of a file and just break after you have found what your looking for.
check the PHP docs here:
http://php.net/manual/en/function.fgets.php
basically what #cbuckley just said :p
If You just want to remove the first line of a file and return the rest, you should make use of file:
$lines = file($path);
array_shift($lines);
$data = implode("\n", $lines);
In stead of using file_get_contents() that gets the entire file at once, which can be too big to run regexps on, you should use fopen() in combination with fgets() (http://php.net/fgets). This function gets the file line by line.
You can then choose to do a regexp on a specific line. Or in your case just skip the entire line.
So in stead of:
$data = file_get_contents($path);
if($data) {
// Remove the initial die() statement
$data = preg_replace('/^.*\n/', '', $data); // Out of memory here
}
try this:
$fileHandler = fopen($path,'r');
$lineNumber = 0;
while (($line = fgets($fileHandler)) !== false) {
if($lineNumber++ != 0) { // Skip the initial die() statement
$data .= $line; // or maybe echo out $line directly so $data doesn't take up too much memory as well.
}
}
I suggest you use this in the file including the cached files:
define('INCLUDESALLOW', 1);
and in the file that will be included:
if( !defined('INCLUDESALLOW') ) die("Access Denied");
Then just use include instead of file_get_contents. This would run PHP code in the included though, not 100% sure if that is what you need.
There times that you will use more memory than the 8 MB php has allotted. If your unable to use less memory by making your code more efficient, you might have to increase your available memory. This can be done in two ways.
The limit can be set to a global default in php.ini:
memory_limit = 32M
Or you can override it in your script like this:
<?php
ini_set('memory_limit', '64M');
...
For more on PHP memory limit you can see This SO question or ini.memory-limit.

error when changing ereg_replace to preg_replace

I am working on old sites and updating the deprecated php functions. I have the following code, which creates an error when I change the ereg to preg.
private function stripScriptTags($string) {
$pattern = array("'\/\*.*\*\/'si", "'<\?.*?\?>'si", "'<%.*?%>'si", "'<script[^>]*?>.*?</script>'si");
$replace = array("", "", "", "");
return ereg_replace($pattern, $replace, $string);
}
This is the error I get:
Fatal error: Allowed memory size of 10000000 bytes exhausted (tried to allocate 6249373 bytes) in C:\Inetpub\qcppos.com\library\vSearch.php on line 403
Is there something else in that line of code that I need to be changing along with the ereg_replace?
So your regexes are as follows:
"'\/\*.*\*\/'si"
"'<\?.*?\?>'si"
"'<%.*?%>'si"
"'<script[^>]*?>.*?</script>'si"
Taking those one at a time, you are first greedily stripping out multiline comments. This is almost certainly where your memory problem is coming from, you should ungreedify that quantifier.
Next up, you are stipping out anything that looks like a PHP tag. This is done with a lazy quantifier, so I don't see any issue with it. Same goes for the ASP tag, and finally the script tag.
Leaving aside potental XSS threats left out by your regex, the main issue seems to be coming from your first regex. Try "'\/\*.*?\*\/'si" instead.
get your memory limitations value
ini_get('memory_limit');
and check your script with memory_get_usage() and memory_get_peak_usage()
(this one needs php 5.2 or higher)
if this turns out to be too low then you can set it higher with:
ini_set("memory_limit","128M"); // "8M" before PHP 5.2.0, "16M" in PHP 5.2.0, > "128M"
Just put in whatever memory you have and works for your script. Keep in mind that this limits the individual php process; to set it globally you need to adapt your php.ini file.
Obviously if it needs an insane amount to run, then consider it a monkeypatch and start rewriting it for better memory footprint.
for more info look at core php.ini directives, search for Resource Limits
Since allowing higher memory allocation wasn't working, the following functions were updated as follows (due to the fact that they aren't actually doing anything but causing issues):
private function stripScriptTags($string)
{
/* $pattern = array("'\/\*.*\*\/'si", "'<\?.*?\?>'si", "'<%.*?%>'si",
"'<script[^>]*?>.*?</script>'si");
$replace = array("", "", "", "");
return ereg_replace($pattern, $replace, $string);
*/
return $string;
}
private function clearSpaces($string, $clear_enters = true)
{
/*$pattern = ($clear_enters == true) ? ("/\s+/") : ("/[ \t]+/");
return preg_replace($pattern, " ", trim($string));
*/
return $string;
}

Iterate over each line in a string in PHP

I have a form that allows the user to either upload a text file or copy/paste the contents of the file into a textarea. I can easily differentiate between the two and put whichever one they entered into a string variable, but where do I go from there?
I need to iterate over each line of the string (preferably not worrying about newlines on different machines), make sure that it has exactly one token (no spaces, tabs, commas, etc.), sanitize the data, then generate an SQL query based off of all of the lines.
I'm a fairly good programmer, so I know the general idea about how to do it, but it's been so long since I worked with PHP that I feel I am searching for the wrong things and thus coming up with useless information. The key problem I'm having is that I want to read the contents of the string line-by-line. If it were a file, it would be easy.
I'm mostly looking for useful PHP functions, not an algorithm for how to do it. Any suggestions?
preg_split the variable containing the text, and iterate over the returned array:
foreach(preg_split("/((\r?\n)|(\r\n?))/", $subject) as $line){
// do stuff with $line
}
I would like to propose a significantly faster (and memory efficient) alternative: strtok rather than preg_split.
$separator = "\r\n";
$line = strtok($subject, $separator);
while ($line !== false) {
# do something with $line
$line = strtok( $separator );
}
Testing the performance, I iterated 100 times over a test file with 17 thousand lines: preg_split took 27.7 seconds, whereas strtok took 1.4 seconds.
Note that though the $separator is defined as "\r\n", strtok will separate on either character - and as of PHP4.1.0, skip empty lines/tokens.
See the strtok manual entry:
http://php.net/strtok
If you need to handle newlines in diferent systems you can simply use the PHP predefined constant PHP_EOL (http://php.net/manual/en/reserved.constants.php) and simply use explode to avoid the overhead of the regular expression engine.
$lines = explode(PHP_EOL, $subject);
It's overly-complicated and ugly but in my opinion this is the way to go:
$fp = fopen("php://memory", 'r+');
fputs($fp, $data);
rewind($fp);
while($line = fgets($fp)){
// deal with $line
}
fclose($fp);
Potential memory issues with strtok:
Since one of the suggested solutions uses strtok, unfortunately it doesn't point out a potential memory issue (though it claims to be memory efficient). When using strtok according to the manual, the:
Note that only the first call to strtok uses the string argument.
Every subsequent call to strtok only needs the token to use, as it
keeps track of where it is in the current string.
It does this by loading the file into memory. If you're using large files, you need to flush them if you're done looping through the file.
<?php
function process($str) {
$line = strtok($str, PHP_EOL);
/*do something with the first line here...*/
while ($line !== FALSE) {
// get the next line
$line = strtok(PHP_EOL);
/*do something with the rest of the lines here...*/
}
//the bit that frees up memory
strtok('', '');
}
If you're only concerned with physical files (eg. datamining):
According to the manual, for the file upload part you can use the file command:
//Create the array
$lines = file( $some_file );
foreach ( $lines as $line ) {
//do something here.
}
foreach(preg_split('~[\r\n]+~', $text) as $line){
if(empty($line) or ctype_space($line)) continue; // skip only spaces
// if(!strlen($line = trim($line))) continue; // or trim by force and skip empty
// $line is trimmed and nice here so use it
}
^ this is how you break lines properly, cross-platform compatible with Regexp :)
Kyril's answer is best considering you need to be able to handle newlines on different machines.
"I'm mostly looking for useful PHP functions, not an algorithm for how
to do it. Any suggestions?"
I use these a lot:
explode() can be used to split a string into an array, given a
single delimiter.
implode() is explode's counterpart, to go from array back to string.
Similar as #pguardiario, but using a more "modern" (OOP) interface:
$fileObject = new \SplFileObject('php://memory', 'r+');
$fileObject->fwrite($content);
$fileObject->rewind();
while ($fileObject->valid()) {
$line = $fileObject->current();
$fileObject->next();
}
SplFileObject doc: https://www.php.net/manual/en/class.splfileobject.php
PHP IO streams: https://www.php.net/manual/en/wrappers.php.php

Categories