preg_match on formula and characters given? - php

I need to be able to tell if there is a match of serials given the following:
$formula = 'XXXXX-XXXXX-XXXXX-XXXXX-XXXXX';
$chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$serials = array(
'9876-345-ABC',
'7856Y-YURYW-00UEW-YUI23-YYYYY',
'0934Y-R6834-27495-89999-11123'
);
So, given the following $serials array, how to return true for all values matching any of the characters in $chars using the specified formula, where X is a placeholder for any character inside of $chars. But I also need to make sure the hyphens in the formula are in the right place in the value of the serials given.
foreach($serials as $serial)
{
if(preg_match("???", $serial) === 0)
echo 'found';
}
Should echo found on the last 2 elements of $serials. Seems simple enough, but I still can't wrap my head around regexes no matter how hard I try.

Certainly not the best one, but give it a shot and comment
Assumption :- formula contains only X's
$formula = 'XXX-XX-XXX-X-XXXXX';
$parts = split("\-", $formula);
$chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$reg = '';
foreach ($parts as $x) {
$reg = $reg . "" . '[' . "" . $chars . "" . ']{' . "" . strlen($x) . "" . "}" . "" . "-";
}
$reg = substr_replace($reg, '', -1);
$serials = array(
'9876-345-ABC',
'7856Y-YUR-00W-YUI23-YYY',
'0934Y-R6834-27495-89999-11123',
'XXX-XX-XXX-X-XXXXX'
);
$reg = '/^' . "" . $reg . "" . '$/';;
foreach($serials as $serial) {
if(preg_match($reg, $serial) != 0) {
echo $serial;
echo "\n";
}
}
Ideone Demo

$formula = 'XXXXX-XXXXX-XXXXX-XXXXX-XXXXX';
$chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$serials = array(
'9876-345-ABC',
'7856Y-YURYW-00UEW-YUI23-YYYYY',
'0934Y-R6834-27495-89999-11123'
);
foreach($serials as $serial) {
$str = str_replace(str_split($chars), 'X', $serial);
echo $str == $formula ? "yes" : "no";
}

You could go for (in multiline mode):
^(?:[0-9A-Z]{3,5}-?){3,5}$
# match the start of the line
# open a non-capturing group (?:
# look for a digit (0-9) or an uppercase letter (A-Z)
# ... between 3-5 times
# make the dash optional -?
# and repeat the non-capturing group 3-5 times
# $ makes sure this is the end of the string
As the wonderful regex101.com does not seem to work at the moment, here a non graphical example of the regex. It will match the ones with an asterisk at the end:
9876-345-ABC *
7856Y-YURYW-00UEW-YUI23-YYYYY *
0934Y-R6834-27495-89999-11123 *
this-one-not
this one neither
Translated to PHP, this would be:
$regex = '~^(?:[0-9A-Z]{3,5}-?){3,5}$~';
if (preg_match($regex, $string)) {
echo "This is a valid serial";
}

You may do it this way, it will use "{}" on successive X.
/**
* $formula = 'XXXXX-XXXXX-XXXXX-XXXXX-XXXXX';
* $chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
* $serials = array(
* '9876-345-ABC',
* '7856Y-YURYW-00UEW-YUI23-YYYYY',
* '0934Y-R6834-27495-89999-11123'
* );
*/
function checkThisFormula($formula, $chars, array $serials) {
$formulaLength = strlen($formula);
$regex = "";
$charsRegex = "[".$chars."]";
$lastIsX = false;
$nbX = 0;
// let's construct the regex from formula
for($i = 0; $i < $formulaLength; $i++) {
if($formula[$i] === 'X') {
// let's count how many X we see before writing
$nbX++;
$lastIsX = true;
} else {
if($lastIsX) {
// end of successive Xs
$regex .= "[".$chars."]";
if($nbX > 1) {
$regex .= "{".$nbX."}";
}
// reinit X count
$lastIsX = false;
$nbX = 0;
}
// have to be this exact char
$regex .= '\\'.$formula[$i];
}
}
if($lastIsX) {
// if the last char is an X, have to write it too !
$regex .= "[".$chars."]";
if($nbX > 1) {
$regex .= "{".$nbX."}";
}
}
// let's make the regex with flag for case insensitive
$regex = "#".$regex."#i";
$result = array();
// let's loop on every serial to test it
foreach($serials as $serial) {
$result[$serial] = preg_match($regex, $serial);
}
return $result;
}
output :
Array
(
[9876-345-ABC] => 0
[7856Y-YURYW-00UEW-YUI23-YYYYY] => 1
[0934Y-R6834-27495-89999-11123] => 1
)

I think the easy way would do something like that:
foreach($serials as $serial)
{
if(preg_match("/([$chars]{5}-){4}[$chars]{5}/", $serial) == 1)
echo 'found - '.$serial.'<br>';
}
Result would be:
found - 7856Y-YURYW-00UEW-YUI23-YYYYY
found - 0934Y-R6834-27495-89999-11123
I hope that's what you want to do.

Related

How To Check If A Specific Character Exists In A String In A PHP?

Im creating a function to check whether a string contains a specific character prefixed,I need to validate the text based on two conditions.
1.if the string contains character + prefixed ,i have to show the text in output without prefix +.
eg:
"we have dinner +today"
output:
"we have dinner today"
2.if the string contains character - prefixed ,i have to show the text in output removing the whole text prefixed with -.
eg:
"we have dinner -today"
output:
"we have dinner".
I will pass an extra parameter called length in this function ,If the string length is less than the given length then the string will be removed.
eg:
length=4;
eg:
"we have dinner -today"
output:
"have dinner".
eg:
"we have dinner +today"
output:
"have dinner today".
The function i have created so far is
$fulltext="-through the +use of the +tab +key";
$length=4;
function checkstring($fulltext,$length)
{
$stringArray = explode(" ", $fulltext);
foreach ($stringArray as $value)
{
if(strlen($value) < $length)
$fulltext= str_replace(" ".$value." " ," ",$fulltext);
}
return $fulltext;
}
print_r(checkstring($fulltext,$length));
The output should be "use tab key"
You can iterate over the words and check if the conditions are checked to keep in the sentence.
$fulltext = "-through the +use of the +tab +key";
$length = 4;
function checkstring($fulltext, $length)
{
$words = preg_split('~\s~',$fulltext);
$remains = [];
foreach ($words as $word)
{
if (strpos($word, '-') === 0) {
continue;
}
if (strpos($word, '+') === 0) {
$word = substr($word, 1);
}
elseif (strlen($word) <= $length) {
continue;
}
$remains[] = $word;
}
return trim(implode(' ', $remains));
}
echo checkstring($fulltext, $length);
Output :
use tab key
View the online demo.
You can use this
if(ctype_alnum($fulltext)) {
echo 'Does not contain symbols';
} else {
echo 'Contains symbols';
}
One way is to use regular expressions to find matching string with correct length and prefixed symbol. If the prefix equals '+' you can then replace the match with string without prefix.
Take a look at preg_replace_callback() and example below.
/* a unix-style command line filter to convert uppercase
* letters at the beginning of paragraphs to lowercase */
<?php
$line = "<p>Start of the paragraph</p>";
$line = preg_replace_callback(
'|<p>\s*\w|',
function ($matches) {
return strtolower($matches[0]);
},
$line
);
echo $line;
?>
function checkstring($fulltext, $length)
{
$stringArray = explode(" ", $fulltext);
foreach ($stringArray as $value) {
if ($value[0] == "-") {
$fulltext = str_replace($value, " ", $fulltext);
} else if ($value[0] == "+") {
$fulltext = str_replace($value, substr($value, 1), $fulltext);
}
if (strlen($value) < $length)
$fulltext = str_replace(" " . $value . " ", " ", $fulltext);
}
return $fulltext;
}
Here is the complete function
Using regex
function checkstring(string $fulltext,int $length){
$matches = []; preg_match_all('/[+|-](.[\w]+)/',$fulltext,$matches);
$text = "";
if(isset($matches[1]) && count($matches[1]) > 0)
for($i=0;$i<count($matches[1]);$i++)
if($matches[0][$i][0] == "+" && ($matches[0][$i][0] == "+" && strlen($matches[1][$i]) < $length))
$text .= $matches[1][$i] . " ";
return trim($text);
}
$fulltext="-through the +use of the +tab +key";
$length = 4;
echo checkstring($fulltext,$length);
Output
use tab key

Replace the Nth occurrence of char in a string with a new substring

I want to do a str_replace() but only at the Nth occurrence.
Inputs:
$originalString = "Hello world, what do you think of today's weather";
$findString = ' ';
$nthOccurrence = 8;
$newWord = ' beautiful ';
Desired Output:
Hello world, what do you think of today's beautiful weather
Here is a tight little regex with \K that allows you to replace the nth occurrence of a string without repeating the needle in the pattern. If your search string is dynamic and might contain characters with special meaning, then preg_quote() is essential to the integrity of the pattern.
If you wanted to statically write the search string and nth occurrence into your pattern, it could be:
(?:.*?\K ){8}
or more efficiently for this particular case: (?:[^ ]*\K ){8}
\K tells the regex pattern to "forget" any previously matched characters in the fullstring match. In other words, "restart the fullstring match" or "Keep from here". In this case, the pattern only keeps the 8th space character.
Code: (Demo)
function replaceNth(string $input, string $find, string $replacement, int $nth = 1): string {
$pattern = '/(?:.*?\K' . preg_quote($find, '/') . '){' . $nth . '}/';
return preg_replace($pattern, $replacement, $input, 1);
}
echo replaceNth($originalString, $findString, $newWord, $nthOccurrence);
// Hello world, what do you think of today's beautiful weather
Another perspective on how to grapple the asked question is: "How to insert a new string after the nth instance of a search string?" Here is a non-regex approach that limits the explosions, prepends the new string to the last element then re-joins the elements. (Demo)
$originalString = "Hello world, what do you think of today's weather";
$findString = ' ';
$nthOccurrence = 8;
$newWord = 'beautiful '; // notice that leading space was removed
function insertAfterNth($input, $find, $newString, $nth = 1) {
$parts = explode($find, $input, $nth + 1);
$parts[$nth] = $newString . $parts[$nth];
return implode($find, $parts);
}
echo insertAfterNth($originalString, $findString, $newWord, $nthOccurrence);
// Hello world, what do you think of today's beautiful weather
I found an answer here - https://gist.github.com/VijayaSankarN/0d180a09130424f3af97b17d276b72bd
$subject = "Hello world, what do you think of today's weather";
$search = ' ';
$occurrence = 8;
$replace = ' nasty ';
/**
* String replace nth occurrence
*
* #param type $search Search string
* #param type $replace Replace string
* #param type $subject Source string
* #param type $occurrence Nth occurrence
* #return type Replaced string
*/
function str_replace_n($search, $replace, $subject, $occurrence)
{
$search = preg_quote($search);
echo preg_replace("/^((?:(?:.*?$search){".--$occurrence."}.*?))$search/", "$1$replace", $subject);
}
str_replace_n($search, $replace, $subject, $occurrence);
$originalString = "Hello world, what do you think of today's weather";
$findString = ' ';
$nthOccurrence = 8;
$newWord = ' beautiful ';
$array = str_split($originalString);
$count = 0;
$num = 0;
foreach ($array as $char) {
if($findString == $char){
$count++;
}
$num++;
if($count == $nthOccurrence){
array_splice( $array, $num, 0, $newWord );
break;
}
}
$newString = '';
foreach ($array as $char) {
$newString .= $char;
}
echo $newString;
I would consider something like:
function replaceNth($string, $substring, $replacement, $nth = 1){
$a = explode($substring, $string); $n = $nth-1;
for($i=0,$l=count($a)-1; $i<$l; $i++){
$a[$i] .= $i === $n ? $replacement : $substring;
}
return join('', $a);
}
$originalString = 'Hello world, what do you think of today\'s weather';
$test = replaceNth($originalString, ' ', ' beautiful ' , 8);
$test2 = replaceNth($originalString, 'today\'s', 'good');
First explode a string by parts, then concatenate the parts together and with search string, but at specific number concatenate with replace string (numbers here start from 0 for convenience):
function str_replace_nth($search, $replace, $subject, $number = 0) {
$parts = explode($search, $subject);
$lastPartKey = array_key_last($parts);
$result = '';
foreach($parts as $key => $part) {
$result .= $part;
if($key != $lastPartKey) {
if($key == $number) {
$result .= $replace;
} else {
$result .= $search;
}
}
}
return $result;
}
Usage:
$originalString = "Hello world, what do you think of today's weather";
$findString = ' ';
$nthOccurrence = 7;
$newWord = ' beautiful ';
$result = str_replace_nth($findString, $newWord, $originalString, $nthOccurrence);

How to replace the first occurrence in a string?

I am trying to replace every question mark "?" in a string with a values in an array.
I need to go through a string, and replace the first occurrence of '?' in the string with a value. I would need to do that for every occurrence
Here is what I tried
function sprintf2($str='', array $values = array(), $char = '?')
{
if (!$str){
return '';
}
if (count($values) > 0)
{
foreach ($values as $value)
{
$str = preg_replace('/'. $char . '/', $value, $str, 1);
}
}
echo $str;
}
But I am getting the following exception
preg_replace(): Compilation failed: nothing to repeat at offset 0
The following shows how I am calling the function
$bindings = array(10, 500);
$str = "select * from `survey_interviews` where `survey_id` = ? and `call_id` = ? limit 1";
sprintf2($str, $bindings);
What am I doing wrong here? why do I get this exception?
Use str_replace instead of preg_replace, since you're replacing a literal string, not a regular expression pattern.
However, str_replace always replaces all matches, there's no way to limit it to just the first match (preg_replace is similar). The 4th argument is not a limit, it's a variable that gets set to the number of matches that were found and replaced. To replace just one match, you can combine strpos and substr_replace:
function sprintf2($str='', array $values = array(), $char = '?')
{
if (!$str){
return '';
}
if (count($values) > 0)
{
$len = strlen($char);
foreach ($values as $value)
{
$pos = strpos($str, $char);
if ($pos !== false) {
$str = substr_replace($str, $value, $pos, strlen($char));
}
}
}
echo $str;
}
DEMO
You need to escape the '?' sign in your regexp using a backslash ( '\?' instead of '?').
But your code can be easily refactored to use preg_replace_callback instead:
$params = array(1, 3);
$str = '? bla ?';
echo preg_replace_callback('#\?#', function() use (&$params) {
return array_pop($params);
}, $str);
Hope this code helps you.
function sprintf2($string = '', array $values = array(), $char = '?')
{
if (!$string){
return '';
}
if (count($values) > 0)
{
$exploded = explode($char, $string);
$i = 0;
$string = '';
foreach ($exploded as $segment) {
if( $i < count($values)){
$string .= ($segment . $values[$i]);
++$i;
}
}
}
echo $string;
}
$bindings = array(10, 500);
$str = "select * from `survey_interviews` where `survey_id` = ? and `call_id`= ? limit 1";
echo sprintf2($str, $bindings);
Explanation:
In your code, you're using the preg_match and the first parameter in the preg_match method is a regex pattern. what you're trying to replace is ? has a valid meaning for 0 or 1 CHARACTER. So, you've to escape that by doing \?. Though all the characters are not needed to be escaped, so, for making your method work, you've to check the character that are valid for any regex.
In my code, I've split the string for the character you want. then appending the values at the end of the part what we get from the array. and this should be done till the values length of the value array, otherwise the offset error will occur.

PHP rename all variables inside code

I would like to rename all variables within the file to random name.
For example this:
$example = "some $string";
function ($variable2) {
echo $variable2;
}
foreach ($variable3 as $key => $var3val) {
echo $var3val . "somestring";
}
Will become this:
$frk43r = "some $string";
function ($izi34ee) {
echo $izi34ee;
}
foreach ($erew7er as $iure7 => $er3k2) {
echo $er3k2 . "some$string";
}
It doesn't look so easy task so any suggestions will be helpful.
I would use token_get_all to parse the document and map a registered random string replacement on all interesting tokens.
To obfuscate all the variable names, replace T_VARIABLE in one pass, ignoring all the superglobals.
Additionally, for the bounty's requisite function names, replace all the T_FUNCTION declarations in the first pass. Then a second pass is needed to replace all the T_STRING invocations because PHP allows you to use a function before it's declared.
For this example, I generated all lowercase letters to avoid case-insensitive clashes to function names, but you can obviously use whatever characters you want and add an extra conditional check for increased complexity. Just remember that they can't start with a number.
I also registered all the internal function names with get_defined_functions to protect against the extremely off-chance possibility that a randomly generated string would match one of those function names. Keep in mind this won't protect against special extensions installed on the machine running the obfuscated script that are not present on the server obfuscating the script. The chances of that are astronomical, but you can always ratchet up the length of the randomly generated string to diminish those odds even more.
<?php
$tokens = token_get_all(file_get_contents('example.php'));
$globals = array(
'$GLOBALS',
'$_SERVER',
'$_GET',
'$_POST',
'$_FILES',
'$_COOKIE',
'$_SESSION',
'$_REQUEST',
'$_ENV',
);
// prevent name clashes with randomly generated strings and native functions
$registry = get_defined_functions();
$registry = $registry['internal'];
// first pass to change all the variable names and function name declarations
foreach($tokens as $key => $element){
// make sure it's an interesting token
if(!is_array($element)){
continue;
}
switch ($element[0]) {
case T_FUNCTION:
$prefix = '';
// this jumps over the whitespace to get the function name
$index = $key + 2;
break;
case T_VARIABLE:
// ignore the superglobals
if(in_array($element[1], $globals)){
continue 2;
}
$prefix = '$';
$index = $key;
break;
default:
continue 2;
}
// check to see if we've already registered it
if(!isset($registry[$tokens[$index][1]])){
// make sure our random string hasn't already been generated
// or just so crazily happens to be the same name as an internal function
do {
$replacement = $prefix.random_str(16);
} while(in_array($replacement, $registry));
// map the original and register the replacement
$registry[$tokens[$index][1]] = $replacement;
}
// rename the variable
$tokens[$index][1] = $registry[$tokens[$index][1]];
}
// second pass to rename all the function invocations
$tokens = array_map(function($element) use ($registry){
// check to see if it's a function identifier
if(is_array($element) && $element[0] === T_STRING){
// make sure it's one of our registered function names
if(isset($registry[$element[1]])){
// rename the variable
$element[1] = $registry[$element[1]];
}
}
return $element;
},$tokens);
// dump the tokens back out to rebuild the page with obfuscated names
foreach($tokens as $token){
echo $token[1] ?? $token;
}
/**
* https://stackoverflow.com/a/31107425/4233593
* Generate a random string, using a cryptographically secure
* pseudorandom number generator (random_int)
*
* For PHP 7, random_int is a PHP core function
* For PHP 5.x, depends on https://github.com/paragonie/random_compat
*
* #param int $length How many characters do we want?
* #param string $keyspace A string of all possible characters
* to select from
* #return string
*/
function random_str($length, $keyspace = 'abcdefghijklmnopqrstuvwxyz')
{
$str = '';
$max = mb_strlen($keyspace, '8bit') - 1;
for ($i = 0; $i < $length; ++$i) {
$str .= $keyspace[random_int(0, $max)];
}
return $str;
}
Given this example.php
<?php
$example = 'some $string';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function exampleFunction($variable2){
echo $variable2;
}
exampleFunction($example);
$variable3 = array('example','another');
foreach($variable3 as $key => $var3val){
echo $var3val."somestring";
}
Produces this output:
<?php
$vsodjbobqokkaabv = 'some $string';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function gkfadicwputpvroj($zwnjrxupprkbudlr){
echo $zwnjrxupprkbudlr;
}
gkfadicwputpvroj($vsodjbobqokkaabv);
$vfjzehtvmzzurxor = array('example','another');
foreach($vfjzehtvmzzurxor as $riuqtlravsenpspv => $mkdgtnpxaqziqkgo){
echo $mkdgtnpxaqziqkgo."somestring";
}
EDIT 4.12.2016 - please see below! (after first answer)
I've just tried to find a solution which can handle both cases: your given case and this example from Elias Van Ootegerm.
of course it should be improved as mentioned in one of my comments, but it works for your example:
$source = file_get_contents("source.php");
// this should get all Variables BUT isn't right at the moment if a variable is followed by an ' or " !!
preg_match_all('/\$[\$a-zA-Z0-9\[\'.*\'\]]*/', $source, $matches);
$matches = array_unique($matches[0]);
// this array saves all old and new variable names to track all replacements
$replacements = array();
$obfuscated_source = $source;
foreach($matches as $varName)
{
do // generates random string and tests if it already is used by an earlier replaced variable name
{
// generate a random string -> should be improved.
$randomName = substr(md5(rand()), 0, 7);
// ensure that first part of variable name is a character.
// there could also be a random character...
$randomName = "a" . $randomName;
}
while(in_array("$" . $randomName, $replacements));
if(substr($varName, 0,8) == '$GLOBALS')
{
// this handles the case of GLOBALS variables
$delimiter = substr($varName, 9, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$GLOBALS[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,8) == '$_SERVER')
{
// this handles the case of SERVER variables
$delimiter = substr($varName, 9, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_SERVER[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,5) == '$_GET')
{
// this handles the case of GET variables
$delimiter = substr($varName, 6, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_GET[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,6) == '$_POST')
{
// this handles the case of POST variables
$delimiter = substr($varName, 7, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_POST[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,7) == '$_FILES')
{
// this handles the case of FILES variables
$delimiter = substr($varName, 8, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_FILES[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,9) == '$_REQUEST')
{
// this handles the case of REQUEST variables
$delimiter = substr($varName, 10, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_REQUEST[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,9) == '$_SESSION')
{
// this handles the case of SESSION variables
$delimiter = substr($varName, 10, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_SESSION[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,5) == '$_ENV')
{
// this handles the case of ENV variables
$delimiter = substr($varName, 6, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_ENV[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,8) == '$_COOKIE')
{
// this handles the case of COOKIE variables
$delimiter = substr($varName, 9, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_COOKIE[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 1, 1) == '$')
{
// this handles the case of variable variables
$name = substr($varName, 2, strlen($varName)-2);
$pattern = '/(?=\$)\$' . $name . '.*;/';
preg_match_all($pattern, $source, $varDeclaration);
$varDeclaration = $varDeclaration[0][0];
preg_match('/\s*=\s*["\'](?:\\.|[^"\\]])*["\']/', $varDeclaration, $varContent);
$varContent = $varContent[0];
preg_match('/["\'](?:\\.|[^"\\]])*["\']/', $varContent, $varContentDetail);
$varContentDetail = substr($varContentDetail[0], 1, strlen($varContentDetail[0])-2);
$replacementDetail = str_replace($varContent, substr($replacements["$" . $varContentDetail], 1, strlen($replacements["$" . $varContentDetail])-1), $varContent);
$explode = explode($varContentDetail, $varContent);
$replacement = $explode[0] . $replacementDetail . $explode[1];
$obfuscated_source = str_replace($varContent, $replacement, $obfuscated_source);
}
else
{
$newName = '$' . $randomName;
}
$obfuscated_source = str_replace($varName, $newName, $obfuscated_source);
$replacements[$varName] = $newName;
}
// this part may be useful to change hard-coded returns of functions.
// it changes all remaining words in the document which are like the previous changed variable names to the new variable names
// attention: if the variables in the document have common names it could also change text you don't like to change...
foreach($replacements as $before => $after)
{
$name_before = str_replace("$", "", $before);
$name_after = str_replace("$", "", $after);
$obfuscated_source = str_replace($name_before, $name_after, $obfuscated_source);
}
// here you can place code to write back the obfuscated code to the same or to a new file, e.g:
$file = fopen("result.php", "w");
fwrite($file, $obfuscated_source);
fclose($file);
EDIT there are still some cases left which require some effort.
At least some kinds of variable declarations may not be handled correctly!
Also the first regex is not perfect, my current status is like:
'/\$\$?[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/'
but this does not get the index-values of predefined variables... But I think it has some potential. If you use it like here you get all 18 involved variables... The next step could be to determine if a [..] follws after the variable name. If so any predefined variable AND such cases like $g = $GLOBALS; and any further use of such a $g would be covered...
EDIT 4.12.2016
due to LSerni and several comments on both the original quesion and some solutions I also wrote a parsing solution which you can find below.
It handles an extended example file which was my aim. If you find any other challenge, please tell me!
new solution:
$variable_names_before = array();
$variable_names_after = array();
$function_names_before = array();
$function_names_after = array();
$forbidden_variables = array(
'$GLOBALS',
'$_SERVER',
'$_GET',
'$_POST',
'$_FILES',
'$_COOKIE',
'$_SESSION',
'$_REQUEST',
'$_ENV',
);
$forbidden_functions = array(
'unlink'
);
// read file
$data = file_get_contents("example.php");
$lock = false;
$lock_quote = '';
for($i = 0; $i < strlen($data); $i++)
{
// check if there are quotation marks
if(($data[$i] == "'" || $data[$i] == '"'))
{
// if first quote
if($lock_quote == '')
{
// remember quotation mark
$lock_quote = $data[$i];
$lock = true;
}
else if($data[$i] == $lock_quote)
{
$lock_quote = '';
$lock = false;
}
}
// detect variables
if(!$lock && $data[$i] == '$')
{
$start = $i;
// detect variable variable names
if($data[$i+1] == '$')
{
$start++;
// increment $i to avoid second detection of variable variable as "normal variable"
$i++;
}
$end = 1;
// find end of variable name
while(ctype_alpha($data[$start+$end]) || is_numeric($data[$start+$end]) || $data[$start+$end] == "_")
{
$end++;
}
// extract variable name
$variable_name = substr($data, $start, $end);
if($variable_name == '$')
{
continue;
}
// check if variable name is allowed
if(in_array($variable_name, $forbidden_variables))
{
// forbidden variable deteced, do whatever you want!
}
else
{
// check if variable name already has been detected
if(!in_array($variable_name, $variable_names_before))
{
$variable_names_before[] = $variable_name;
// generate random name for variable
$new_variable_name = "";
do
{
$new_variable_name = random_str(rand(5, 20));
}
while(in_array($new_variable_name, $variable_names_after));
$variable_names_after[] = $new_variable_name;
}
//var_dump("variable: " . $variable_name);
}
}
// detect function-definitions
// the third condition checks if the symbol before 'function' is neither a character nor a number
if(!$lock && strtolower(substr($data, $i, 8)) == 'function' && (!ctype_alpha($data[$i-1]) && !is_numeric($data[$i-1])))
{
// find end of function name
$end = strpos($data, '(', $i);
// extract function name and remove possible spaces on the right side
$function_name = rtrim(substr($data, ($i+9), $end-$i-9));
// check if function name is allowed
if(in_array($function_name, $forbidden_functions))
{
// forbidden function detected, do whatever you want!
}
else
{
// check if function name already has been deteced
if(!in_array($function_name, $function_names_before))
{
$function_names_before[] = $function_name;
// generate random name for variable
$new_function_name = "";
do
{
$new_function_name = random_str(rand(5, 20));
}
while(in_array($new_function_name, $function_names_after));
$function_names_after[] = $new_function_name;
}
//var_dump("function: " . $function_name);
}
}
}
// this array contains prefixes and suffixes for string literals which
// may contain variable names.
// if string literals as a return of functions should not be changed
// remove the last two inner arrays of $possible_pre_suffixes
// this will enable correct handling of situations like
// - $func = 'getNewName'; echo $func();
// but it will break variable variable names like
// - ${getNewName()}
$possible_pre_suffixes = array(
array(
"prefix" => "= '",
"suffix" => "'"
),
array(
"prefix" => '= "',
"suffix" => '"'
),
array(
"prefix" => "='",
"suffix" => "'"
),
array(
"prefix" => '="',
"suffix" => '"'
),
array(
"prefix" => 'rn "', // return " ";
"suffix" => '"'
),
array(
"prefix" => "rn '", // return ' ';
"suffix" => "'"
)
);
// replace variable names
for($i = 0; $i < count($variable_names_before); $i++)
{
$data = str_replace($variable_names_before[$i], '$' . $variable_names_after[$i], $data);
// try to find strings which equals variable names
// this is an attempt to handle situations like:
// $a = "123";
// $b = "a"; <--
// $$b = "321"; <--
// and also
// function getName() { return "a"; }
// echo ${getName()};
$name = substr($variable_names_before[$i], 1);
for($j = 0; $j < count($possible_pre_suffixes); $j++)
{
$data = str_replace($possible_pre_suffixes[$j]["prefix"] . $name . $possible_pre_suffixes[$j]["suffix"],
$possible_pre_suffixes[$j]["prefix"] . $variable_names_after[$i] . $possible_pre_suffixes[$j]["suffix"],
$data);
}
}
// replace funciton names
for($i = 0; $i < count($function_names_before); $i++)
{
$data = str_replace($function_names_before[$i], $function_names_after[$i], $data);
}
/**
* https://stackoverflow.com/a/31107425/4233593
* Generate a random string, using a cryptographically secure
* pseudorandom number generator (random_int)
*
* For PHP 7, random_int is a PHP core function
* For PHP 5.x, depends on https://github.com/paragonie/random_compat
*
* #param int $length How many characters do we want?
* #param string $keyspace A string of all possible characters
* to select from
* #return string
*/
function random_str($length, $keyspace = 'abcdefghijklmnopqrstuvwxyz')
{
$str = '';
$max = mb_strlen($keyspace, '8bit') - 1;
for ($i = 0; $i < $length; ++$i)
{
$str .= $keyspace[random_int(0, $max)];
}
return $str;
}
example input file:
$example = 'some $string';
$test = '$abc 123' . $example . '$hello here I "$am"';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function exampleFunction($variable2){
echo $variable2;
}
exampleFunction($example);
$variable3 = array('example','another');
foreach($variable3 as $key => $var3val){
echo $var3val."somestring";
}
$test = "example";
$$test = 'hello';
exampleFunction($example);
exampleFunction($$test);
function getNewName()
{
return "test";
}
exampleFunction(${getNewName()});
output of my function:
$fesvffyn = 'some $string';
$zimskk = '$abc 123' . $fesvffyn . '$hello here I "$am"';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function kainbtqpybl($yxjvlvmyfskwqcevo){
echo $yxjvlvmyfskwqcevo;
}
kainbtqpybl($fesvffyn);
$lmiphctfgjfdnonjpia = array('example','another');
foreach($lmiphctfgjfdnonjpia as $qypdfcpcla => $gwlpcpnvnhbvbyflr){
echo $gwlpcpnvnhbvbyflr."somestring";
}
$zimskk = "fesvffyn";
$$zimskk = 'hello';
kainbtqpybl($fesvffyn);
kainbtqpybl($$zimskk);
function tauevjkk()
{
return "zimskk";
}
kainbtqpybl(${tauevjkk()});
I know there are some cases left, where you can find an issue with variable variable names, but then you may have to expand the $possible_pre_suffixes array...
Maybe you also want to differentiate between global variables and "forbidden variables"...
Well, you can try write your own but the number of strange things you have to handle are likely to overwhelm you, and I presume you are more interested in using such a tool than writing and maintaining one yourself. (There a lots of broken PHP obfuscators out there, where people have tried to do this).
If you want one that is reliable, you do have base it on a parser or your tool will mis-parse the text and handle it wrong (this is the first "strange thing"). Regexes simply won't do the trick.
The Semantic Designs PHP Obfuscator (from my company), taken out of the box, took this slightly modified version of Elias Van Ootegem's example:
<?php
//non-obfuscated
function getVarname()
{//the return value has to change
return (('foobar'));
}
$format = '%s = %d';
$foobar = 123;
$variableVar = (('format'));//you need to change this string
printf($$variableVar, $variableVar = getVarname(), $$variableVar);
echo PHP_EOL;
var_dump($GLOBALS[(('foobar'))]);//note the key == the var
and produced this:
<?php function l0() { return (('O0')); } $l1="%\163 = %d"; $O1=0173; $l2=(('O2')); printf($$l2,$l2=l0(),$$l2); echo PHP_EOL; var_dump($GLOBALS[(('O0'))]);
The key issue in Elias's example are strings that actually contain variable names. In general, there is no way for a tool to know that "x" is a variable name, and not just the string containing the letter x. But, the programmers know. We insist that such strings be marked [by enclosing them in ((..)) ] and then the obfuscator can obfuscate their content properly.
Sometimes the string contains variables names and other things; it that case,
the programmer has to break up the string into "variable name" content and everything else. This is pretty easy to do in practice, and is
the "slight change" I made to his supplied code.
Other strings, not being marked, are left alone. You only have to do this
once to the source file. [You can say this is cheating, but no other practical answer will work; the tool cannot know reliably. Halting Problem, if you insist.].
The next thing to get right is reliable obfuscation across multiple files. You can't do this one file at a time. This obfuscator has been used on very big PHP applications (thousands of PHP script files).
Yes, it does use a full PHP parser. Not nikic's.
I ended up with this simple code:
$tokens = token_get_all($src);
$skip = array('$this','$_GET','$_POST','$_REQUEST','$_SERVER','$_COOKIE','$_SESSION');
function renameVars($tokens,$content,$skip){
$vars = array();
foreach($tokens as $token) {
if ($token[0] == T_VARIABLE && !in_array($token[1],$skip))
$vars[generateRandomString()]= $token[1];
}
$vars = array_unique($vars);
$vars2 = $vars;
foreach($vars as $new => $old){
foreach($vars2 as $var){
if($old!=$var && strpos($var,$old)!==false){
continue 2;
}
}
$content = str_replace($old,'${"'.$new.'"}',$content);
//function(${"example"}) will trigger error. This is why we need this:
$content = str_replace('(${"'.$new.'"}','($'.$new,$content);
$content = str_replace(',${"'.$new.'"}',',$'.$new,$content);
$chars = array('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z');
//for things like function deleteExpired(Varien_Event_Observer $fz5eDWIt1si), Exception,
foreach($chars as $char){
$content = str_replace($char.' ${"'.$new.'"}',$char.' $'.$new,$content);
}
}
It works for me because the code is simple. I guess it wont work in all scenarios.
I have it working now but there may still be some vulnerabilities because PHP allows functions names and variables names to be generated dynamically.
The first function replaces $_SESSION, $_POST etc. with functions:
function replaceArrayVariable($str, $arr, $function)
{
$str = str_replace($arr, $function, $str);
$lastPos = 0;
while (($lastPos = strpos($str, $function, $lastPos)) !== false)
{
$lastPos = $lastPos + strlen($function);
$currentPos = $lastPos;
$openSqrBrackets = 1;
while ($openSqrBrackets > 0)
{
if ($str[$currentPos] === '[')
$openSqrBrackets++;
elseif ($str[$currentPos] === ']')
$openSqrBrackets--;
$currentPos++;
}
$str[$currentPos - 1] = ')';
}
return $str;
}
The second renames functions ignoring whitelisted keywords:
function renameFunctions($str)
{
preg_match_all('/[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/', $str, $matches, PREG_OFFSET_CAPTURE);
$totalMatches = count($matches[0]);
$offset = 0;
for ($i = 0; $i < $totalMatches; $i++)
{
$matchIndex = $matches[0][$i][1] + $offset;
if ($matchIndex === 0 || $str[$matchIndex - 1] !== '$')
{
$keyword = $matches[0][$i][0];
if ($keyword !== 'true' && $keyword !== 'false' && $keyword !== 'if' && $keyword !== 'else' && $keyword !== 'getPost' && $keyword !== 'getSession')
{
$str = substr_replace($str, 'qq', $matchIndex, 0);
$offset += 2;
}
}
}
return $str;
}
Then to rename functions, variables, and non-whitelisted keywords, I use this code:
$str = replaceArrayVariable($str, '$_POST[', 'getPost(');
$str = replaceArrayVariable($str, '$_SESSION[', 'getSession(');
preg_match_all('/\'(?:\\\\.|[^\\\\\'])*\'|.[^\']+/', $str, $matches);
$str = '';
foreach ($matches[0] as $match)
{
if ($match[0] != "'")
{
$match = preg_replace('!\s+!', ' ', $match);
$match = renameFunctions($match);
$match = str_replace('$', '$qq', $match);
}
$str .= $match;
}

match the first & last whole word in a variable

I use php preg_match to match the first & last word in a variable with a given first & last specific words,
example:
$first_word = 't'; // I want to force 'this'
$last_word = 'ne'; // I want to force 'done'
$str = 'this function can be done';
if(preg_match('/^' . $first_word . '(.*)' . $last_word .'$/' , $str))
{
echo 'true';
}
But the problem is i want to force match the whole word at (starting & ending) not the first or last characters.
Using \b as boudary word limit in search:
$first_word = 't'; // I want to force 'this'
$last_word = 'ne'; // I want to force 'done'
$str = 'this function can be done';
if(preg_match('/^' . $first_word . '\b(.*)\b' . $last_word .'$/' , $str))
{
echo 'true';
}
I would go about this in a slightly different way:
$firstword = 't';
$lastword = 'ne';
$string = 'this function can be done';
$words = explode(' ', $string);
if (preg_match("/^{$firstword}/i", reset($words)) && preg_match("/{$lastword}$/i", end($words)))
{
echo 'true';
}
==========================================
Here's another way to achieve the same thing
$firstword = 'this';
$lastword = 'done';
$string = 'this can be done';
$words = explode(' ', $string);
if (reset($words) === $firstword && end($words) === $lastword)
{
echo 'true';
}
This is always going to echo true, because we know the firstword and lastword are correct, try changing them to something else and it will not echo true.
I wrote a function to get Start of sentence but it is not any regex in it.
You can write for end like this. I don't add function for the end because of its long...
<?php
function StartSearch($start, $sentence)
{
$data = explode(" ", $sentence);
$flag = false;
$ret = array();
foreach ($data as $val)
{
for($i = 0, $j = 0;$i < strlen($val), $j < strlen($start);$i++)
{
if ($i == 0 && $val{$i} != $start{$j})
break;
if ($flag && $val{$i} != $start{$j})
break;
if ($val{$i} == $start{$j})
{
$flag = true;
$j++;
}
}
if ($j == strlen($start))
{
$ret[] = $val;
}
}
return $ret;
}
print_r(StartSearch("th", $str));
?>

Categories