Is it possible to write a regex which would take input like 'sqrt(2 * (2+2)) + sin(pi/6)' and transform it into '\sqrt{2 \cdot (2+2)} + \sin(\pi/6)'?
The problem is the 'sqrt' and parentheses in it. It is obvious I can't simply use something like this:
/sqrt\((.?)\)/ -> \\sqrt{$1}
because this code would create something like this '\sqrt{2 \cdot (2+2)) + \sin(\pi/6}'.
My solution: it simply go throw the string converted to char array and tests if a current substring starts with $latex, if it does second for-cycle go from this point in different direction and by parentheses decides where the function starts and ends. (startsWith function)
Code:
public static function formatFunction($function, $latex, $input) {
$input = preg_replace("/" . $function . "\(/", $latex . "{", $input);
$arr = str_split($input);
$inGap = false;
$gap = 0;
for ($i = count($arr) - 1; $i >= 0; $i--) {
if (startsWith(substr($input, $i), $latex)) {
for ($x = $i; $x < count($arr); $x++) {
if ($arr[$x] == "(" || $arr[$x] == "{") { $gap++; $inGap = true; }
else if ($arr[$x] == ")" || $arr[$x] == "}") { $gap--; }
if ($inGap && $gap == 0) {
$arr[$x] = "}";
$inGap = false;
break;
}
}
}
$gap = 0;
}
return implode($arr);
}
Use:
self::formatFunction("sqrt", "\\sqrt",
"sqrt(25 + sqrt(16 - sqrt(49)) + (7 + 1)) + sin(pi/2)");
Output:
\sqrt{25+\sqrt{16-\sqrt{49}}+(7+1)}+\sin (\pi/2)
Note: sin and pi aren't formated by this code, it's only str_replace function...
In general, no regular expression can effectively handle nested parentheses. Sorry to be the bearer of bad news! The MathJAX parser library can interpret LaTeX equations and you could probably add a custom output routine to do what you want.
For TeX questions, you can also try http://tex.stackexchange.com .
Some time ago i soved a similar problem in such way. Maybe it will be helpful for you
$str = 'sqrt((2 * (2+2)) + sin(pi/(6+7)))';
$from = []; // parentheses content
$to = []; // patterns for replace #<number>
$brackets = [['(', ')'], ['{', '}'], ['[', ']']]; // new parentheses for every level
$level = 0;
$count = 1; // count or replace made
while($count) {
$str = preg_replace_callback('(\(([^()]+)\))',
function ($m) use (&$to, &$from, $brackets, $level) {
array_unshift($to, $brackets[$level][0] . $m[1] . $brackets[$level][1]);
$i = '#' . (count($to)-1); // pattern for future replace.
// here it '#1', '#2'.
// Make it so they will be unique
array_unshift($from, $i);
return $i; }, $str, -1, $count);
$level++;
}
echo str_replace($from, $to, $str); // return content back
// sqrt[{2 * (2+2)} + sin{pi/(6+7)}]
I forgot all details, but it, seems, works
Related
I have the following code:
for($a=1; $a<strlen($string); $a++){
for($b=1; $a+$b<strlen($string); $b++){
for($c=1; $a+$b+$c<strlen($string); $c++){
for($d=1; $a+$b+$c+$d<strlen($string); $d++){
$tempString = substr_replace($string, ".", $a, 0);
$tempString = substr_replace($tempString, ".", $a+$b+1, 0);
$tempString = substr_replace($tempString, ".", $a+$b+$c+2, 0);
$tempString = substr_replace($tempString, ".", $a+$b+$c+$d+3, 0);
echo $tempString."</br>";
}
}
}
}
What it does is to make all possible combinatons of a string with several dots.
Example:
t.est123
te.st123
tes.t123
...
test12.3
Then, I add one more dot:
t.e.st123
t.es.t123
...
test1.2.3
Doing the way I'm doing now, I need to create lots and lots of for loops, each for a determined number of dots. I don't know how I can turn that example into a functon or other easier way of doing this.
Your problem is a combination problem. Note: I'm not a math freak, I only researched this information because of interest.
http://en.wikipedia.org/wiki/Combination#Number_of_k-combinations
Also known as n choose k. The Binomial coefficient is a function which gives you the number of combinations.
A function I found here: Calculate value of n choose k
function choose($n, $k) {
if ($k == 0) {return 1;}
return($n * choose($n - 1, $k - 1)) / $k;
}
// 6 positions between characters (test123), 4 dots
echo choose(6, 4); // 15 combinations
To get all combinations you also have to choose between different algorithms.
Good post: https://stackoverflow.com/a/127856/1948627
UPDATE:
I found a site with an algorithm in different programming languages. (But not PHP)
I've converted it to PHP:
function bitprint($u){
$s= [];
for($n= 0;$u > 0;++$n, $u>>= 1) {
if(($u & 1) > 0) $s[] = $n;
}
return $s;
}
function bitcount($u){
for($n= 0;$u > 0;++$n, $u&= ($u - 1));
return $n;
}
function comb($c, $n){
$s= [];
for($u= 0;$u < 1 << $n;$u++) {
if(bitcount($u) == $c) $s[] = bitprint($u);
}
return $s;
}
echo '<pre>';
print_r(comb(4, 6));
It outputs an array with all combinations (positions between the chars).
The next step is to replace the string with the dots:
$string = 'test123';
$sign = '.';
$combs = comb(4, 6);
// get all combinations (Th3lmuu90)
/*
$combs = [];
for($i=0; $i<strlen($string); $i++){
$combs = array_merge($combs, comb($i, strlen($string)-1));
}
*/
foreach ($combs as $comb) {
$a = $string;
for ($i = count($comb) - 1; $i >= 0; $i--) {
$a = substr_replace($a, $sign, $comb[$i] + 1, 0);
}
echo $a.'<br>';
}
// output:
t.e.s.t.123
t.e.s.t1.23
t.e.st.1.23
t.es.t.1.23
te.s.t.1.23
t.e.s.t12.3
t.e.st.12.3
t.es.t.12.3
te.s.t.12.3
t.e.st1.2.3
t.es.t1.2.3
te.s.t1.2.3
t.est.1.2.3
te.st.1.2.3
tes.t.1.2.3
This is quite an unusual question, but I can't help but try to wrap around what you are tying to do. My guess is that you want to see how many combinations of a string there are with a dot moving between characters, finally coming to rest right before the last character.
My understanding is you want a count and a printout of string similar to what you see here:
t.est
te.st
tes.t
t.es.t
te.s.t
t.e.s.t
count: 6
To facilitate this functionality I came up with a class, this way you could port it to other parts of code and it can handle multiple strings. The caveat here is the strings must be at least two characters and not contain a period. Here is the code for the class:
class DotCombos
{
public $combos;
private function combos($string)
{
$rebuilt = "";
$characters = str_split($string);
foreach($characters as $index => $char) {
if($index == 0 || $index == count($characters)) {
continue;
} else if(isset($characters[$index]) && $characters[$index] == ".") {
break;
} else {
$rebuilt = substr($string, 0, $index) . "." . substr($string, $index);
print("$rebuilt\n");
$this->combos++;
}
}
return $rebuilt;
}
public function allCombos($string)
{
if(strlen($string) < 2) {
return null;
}
$this->combos = 0;
for($i = 0; $i < count(str_split($string)) - 1; $i++) {
$string = $this->combos($string);
}
}
}
To make use of the class you would do this:
$combos = new DotCombos();
$combos->allCombos("test123");
print("Count: $combos->combos");
The output would be:
t.est123
te.st123
tes.t123
test.123
test1.23
test12.3
t.est12.3
te.st12.3
tes.t12.3
test.12.3
test1.2.3
t.est1.2.3
te.st1.2.3
tes.t1.2.3
test.1.2.3
t.est.1.2.3
te.st.1.2.3
tes.t.1.2.3
t.es.t.1.2.3
te.s.t.1.2.3
t.e.s.t.1.2.3
Count: 21
Hope that is what you are looking for (or at least helps)....
I have an interface where a user will enter the name of a company. It then compares what they typed to current entries in the database, and if something similar is found it presents them with options (in case they misspelled) or they can click a button which confirms what they typed is definitely new and unique.
The problem I am having is that it is not very accurate and often brings up dozens of "similar" matches that aren't that similar at all!
Here is what I have now, the first large function I didn't make and I am not clear on what exactly it does. Is there are much simpler way to acheive what I want?
// Compares strings and determines how similar they are based on a nth letter split comparison.
function cmp_by_optionNumber($b, $a) {
if ($a["score"] == $b["score"]) return 0;
if ($a["score"] > $b["score"]) return 1;
return -1;
}
function string_compare($str_a, $str_b)
{
$length = strlen($str_a);
$length_b = strlen($str_b);
$i = 0;
$segmentcount = 0;
$segmentsinfo = array();
$segment = '';
while ($i < $length)
{
$char = substr($str_a, $i, 1);
if (strpos($str_b, $char) !== FALSE)
{
$segment = $segment.$char;
if (strpos($str_b, $segment) !== FALSE)
{
$segmentpos_a = $i - strlen($segment) + 1;
$segmentpos_b = strpos($str_b, $segment);
$positiondiff = abs($segmentpos_a - $segmentpos_b);
$posfactor = ($length - $positiondiff) / $length_b; // <-- ?
$lengthfactor = strlen($segment)/$length;
$segmentsinfo[$segmentcount] = array( 'segment' => $segment, 'score' => ($posfactor * $lengthfactor));
}
else
{
$segment = '';
$i--;
$segmentcount++;
}
}
else
{
$segment = '';
$segmentcount++;
}
$i++;
}
// PHP 5.3 lambda in array_map
$totalscore = array_sum(array_map(function($v) { return $v['score']; }, $segmentsinfo));
return $totalscore;
}
$q = $_POST['stringA'] ;
$qLengthMin = strlen($q) - 5 ; // Part of search calibration. Smaller number = stricter.
$qLengthMax = strlen($q) + 2 ; // not in use.
$main = array() ;
include("pdoconnect.php") ;
$result = $dbh->query("SELECT id, name FROM entity_details WHERE
name LIKE '{$q[0]}%'
AND CHAR_LENGTH(name) >= '$qLengthMin'
#LIMIT 50") ; // The first letter MUST be correct. This assumption makes checker faster and reduces irrelivant results.
$x = 0 ;
while($row = $result->fetch(PDO::FETCH_ASSOC)) {
$percent = string_compare(strtolower($q), strtolower(rawurldecode($row['name']))) ;
if($percent == 1) {
//echo 1 ;// 1 signifies an exact match on a company already in our DB.
echo $row['id'] ;
exit() ;
}
elseif($percent >= 0.6) { // Part of search calibration. Higher deci number = stricter.
$x++ ;
$main[$x]['name'] = rawurldecode($row['name']) ;
$main[$x]['score'] = round($percent, 2) * 100;
//array_push($overs, urldecode($row['name']) . " ($percent)<br />") ;
}
}
usort($main, "cmp_by_optionNumber") ;
$z = 0 ;
echo '<div style="overflow-y:scroll;height:175px;width:460px;">' ;
foreach($main as $c) {
if($c['score'] > 100) $c['score'] = 100 ;
if(count($main) > 1) {
echo '<div id="anysuggested' . $z . '" class="hoverdiv" onclick="selectAuto(' . "'score$z'" . ');">' ;
}
else echo '<div id="anysuggested' . $z . '" class="hoverdiv" style="color:#009444;" onclick="selectAuto(' . "'score$z'" . ');">' ;
echo '<span id="autoscore' . $z . '">' . $c['name'] . '</span></div>' ;
$z++ ;
}
echo '</div>' ;
Comparing strings is a huge topic and there are many ways to do it. One very common algorithm is called the Levenshtein difference. This is a native implementation in PHP but none in MySQL. There is however an implementation here that you could use.
You need aproximate/fuzzy string matching.
Read more about
http://php.net/manual/en/function.levenshtein.php, http://www.slideshare.net/kyleburton/fuzzy-string-matching
The best way would be to use some index based search engine like SOLR http://lucene.apache.org/solr/.
I'm trying to extract OUTERMOST special HTML-like tags from a given string. Here's a sample string:
sample string with <::Class id="some id\" and more">text with possible other tags inside<::/Class> some more text
I need to find where in a string a <::Tag starts and where it ends. The problem is it might contain nested tags inside. Is there a simple loop-like algorithm to find the FIRST ocurrence of the <::Tag and the length of the string until the matching <::/Tag>? I've tried a different way, using a simple HTML tag instead and using DomDocument, but it cannot tell me the position of the tag in a string. I cannot use external libraries, i'm just looking for pointers as to how this could be solved. Maybe you've seen an algorithm that does exactly that - i'd like to have a look at it.
Thanks for the help.
P.S. regex solutions will not work since there are nested tags. Recursive regex solutions will not work as well. I'm just looking for a very simple parsing algorighm for this specific case.
What you're talking about here is making a template. Regex for parsing templates is very slow. Instead, your template-reading/processing engine should be doing a string parse. It's not super-easy, but it's also not terribly hard. Still, my advice is use another template library instead of reinventing the wheel.
There's an open-source template engine in PHPBB that you could utilize or learn from. Or, use something like Smarty. If performance is a major deal, have a look at Blitz.
strpos + strrpos (ouch...)
$str = 'sample string with <::Class id="some id" and more">text with possible <::Strong>other<::/Strong> tags inside<::/Class> some more text';
$tag = '<::';
$first = strpos($str, $tag);
$last = strrpos($str, $tag);
$rtn = array();
$cnt = 0;
while ($first<$last)
{
if (!$cnt)
{
$rtn[] = substr($str, 0, $first);
}
++$cnt;
$next = strpos($str, $tag, $first+1);
if ($next)
{
$pos = strpos($str, '>', $first);
$rtn[] = substr($str, $first, $pos-$first+1);
$rtn[] = substr($str, $pos+1, $next-$pos-1);
$first = $next;
}
}
With the $rtn, you can do whatever you want then ... this code is not perfect yet ...
array (
0 => 'sample string with ',
1 => '<::Class id="some id" and more">',
2 => 'text with possible ',
3 => '<::Strong>',
4 => 'other',
5 => '<::/Strong>',
6 => ' tags inside',
7 => '<::/Class> some more text',
)
So basically here's what i came up with. Something like ajreal's solution only not as clean ;] Not even sure if it works perfectly yet, initial testing was successful.
protected function findFirstControl()
{
$pos = strpos($this->mSource, '<::');
if ($pos === false)
return false;
// get the control name
$endOfName = false;
$controlName = '';
$len = strlen($this->mSource);
$i = $pos + 3;
while (!$endOfName && $i < $len)
{
$char = $this->mSource[$i];
if (($char >= 'a' && $char <= 'z') || ($char >= 'A' && $char <= 'Z'))
$controlName .= $char;
else
$endOfName = true;
$i++;
}
if ($controlName == '')
return false;
$posOfEnd = strpos($this->mSource, '<::/' . $controlName, $i);
$posOfStart = strpos($this->mSource, '<::' . $controlName, $i);
if ($posOfEnd === false)
return false;
if ($posOfStart > $pos)
{
while ($posOfStart > $pos && $posOfEnd !== false && $posOfStart < $posOfEnd)
{
$i = $posOfStart + 1;
$n = $posOfEnd + 1;
$posOfStart = strpos($this->mSource, '<::' . $controlName, $i);
$posOfEnd = strpos($this->mSource, '<::/' . $controlName, $n);
}
}
if ($posOfEnd !== false)
{
$ln = $posOfEnd - $pos + strlen($controlName) + 5;
return array($pos, $ln, $controlName, substr($this->mSource, $pos, $ln));
}
else
return false;
}
Not an extendable solution, but it works.
$startPos = strpos($string, '<::Class');
$endPos = strrpos($string, '<::/Class>');
Note my use of strrpos to fix the nesting problem. Also, this will give you the start position of <::/Class>, not the end.
Why don't you just use regular XML and the DOM? Or just an existing template engine like Smarty?
i programmed this php function that takes any text/html string and trims it.
For example:
gen_string("Hello, how are you today?",10);
Returns:
Hello, how...
The problem arises when the function string limit is the same as the position of a special character such as: á, ñ, etc...
In which case:
gen_string("Helló my friend",5);
Returns: Hell�...
Any ideas on how to solve this issue? This is the current function:
# string: advanced substr
function gen_string($string,$min,$clean=false) {
$text = trim(strip_tags($string));
if(strlen($text)>$min) {
$blank = strpos($text,' ');
if($blank) {
# limit plus last word
$extra = strpos(substr($text,$min),' ');
$max = $min+$extra;
$r = substr($text,0,$max);
if(strlen($text)>=$max && !$clean) $r=trim($r,'.').'...';
} else {
# if there are no spaces
$r = substr($text,0,$min).'...';
}
} else {
# if original length is lower than limit
$r = $text;
}
return trim($r);
}
Thanks!
You should use the multibyte string functions to correctly handle unicode characters.
For example you could try using mb_strimwidth to truncate a string to a specified length.
You could also take a different approach and make use of the PCRE regex extension's UTF-8 capabilities (assuming your strings are UTF-8!).
function gen_string($string, $length)
{
$str = trim(strip_tags($string));
$strlen = strlen(utf8_decode($str));
// String is less than limit
if ($strlen <= $length) return $str;
// Shorten string, preserving whole "words" (non-whitespace)
preg_match('/^.{'.($length-1).'}\S*/su', $str, $match);
// Append ellipsis if needed (bytes length is OK to check)
if (strlen($match[0]) !== strlen($str)) $match[0] .= '...';
return $match[0];
}
Aside from the multibyte issue, maybe you can write it shorter
function gen_string($str, $limit) {
if ($str >= strlen($limit))
return $str;
$offset = -(strlen($str) - $limit);
return substr($str, 0, strrpos($str, ' ', $offset)).'...';
}
It will limit the length of the string, so rather than cut it after the first word beyond the limit, it ensures that the length is never larger than the limit.
strlen() cannot be used for UTF-8 string, because it would count also the continuation characters, which should not be counted.
You can try with the following code:
define('PREG_CLASS_UNICODE_WORD_BOUNDARY',
'\x{0}-\x{2F}\x{3A}-\x{40}\x{5B}-\x{60}\x{7B}-\x{A9}\x{AB}-\x{B1}\x{B4}' .
'\x{B6}-\x{B8}\x{BB}\x{BF}\x{D7}\x{F7}\x{2C2}-\x{2C5}\x{2D2}-\x{2DF}' .
'\x{2E5}-\x{2EB}\x{2ED}\x{2EF}-\x{2FF}\x{375}\x{37E}-\x{385}\x{387}\x{3F6}' .
'\x{482}\x{55A}-\x{55F}\x{589}-\x{58A}\x{5BE}\x{5C0}\x{5C3}\x{5C6}' .
'\x{5F3}-\x{60F}\x{61B}-\x{61F}\x{66A}-\x{66D}\x{6D4}\x{6DD}\x{6E9}' .
'\x{6FD}-\x{6FE}\x{700}-\x{70F}\x{7F6}-\x{7F9}\x{830}-\x{83E}' .
'\x{964}-\x{965}\x{970}\x{9F2}-\x{9F3}\x{9FA}-\x{9FB}\x{AF1}\x{B70}' .
'\x{BF3}-\x{BFA}\x{C7F}\x{CF1}-\x{CF2}\x{D79}\x{DF4}\x{E3F}\x{E4F}' .
'\x{E5A}-\x{E5B}\x{F01}-\x{F17}\x{F1A}-\x{F1F}\x{F34}\x{F36}\x{F38}' .
'\x{F3A}-\x{F3D}\x{F85}\x{FBE}-\x{FC5}\x{FC7}-\x{FD8}\x{104A}-\x{104F}' .
'\x{109E}-\x{109F}\x{10FB}\x{1360}-\x{1368}\x{1390}-\x{1399}\x{1400}' .
'\x{166D}-\x{166E}\x{1680}\x{169B}-\x{169C}\x{16EB}-\x{16ED}' .
'\x{1735}-\x{1736}\x{17B4}-\x{17B5}\x{17D4}-\x{17D6}\x{17D8}-\x{17DB}' .
'\x{1800}-\x{180A}\x{180E}\x{1940}-\x{1945}\x{19DE}-\x{19FF}' .
'\x{1A1E}-\x{1A1F}\x{1AA0}-\x{1AA6}\x{1AA8}-\x{1AAD}\x{1B5A}-\x{1B6A}' .
'\x{1B74}-\x{1B7C}\x{1C3B}-\x{1C3F}\x{1C7E}-\x{1C7F}\x{1CD3}\x{1FBD}' .
'\x{1FBF}-\x{1FC1}\x{1FCD}-\x{1FCF}\x{1FDD}-\x{1FDF}\x{1FED}-\x{1FEF}' .
'\x{1FFD}-\x{206F}\x{207A}-\x{207E}\x{208A}-\x{208E}\x{20A0}-\x{20B8}' .
'\x{2100}-\x{2101}\x{2103}-\x{2106}\x{2108}-\x{2109}\x{2114}' .
'\x{2116}-\x{2118}\x{211E}-\x{2123}\x{2125}\x{2127}\x{2129}\x{212E}' .
'\x{213A}-\x{213B}\x{2140}-\x{2144}\x{214A}-\x{214D}\x{214F}' .
'\x{2190}-\x{244A}\x{249C}-\x{24E9}\x{2500}-\x{2775}\x{2794}-\x{2B59}' .
'\x{2CE5}-\x{2CEA}\x{2CF9}-\x{2CFC}\x{2CFE}-\x{2CFF}\x{2E00}-\x{2E2E}' .
'\x{2E30}-\x{3004}\x{3008}-\x{3020}\x{3030}\x{3036}-\x{3037}' .
'\x{303D}-\x{303F}\x{309B}-\x{309C}\x{30A0}\x{30FB}\x{3190}-\x{3191}' .
'\x{3196}-\x{319F}\x{31C0}-\x{31E3}\x{3200}-\x{321E}\x{322A}-\x{3250}' .
'\x{3260}-\x{327F}\x{328A}-\x{32B0}\x{32C0}-\x{33FF}\x{4DC0}-\x{4DFF}' .
'\x{A490}-\x{A4C6}\x{A4FE}-\x{A4FF}\x{A60D}-\x{A60F}\x{A673}\x{A67E}' .
'\x{A6F2}-\x{A716}\x{A720}-\x{A721}\x{A789}-\x{A78A}\x{A828}-\x{A82B}' .
'\x{A836}-\x{A839}\x{A874}-\x{A877}\x{A8CE}-\x{A8CF}\x{A8F8}-\x{A8FA}' .
'\x{A92E}-\x{A92F}\x{A95F}\x{A9C1}-\x{A9CD}\x{A9DE}-\x{A9DF}' .
'\x{AA5C}-\x{AA5F}\x{AA77}-\x{AA79}\x{AADE}-\x{AADF}\x{ABEB}' .
'\x{D800}-\x{F8FF}\x{FB29}\x{FD3E}-\x{FD3F}\x{FDFC}-\x{FDFD}' .
'\x{FE10}-\x{FE19}\x{FE30}-\x{FE6B}\x{FEFF}-\x{FF0F}\x{FF1A}-\x{FF20}' .
'\x{FF3B}-\x{FF40}\x{FF5B}-\x{FF65}\x{FFE0}-\x{FFFD}');
function utf8_strlen($text) {
if (function_exists('mb_strlen')) {
return mb_strlen($text);
}
// Do not count UTF-8 continuation bytes.
return strlen(preg_replace("/[\x80-\xBF]/", '', $text));
}
function utf8_truncate($string, $max_length, $wordsafe = FALSE, $add_ellipsis = FALSE, $min_wordsafe_length = 1) {
$ellipsis = '';
$max_length = max($max_length, 0);
$min_wordsafe_length = max($min_wordsafe_length, 0);
if (utf8_strlen($string) <= $max_length) {
// No truncation needed, so don't add ellipsis, just return.
return $string;
}
if ($add_ellipsis) {
// Truncate ellipsis in case $max_length is small.
$ellipsis = utf8_substr('...', 0, $max_length);
$max_length -= utf8_strlen($ellipsis);
$max_length = max($max_length, 0);
}
if ($max_length <= $min_wordsafe_length) {
// Do not attempt word-safe if lengths are bad.
$wordsafe = FALSE;
}
if ($wordsafe) {
$matches = array();
// Find the last word boundary, if there is one within $min_wordsafe_length
// to $max_length characters. preg_match() is always greedy, so it will
// find the longest string possible.
$found = preg_match('/^(.{' . $min_wordsafe_length . ',' . $max_length . '})[' . PREG_CLASS_UNICODE_WORD_BOUNDARY . ']/u', $string, $matches);
if ($found) {
$string = $matches[1];
}
else {
$string = utf8_substr($string, 0, $max_length);
}
}
else {
$string = utf8_substr($string, 0, $max_length);
}
if ($add_ellipsis) {
$string .= $ellipsis;
}
return $string;
}
function utf8_substr($text, $start, $length = NULL) {
if (function_exists('mb_substr')) {
return $length === NULL ? mb_substr($text, $start) : mb_substr($text, $start, $length);
}
else {
$strlen = strlen($text);
// Find the starting byte offset.
$bytes = 0;
if ($start > 0) {
// Count all the continuation bytes from the start until we have found
// $start characters or the end of the string.
$bytes = -1;
$chars = -1;
while ($bytes < $strlen - 1 && $chars < $start) {
$bytes++;
$c = ord($text[$bytes]);
if ($c < 0x80 || $c >= 0xC0) {
$chars++;
}
}
}
elseif ($start < 0) {
// Count all the continuation bytes from the end until we have found
// abs($start) characters.
$start = abs($start);
$bytes = $strlen;
$chars = 0;
while ($bytes > 0 && $chars < $start) {
$bytes--;
$c = ord($text[$bytes]);
if ($c < 0x80 || $c >= 0xC0) {
$chars++;
}
}
}
$istart = $bytes;
// Find the ending byte offset.
if ($length === NULL) {
$iend = $strlen;
}
elseif ($length > 0) {
// Count all the continuation bytes from the starting index until we have
// found $length characters or reached the end of the string, then
// backtrace one byte.
$iend = $istart - 1;
$chars = -1;
$last_real = FALSE;
while ($iend < $strlen - 1 && $chars < $length) {
$iend++;
$c = ord($text[$iend]);
$last_real = FALSE;
if ($c < 0x80 || $c >= 0xC0) {
$chars++;
$last_real = TRUE;
}
}
// Backtrace one byte if the last character we found was a real character
// and we don't need it.
if ($last_real && $chars >= $length) {
$iend--;
}
}
elseif ($length < 0) {
// Count all the continuation bytes from the end until we have found
// abs($start) characters, then backtrace one byte.
$length = abs($length);
$iend = $strlen;
$chars = 0;
while ($iend > 0 && $chars < $length) {
$iend--;
$c = ord($text[$iend]);
if ($c < 0x80 || $c >= 0xC0) {
$chars++;
}
}
// Backtrace one byte if we are not at the beginning of the string.
if ($iend > 0) {
$iend--;
}
}
else {
// $length == 0, return an empty string.
return '';
}
return substr($text, $istart, max(0, $iend - $istart + 1));
}
}
For your return statement you could try:
return htmlspecialchars(trim($r));
EDIT: I tried your code as you provided it and it ran fine for me without having to use htmlspecialchars(). This is probably due to the face that in the <head> of the page the code was running on, the charset was set to UTF-8. So your options could be to set the encoding of the page like this:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
or to use htmlspecialchars() as above.
I need a way of taking an equation given as a string and finding it's mathematical answer, the big caveat is that I can't use eval().
I know the equation will only ever contain numbers, the four mathematical operators (i.e. * / + -) and parentheses, it may or may not have spaces in the string. Here's a couple of examples.
4 * 4
4+6/3
(3 / 2)*(4+8)
(4+8) * 2
I'm guessing that it's going to have to be done with some kind of regex?
Math expressions aren't regular. They're context-free.
Your best bet is to parse them using well-known math parsing algorithms like the shunting yard algorithm. All you have to worry about is implementing the algorithm in PHP. You might even be able to find PHP implementations of it online.
Just in case anybody's interested here is the algorithm I came up with in PHP for producing Reverse Polish Notation
function convertToRPN($equation)
{
$equation = str_replace(' ', '', $equation);
$tokens = token_get_all('<?php ' . $equation);
$operators = array('*' => 1, '/' => 1, '+' => 2, '-' => 2);
$rpn = '';
$stack = array();
$size = count($tokens);
for($i = 1; $i < $size; $i++) {
if(is_array($tokens[$i])) {
$rpn .= $tokens[$i][1] . ' ';
} else {
if(empty($stack) || $tokens[$i] == '(') {
$stack[] = $tokens[$i];
} else {
if($tokens[$i] == ')') {
while(end($stack) != '(') {
$rpn .= array_pop($stack);
}
array_pop($stack);
} else {
while(!empty($stack) && end($stack) != '(' && $operators[$tokens[$i]] >= $operators[end($stack)]) {
$rpn .= array_pop($stack);
}
$stack[] = $tokens[$i];
}
}
}
}
while(!empty($stack)) {
$rpn .= array_pop($stack);
}
return $rpn;
}