Combining array elements with " mark at both ends - php

I need to combine two or more array elements enclosed inside a double quotation mark...
Example:
- Before -
[
"Foo1",
"\"Foo2",
"Foo3",
"Foo4\"",
"Foo5"
]
-After -
[
"Foo1",
"\"Foo2 Foo3 Foo4\"",
"Foo5"
]

Below, $source and $result are the source and result respectively. I would use array_reduce()
$result = array_reduce($source, function($acc,$i) {
static $append = false; // static so value is remembered between iterations
if($append) $acc[count($acc)-1] .= " $i"; // Append to the last element
else $acc[] = $i; // Add a new element
// Set $append=true if item starts with ", $append=false if item ends with "
if(strpos($i,'"') === 0) $append = true;
elseif (strpos($i,'"') === strlen($i)-1) $append = false;
return $acc; // Value to be inserted as the $acc param in next iteration
}, []);
Live demo

function combine_inside_quotation($array)
{
if(!is_array($array) || 0 === ($array_lenght = count($array))) {
return [];
}
$i = 0;
$j = 0;
$output = [];
$inside_quote = false;
while($i < $array_lenght) {
if (false === $inside_quote && '"' === $array[$i][0]) {
$inside_quote = true;
$output[$j] = $array[$i];
}else if (true === $inside_quote && '"' === $array[$i][strlen($array[$i]) - 1]) {
$inside_quote = false;
$output[$j] .= ' ' . $array[$i];
$j++;
}else if (true === $inside_quote && '"' !== $array[$i][0] && '"' !== $array[$i][strlen($array[$i]) - 1]) {
$output[$j] .= ' ' . $array[$i];
} else {
$inside_quote = false;
$output[$j] = $array[$i];
$j++;
}
$i++;
}
return $output;
}
Try this function it gives the output you mentioned.
Live Demo

Related

PHP recursive function nesting level reached

Good day. I have a parcer function that taker an array of string like this:
['str','str2','str2','*str','*str2','**str2','str','str2','str2']
And recursivelly elevates a level those starting with asterix to get this
['str','str2','str2',['str','str2',['str2']],'str','str2','str2']
And the function is:
function recursive_array_parser($ARRAY) {
do {
$i = 0;
$s = null;
$e = null;
$err = false;
foreach ($ARRAY as $item) {
if (!is_array($item)) { //if element is not array
$item = trim($item);
if ($item[0] === '*' && $s == null && $e == null) { //we get it's start and end if it has asterix
$s = $i;
$e = $i;
} elseif ($item[0] === '*' && $e != null)
$e = $i;
elseif (!isset($ARRAY[$i + 1]) || $s != null && $e != null) { //if there are no elements or asterix element ended we elevate it
$e = $e == NULL ? $i : $e;
$head = array_slice($ARRAY, 0, $s);
$_x = [];
$inner = array_slice($ARRAY, $s, $e - $s + 1);
foreach ($inner as $_i)
$_x[] = substr($_i, 1);
$inner = [$_x];
$tail = array_slice($ARRAY, $e + 1, 999) or [];
$X = array_merge($head, $inner);
$ARRAY = array_merge($X, $tail);
$s = null;
$e = null;
$err = true;
}
} else {
$ARRAY[$i] = recursive_array_parser($ARRAY[$i]); //if the item is array of items we recur.
}
$i++;
if ($err == true) {
break 1;
}
}
} while ($err);
return $ARRAY;
}
When this function runs, i get "Fatal error: Maximum function nesting level of '200' reached, aborting!" error.
I know it has something to do with infinite recursion, but i can't track the particular place where it occurs, and this is strange.
I don't normally rewrite code, but your code can be reduced and simplified while, from what I can see, getting the desired result. See if this works for you:
$a = array('a','b','c','*d','*e','**f','g','*h');
print_r($a);
$a = recursive_array_parser($a);
print_r($a);
function recursive_array_parser($array)
{
$ret = array();
for($i=0; $i<sizeof($array); $i++)
{
if($array[$i]{0}!='*') $ret[] = $array[$i];
else
{
$tmp = array();
for($j=$i; $j<sizeof($array) && $array[$j]{0}=='*'; $j++)
{
$tmp[] = substr($array[$j],1);
}
$ret[] = recursive_array_parser($tmp);
$i = $j-1;
}
}
return $ret;
}
Note that it isn't possible for $array[$i] to be an array, so that check is removed. The recursion takes place on the temp array created when a * is found. The $i is closer tied to $array to reset it properly after parsing the series of * elements.
Here's my solution. No nested loops.
function recursive_array_parser($arr) {
$out = array();
$sub = null;
foreach($arr as $item) {
if($item[0] == '*') { // We've hit a special item!
if(!is_array($sub)) { // We're not currently accumulating a sub-array, let's make one!
$sub = array();
}
$sub[] = substr($item, 1); // Add it to the sub-array without the '*'
} else {
if(is_array($sub)) {
// Whoops, we have an active subarray, but this thing didn't start with '*'. End that sub-array
$out[] = recursive_array_parser($sub);
$sub = null;
}
// Take the item
$out[] = $item;
}
}
if(is_array($sub)) { // We ended in an active sub-array. Add it.
$out[] = recursive_array_parser($sub);
$sub = null;
}
return $out;
}

PHP4 (legacy servers) XML=>JSON converion, a la simplexml_file_load() or simpleXMLelement

I'm developing on a server using an old version of PHP (4.3.9), and I'm trying to convert an XML string into a JSON string. This is easy in PHP5, but a lot tougher with PHP4.
I've tried:
Zend JSON.php
require_once 'Zend/Json.php';
echo Zend_Json::encode($sxml);
Error:
PHP Parse error: parse error, unexpected T_CONST, expecting T_OLD_FUNCTION or T_FUNCTION or T_VAR or '}'
simplexml44
$impl = new IsterXmlSimpleXMLImpl;
$NDFDxmltemp = $impl->load_file($NDFDstring);
$NDFDxml = $NDFDxmltemp->asXML();
Error:
WARNING isterxmlexpatnonvalid->parse(): nothing to read
xml_parse
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "_start_element", "_end_element");
xml_set_character_data_handler($xml_parser, "_character_data");
xml_parse($xml_parser, $NDFDstring);
Error:
PHP Warning: xml_parse(): Unable to call handler _character_data() in ...
PHP Warning: xml_parse(): Unable to call handler _end_element() in ...
Does anyone have any other alternatives to simplexml_file_load() and new simpleXMLelement in PHP4?
Upgrading PHP is not an option in this particular case, so do not bother bringing it up. Yes, I know its old.
NOTE: This is the XML I'm trying to parse into a multidimensional array OR json.
http://graphical.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php?lat=40&lon=-120&product=time-series&begin=2013-10-30T00:00:00&end=2013-11-06T00:00:00&maxt=maxt&mint=mint&rh=rh&wx=wx&wspd=wspd&wdir=wdir&icons=icons&wgust=wgust&pop12=pop12&maxrh=maxrh&minrh=minrh&qpf=qpf&snow=snow&temp=temp&wwa=wwa
XML are quite easy to parse by yourself, anyway here is exists xml_parse in php 4 and domxml_open_file
and here is json for php4 and another one
according parsing xml:
if you know the structure of xml file, you can go even with RegExp, as XML is strict format (I mean all tags must be closed, all attributes in quotes, all special symbols always escaped)
if you parse arbitrary xml file, here is student sample, which works with php4, do not understand all XML features, but can give you "brute-force" like idea:
<?php
define("LWG_XML_ELEMENT_NULL", "0");
define("LWG_XML_ELEMENT_NODE", "1");
function EntitiesToString($str)
{
$s = $str;
$s = eregi_replace(""", "\"", $s);
$s = eregi_replace("<", "<", $s);
$s = eregi_replace(">", ">", $s);
$s = eregi_replace("&", "&", $s);
return $s;
}
class CLWG_dom_attribute
{
var $name;
var $value;
function CLWG_dom_attribute()
{
$name = "";
$value = "";
}
}
class CLWG_dom_node
{
var $m_Attributes;
var $m_Childs;
var $m_nAttributesCount;
var $m_nChildsCount;
var $type;
var $tagname;
var $content;
function CLWG_dom_node()
{
$this->m_Attributes = array();
$this->m_Childs = array();
$this->m_nAttributesCount = 0;
$this->m_nChildsCount = 0;
$this->type = LWG_XML_ELEMENT_NULL;
$this->tagname = "";
$this->content = "";
}
function get_attribute($attr_name)
{
//echo "<message>Get Attribute: ".$attr_name." ";
for ($i=0; $i<sizeof($this->m_Attributes); $i++)
if ($this->m_Attributes[$i]->name == $attr_name)
{
//echo $this->m_Attributes[$i]->value . "</message>\n";
return $this->m_Attributes[$i]->value;
}
//echo "[empty]</message>\n";
return "";
}
function get_content()
{
//echo "<message>Get Content: ".$this->content . "</message>\n";
return $this->content;
}
function attributes()
{
return $this->m_Attributes;
}
function child_nodes()
{
return $this->m_Childs;
}
function loadXML($str, &$i)
{
//echo "<debug>DEBUG: LoadXML (".$i.": ".$str[$i].")</debug>\n";
$str_len = strlen($str);
//echo "<debug>DEBUG: start searching for tag (".$i.": ".$str[$i].")</debug>\n";
while ( ($i<$str_len) && ($str[$i] != "<") )
$i++;
if ($i == $str_len) return FALSE;
$i++;
while ( ($i<strlen($str)) && ($str[$i] != " ") && ($str[$i] != "/") && ($str[$i] != ">") )
$this->tagname .= $str[$i++];
//echo "<debug>DEBUG: Tag: " . $this->tagname . "</debug>\n";
if ($i == $str_len) return FALSE;
switch ($str[$i])
{
case " ": // attributes comming
{
//echo "<debug>DEBUG: Tag: start searching attributes</debug>\n";
$i++;
$cnt = sizeof($this->m_Attributes);
while ( ($i<strlen($str)) && ($str[$i] != "/") && ($str[$i] != ">") )
{
$this->m_Attributes[$cnt] = new CLWG_dom_attribute;
while ( ($i<strlen($str)) && ($str[$i] != "=") )
$this->m_Attributes[$cnt]->name .= $str[$i++];
if ($i == $str_len) return FALSE;
$i++;
while ( ($i<strlen($str)) && ($str[$i] != "\"") )
$i++;
if ($i == $str_len) return FALSE;
$i++;
while ( ($i<strlen($str)) && ($str[$i] != "\"") )
$this->m_Attributes[$cnt]->value .= $str[$i++];
$this->m_Attributes[$cnt]->value = EntitiesToString($this->m_Attributes[$cnt]->value);
//echo "<debug>DEBUG: Tag: Attribute: '".$this->m_Attributes[$cnt]->name."' = '".$this->m_Attributes[$cnt]->value."'</debug>\n";
if ($i == $str_len) return FALSE;
$i++;
if ($i == $str_len) return FALSE;
while ( ($i<strlen($str)) && ($str[$i] == " ") )
$i++;
$cnt++;
}
if ($i == $str_len) return FALSE;
switch ($str[$i])
{
case "/":
{
//echo "<debug>DEBUG: self closing tag with attributes (".$this->tagname.")</debug>\n";
$i++;
if ($i == $str_len) return FALSE;
if ($str[$i] != ">") return FALSE;
$i++;
return TRUE;
break;
}
case ">";
{
//echo "<debug>DEBUG: end of attributes (".$this->tagname.")</debug>\n";
$i++;
break;
}
}
break;
}
case "/": // self closing tag
{
//echo "<debug>DEBUG: self closing tag (".$this->tagname.")</debug>\n";
$i++;
if ($i == $str_len) return FALSE;
if ($str[$i] != ">") return FALSE;
$i++;
return TRUE;
break;
}
case ">": // end of begin of node
{
//echo "<debug>DEBUG: end of begin of node</debug>\n";
$i++;
break;
}
}
if ($i == $str_len) return FALSE;
$b = 1;
while ( ($i<$str_len) && ($b) )
{
//echo "<debug>DEBUG: searching for content</debug>\n";
while ( ($i<strlen($str)) && ($str[$i] != "<") )
$this->content .= $str[$i++];
//echo "<debug>DEBUG: content: ".$this->content."</debug>\n";
if ($i == $str_len) return FALSE;
$i++;
if ($i == $str_len) return FALSE;
if ($str[$i] != "/") // new child
{
$cnt = sizeof($this->m_Childs);
//echo "<debug>DEBUG: Create new child (" . $cnt . ")</debug>\n";
$this->m_Childs[$cnt] = new CLWG_dom_node;
$this->m_Childs[$cnt]->type = LWG_XML_ELEMENT_NODE;
$i--;
if ($this->m_Childs[$cnt]->loadXML($str, $i) === FALSE)
return FALSE;
}
else
$b = 0;
}
$i++;
$close_tag = "";
while ( ($i<strlen($str)) && ($str[$i] != ">") )
$close_tag .= $str[$i++];
//echo "<debug>DEBUG: close tag: ".$close_tag." - ".$this->tagname."</debug>\n";
if ($i == $str_len) return FALSE;
$i++;
$this->content = EntitiesToString($this->content);
//echo "<debug>DEBUG: content: ".$this->content."</debug>\n";
return ($close_tag == $this->tagname);
}
}
class CLWG_dom_xml
{
var $m_Root;
function CLWG_dom_xml()
{
$this->m_Root = 0;
}
function document_element()
{
return $this->m_Root;
}
function loadXML($xml_string)
{
// check xml tag
if (eregi("<\\?xml", $xml_string))
{
// check xml version
$xml_version = array();
if ( (eregi("<\\?xml version=\"([0-9\\.]+)\".*\\?>", $xml_string, $xml_version)) && ($xml_version[1] == 1.0) )
{
// initialize root
$this->m_Root = new CLWG_dom_node;
$i = 0;
return $this->m_Root->loadXML(eregi_replace("<\\?xml.*\\?>", "", $xml_string), $i);
}
else
{
echo "<error>Cannot find version attribute in xml tag</error>";
return FALSE;
}
}
else
{
echo "<error>Cannot find xml tag</error>";
return FALSE;
}
}
}
function lwg_domxml_open_mem($xml_string)
{
global $lwg_xml;
$lwg_xml = new CLWG_dom_xml;
if ($lwg_xml->loadXML($xml_string))
return $lwg_xml;
else
return 0;
}
?>
PHP4 has no support for ArrayAccess nor does it have the needed __get(), __set() and __toString() magic methods which effectively prevents you from creating an object mimicking that feature of SimpleXMLElement.
Also I'd say creating an in-memory structure yourself which is done by Simplexml is not going to work out well with PHP 4 because of it's limitation of the OOP-object-model and garbage collection. Especially as I would prefer something like the Flyweight pattern here.
Which brings me to the point that you're probably more looking for an event or pull-based XML parser.
Take a look at the XML Parser Functions which are the PHP 4 way to parse XML with PHP. You find it well explained in years old PHP training materials also with examples and what not.

php regular comment remove

how remove multi line comment ? ( /* comment php */) from file .php
how remove single comment ? ( // coment ) from file.php
how remove enter end line ?
sample :
single comment line
$G["url"] = "http://".$_SERVER["HTTP_HOST"]
multi comment line
/*
* 310
* - "::"
* - $col
*/
Try using the code posted below, it even ignores comment tokens on strings like " /* comment2 */ "
<?php
$fileIn = "src.php";
$fileOut = "srcCompressed.php";
$text = file_get_contents($fileIn);
$text = compress_php_src($text);
file_put_contents($fileOut, $text);
function compress_php_src($src) {
// Whitespaces left and right from this signs can be ignored
static $IW = array(
T_CONCAT_EQUAL, // .=
T_DOUBLE_ARROW, // =>
T_BOOLEAN_AND, // &&
T_BOOLEAN_OR, // ||
T_IS_EQUAL, // ==
T_IS_NOT_EQUAL, // != or <>
T_IS_SMALLER_OR_EQUAL, // <=
T_IS_GREATER_OR_EQUAL, // >=
T_INC, // ++
T_DEC, // --
T_PLUS_EQUAL, // +=
T_MINUS_EQUAL, // -=
T_MUL_EQUAL, // *=
T_DIV_EQUAL, // /=
T_IS_IDENTICAL, // ===
T_IS_NOT_IDENTICAL, // !==
T_DOUBLE_COLON, // ::
T_PAAMAYIM_NEKUDOTAYIM, // ::
T_OBJECT_OPERATOR, // ->
T_DOLLAR_OPEN_CURLY_BRACES, // ${
T_AND_EQUAL, // &=
T_MOD_EQUAL, // %=
T_XOR_EQUAL, // ^=
T_OR_EQUAL, // |=
T_SL, // <<
T_SR, // >>
T_SL_EQUAL, // <<=
T_SR_EQUAL, // >>=
);
if(is_file($src)) {
if(!$src = file_get_contents($src)) {
return false;
}
}
$tokens = token_get_all($src);
$new = "";
$c = sizeof($tokens);
$iw = false; // ignore whitespace
$ih = false; // in HEREDOC
$ls = ""; // last sign
$ot = null; // open tag
for($i = 0; $i < $c; $i++) {
$token = $tokens[$i];
if(is_array($token)) {
list($tn, $ts) = $token; // tokens: number, string, line
$tname = token_name($tn);
if($tn == T_INLINE_HTML) {
$new .= $ts;
$iw = false;
} else {
if($tn == T_OPEN_TAG) {
if(strpos($ts, " ") || strpos($ts, "\n") || strpos($ts, "\t") || strpos($ts, "\r")) {
$ts = rtrim($ts);
}
$ts .= " ";
$new .= $ts;
$ot = T_OPEN_TAG;
$iw = true;
} elseif($tn == T_OPEN_TAG_WITH_ECHO) {
$new .= $ts;
$ot = T_OPEN_TAG_WITH_ECHO;
$iw = true;
} elseif($tn == T_CLOSE_TAG) {
if($ot == T_OPEN_TAG_WITH_ECHO) {
$new = rtrim($new, "; ");
} else {
$ts = " ".$ts;
}
$new .= $ts;
$ot = null;
$iw = false;
} elseif(in_array($tn, $IW)) {
$new .= $ts;
$iw = true;
} elseif($tn == T_CONSTANT_ENCAPSED_STRING
|| $tn == T_ENCAPSED_AND_WHITESPACE)
{
if($ts[0] == '"') {
$ts = addcslashes($ts, "\n\t\r");
}
$new .= $ts;
$iw = true;
} elseif($tn == T_WHITESPACE) {
$nt = #$tokens[$i+1];
if(!$iw && (!is_string($nt) || $nt == '$') && !in_array($nt[0], $IW)) {
$new .= " ";
}
$iw = false;
} elseif($tn == T_START_HEREDOC) {
$new .= "<<<S\n";
$iw = false;
$ih = true; // in HEREDOC
} elseif($tn == T_END_HEREDOC) {
$new .= "S;";
$iw = true;
$ih = false; // in HEREDOC
for($j = $i+1; $j < $c; $j++) {
if(is_string($tokens[$j]) && $tokens[$j] == ";") {
$i = $j;
break;
} else if($tokens[$j][0] == T_CLOSE_TAG) {
break;
}
}
} elseif($tn == T_COMMENT || $tn == T_DOC_COMMENT) {
$iw = true;
} else {
if(!$ih) {
$ts = strtolower($ts);
}
$new .= $ts;
$iw = false;
}
}
$ls = "";
} else {
if(($token != ";" && $token != ":") || $ls != $token) {
$new .= $token;
$ls = $token;
}
$iw = true;
}
}
return $new;
}
?>
Credits to gelamu function compress_php_src().
There's no reason to remove comments from your code. The overhead from processing them is so incredibly minimal that it doesn't justify the effort.
I think you are probably looking for php_strip_whitespace().

Cutting text without destroying html tags

Is there a way to do this without writing my own function?
For example:
$text = 'Test <span><a>something</a> something else</span>.';
$text = cutText($text, 2, null, 20, true);
//result: Test <span><a>something</a></span>
I need to make this function indestructible
My problem is similar to
This thread
but I need a better solution. I would like to keep nested tags untouched.
So far my algorithm is:
function cutText($content, $max_words, $max_chars, $max_word_len, $html = false) {
$len = strlen($content);
$res = '';
$word_count = 0;
$word_started = false;
$current_word = '';
$current_word_len = 0;
if ($max_chars == null) {
$max_chars = $len;
}
$inHtml = false;
$openedTags = array();
for ($i = 0; $i<$max_chars;$i++) {
if ($content[$i] == '<' && $html) {
$inHtml = true;
}
if ($inHtml) {
$max_chars++;
}
if ($html && !$inHtml) {
if ($content[$i] != ' ' && !$word_started) {
$word_started = true;
$word_count++;
}
$current_word .= $content[$i];
$current_word_len++;
if ($current_word_len == $max_word_len) {
$current_word .= '- ';
}
if (($content[$i] == ' ') && $word_started) {
$word_started = false;
$res .= $current_word;
$current_word = '';
$current_word_len = 0;
if ($word_count == $max_words) {
return $res;
}
}
}
if ($content[$i] == '<' && $html) {
$inHtml = true;
}
}
return $res;
}
But of course it won't work. I thought about remembering opened tags and closing them if they were not closed but maybe there is a better way?
This works perfectly for me:
function trimContent ($str, $trimAtIndex) {
$beginTags = array();
$endTags = array();
for($i = 0; $i < strlen($str); $i++) {
if( $str[$i] == '<' )
$beginTags[] = $i;
else if($str[$i] == '>')
$endTags[] = $i;
}
foreach($beginTags as $k=>$index) {
// Trying to trim in between tags. Trim after the last tag
if( ( $trimAtIndex >= $index ) && ($trimAtIndex <= $endTags[$k]) ) {
$trimAtIndex = $endTags[$k];
}
}
return substr($str, 0, $trimAtIndex);
}
Try something like this
function cutText($inputText, $start, $length) {
$temp = $inputText;
$res = array();
while (strpos($temp, '>')) {
$ts = strpos($temp, '<');
$te = strpos($temp, '>');
if ($ts > 0) $res[] = substr($temp, 0, $ts);
$res[] = substr($temp, $ts, $te - $ts + 1);
$temp = substr($temp, $te + 1, strlen($temp) - $te);
}
if ($temp != '') $res[] = $temp;
$pointer = 0;
$end = $start + $length - 1;
foreach ($res as &$part) {
if (substr($part, 0, 1) != '<') {
$l = strlen($part);
$p1 = $pointer;
$p2 = $pointer + $l - 1;
$partx = "";
if ($start <= $p1 && $end >= $p2) $partx = "";
else {
if ($start > $p1 && $start <= $p2) $partx .= substr($part, 0, $start-$pointer);
if ($end >= $p1 && $end < $p2) $partx .= substr($part, $end-$pointer+1, $l-$end+$pointer);
if ($partx == "") $partx = $part;
}
$part = $partx;
$pointer += $l;
}
}
return join('', $res);
}
Parameters:
$inputText - input text
$start - position of first character
$length - how menu characters we want to remove
Example #1 - Removing first 3 characters
$text = 'Test <span><a>something</a> something else</span>.';
$text = cutText($text, 0, 3);
var_dump($text);
Output (removed "Tes")
string(47) "t <span><a>something</a> something else</span>."
Removing first 10 characters
$text = cutText($text, 0, 10);
Output (removed "Test somet")
string(40) "<span><a>hing</a> something else</span>."
Example 2 - Removing inner characters - "es" from "Test "
$text = cutText($text, 1, 2);
Output
string(48) "Tt <span><a>something</a> something else</span>."
Removing "thing something el"
$text = cutText($text, 9, 18);
Output
string(32) "Test <span><a>some</a>se</span>."
Hope this helps.
Well, maybe this is not the best solution but it's everything I can do at the moment.
Ok I solved this thing.
I divided this in 2 parts.
First cutting text without destroying html:
function cutHtml($content, $max_words, $max_chars, $max_word_len) {
$len = strlen($content);
$res = '';
$word_count = 0;
$word_started = false;
$current_word = '';
$current_word_len = 0;
if ($max_chars == null) {
$max_chars = $len;
}
$inHtml = false;
$openedTags = array();
$i = 0;
while ($i < $max_chars) {
//skip any html tags
if ($content[$i] == '<') {
$inHtml = true;
while (true) {
$res .= $content[$i];
$i++;
while($content[$i] == ' ') { $res .= $content[$i]; $i++; }
//skip any values
if ($content[$i] == "'") {
$res .= $content[$i];
$i++;
while(!($content[$i] == "'" && $content[$i-1] != "\\")) {
$res .= $content[$i];
$i++;
}
}
//skip any values
if ($content[$i] == '"') {
$res .= $content[$i];
$i++;
while(!($content[$i] == '"' && $content[$i-1] != "\\")) {
$res .= $content[$i];
$i++;
}
}
if ($content[$i] == '>') { $res .= $content[$i]; $i++; break;}
}
$inHtml = false;
}
if (!$inHtml) {
while($content[$i] == ' ') { $res .= $content[$i]; $letter_count++; $i++; } //skip spaces
$word_started = false;
$current_word = '';
$current_word_len = 0;
while (!in_array($content[$i], array(' ', '<', '.', ','))) {
if (!$word_started) {
$word_started = true;
$word_count++;
}
$current_word .= $content[$i];
$current_word_len++;
if ($current_word_len == $max_word_len) {
$current_word .= '-';
$current_word_len = 0;
}
$i++;
}
if ($letter_count > $max_chars) {
return $res;
}
if ($word_count < $max_words) {
$res .= $current_word;
$letter_count += strlen($current_word);
}
if ($word_count == $max_words) {
$res .= $current_word;
$letter_count += strlen($current_word);
return $res;
}
}
}
return $res;
}
And next thing is closing unclosed tags:
function cleanTags(&$html) {
$count = strlen($html);
$i = -1;
$openedTags = array();
while(true) {
$i++;
if ($i >= $count) break;
if ($html[$i] == '<') {
$tag = '';
$closeTag = '';
$reading = false;
//reading whole tag
while($html[$i] != '>') {
$i++;
while($html[$i] == ' ') $i++; //skip any spaces (need to be idiot proof)
if (!$reading && $html[$i] == '/') { //closing tag
$i++;
while($html[$i] == ' ') $i++; //skip any spaces
$closeTag = '';
while($html[$i] != ' ' && $html[$i] != '>') { //start reading first actuall string
$reading = true;
$html[$i] = strtolower($html[$i]); //tags to lowercase
$closeTag .= $html[$i];
$i++;
}
$c = count($openedTags);
if ($c > 0 && $openedTags[$c-1] == $closeTag) array_pop($openedTags);
}
if (!$reading) //read only tag
while($html[$i] != ' ' && $html[$i] != '>') { //start reading first actuall string
$reading = true;
$html[$i] = strtolower($html[$i]); //tags to lowercase
$tag .= $html[$i];
$i++;
}
//skip any values
if ($html[$i] == "'") {
$i++;
while(!($html[$i] == "'" && $html[$i-1] != "\\")) {
$i++;
}
}
//skip any values
if ($html[$i] == '"') {
$i++;
while(!($html[$i] == '"' && $html[$i-1] != "\\")) {
$i++;
}
}
if ($reading && $html[$i] == '/') { //self closed tag
$tag = '';
break;
}
}
if (!empty($tag)) $openedTags[] = $tag;
}
}
while (count($openedTags) > 0) {
$tag = array_pop($openedTags);
$html .= "</$tag>";
}
}
It's not idiot proof but tinymce will clear this thing out so further cleaning is not necessary.
It may be a little long but i don't think it will eat a lot of resources and it should be faster than regex.

What can be improved in this PHP code?

This is a custom encryption library. I do not know much about PHP's standard library of functions and was wondering if the following code can be improved in any way. The implementation should yield the same results, the API should remain as it is, but ways to make is more PHP-ish would be greatly appreciated.
Code
<?php
/***************************************
Create random major and minor SPICE key.
***************************************/
function crypt_major()
{
$all = range("\x00", "\xFF");
shuffle($all);
$major_key = implode("", $all);
return $major_key;
}
function crypt_minor()
{
$sample = array();
do
{
array_push($sample, 0, 1, 2, 3);
} while (count($sample) != 256);
shuffle($sample);
$list = array();
for ($index = 0; $index < 64; $index++)
{
$b12 = $sample[$index * 4] << 6;
$b34 = $sample[$index * 4 + 1] << 4;
$b56 = $sample[$index * 4 + 2] << 2;
$b78 = $sample[$index * 4 + 3];
array_push($list, $b12 + $b34 + $b56 + $b78);
}
$minor_key = implode("", array_map("chr", $list));
return $minor_key;
}
/***************************************
Create the SPICE key via the given name.
***************************************/
function named_major($name)
{
srand(crc32($name));
return crypt_major();
}
function named_minor($name)
{
srand(crc32($name));
return crypt_minor();
}
/***************************************
Check validity for major and minor keys.
***************************************/
function _check_major($key)
{
if (is_string($key) && strlen($key) == 256)
{
foreach (range("\x00", "\xFF") as $char)
{
if (substr_count($key, $char) == 0)
{
return FALSE;
}
}
return TRUE;
}
return FALSE;
}
function _check_minor($key)
{
if (is_string($key) && strlen($key) == 64)
{
$indexs = array();
foreach (array_map("ord", str_split($key)) as $byte)
{
foreach (range(6, 0, 2) as $shift)
{
array_push($indexs, ($byte >> $shift) & 3);
}
}
$dict = array_count_values($indexs);
foreach (range(0, 3) as $index)
{
if ($dict[$index] != 64)
{
return FALSE;
}
}
return TRUE;
}
return FALSE;
}
/***************************************
Create encode maps for encode functions.
***************************************/
function _encode_map_1($major)
{
return array_map("ord", str_split($major));
}
function _encode_map_2($minor)
{
$map_2 = array(array(), array(), array(), array());
$list = array();
foreach (array_map("ord", str_split($minor)) as $byte)
{
foreach (range(6, 0, 2) as $shift)
{
array_push($list, ($byte >> $shift) & 3);
}
}
for ($byte = 0; $byte < 256; $byte++)
{
array_push($map_2[$list[$byte]], chr($byte));
}
return $map_2;
}
/***************************************
Create decode maps for decode functions.
***************************************/
function _decode_map_1($minor)
{
$map_1 = array();
foreach (array_map("ord", str_split($minor)) as $byte)
{
foreach (range(6, 0, 2) as $shift)
{
array_push($map_1, ($byte >> $shift) & 3);
}
}
return $map_1;
}function _decode_map_2($major)
{
$map_2 = array();
$temp = array_map("ord", str_split($major));
for ($byte = 0; $byte < 256; $byte++)
{
$map_2[$temp[$byte]] = chr($byte);
}
return $map_2;
}
/***************************************
Encrypt or decrypt the string with maps.
***************************************/
function _encode($string, $map_1, $map_2)
{
$cache = "";
foreach (str_split($string) as $char)
{
$byte = $map_1[ord($char)];
foreach (range(6, 0, 2) as $shift)
{
$cache .= $map_2[($byte >> $shift) & 3][mt_rand(0, 63)];
}
}
return $cache;
}
function _decode($string, $map_1, $map_2)
{
$cache = "";
$temp = str_split($string);
for ($iter = 0; $iter < strlen($string) / 4; $iter++)
{
$b12 = $map_1[ord($temp[$iter * 4])] << 6;
$b34 = $map_1[ord($temp[$iter * 4 + 1])] << 4;
$b56 = $map_1[ord($temp[$iter * 4 + 2])] << 2;
$b78 = $map_1[ord($temp[$iter * 4 + 3])];
$cache .= $map_2[$b12 + $b34 + $b56 + $b78];
}
return $cache;
}
/***************************************
This is the public interface for coding.
***************************************/
function encode_string($string, $major, $minor)
{
if (is_string($string))
{
if (_check_major($major) && _check_minor($minor))
{
$map_1 = _encode_map_1($major);
$map_2 = _encode_map_2($minor);
return _encode($string, $map_1, $map_2);
}
}
return FALSE;
}
function decode_string($string, $major, $minor)
{
if (is_string($string) && strlen($string) % 4 == 0)
{
if (_check_major($major) && _check_minor($minor))
{
$map_1 = _decode_map_1($minor);
$map_2 = _decode_map_2($major);
return _decode($string, $map_1, $map_2);
}
}
return FALSE;
}
?>
This is a sample showing how the code is being used. Hex editors may be of help with the input / output.
Example
<?php
# get and process all of the form data
# $input = htmlspecialchars($_POST["input"]);
# $majorname = htmlspecialchars($_POST["majorname"]);
# $minorname = htmlspecialchars($_POST["minorname"]);
# $majorkey = htmlspecialchars($_POST["majorkey"]);
# $minorkey = htmlspecialchars($_POST["minorkey"]);
# $output = htmlspecialchars($_POST["output"]);
# process the submissions by operation
# CREATE
# $operation = $_POST["operation"];
if ($operation == "Create")
{
if (strlen($_POST["majorname"]) == 0)
{
$majorkey = bin2hex(crypt_major());
}
if (strlen($_POST["minorname"]) == 0)
{
$minorkey = bin2hex(crypt_minor());
}
if (strlen($_POST["majorname"]) != 0)
{
$majorkey = bin2hex(named_major($_POST["majorname"]));
}
if (strlen($_POST["minorname"]) != 0)
{
$minorkey = bin2hex(named_minor($_POST["minorname"]));
}
}
# ENCRYPT or DECRYPT
function is_hex($char)
{
if ($char == "0"):
return TRUE;
elseif ($char == "1"):
return TRUE;
elseif ($char == "2"):
return TRUE;
elseif ($char == "3"):
return TRUE;
elseif ($char == "4"):
return TRUE;
elseif ($char == "5"):
return TRUE;
elseif ($char == "6"):
return TRUE;
elseif ($char == "7"):
return TRUE;
elseif ($char == "8"):
return TRUE;
elseif ($char == "9"):
return TRUE;
elseif ($char == "a"):
return TRUE;
elseif ($char == "b"):
return TRUE;
elseif ($char == "c"):
return TRUE;
elseif ($char == "d"):
return TRUE;
elseif ($char == "e"):
return TRUE;
elseif ($char == "f"):
return TRUE;
else:
return FALSE;
endif;
}
function hex2bin($str)
{
if (strlen($str) % 2 == 0):
$string = strtolower($str);
else:
$string = strtolower("0" . $str);
endif;
$cache = "";
$temp = str_split($str);
for ($index = 0; $index < count($temp) / 2; $index++)
{
$h1 = $temp[$index * 2];
if (is_hex($h1))
{
$h2 = $temp[$index * 2 + 1];
if (is_hex($h2))
{
$cache .= chr(hexdec($h1 . $h2));
}
else
{
return FALSE;
}
}
else
{
return FALSE;
}
}
return $cache;
}
if ($operation == "Encrypt" || $operation == "Decrypt")
{
# CHECK FOR ANY ERROR
$errors = array();
if (strlen($_POST["input"]) == 0)
{
$output = "";
}
$binmajor = hex2bin($_POST["majorkey"]);
if (strlen($_POST["majorkey"]) == 0)
{
array_push($errors, "There must be a major key.");
}
elseif ($binmajor == FALSE)
{
array_push($errors, "The major key must be in hex.");
}
elseif (_check_major($binmajor) == FALSE)
{
array_push($errors, "The major key is corrupt.");
}
$binminor = hex2bin($_POST["minorkey"]);
if (strlen($_POST["minorkey"]) == 0)
{
array_push($errors, "There must be a minor key.");
}
elseif ($binminor == FALSE)
{
array_push($errors, "The minor key must be in hex.");
}
elseif (_check_minor($binminor) == FALSE)
{
array_push($errors, "The minor key is corrupt.");
}
if ($_POST["operation"] == "Decrypt")
{
$bininput = hex2bin(str_replace("\r", "", str_replace("\n", "", $_POST["input"])));
if ($bininput == FALSE)
{
if (strlen($_POST["input"]) != 0)
{
array_push($errors, "The input data must be in hex.");
}
}
elseif (strlen($bininput) % 4 != 0)
{
array_push($errors, "The input data is corrupt.");
}
}
if (count($errors) != 0)
{
# ERRORS ARE FOUND
$output = "ERROR:";
foreach ($errors as $error)
{
$output .= "\n" . $error;
}
}
elseif (strlen($_POST["input"]) != 0)
{
# CONTINUE WORKING
if ($_POST["operation"] == "Encrypt")
{
# ENCRYPT
$output = substr(chunk_split(bin2hex(encode_string($_POST["input"], $binmajor, $binminor)), 58), 0, -2);
}
else
{
# DECRYPT
$output = htmlspecialchars(decode_string($bininput, $binmajor, $binminor));
}
}
}
# echo the form with the values filled
echo "<P><TEXTAREA class=maintextarea name=input rows=25 cols=25>" . $input . "</TEXTAREA></P>\n";
echo "<P>Major Name:</P>\n";
echo "<P><INPUT id=textbox1 name=majorname value=\"" . $majorname . "\"></P>\n";
echo "<P>Minor Name:</P>\n";
echo "<P><INPUT id=textbox1 name=minorname value=\"" . $minorname . "\"></P>\n";
echo "<DIV style=\"TEXT-ALIGN: center\"><INPUT class=submit type=submit value=Create name=operation>\n";
echo "</DIV>\n";
echo "<P>Major Key:</P>\n";
echo "<P><INPUT id=textbox1 name=majorkey value=\"" . $majorkey . "\"></P>\n";
echo "<P>Minor Key:</P>\n";
echo "<P><INPUT id=textbox1 name=minorkey value=\"" . $minorkey . "\"></P>\n";
echo "<DIV style=\"TEXT-ALIGN: center\"><INPUT class=submit type=submit value=Encrypt name=operation> \n";
echo "<INPUT class=submit type=submit value=Decrypt name=operation> </DIV>\n";
echo "<P>Result:</P>\n";
echo "<P><TEXTAREA class=maintextarea name=output rows=25 readOnly cols=25>" . $output . "</TEXTAREA></P></DIV></FORM>\n";
?>
What should be editted for better memory efficiency or faster execution?
You could replace your isHex function with:
function isHex($char) {
return strpos("0123456789ABCDEF",strtoupper($char)) > -1;
}
And your hex2bin might be better as:
function hex2bin($h)
{
if (!is_string($h)) return null;
$r='';
for ($a=0; $a<strlen($h); $a+=2) { $r.=chr(hexdec($h{$a}.$h{($a+1)})); }
return $r;
}
You seem to have a lot of if..elseif...elseif...elseif which would be a lot cleaner in a switch or seperated into different methods.
However, I'm talking more from a maintainability and readability perspective - although the easier it is to read and understand, the easier it is to optimize. If it would help you if I waded through all the code and wrote it in a cleaner way, then I'll do so...

Categories