I'm using an IPv6 class found on GitHub to do some IP manipulation but I noticed that there is an issue with shortening certain address, typically ending in 0.
When I enter the address 2001::6dcd:8c74:0:0:0:0, it results in 2001::6dcd:8c74::::.
$address = '2001::6dcd:8c74:0:0:0:0';
// Check to see if address is already compacted
if (strpos($address, '::') === FALSE) {
$parts = explode(':', $address);
$new_parts = array();
$ignore = FALSE;
$done = FALSE;
for ($i = 0; $i < count($parts); $i++) {
if (intval(hexdec($parts[$i])) === 0 && $ignore == FALSE && $done == FALSE) {
$ignore = TRUE;
$new_parts[] = '';
if ($i == 0) {
$new_parts = '';
}
} else if (intval(hexdec($parts[$i])) === 0 && $ignore == TRUE && $done == FALSE) {
continue;
} else if (intval(hexdec($parts[$i])) !== 0 && $ignore == TRUE) {
$done = TRUE;
$ignore = FALSE;
$new_parts[] = $parts[$i];
} else {
$new_parts[] = $parts[$i];
}
}
// Glue everything back together
$address = implode(':', $new_parts);
}
// Remove the leading 0's
$new_address = preg_replace("/:0{1,3}/", ":", $address);
// $this->compact = $new_address;
// return $this->compact;
echo $new_address; // Outputs: 2001::6dcd:8c74::::
To compress ipv6 addresses, use php functions inet_ntop and inet_pton to do the formatting, by converting the ipv6 to binary and back.
Sample ipv6 - 2001::6dcd:8c74:0:0:0:0
Test usage:
echo inet_ntop(inet_pton('2001::6dcd:8c74:0:0:0:0'));
Output : 2001:0:6dcd:8c74::
Without that problem line at the bottom you get 2001::6dcd:8c74:0:0:0:0.
Now, before the leading 0's are all replaced, the function checks to see if the address ends in :0 before removing all leading 0's.
if (substr($address, -2) != ':0') {
$new_address = preg_replace("/:0{1,3}/", ":", $address);
} else {
$new_address = $address;
}
Another check is added to catch other possible valid IPv6 addresses from being malformed.
if (isset($new_parts)) {
if (count($new_parts) < 8 && array_pop($new_parts) == '') {
$new_address .= ':0';
}
}
The new full function looks like this:
// Check to see if address is already compacted
if (strpos($address, '::') === FALSE) {
$parts = explode(':', $address);
$new_parts = array();
$ignore = FALSE;
$done = FALSE;
for ($i = 0; $i < count($parts); $i++) {
if (intval(hexdec($parts[$i])) === 0 && $ignore == FALSE && $done == FALSE) {
$ignore = TRUE;
$new_parts[] = '';
if ($i == 0) {
$new_parts = '';
}
} else if (intval(hexdec($parts[$i])) === 0 && $ignore == TRUE && $done == FALSE) {
continue;
} else if (intval(hexdec($parts[$i])) !== 0 && $ignore == TRUE) {
$done = TRUE;
$ignore = FALSE;
$new_parts[] = $parts[$i];
} else {
$new_parts[] = $parts[$i];
}
}
// Glue everything back together
$address = implode(':', $new_parts);
}
// Check to see if this ends in a shortened :0 before replacing all
// leading 0's
if (substr($address, -2) != ':0') {
// Remove the leading 0's
$new_address = preg_replace("/:0{1,3}/", ":", $address);
} else {
$new_address = $address;
}
// Since new_parts isn't always set, check to see if it's set before
// trying to fix possibly broken shortened addresses ending in 0.
// (Ex: Trying to shorten 2001:19f0::0 will result in unset array)
if (isset($new_parts)) {
// Some addresses (Ex: starting addresses for a range) can end in
// all 0's resulting in the last value in the new parts array to be
// an empty string. Catch that case here and add the remaining :0
// for a complete shortened address.
if (count($new_parts) < 8 && array_pop($new_parts) == '') {
$new_address .= ':0';
}
}
// $this->compact = $new_address;
// return $this->compact;
echo $new_address; // Outputs: 2001::6dcd:8c74:0:0:0:0
It's not the cleanest solution and could possibly have holes in its logic depending on what the address is. If I find any other issues with it I will update this question/answer.
Related
So, I want to check the users-input, if it contains some of these characters:
" ' < >
I hope someone can show me a better way with less code
Thanks!
I used preg_match, but i just managed it with 4 nested if's.
/*Checks if the given value is valid*/
private function checkValidInput($input)
{
/*If there is no " */
if(preg_match('/"/', $input) == false)
{
/*If there is no ' */
if(preg_match("/'/", $input) == false)
{
/*If there is no <*/
if(preg_match("/</", $input) == false)
{
/*If there is no >*/
if(preg_match("/>/", $input) == false)
{
return true;
}
else
{
return false;
}
}
else
{
return false;
}
}
else
{
return false;
}
}
else
{
return false;
}
}
You could create a regex class
preg_match('#["\'<>]#', $input);
Edit:
If you need to check for all characters then use strpos() with for loop
function checkInput($val) {
$contains = true;
$required = "<>a";
for($i = 0, $count = strlen($required); $i < $count ; ++$i) {
$contains = $contains && false !== strpos($val, $required[$i]);
}
return $contains;
}
var_dump(checkInput('abcd<>a')); // true
var_dump(checkInput('abcd>a')); // false, doesn't contain <
I am upgrading a codebase that makes use of pass by reference
Main function
function splitSqlFile(&$ret, $sql)
{
$sql = trim($sql);
$sql_len = strlen($sql);
$char = '';
$string_start = '';
$in_string = false;
for ($i = 0; $i < $sql_len; ++$i) {
$char = $sql[$i];
if ($in_string) {
for (;;) {
$i = strpos($sql, $string_start, $i);
if (!$i) {
$ret[] = $sql;
return true;
}else if ($string_start == '`' || $sql[$i-1] != '\\'){
......
}else {
......
} // end if...elseif...else
} // end for
}
else if ($char == ';') {
$ret[] = substr($sql, 0, $i);
$sql = ltrim(substr($sql, min($i + 1, $sql_len)));
$sql_len = strlen($sql);
if ($sql_len) {
$i = -1;
} else {
// The submited statement(s) end(s) here
return true;
}
}else if (($char == '"') || ($char == '\'') || ($char == '`')) {
$in_string = true;
$string_start = $char;
} // end else if (is start of string)
// for start of a comment (and remove this comment if found)...
else if ($char == '#' || ($char == ' ' && $i > 1 && $sql[$i-2] . $sql[$i-1] == '--')) {
......
if (!$end_of_comment) {
// no eol found after '#', add the parsed part to the returned
// array and exit
$ret[] = trim(substr($sql, 0, $i-1));
return true;
} else {
.....
} // end if...else
} // end else if (is comment)
} // end for
// add any rest to the returned array
if (!empty($sql) && trim($sql) != '') {
$ret[] = $sql;
}
return true;
}
Calling the function
$sqlUtility->splitSqlFile($pieces, $sql_query);
foreach ($pieces as $piece)
{
.......
}
If the above variable splitSqlFile(&$ret, $sql) have the "&" before it, the program does run successfully, but if it is removed, now splitSqlFile($ret, $sql), It will start returning the 'invalid argument supplied for foreach' error.and when I try using the "is_array" function to check if it is an array, the result is always "NULL".
Why you get the error:
By removing the & from $ret, you are no longer referencing the variable in the function call. In this case, $pieces. So when you do a foreach on $pieces after calling the function, it will error because $pieces is basically a null variable at that point.
function splitSqlFile(&$ret,$sql) {
$ret[] = 'stuff';
}
splitSqlFile($pieces,$sql);
// $pieces will be an array as 0 => 'stuff'
foreach ($pieces as $piece) { } // will not error
vs:
function splitSqlFile($ret,$sql) {
$ret[] = 'stuff';
}
splitSqlFile($pieces,$sql);
// $pieces will be a null variable, since it was never assigned anything
foreach ($pieces as $piece) { } // will error
Alternative to no reference:
So if you want to remove the & and no longer pass by reference, you have to do other changes to the function to get that value back out. And depending on the codebase, this could mean a whole lot of work everywhere that function is used!
Example:
function splitSqlFile($sql) {
$ret = [];
$ret[] = 'stuff';
return array('result'=>true,'ret'=>$ret);
}
// $result will contain multiple things to utilize
// if you will only need that variable once (does not accumulate)
$result = splitSqlFile($sql);
foreach ($result['pieces'] as $piece) { }
// if that variable is added by multiple calls, and displayed later... merge
$pieces = [];
$result = splitSqlFile($sql_1);
$pieces = array_merge($pieces,$result['pieces']);
$result = splitSqlFile($sql_2);
$pieces = array_merge($pieces,$result['pieces']);
foreach ($pieces as $piece) { }
A second example (passing in the array as you go... gets confusing):
function splitSqlFile($pieces_in,$sql) {
$pieces_in[] = 'stuff';
return array('result'=>true,'pieces_out'=>$pieces_in);
}
$pieces = [];
$result = splitSqlFile($pieces,$sql_1);
$pieces = $result['pieces_out'];
$result = splitSqlFile($pieces,$sql_2);
$pieces = $result['pieces_out'];
foreach ($pieces as $piece) { }
As you can see, not only does it change the return values that has to be dealt with, but it also changes how it is called. Again, if this function is used in a thousand places in the code... serious headaches!
Conclusion:
I would honestly keep the reference as it is. It was done that way to make accumulating debug data easier, and direct. Otherwise you have a lot of code changes to do toget rid of the reference.
However that can simply be my opinion on the matter.
<!-- language: php -->
<?php
// test variables
$l1 = "http://youtube.com/channel/";
$l2 = "http://youtube.com/channel/";
$l3 = "http://youtube.com/channel/";
$l4 = "http://youtube.com/channel/";
$fl = "http://youtube.com/channel/";
//set error false as default
$error = "false";
//check if variables are ready for use, if they are, add them to `$l` array
//I do each check as a seperate line, as it looks cleaner than 1 long if statement.
$l = [];
if(!empty($l1)) $l[] = $l1;
if(!empty($l2)) $l[] = $l2;
if(!empty($l3)) $l[] = $l3;
if(!empty($l4)) $l[] = $l4;
if(!empty($fl)) $l[] = $fl;
foreach($l as $key => $value) {
//1 line ternary is cleaner than if/else statetmnt
$errorKey = $key < 9? "0{$key}" : $key;
//each row by default has no error
$hasError = 0;
//check if this a valid url
if(!preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $value)) {
$error = "true";
$hasError = 1;
}
if($hasError) {
//store error in array, to loop through later
$errors[] = $errorKey;
}
}
$search = '?sub_confirmation=1';
$searchUrl = "youtube.com/channel";
if (strpos($l, $searchUrl) !== false && strpos($l, $search) === false) {
$l = $value."".$search;
}
if($error == "false") {
echo $l1;
echo $l2;
echo $l3;
echo $l4;
echo $fl;
}
// deliver the error message
//Check if $error has been set to true at any point
if($error == "true") {
//loop through error array, echo error message if $errorNumber matches.
//at this point we KNOW there was an error at some point, no need to use a switch really
foreach($errors as $errorNumber) {
echo "Something went wrong here $errorNumber :o";
}
}
?>
Hello, my problem is at the end of the code where the strpos function is, so basically I want to check every url, once if it contains a certain url, and then add something to the end if it is so. But I don't want to repeat an if statement 4 times($fl variable doesn't has to be checked), I am quite new in all that so I hope somebody can help me, I tought about a switch statement but I guess there is a better way. And if I put it in the foreach aboth, it doesn't applies on the certain variables, only on the value variable.
You can assign $value by reference using this foreach header (notice the & in front of $value):
foreach($l as $key => &$value) {
By doing this every change you do to $value will also be done to the corresponding value in the $l array.
Then at the end of the foreach loop you put this code:
if (strpos($value, $searchUrl) !== false && strpos($value, $search) === false) {
$value .= $search;
}
So your final foreach loop should look like this:
foreach($l as $key => &$value) {
//1 line ternary is cleaner than if/else statetmnt
$errorKey = $key < 9? "0{$key}" : $key;
//each row by default has no error
$hasError = 0;
//check if this a valid url
if(!preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $value)) {
$error = "true";
$hasError = 1;
}
if($hasError) {
//store error in array, to loop through later
$errors[] = $errorKey;
}
$search = '?sub_confirmation=1';
$searchUrl = "youtube.com/channel";
if (strpos($value, $searchUrl) !== false && strpos($value, $search) === false) {
$value .= $search;
}
}
You can read more about using references in foreach loops here: PHP: foreach
Edit:
To apply the changes not only to the elements of the $l array, but also to the original variables $l1, $l2 and so on, you should assign the elements to your array as references too:
$l = [];
if(!empty($l1)) $l[] = &$l1;
if(!empty($l2)) $l[] = &$l2;
if(!empty($l3)) $l[] = &$l3;
if(!empty($l4)) $l[] = &$l4;
if(!empty($fl)) $l[] = &$fl;
Personally, I think this is a good candidate for moving to a class. To be honest I'm not 100% sure what you are doing but will try to convert your code to a class.
class L {
public $raw = null;
public $modified = null;
public $error = false;
// create the class
public function __construct($data=null) {
$this->raw = $data;
// Check the raw passed in data
if ($data) {
$this->isUrl();
}
// If there was no error, check the data
if (! $this->error) {
$this->search();
}
}
// Do something ?
public function debug() {
echo '<pre>';
var_dump($this);
echo '</pre>';
}
public function getData() {
return ($this->modified) ? : $this->raw;
}
private function isUrl() {
$this->error = (! preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $this->raw));
}
// Should a failed search also be an error?
private function search() {
if ($this->raw) {
if ( (strpos($this->raw, "youtube.com/channel") !== false) &&
(strpos($this->raw, "?sub_confirmation=1") === false) ) {
$this->modified = $this->raw ."?sub_confirmation=1";
}
}
}
}
// Test data
$testList[] = "test fail";
$testList[] = "https://youtube.com/searchFail";
$testList[] = "https://youtube.com/channel/success";
$testList[] = "https://youtube.com/channel/confirmed?sub_confirmation=1";
// Testing code
foreach($testList as $key=>$val) {
$l[] = new L($val);
}
foreach($l as $key=>$val) {
// Check for an error
if ($val->error) {
$val->debug();
} else {
echo '<pre>'.$val->getData().'</pre>';
}
}
And the output would be:
object(L)#1 (3) {
["raw"]=>
string(9) "test fail"
["modified"]=>
NULL
["error"]=>
bool(true)
}
https://youtube.com/searchFail
https://youtube.com/channel/success?sub_confirmation=1
https://youtube.com/channel/confirmed?sub_confirmation=1
I'm trying to code a php parser to gather professor reviews from ratemyprofessor.com. Each professor has a page and it has all the reviews in it, I want to parse each professor's site and extract the comments into a txt file.
This is what I have so far but it doesn't excute properly when I run it because the output txt file remains empty. what can be the issue?
<?php
set_time_limit(0);
$domain = "http://www.ratemyprofessors.com";
$content = "div id=commentsection";
$content_tag = "comment";
$output_file = "reviews.txt";
$max_urls_to_check = 400;
$rounds = 0;
$reviews_stack = array();
$max_size_domain_stack = 10000;
$checked_domains = array();
while ($domain != "" && $rounds < $max_urls_to_check) {
$doc = new DOMDocument();
#$doc->loadHTMLFile($domain);
$found = false;
foreach($doc->getElementsByTagName($content_tag) as $tag) {
if (strpos($tag->nodeValue, $content)) {
$found = true;
break;
}
}
$checked_domains[$domain] = $found;
foreach($doc->getElementsByTagName('a') as $link) {
$href = $link->getAttribute('href');
if (strpos($href, 'http://') !== false && strpos($href, $domain) === false) {
$href_array = explode("/", $href);
if (count($domain_stack) < $max_size_domain_stack &&
$checked_domains["http://".$href_array[2]] === null) {
array_push($domain_stack, "http://".$href_array[2]);
}
};
}
$domain_stack = array_unique($domain_stack);
$domain = $domain_stack[0];
unset($domain_stack[0]);
$domain_stack = array_values($domain_stack);
$rounds++;
}
$found_domains = "";
foreach ($checked_domains as $key => $value) {
if ($value) {
$found_domains .= $key."\n";
}
}
file_put_contents($output_file, $found_domains);
?>
This is what I have so far but it doesn't excute properly when I run it because the output txt file remains empty. what can be the issue?
It gives empty output since there is a lack of array variable initialization.
Main part. Add an initialization of variable:
$domain_stack = array(); // before while ($domain != ...... )
Additional. Fix other warnings and notices:
// change this
$checked_domains["http://".$href_array[2]] === null
// into
!isset($checked_domains["http://".$href_array[2]])
// another line
// check if key exists
if (isset($domain_stack[0])) {
$domain = $domain_stack[0];
unset($domain_stack[0]);
}
I'm participating in one of the Code Golf competitions where the smaller your file size is, the better.
Rather than manually removing all whitespace, etc., I'm looking for a program or website which will take a file, remove all whitespace (including new lines) and return a compact version of the file. Any ideas?
You could use:
sed 's/\s\s+/ /g' youfile > yourpackedfile`
There is also this online tool.
You can even do it in PHP (how marvelous is life):
$data = file_get_contents('foobar.php');
$data = preg_replace('/\s\s+/', ' ', $data);
file_put_contents('foobar2.php', $data);
You have to note this won't take care of a string variable like $bar = ' asd aa a'; it might be a problem depending on what you are doing. The online tool seems to handle this properly.
$ tr -d ' \n' <oldfile >newfile
In PowerShell (v2) this can be done with the following little snippet:
(-join(gc my_file))-replace"\s"
or longer:
(-join (Get-Content my_file)) -replace "\s"
It will join all lines together and remove all spaces and tabs.
However, for some languages you probably don't want to do that. In PowerShell for example you don't need semicolons unless you put multiple statements on a single line so code like
while (1) {
"Hello World"
$x++
}
would become
while(1){"HelloWorld"$x++}
when applying aforementioned statements naïvely. It both changed the meaning and the syntactical correctness of the program. Probably not too much to look out for in numerical golfed solutions but the issue with lines joined together still remains, sadly. Just putting a semicolon between each line doesn't actually help either.
This is a PHP function that will do the work for you:
function compress_php_src($src) {
// Whitespaces left and right from this signs can be ignored
static $IW = array(
T_CONCAT_EQUAL, // .=
T_DOUBLE_ARROW, // =>
T_BOOLEAN_AND, // &&
T_BOOLEAN_OR, // ||
T_IS_EQUAL, // ==
T_IS_NOT_EQUAL, // != or <>
T_IS_SMALLER_OR_EQUAL, // <=
T_IS_GREATER_OR_EQUAL, // >=
T_INC, // ++
T_DEC, // --
T_PLUS_EQUAL, // +=
T_MINUS_EQUAL, // -=
T_MUL_EQUAL, // *=
T_DIV_EQUAL, // /=
T_IS_IDENTICAL, // ===
T_IS_NOT_IDENTICAL, // !==
T_DOUBLE_COLON, // ::
T_PAAMAYIM_NEKUDOTAYIM, // ::
T_OBJECT_OPERATOR, // ->
T_DOLLAR_OPEN_CURLY_BRACES, // ${
T_AND_EQUAL, // &=
T_MOD_EQUAL, // %=
T_XOR_EQUAL, // ^=
T_OR_EQUAL, // |=
T_SL, // <<
T_SR, // >>
T_SL_EQUAL, // <<=
T_SR_EQUAL, // >>=
);
if(is_file($src)) {
if(!$src = file_get_contents($src)) {
return false;
}
}
$tokens = token_get_all($src);
$new = "";
$c = sizeof($tokens);
$iw = false; // Ignore whitespace
$ih = false; // In HEREDOC
$ls = ""; // Last sign
$ot = null; // Open tag
for($i = 0; $i < $c; $i++) {
$token = $tokens[$i];
if(is_array($token)) {
list($tn, $ts) = $token; // tokens: number, string, line
$tname = token_name($tn);
if($tn == T_INLINE_HTML) {
$new .= $ts;
$iw = false;
}
else {
if($tn == T_OPEN_TAG) {
if(strpos($ts, " ") || strpos($ts, "\n") || strpos($ts, "\t") || strpos($ts, "\r")) {
$ts = rtrim($ts);
}
$ts .= " ";
$new .= $ts;
$ot = T_OPEN_TAG;
$iw = true;
} elseif($tn == T_OPEN_TAG_WITH_ECHO) {
$new .= $ts;
$ot = T_OPEN_TAG_WITH_ECHO;
$iw = true;
} elseif($tn == T_CLOSE_TAG) {
if($ot == T_OPEN_TAG_WITH_ECHO) {
$new = rtrim($new, "; ");
} else {
$ts = " ".$ts;
}
$new .= $ts;
$ot = null;
$iw = false;
} elseif(in_array($tn, $IW)) {
$new .= $ts;
$iw = true;
} elseif($tn == T_CONSTANT_ENCAPSED_STRING
|| $tn == T_ENCAPSED_AND_WHITESPACE)
{
if($ts[0] == '"') {
$ts = addcslashes($ts, "\n\t\r");
}
$new .= $ts;
$iw = true;
} elseif($tn == T_WHITESPACE) {
$nt = #$tokens[$i+1];
if(!$iw && (!is_string($nt) || $nt == '$') && !in_array($nt[0], $IW)) {
$new .= " ";
}
$iw = false;
} elseif($tn == T_START_HEREDOC) {
$new .= "<<<S\n";
$iw = false;
$ih = true; // in HEREDOC
} elseif($tn == T_END_HEREDOC) {
$new .= "S;";
$iw = true;
$ih = false; // in HEREDOC
for($j = $i+1; $j < $c; $j++) {
if(is_string($tokens[$j]) && $tokens[$j] == ";") {
$i = $j;
break;
} else if($tokens[$j][0] == T_CLOSE_TAG) {
break;
}
}
} elseif($tn == T_COMMENT || $tn == T_DOC_COMMENT) {
$iw = true;
} else {
if(!$ih) {
$ts = strtolower($ts);
}
$new .= $ts;
$iw = false;
}
}
$ls = "";
}
else {
if(($token != ";" && $token != ":") || $ls != $token) {
$new .= $token;
$ls = $token;
}
$iw = true;
}
}
return $new;
}
// This is an example
$src = file_get_contents('foobar.php');
file_put_contents('foobar3.php',compress_php_src($src));
If your code editor programs supports regular expressions, you can try this:
Find this: [\r\n]{2,}
Replace with this: \n
Then Replace All
Notepad++ is quite a nice editor if you are on Windows, and it has a lot of predefined macros, trimming down code and removing whitespace among them.
It can do regular expressions and has a plethora of features to help the code hacker or script kiddie.
Notepad++ website
Run php -w on it!
php -w myfile.php
Unlike a regular expression, this is smart enough to leave strings alone, and it removes comments too.