I'm trying to code a php parser to gather professor reviews from ratemyprofessor.com. Each professor has a page and it has all the reviews in it, I want to parse each professor's site and extract the comments into a txt file.
This is what I have so far but it doesn't excute properly when I run it because the output txt file remains empty. what can be the issue?
<?php
set_time_limit(0);
$domain = "http://www.ratemyprofessors.com";
$content = "div id=commentsection";
$content_tag = "comment";
$output_file = "reviews.txt";
$max_urls_to_check = 400;
$rounds = 0;
$reviews_stack = array();
$max_size_domain_stack = 10000;
$checked_domains = array();
while ($domain != "" && $rounds < $max_urls_to_check) {
$doc = new DOMDocument();
#$doc->loadHTMLFile($domain);
$found = false;
foreach($doc->getElementsByTagName($content_tag) as $tag) {
if (strpos($tag->nodeValue, $content)) {
$found = true;
break;
}
}
$checked_domains[$domain] = $found;
foreach($doc->getElementsByTagName('a') as $link) {
$href = $link->getAttribute('href');
if (strpos($href, 'http://') !== false && strpos($href, $domain) === false) {
$href_array = explode("/", $href);
if (count($domain_stack) < $max_size_domain_stack &&
$checked_domains["http://".$href_array[2]] === null) {
array_push($domain_stack, "http://".$href_array[2]);
}
};
}
$domain_stack = array_unique($domain_stack);
$domain = $domain_stack[0];
unset($domain_stack[0]);
$domain_stack = array_values($domain_stack);
$rounds++;
}
$found_domains = "";
foreach ($checked_domains as $key => $value) {
if ($value) {
$found_domains .= $key."\n";
}
}
file_put_contents($output_file, $found_domains);
?>
This is what I have so far but it doesn't excute properly when I run it because the output txt file remains empty. what can be the issue?
It gives empty output since there is a lack of array variable initialization.
Main part. Add an initialization of variable:
$domain_stack = array(); // before while ($domain != ...... )
Additional. Fix other warnings and notices:
// change this
$checked_domains["http://".$href_array[2]] === null
// into
!isset($checked_domains["http://".$href_array[2]])
// another line
// check if key exists
if (isset($domain_stack[0])) {
$domain = $domain_stack[0];
unset($domain_stack[0]);
}
Related
I download a TS3AntiVPN but it shoes an error. I use a Linux Server running Debian 9 Plesk installed.
PHP Warning: Invalid argument supplied for foreach() in
/var/www/vhosts/suspectgaming.de/tsweb.suspectgaming.de/antivpn/bot.php
on line 29
How do I solve this problem?
<?php
require("ts3admin.class.php");
$ignore_groups = array('1',); // now supports one input and array input
$msg_kick = "VPN";
$login_query = "serveradmin";
$pass_query = "";
$adres_ip = "94.249.254.216";
$query_port = "10011";
$port_ts = "9987";
$nom_bot = "AntiVPN";
$ts = new ts3Admin($adres_ip, $query_port);
if(!$ts->getElement('success', $ts->connect())) {
die("Anti-Proxy");
}
$ts->login($login_query, $pass_query);
$ts->selectServer($port_ts);
$ts->setName($nom_bot);
while(true) {
sleep(1);
$clientList = $ts->clientList("-ip -groups");
foreach($clientList['data'] as $val) {
$groups = explode(",", $val['client_servergroups'] );
if(is_array($ignore_groups)){
foreach($ignore_groups as $ig){
if(in_array($ig, $groups) || ($val['client_type'] == 1)) {
continue;
}
}
}else{
if(in_array($ignore_groups, $groups) || ($val['client_type'] == 1)) {
continue;
}
}
$file = file_get_contents('https://api.xdefcon.com/proxy/check/?ip='.$val['connection_client_ip'].'');
$file = json_decode($file, true);
if($file['message'] == "Proxy detected.") {
$ts->clientKick($val['clid'], "server", $msg_kick);
}
}
}
?>
You might be missing data in your first array. You may add an if statement to check if there is data in your $clientList['data'] var:
if (is_array($clientList['data'])) {
}
Or you might also check if sizeof(); of your array is larger than a number, you may desire.
if (is_array($clientList['data']) && sizeof($clientList['data']) > 0) {
}
Code
require "ts3admin.class.php";
$ignore_groups = array('1'); // now supports one input and array input
$msg_kick = "VPN";
$login_query = "serveradmin";
$pass_query = "";
$adres_ip = "94.249.254.216";
$query_port = "10011";
$port_ts = "9987";
$nom_bot = "AntiVPN";
$ts = new ts3Admin($adres_ip, $query_port);
if (!$ts->getElement('success', $ts->connect())) {
die("Anti-Proxy");
}
$ts->login($login_query, $pass_query);
$ts->selectServer($port_ts);
$ts->setName($nom_bot);
while (true) {
sleep(1);
$clientList = $ts->clientList("-ip -groups");
if (is_array($clientList['data']) && sizeof($clientList['data']) > 0) {
foreach ($clientList['data'] as $val) {
$groups = explode(",", $val['client_servergroups']);
if (is_array($ignore_groups)) {
foreach ($ignore_groups as $ig) {
if (in_array($ig, $groups) || ($val['client_type'] == 1)) {
continue;
}
}
} else {
if (in_array($ignore_groups, $groups) || ($val['client_type'] == 1)) {
continue;
}
}
$file = file_get_contents('https://api.xdefcon.com/proxy/check/?ip=' . $val['connection_client_ip'] . '');
$file = json_decode($file, true);
if ($file['message'] == "Proxy detected.") {
$ts->clientKick($val['clid'], "server", $msg_kick);
}
}
} else {
echo "There might be no data in Client List";
}
}
I want to remove all functions ending with _example from my code. I am processing the code using token_get_all. The code I currently have is below to change the opening tags and strip the comments out.
foreach ($files as $file) {
$content = file_get_contents($file);
$tokens = token_get_all($content);
$output = '';
foreach($tokens as $token) {
if (is_array($token)) {
list($index, $code, $line) = $token;
switch($index) {
case T_OPEN_TAG_WITH_ECHO:
$output .= '<?php echo';
break;
case T_COMMENT:
case T_DOC_COMMENT:
$output .= '';
break;
case T_FUNCTION:
// ???
default:
$output .= $code;
break;
}
} else {
$output .= $token;
}
}
file_put_contents($file, $output);
}
I just can't figure out how I can modify it to strip entire functions based on their names.
Ok, I wrote the new code for your problem:
First, he finds every functions and them declarations in your source code.
Second, he checks if function name finished by "_example" and remove his code.
$source = file_get_contents($filename); // Obtain source from filename $filename
$tokens = token_get_all($source); // Get php tokens
// Init variables
$in_fnc = false;
$fnc_name = null;
$functions = array();
// Loop $tokens to locate functions
foreach ($tokens as $token){
$t_array = is_array($token);
if ($t_array){
list($t, $c) = $token;
if (!$in_fnc && $t == T_FUNCTION){ // "function": we register one function start
$in_fnc = true;
$fnc_name = null;
$nb_opened = $nb_closed = 0;
continue;
}
else if ($in_fnc && null === $fnc_name && $t == T_STRING){ // we check and store the name of function if exists
if (preg_match('`function\s+'.preg_quote($c).'\s*\(`sU', $source)){ // "function function_name ("
$fnc_name = $c;
continue;
}
}
}
else {
$c = $token; // single char: content is $token
}
if ($in_fnc && null !== $fnc_name){ // we are in declaration of function
$nb_closed += substr_count($c, '}'); // we count number of } to extract later complete code of this function
if (!$t_array){
$nb_opened += substr_count($c, '{') - substr_count($c, '}'); // we count number of { not closed (num "{" - num "}")
if ($nb_closed > 0 && $nb_opened == 0){ // once "}" parsed and all "{" are closed by "}"
if (preg_match('`function\s+'.preg_quote($fnc_name).'\s*\((.*\}){'.$nb_closed.'}`sU', $source, $m)){
$functions[$fnc_name] = $m[0]; // we store entire code of this function in $functions
}
$in_fnc = false; // we declare that function is finished
}
}
}
}
// Ok, now $functions contains all functions found in $filename
$source_changed = false; // Prevents re-write $filename with the original content
foreach ($functions as $f_name => $f_code){
if (preg_match('`_example$`', $f_name)){
$source = str_replace($f_code, '', $source); // remove function if her name finished by "_example"
$source_changed = true;
}
}
if ($source_changed){
file_put_contents($filename, $source); // replace $filename file contents
}
<!-- language: php -->
<?php
// test variables
$l1 = "http://youtube.com/channel/";
$l2 = "http://youtube.com/channel/";
$l3 = "http://youtube.com/channel/";
$l4 = "http://youtube.com/channel/";
$fl = "http://youtube.com/channel/";
//set error false as default
$error = "false";
//check if variables are ready for use, if they are, add them to `$l` array
//I do each check as a seperate line, as it looks cleaner than 1 long if statement.
$l = [];
if(!empty($l1)) $l[] = $l1;
if(!empty($l2)) $l[] = $l2;
if(!empty($l3)) $l[] = $l3;
if(!empty($l4)) $l[] = $l4;
if(!empty($fl)) $l[] = $fl;
foreach($l as $key => $value) {
//1 line ternary is cleaner than if/else statetmnt
$errorKey = $key < 9? "0{$key}" : $key;
//each row by default has no error
$hasError = 0;
//check if this a valid url
if(!preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $value)) {
$error = "true";
$hasError = 1;
}
if($hasError) {
//store error in array, to loop through later
$errors[] = $errorKey;
}
}
$search = '?sub_confirmation=1';
$searchUrl = "youtube.com/channel";
if (strpos($l, $searchUrl) !== false && strpos($l, $search) === false) {
$l = $value."".$search;
}
if($error == "false") {
echo $l1;
echo $l2;
echo $l3;
echo $l4;
echo $fl;
}
// deliver the error message
//Check if $error has been set to true at any point
if($error == "true") {
//loop through error array, echo error message if $errorNumber matches.
//at this point we KNOW there was an error at some point, no need to use a switch really
foreach($errors as $errorNumber) {
echo "Something went wrong here $errorNumber :o";
}
}
?>
Hello, my problem is at the end of the code where the strpos function is, so basically I want to check every url, once if it contains a certain url, and then add something to the end if it is so. But I don't want to repeat an if statement 4 times($fl variable doesn't has to be checked), I am quite new in all that so I hope somebody can help me, I tought about a switch statement but I guess there is a better way. And if I put it in the foreach aboth, it doesn't applies on the certain variables, only on the value variable.
You can assign $value by reference using this foreach header (notice the & in front of $value):
foreach($l as $key => &$value) {
By doing this every change you do to $value will also be done to the corresponding value in the $l array.
Then at the end of the foreach loop you put this code:
if (strpos($value, $searchUrl) !== false && strpos($value, $search) === false) {
$value .= $search;
}
So your final foreach loop should look like this:
foreach($l as $key => &$value) {
//1 line ternary is cleaner than if/else statetmnt
$errorKey = $key < 9? "0{$key}" : $key;
//each row by default has no error
$hasError = 0;
//check if this a valid url
if(!preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $value)) {
$error = "true";
$hasError = 1;
}
if($hasError) {
//store error in array, to loop through later
$errors[] = $errorKey;
}
$search = '?sub_confirmation=1';
$searchUrl = "youtube.com/channel";
if (strpos($value, $searchUrl) !== false && strpos($value, $search) === false) {
$value .= $search;
}
}
You can read more about using references in foreach loops here: PHP: foreach
Edit:
To apply the changes not only to the elements of the $l array, but also to the original variables $l1, $l2 and so on, you should assign the elements to your array as references too:
$l = [];
if(!empty($l1)) $l[] = &$l1;
if(!empty($l2)) $l[] = &$l2;
if(!empty($l3)) $l[] = &$l3;
if(!empty($l4)) $l[] = &$l4;
if(!empty($fl)) $l[] = &$fl;
Personally, I think this is a good candidate for moving to a class. To be honest I'm not 100% sure what you are doing but will try to convert your code to a class.
class L {
public $raw = null;
public $modified = null;
public $error = false;
// create the class
public function __construct($data=null) {
$this->raw = $data;
// Check the raw passed in data
if ($data) {
$this->isUrl();
}
// If there was no error, check the data
if (! $this->error) {
$this->search();
}
}
// Do something ?
public function debug() {
echo '<pre>';
var_dump($this);
echo '</pre>';
}
public function getData() {
return ($this->modified) ? : $this->raw;
}
private function isUrl() {
$this->error = (! preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $this->raw));
}
// Should a failed search also be an error?
private function search() {
if ($this->raw) {
if ( (strpos($this->raw, "youtube.com/channel") !== false) &&
(strpos($this->raw, "?sub_confirmation=1") === false) ) {
$this->modified = $this->raw ."?sub_confirmation=1";
}
}
}
}
// Test data
$testList[] = "test fail";
$testList[] = "https://youtube.com/searchFail";
$testList[] = "https://youtube.com/channel/success";
$testList[] = "https://youtube.com/channel/confirmed?sub_confirmation=1";
// Testing code
foreach($testList as $key=>$val) {
$l[] = new L($val);
}
foreach($l as $key=>$val) {
// Check for an error
if ($val->error) {
$val->debug();
} else {
echo '<pre>'.$val->getData().'</pre>';
}
}
And the output would be:
object(L)#1 (3) {
["raw"]=>
string(9) "test fail"
["modified"]=>
NULL
["error"]=>
bool(true)
}
https://youtube.com/searchFail
https://youtube.com/channel/success?sub_confirmation=1
https://youtube.com/channel/confirmed?sub_confirmation=1
I'm writing a script for download from FTP..
In the form I need to show files and folders..
With ftp_nlist, they come all togethers but I want to know who's who ..
I can't find an easy way to do this:
$contents = ftp_nlist($connection, $rep);
$dossiers =array();
$fichiers = array();
foreach($contents as $content){
//if folder
if (is_folder($content)) $dossiers[] = $content;
//si file
if(is_filex($content)) $fichiers[] = $content;
}
Of course is_file and is_dir don't work with distant files...
I've find something with ftp_rawlist and the size of each result..
like this:
if($result['size']== 0){ //is dir }
But in case of an empty file???
So what id the way to know what is a folder and what is a file??
Thanks!
I've had the same problem and this was my solution:
$conn = ftp_connect('my_ftp_host');
ftp_login($conn, 'my_user', 'my_password');
$path = '/';
// Get lists
$nlist = ftp_nlist($conn, $path);
$rawlist = ftp_rawlist($conn, $path);
$ftp_dirs = array();
for ($i = 0; $i < count($nlist) - 1; $i++)
{
if($rawlist[$i][0] == 'd')
{
$ftp_dirs[] = $nlist[$i];
}
}
I know the above code could be optimised and do just one FTP request instead of two but for my purposes this did the work.
For anyone looking for a cleaner solution, I've found a script to parse ftp_rawlist in this LINK:
Function
function parse_ftp_rawlist($List, $Win = FALSE)
{
$Output = array();
$i = 0;
if ($Win) {
foreach ($List as $Current) {
ereg('([0-9]{2})-([0-9]{2})-([0-9]{2}) +([0-9]{2}):([0-9]{2})(AM|PM) +([0-9]+|) +(.+)', $Current, $Split);
if (is_array($Split)) {
if ($Split[3] < 70) {
$Split[3] += 2000;
}
else {
$Split[3] += 1900;
}
$Output[$i]['isdir'] = ($Split[7] == '');
$Output[$i]['size'] = $Split[7];
$Output[$i]['month'] = $Split[1];
$Output[$i]['day'] = $Split[2];
$Output[$i]['time/year'] = $Split[3];
$Output[$i]['name'] = $Split[8];
$i++;
}
}
return !empty($Output) ? $Output : false;
}
else {
foreach ($List as $Current) {
$Split = preg_split('[ ]', $Current, 9, PREG_SPLIT_NO_EMPTY);
if ($Split[0] != 'total') {
$Output[$i]['isdir'] = ($Split[0] {0} === 'd');
$Output[$i]['perms'] = $Split[0];
$Output[$i]['number'] = $Split[1];
$Output[$i]['owner'] = $Split[2];
$Output[$i]['group'] = $Split[3];
$Output[$i]['size'] = $Split[4];
$Output[$i]['month'] = $Split[5];
$Output[$i]['day'] = $Split[6];
$Output[$i]['time/year'] = $Split[7];
$Output[$i]['name'] = $Split[8];
$i++;
}
}
return !empty($Output) ? $Output : FALSE;
}
}
Usage
// connect to ftp server
$res_ftp_stream = ftp_connect('my_server_ip');
// login with username/password
$login_result = ftp_login($res_ftp_stream, 'my_user_name', 'my_password');
// get the file list for curent directory
$buff = ftp_rawlist($res_ftp_stream, '/');
// parse ftp_rawlist output
$result = parse_ftp_rawlist($buff, false);
// dump result
var_dump($result);
// close ftp connection
ftp_close($res_ftp_stream);
Again I'm working on a working CSV filter. It will search through about 500 lines of promotional code and return its amount to ajax receiver. The weird thing is, if I only enter 2 letters, instead of searching for exact fit, the php processor would return the result once it has found a value which contains my entered letters! I need it to look for only exact fit of 4-strings value.
Here's my code so far:
<?php
// if data are received via POST, with index of 'test'
if (isset($_POST['test'])) {
$promocodevalid = false;
$file = fopen('test.csv', 'r');
$coupon = array($_POST['test']);
$coupondef = $_POST['test']; // get data
$coupon = array_map('preg_quote', $coupon);
$regex = '/'.implode('|', $coupon).'/i';
while (($line = fgetcsv($file)) !== FALSE) {
list($promocode, $amount) = $line;
if(preg_match($regex, $promocode)) {
$validity = 1;
echo $amount."[BRK]".$promocode."[BRK]".$validity;
$promocodevalid = true;
break;
}
}
if(!$promocodevalid) {
$validity = 0;
echo $amount."[BRK]".$promocode."[BRK]".$validity;
}
}
?>
Try to avoid regexes where they are not needed. Search for str* function you need.
Above code should look like:
if (isset($_POST['test'])) {
$promocodevalid = false;
$file = fopen('test.csv', 'r');
$coupondef = $_POST['test']; // get data
while (($line = fgetcsv($file)) !== FALSE) {
list($promocode, $amount) = $line;
// remove strtolower if you are have lowercase promocode,
// but probably leave a $coupondef lowered.
if(strpos(strtolower($promocode), strtolower($coupondef)) === 0) {
$validity = 1;
echo $amount."[BRK]".$promocode."[BRK]".$validity;
$promocodevalid = true;
break;
}
}
if(!$promocodevalid) {
$validity = 0;
echo $amount."[BRK]".$promocode."[BRK]".$validity;
}
}