I have a url with a page system.
For instance https://myURL?p=50
But I want a script to find the last page available, for instance, let's say p=187
I have a function checkEmpty() that tells me whether the page is empty or not.
So for instance:
$myUrl = new URL(50); //https://myURL?p=50
$myUrl->checkEmpty();
//This evaluates to false -> the page exists
$myUrl = new URL(188); //https://myURL?p=188
$myUrl->checkEmpty();
//This evaluates to true -> the page does NOT exist
$myUrl = new URL(187); //https://myURL?p=187
$myUrl->checkEmpty();
//This evaluates to false -> the page exists
I did a naive algorithm, that you might guess it, performs too much requests.
My question is:
What would be the algorithm to find the last page with the minimal amount of requests?
EDIT
As requested by people in the comment here is the checkEmpty() implementation
<?php
public function checkEmpty() : bool
{
$criteria = "Aucun contenu disponible";
if(strstr( $this->replace_carriage_return(" ", $this->getHtml()), $criteria) !== false)
{
return true;
}
else
{
return false;
}
}
Since the upper bound is not known, exponentially increase the page no by 2 starting from 1. The moment you hit a non-existent page, you can do a binary search from previous existing page + 1 till this new upper bound where the page doesn't exist.
This way, you can get your answer in O(log(n)) attempts asymptotically where n is the no. of existing pages here as the sample space.
<?php
$lowerBound = 1;
$upperBound = 1;
while(true){
$myUrl = new URL($upperBound);
if($myUrl->checkEmpty()){
break;
}
$lowerBound = $upperBound + 1;
$upperBound <<= 1;
}
$ans = $lowerBound;
while($lowerBound <= $upperBound){
$mid = $lowerBound + (($upperBound - $lowerBound) >> 1);
$myUrl = new URL($mid);
if($myUrl->checkEmpty()){
$upperBound = $mid - 1;
}else{
$lowerBound = $mid + 1;
$ans = $lowerBound;
}
}
echo $ans;
I have a code that shows in-article ads after a specified amount of words, the thing is:
If I write a very long article, the ad will be lost due to text length, so there will be only text showing. What I need to do is to create 1 or 2 ads and make it/them repeat indefinitely every 4 paragraphs and 250 words (just an example), based on article length.
HERE'S AN EXAMPLE:
This blog has the very same thing that I'm trying to achieve. As you scroll the article, you'll see that more and more ads will be loaded between the article paragraphs.
THIS IS MY CURRENT CODE:
// Insert ads after a number of words and after the </p> closing tag.
// https://stackoverflow.com/questions/42801541/insert-text-in-content-after-300-words-but-after-closing-tag-of-a-paragraph
function anunciamentosegundo($content) {
$ad_code = '<script type="application/javascript">Adsense code goes here</script>';
// only inject google ads if post is longer than 800 characters
$enable_length1 = 1800;
// Insert at the end of the paragraph every 200 words
$after_word1 = 400;
// Maximum of 2 ads
$max_ads = 2;
if (strlen($content) > $enable_length1) {
$len = strlen($content);
$i=0;
// Keep adding untill end of content or $max_ads number of ads has ben inserted
while($i<$len && $max_ads-->0) {
// Work our way untill the apropriate length
$word_cout = 0;
$in_tag = false;
while(++$i < $len && $word_cout < $after_word1) {
if(!$in_tag && ctype_space($content[$i])) {
// Whitespace
$word_cout++;
}
else if(!$in_tag && $content[$i] == '<') {
// Begin tag
$in_tag = true;
$word_cout++;
}
else if($in_tag && $content[$i] == '>') {
// End tag
$in_tag = false;
}
}
// Find the next '</p>'
$i = strpos($content, "</p>", $i);
if($i === false) {
// No more paragraph endings
break;
}
else {
// Add the length of </p>
$i += 4;
// Get ad as string
ob_start();
echo $ad_code ; //would normally get printed to the screen/output to browser
$ad = ob_get_contents();
ob_end_clean();
$content = substr($content, 0, $i) . $ad . substr($content, $i);
// Set the correct i
$i+= strlen($ad);
}
}
}
return $content;
}
add_filter( 'the_content', 'anunciamentosegundo' );
Currently I can show ads after an amount of words and paragraphs, but not every x amount words and paragraphs. What should I do?
I would like to scrape the google search result up to page 2 but i'm having trouble on the result of blank page of my website or timeout.
for($j=0; $j<$acount; $j++){
sleep(60);
for($sp = 0; $sp <= 10; $sp+=10){
$url = 'http://www.google.'.$lang.'/search?q='.$in.'&start='.$sp;
if($sp == 10){
$datenbank = "proxy_work.php";
$datei = fopen($datenbank,"a+");
fwrite($datei, $data);
fwrite ($datei,"\r\n");
fclose($datei);
} else {
$datenbank = "proxy_work.php";
$datei = fopen($datenbank,"w+");
fwrite($datei, $data);
fwrite ($datei,"\r\n");
fclose($datei);
}
}
$html = file_get_html("proxy_work.php");
foreach($html->find('a') as $e){
// $title = $h3->innertext;
$link = $e->href;
if(in_array($endomain, $approveurl)){
}
// if it is not a direct link but url reference found inside it, then extract
if (!preg_match('/^https?/', $link) && preg_match('/q=(.+)&sa=/U', $link, $matches) && preg_match('/^https?/', $matches[1])) {
$link = $matches[1];
} else if (!preg_match('/^https?/', $link)) { // skip if it is not a valid link
continue;
}
}
}
Google search result pages (SERP) are not like a common website with static html. Google preserves its data from web scraping. Consider its data as a business directory and see the following tips for business directory scrape:
IP-proxying.
Imitating human behaviour by using some browser automation tools (Selenium, iMacros and others).
Read more here.
I have a function that scrapes data using a list of proxies with curl. It selects a random proxy each time the function is called. However sometimes a proxy can fail or timeout.
When the connection fails/timeout I would like to repeat the function up to 3 times until the data is returned.
The way I would like to test if the connection is bad is by checking if a string exists in the output like this:
$check = stripos($page,'string_to_check');
if($check > 0){
return $page; //String found. Return scraped data.
}
else {
//String not found. Loop the script
}
How would I get the whole function code to repeat if the string doesn't exist?
$max_tries = 3;
$success = false;
//try 3 times
for( $i = 0; $i < $max_tries; $i++ ) {
$page = your_scrape_function();
$check = stripos($page,'string_to_check');
if($check > 0){
$success = true;
break; //String found. Break loop.
}
}
// double check that the string was actually found and you didn't just exceed $max_tries
if( ! $success ) {
die('Error: String not found or scrape unsuccessful.');
}
I need to telnet to cisco switch using php and execute show interface status command and get results. I tried some php classes I found on internet but none of them could connect to device. So I tried to write the script myself, but I have the same problem, I cant connect to device.
The host sends me banner message and then new line with username:.
I send my username with \r\n, wait some time and tries to read data, but it looks to me like host is just ignoring my new line characters. This is response I got (explode('\n') on response):
Array
(
[0] => %
[1] => User Access Verification
[2] => Username: timeout expired!
)
Why didn't I get prompt on password? I tried it with sending telnet headers, and without, no change. Can anyone please help me?
Here is my code
<?
$host = "switchName";
$name = "name";
$pass = "pass";
$port = 23;
$timeOut = 15;
$connected = false;
$skipNullLines = true;
$timeout = 125000;
$header1=chr(0xFF).chr(0xFB).chr(0x1F).chr(0xFF).chr(0xFB).chr(0x20).chr(0xFF).chr(0xFB).chr(0x18).chr(0xFF).chr(0xFB).chr(0x27).chr(0xFF).chr(0xFD).chr(0x01).chr(0xFF).chr(0xFB).chr(0x03).chr(0xFF).chr(0xFD).chr(0x03).chr(0xFF).chr(0xFC).chr(0x23).chr(0xFF).chr(0xFC).chr(0x24).chr(0xFF).chr(0xFA).chr(0x1F).chr(0x00).chr(0x50).chr(0x00).chr(0x18).chr(0xFF).chr(0xF0).chr(0xFF).chr(0xFA).chr(0x20).chr(0x00).chr(0x33).chr(0x38).chr(0x34).chr(0x30).chr(0x30).chr(0x2C).chr(0x33).chr(0x38).chr(0x34).chr(0x30).chr(0x30).chr(0xFF).chr(0xF0).chr(0xFF).chr(0xFA).chr(0x27).chr(0x00).chr(0xFF).chr(0xF0).chr(0xFF).chr(0xFA).chr(0x18).chr(0x00).chr(0x41).chr(0x4E).chr(0x53).chr(0x49).chr(0xFF).chr(0xF0);
$header2=chr(0xFF).chr(0xFC).chr(0x01).chr(0xFF).chr(0xFC).chr(0x22).chr(0xFF).chr(0xFE).chr(0x05).chr(0xFF).chr(0xFC).chr(0x21);
function read_string()
{
global $fw,$host,$skipNullLines;
$string = "";
while( !feof($fw) )
{
$read = fgets($fw);
$string .= $read;
// Probably prompt, stop reading
if( strpos($read, ':') !== FALSE || strpos($read, '> (enable)') !== FALSE || strpos($read, $host.'#') !== FALSE)
{ break; }
}
$string = explode("\n", $string);
// Get rid of null lines
$ret = array();
for($i = 0; $i<count($string); $i++)
{
if( trim($string[$i]) == '' && $skipNullLines ) continue;
$ret[] = $string[$i];
}
return $ret;
}
function send_string($string, $force=false)
{
GLOBAL $timeout,$fw;
$string = trim($string);
// execute only strings that are preceded by "show" (if not forced)
if(!$force && strpos($string, 'show ') !== 0)
{
return 1;
}
fputs($fw, $string."\r\n");
echo("SEND:".$string."\r\n");
usleep($timeout);
}
$fw = fsockopen($host, $port, $errno, $errorstr, $timeOut);
if($fw == false)
{
echo("Cant connect");
}
else
{
echo("Connected<br>");
$connected = true;
stream_set_timeout($fw, $timeout);
// fputs($fw, $header1);
// usleep($timeout);
// fputs($fw, $header2);
// usleep($timeout);
print_r(read_string());
send_string("test", true);
print_r(read_string());
}
fclose($fw);
?>
UPDATE
If I send username at first, and then I read, I get password prompt. I dont understand it, why cant I firstly read messages from host and then send my response. The way it works to me now (send response and then read for prompt) is no-sense! (and I still got "% Authentication failed." message event with right password/name).
...
$connected = true;
stream_set_timeout($fw, $timeout);
send_string("name", true);
send_string("password", true);
print_r(read_string());
...
Okay, so I dont know what was the problem, but after "few" tests I was able to write this class that works for me. I dont know why other telnet classes dont work altough they do pretty much the same. So if anyone will have similar problem, you can try this:
class TELNET
{
private $host;
private $name;
private $pass;
private $port;
private $connected;
private $connect_timeout;
private $stream_timetout;
private $socket;
public function TELNET()
{
$this->port = 23;
$this->connected = false; // connected?
$this->connect_timeout = 10; // timeout while asking for connection
$this->stream_timeout = 380000; // timeout between I/O operations
}
public function __destruct()
{
if($this->connected) { fclose($this->socket); }
}
// Connects to host
// #$_host - addres (or hostname) of host
// #$_user - name of user to log in as
// $#_pass - password of user
//
// Return: TRUE on success, other way function will return error string got by fsockopen()
public function Connect($_host, $_user, $_pass)
{
// If connected successfully
if( ($this->socket = #fsockopen($_host, $this->port, $errno, $errorstr, $this->connect_timeout)) !== FALSE )
{
$this->host = $_host;
$this->user = $_user;
$this->pass = $_pass;
$this->connected = true;
stream_set_timeout($this->socket, 0, 380000);
stream_set_blocking($this->socket, 1);
return true;
}
// else if coulnt connect
else return $errorstr;
}
// LogIn to host
//
// RETURN: will return true on success, other way returns false
public function LogIn()
{
if(!$this->connected) return false;
// Send name and password
$this->SendString($this->user, true);
$this->SendString($this->pass, true);
// read answer
$data = $this->ReadTo(array('#'));
// did we get the prompt from host?
if( strtolower(trim($data[count($data)-1])) == strtolower($this->host).'#' ) return true;
else return false;
}
// Function will execute command on host and returns output
//
// #$_command - command to be executed, only commands beginning with "show " can be executed, you can change this by adding
// "true" (bool type) as the second argument for function SendString($command) inside this function (3rd line)
//
function GetOutputOf($_command)
{
if(!$this->connected) return false;
$this->SendString($_command);
$output = array();
$work = true;
//
// Read whole output
//
// read_to( array( STRINGS ) ), STRINGS are meant as possible endings of outputs
while( $work && $data = $this->ReadTo( array("--More--","#") ) )
{
// CHeck wheter we actually did read any data
$null_data = true;
foreach($data as $line)
{
if(trim($line) != "") {$null_data = false;break;}
}
if($null_data) { break;}
// if device is paging output, send space to get rest
if( trim($data[count($data)-1]) == '--More--')
{
// delete line with prompt (or "--More--")
unset($data[count($data)-1]);
// if second line is blank, delete it
if( trim($data[1]) == '' ) unset($data[1]);
// If first line contains send command, delete it
if( strpos($data[0], $_command)!==FALSE ) unset($data[0]);
// send space
fputs($this->socket, " ");
}
// ak ma vystup max dva riadky
// alebo sme uz nacitali prompt
// IF we got prompt (line ending with #)
// OR string that we've read has only one line
// THEN we reached end of data and stop reading
if( strpos($data[count($data)-1], '#')!==FALSE /* || (count($data) == 1 && $data[0] == "")*/ )
{
// delete line with prompt
unset($data[count($data)-1]);
// if second line is blank, delete it
if( trim($data[1]) == '' ) unset($data[1]);
// If first line contains send command, delete it
if( strpos($data[0], $_command)!==FALSE ) unset($data[0]);
// stop while cyclus
$work = false;
}
// get rid of empty lines at the end
for($i = count($data)-1; $i>0; $i--)
{
if(trim($data[$i]) == "") unset($data[$i]);
else break;
}
// add new data to $output
foreach($data as $v)
{ $output[] = $v; }
}
// return output
return $output;
}
// Read from host until occurence of any index from $array_of_stops
// #array_of_stops - array that contains strings of texts that may be at the end of output
// RETURNS: output of command as array of lines
function ReadTo($array_of_stops)
{
$ret = array();
$max_empty_lines = 3;
$count_empty_lines = 0;
while( !feof($this->socket) )
{
$read = fgets($this->socket);
$ret[] = $read;
//
// Stop reading after (int)"$max_empty_lines" empty lines
//
if(trim($read) == "")
{
if($count_empty_lines++ > $max_empty_lines) break;
}
else $count_empty_lines = 0;
//
// Does last line of readed data contain any of "Stop" strings ??
$found = false;
foreach($array_of_stops AS $stop)
{
if( strpos($read, $stop) !== FALSE ) { $found = true; break; }
}
// If so, stop reading
if($found) break;
}
return $ret;
}
// Send string to host
// If force is set to false (default), function sends to host only strings that begins with "show "
//
// #$string - command to be executed
// #$force - force command? Execute if not preceeded by "show " ?
// #$newLine - append character of new line at the end of command?
function SendString($string, $force=false, $newLine=true)
{
$t1 = microtime(true);
$string = trim($string);
// execute only strings that are preceded by "show"
// and execute only one command (no new line characters) !
if(!$force && strpos($string, 'show ') !== 0 && count(explode("\n", $string)) == 1)
{
return 1;
}
if($newLine) $string .= "\n";
fputs($this->socket, $string);
$t2 = microtime(true);
}
}
// EXAMPLE
$host = "hostname";
$name = "username";
$pass = "password";
$t = new TELNET();
echo("CONNECT:".$t->Connect($host, $name, $pass)."<br>");
echo("LOGIN:".(int)$t->LogIn());
echo("<br>OUTPUT:<br>");
print_r($t->GetOutputOf("show snmp"));
print_r($t->GetOutputOf("show users"));
print_r($t->GetOutputOf("show interface status"));
PS: my devices prompt is "hostname#", so you may need to edit Login function to make this code work with prompt of your device (so you may need in GetOutputOf() )