I'm trying to go next page and for that wrote while(true) loop, but not working properly. Giving me no error or anything just nothing.
This is the web site link: https://suumo.jp/ms/shinchiku/osaka/sa_osaka/pnz11.html I'm trying to add +1 to pagination
$startID = 1;
while(true) {
#$url = "https://suumo.jp/ms/shinchiku/osaka/sa_osaka/pnz1".$startID.".html";
$html = #file_get_contents($url);
if($http_response_header[0] == 'HTTP/1.1 200 OK') {
libxml_use_internal_errors(true);
$parser = new \DOMDocument();
$parser->loadHTML($html);
And end of the code.
$a = $startID+1;
} else {
$this->error("Next page is not found!");
}
By the way I am scraping first page with no problem. But it isn't go next page. Any idea for why is that happening?
You are not increment the $startID you have $a=$startID+1. So each iteration of your loop $startID is equal to 1. To fix it you need to add it to itself with either:
$startID += 1;
//or
++$startID;
//or (if you really need $a)
$a = $startID += 1;
And change this:
} else {
$this->error("Next page is not found!");
break; //exit the loop
}
I should mention that for(;;) is roughly equivalent to while(true) So this:
for($startID=1;;++$startID){ ... }
Is roughly equivalent to all this:
$startID = 1;
while(true){
++$startID;
}
Except it's much prettier in my Opinion. I feel like a lot of coders overlook for in PHP, the arguments are actually optional, too.
Enjoy.
Related
I been trying to parse statements/questions from a forum site using a DOM parser.
Its working fine, it extracts all statements in the forum. So, i tried to put a limit of extracting statements using a if condition. it still doesn't fix the problem.
I thought the problem in structuring the if condition, so i ran the loop separately and it worked..
the code goes as follows:
<?php
$i = 1;
$elementCount=0;
while(true){
require_once('dom/simple_html_dom.php');
$html = file_get_html('http://www.usmleforum.com/forum/index.php?forum=1&Page='.$i);
foreach ($html->find("tr") as $row) {
$element = $row->find('td.FootNotes2',0);
if ($element == null) { continue; }
$textNode = array_filter($element->nodes, function ($n) {
return $n->nodetype == 3; //Text node type, like in jQuery
});
if (!empty($textNode)) {
$text = current($textNode);
echo $text."<br>";
$elementCount++;
}
}
if($elementCount==12){
break;
}
$i++;
}
?>
So, even after adding the if condition for 12 statements it still runs for forever.
now the if condition alone:
<?php
$i = 1;
$elementCount=0;
while(true){
echo "harish".$i."<br>";
$elementCount++;
if($elementCount==12){
break;
}
$i++;
}
?>
It works fine, prints only 12 given statement.
Any help is appreciated...
I'm not sure you are incrementing the $elementCount correctly but since I cannot see the output of your code I'm not sure.
I would move the break statement before the $elementCount++ and also echo the $elementCount to figure out what's really going on.
The resulting code would be like this:
<?php
$i = 1;
$elementCount=0;
while(true){
require_once('dom/simple_html_dom.php');
$html = file_get_html('http://www.usmleforum.com/forum/index.php?forum=1&Page='.$i);
foreach ($html->find("tr") as $row) {
$element = $row->find('td.FootNotes2',0);
if ($element == null) { continue; }
$textNode = array_filter($element->nodes, function ($n) {
return $n->nodetype == 3; //Text node type, like in jQuery
});
if (!empty($textNode)) {
$text = current($textNode);
echo $text."<br>";
echo $elementCount;
if($elementCount===12){
break;
}
$elementCount++;
}
}
$i++;
}
?>
You should better structure your code. So you would see that you increment the elementCount inside the foreach.
As you are checking outside of the foreach for the exact elementCount it can happen, that your counter rises above 12 elements and now will run forever.
You could just change the loop to
while($elementCount < 12) {...}
So you are checking if you have less than 12 and not exactly 12 or to:
if ($elementCount >= 12) {...}
I'm using json_decode to parse JSON files. In a for loop, I attempt to capture specific cases in the JSON in which one element or another exist. I've implemented a function that seems to fit my needs, but I find that I need to use two for loops to get it to catch both of my cases.
I would rather use a single loop, if that's possible, but I'm stuck on how to get both cases caught in a single pass. Here's a mockup of what I would like the result to look like:
<?php
function extract($thisfile){
$test = implode("", file($thisfile));
$obj = json_decode($test, true);
for ($i = 0; $i <= sizeof($obj['patcher']['boxes']); $i ++) {
//this is sometimes found 2nd
if ($obj['patcher']['boxes'][$i]['box']['name'] == "mystring1") {
}
//this is sometimes found 1st
if ($obj['patcher']['boxes'][$i]['box']['name'] == "mystring2") {
}
}
}
?>
Can anyone tell me how I could catch both cases outlined above within a single iteration?
I clearly could not do something like
if ($obj['patcher']['boxes'][$i]['box']['name'] == "string1" && $obj['patcher']['boxes'][$i]['box']['name'] == "string2") {}
...because that condition would never be met.
Generally what I do when I have raw data that is in an order that isn't ideal to work with is to run a first loop pass to generate a a list of indexes for me to pass through a second time.
So a quick example from your code:
<?php
function extract($thisfile){
$test = implode("", file($thisfile));
$obj = json_decode($test, true);
$index_mystring2 = array(); //Your list of indexes for the second condition
//1st loop.
$box_name;
for ($i = 0; $i <= sizeof($obj['patcher']['boxes']); $i ++) {
$box_name = $obj['patcher']['boxes'][$i]['box']['name'];
if ( $box_name == "mystring1") {
//Do your code here for condition 1
}
if ($box_name == "mystring2") {
//We push the index onto an array for a later loop.
array_push($index_mystring2, $i);
}
}
//2nd loop
for($j=0; $j<=sizeof($index_mystring2); $j++) {
//Your code here. do note that $obj['patcher']['boxes'][$j]
// will refer you to the data in your decoded json tree
}
}
?>
Granted you can do this in more generic ways so it's cleaner (ie, generate both the first and second conditions into indexes) but i think you get the idea :)
I found that something like what #Jon had mentioned is probably the best way to attack this problem, for me at least:
<?php
function extract($thisfile){
$test = implode("", file($thisfile));
$obj = json_decode($test, true);
$found1 = $found2 = false;
for ($i = 0; $i <= sizeof($obj['patcher']['boxes']); $i ++) {
//this is sometimes found 2nd
if ($obj['patcher']['boxes'][$i]['box']['name'] == "mystring1") {
$found1 = true;
}
//this is sometimes found 1st
if ($obj['patcher']['boxes'][$i]['box']['name'] == "mystring2") {
$found2 = true;
}
if ($found1 && $found2){
break;
}
}
}
?>
Im trying to return true when my loop has finished but it does not seem to happen.
I can get it to echo true or false or any text but returning does nothing.
Wonder if anyone could explain why this is.
Here is the (kinda) function I have removed the data base calls and such as its not important.
function loop_me(){
// this part is not important...
$finished = false;
$done = 0;
$userC = 1000;
$page = 0;
$count = 10;
$array = array()
$data = array('1','2','3') // big array of data...
if($done < $userC){
for($i=0; $i<$count; $i++){
$array[] = $data[$i];
}
// bellow is the important part...
if($done >= $userC){
$finished = true;
}else{
$page++;
loop_me();
}
}
if($finished){
// If I echo true it outputs 1 (this is fine)
// if I return true I get nothing this is got good as I want to do an IF statement on the
// output, which I can't do if it does not.
echo(true);
}
}
Ok so the function with the issue is above but just to help you out, the basic idea of the function is that i loops thought an array of data (not showen above) but this data is paginated so it needs to go to the next 'page' once its finished with the first and there a few pages so what I want to do is when it has finished looping thought it all return true.
Might be a simple fix.
But I can't work it out.
You called 'loop_me()' recursively, but you need to return it.
}else{
$page++;
return loop_me();
}
and of course change echo to return too!
edit your echo (true); to something like: return true; then call your function:
$var = loop_me();
echo $var; // If a success you should see true.
You should also consider adding a return false if there is a problem when calling your defined function.
for some reason $post is always < 0. The indoxOf function works great. I use it on ohter codes and it works great
for some reason even after I add the element like this array_push($groups, $tempDon); on the next loop i continues to return -1
$donations = $this->getInstitutionDonations($post->ID);
$groups=array();
foreach( $donations as $don ) : setup_postdata($don);
$pos = $this->indexOf($don, $groups);
print_r($pos);
if($pos < 0)
{
$tempDom = $don;
$tempDon->count = 1;
array_push($groups, $tempDon);
}
else
{
$tempDom = $groups[$pos];
$tempDon->count++;
array_splice($tempDon);
array_push($groups, $tempDon);
echo '<br><br><br>ahhhhhhhhhh<br><br>';
}
endforeach;
protected function indexOf($needle, $haystack) { // conversion of JavaScripts most awesome
for ($i=0;$i<count($haystack);$i++) { // indexOf function. Searches an array for
if ($haystack[$i] == $needle) { // a value and returns the index of the *first*
return $i; // occurance
}
}
return -1;
}
This looks like an issue of poor proofreading to me (note $tempDom vs $tempDon):
$tempDom = $don;
$tempDon->count = 1;
array_push($groups, $tempDon);
Your else block has similar issues.
I also completely agree with #hakre's comment regarding syntax inconsistencies.
EDIT
I'd also like to recommend that you make use of PHP's built-in array_search function in the body of your indexOf method rather than rolling your own.
I have programmed a script with the goto command but on the server where I want to execute the script there is a previous PHP version (<5.3), so I have to change the code. The structure of the code goes like that:
for($i = 0; $i < 30; $i++) // print 30 articles
{
$x = 0;
// choose a a feed from the db
// parse it
a:
foreach($feed->get_items($x, 1) as $item)
{
// create a unique id for the article of the feed
if($id == $dbid)
{
// if this id exists in the db, take the next article of the same feed which is not in the db
$x++;
goto a;
}
else
{
// print the original article you grabbed
}
} // end of foreach
} // end of for
I have tested everything. Do you have any ideas how can I retransform this code without goto in order to be executed properly???
This questions demonstrates why goto should be avoided. It lets you get away without thinking about the algorithm enough.
The standard way to do this is with a flag. I hope you were not expecting a "herezthecode kthxbai" sort of an answer, but in this case the best way to explain it would be to write the code -
for($i=0;$i<30;$++){
$x=0;
do {
$found = false;
foreach($feed->get_items($x,1) as $item){
// get $id
if($id==$dbid){
$found = true;
break;
}else{
// other things
}
}
$x++;
} while($found);
}
Without knowing how the ->get_items() call behaves you could use this brute-force method in lieu of the goto-switch:
for($i = 0; $i < 30; $i++)
{
$x = 0;
$a = 1;
while ($a--)
foreach($feed->get_items($x, 1) as $item)
{
if($id == $dbid)
{
$x++;
$a=1; break;
}
else
{
}
} // end of foreach
} // end of for
The label gets replaced by a while and a self-fulfulling stop condition. And the goto becomes a break and resets the $a stop condition.
Something like this would probably work...
function loop(){
foreach($feed->get_items($x,1) as $item){
if($id==$dbid){
$x++;
loop();
}else{
}
}
}
for($i=0;$i<30;$++){
$x=0;
loop();
}
I'm sorry, I removed all the comments, they were annoying.
Move the declaration of $x outside of the for loop and replace your label/goto combination with a break, like so...
$x=0;
for($i=0;$i<30;$++) //print 30 articles
{
foreach($feed->get_items($x,1) as $item)
{
// create a unique id for the article of the feed
if($id==$dbid)
{
//if this id exists in the db,take the next article of the same feed which is not in the db
$x++;
continue;
}
else
{
//print the original article you grabbed
}
} // end of foreach
}//end of for
Agree with unset - using break will break if loop and keep iterating through for loop