file_get_contents() function loads different page compared to broswer - php

I am trying to find next page's link of a particular page(i call that particular page as current page here).The current page in program i am using is
http://en.wikipedia.org/wiki/Category:1980_births
The next page link which i am extracting from the current page is the below one
http://en.wikipedia.org/w/index.php?title=Category:1980_births&pagefrom=Alexis%2C+Toya%0AToya+Alexis#mw-pages
But ,, when file_get_contents() function load the next page link it's getting the the current page contents ,,,
The code is
<?php
$string = file_get_contents("http://en.wikipedia.org/wiki/Category:1980_births"); //Getting contents of current page ,
preg_match_all("/\(previous page\) \(<a href=\"(.*)\" title/", $string,$matches); // extracting the next_page_link from the current page contents
foreach ($matches[1] as $match) {
break;
}
$next_page_link = $match;
$next_page_link = "http://en.wikipedia.org" . $next_page_link; //the next_link will have only the path , does't contain the domain name ,,, so i am adding the domain name here, this does't make any impact on the problem statement
$string1 = file_get_contents($next_page_link);
echo $next_page_link;
echo $string1;
?>
As per the code string1 should have next_page_link's content ,, but instead it just getting the current page's content.

In the source of the original web site, the links have entity-encoded ampersands (See Do I encode ampersands in <a href…>?). The browser decodes them normally when you click the anchor, but your scraping code does not. Compare
http://en.wikipedia.org/ ... &pagefrom=Alexis%2C+Toya%0AToya+Alexis#mw-pages
versus
http://en.wikipedia.org ... &pagefrom=Alexis%2C+Toya%0AToya+Alexis#mw-pages
This malformed querystring is what you are in fact passing into file_get_contents. You can convert them back to regular ampersands like this:
// $next_page_link = $match;
$next_page_link = html_entity_decode($match);

Related

PHP URL Encode - hyperlink variable

This is a link I have as part of my .php page.
<li>Stats</li>
'GET' is used to obtain the variable from the URL of the current page (not Information.php). The URL for the current page is http://localhost/Current.php?AssignedI=AI03#
AssignedI = AI03 (it is a string).
When selecting the link, and the Information.php opens, the URL is displayed as:
Information.php?AI=%27
I understand that %27 is the actually ' - when decoded into text. However, how can I make it so that it displays the actual AssignedI - i.e. 'AI03'.
On the Information.php, I have the following code:
$AI = $_GET['AI'];
echo $AI;
This outputs '.
I have been trying to figure this out for a while so any help is much appreciated.
AssignedI = AI03
if i understand the output you want is
Stats
you do not need all those single and double quote i think you are mixing the server side code and client side code
Server side :
<?php echo $_GET[AssignedI] ?>
your server will find all the php tag and output them as html in the file so basically your echo will be output to the link before the browser even read the link
<li>
Stats
</li>
this should ouput what you want. without the ' ... then once you click the link in information.php
$AI = $_GET['AI'];
echo $AI;
should echo AI03 with no quote
I solved this an alternative way.
This is my edited hyperlink:
<li>Stats</li>
I obtained the current URL http://localhost/Current.php?AssignedI=AI03#, using $_SERVER['REQUEST_URI'] and the last four characters, which contained the 'AI' were obtained (using substr(..., -4).

Complex if and else statement with preg_replace function

Good evening...Am a newbie to programming and I was able to work my way round getting url in text to display as link using the codes below
<?php
$textorigen = $row_get_tweets['tweet'];
// URL starting with http://
$reg_exUrl = "/(^|\A|\s)((http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}(\/\S*)?)/";
if(preg_match($reg_exUrl, $textorigen, $url)) {
// make the urls hyper links
$text_result=preg_replace( $reg_exUrl, "$1$2 ", $textorigen );
echo $textorigen=$text_result;
} else {
// if no urls in the text just return the text
echo $text_result=$textorigen;
}
// URL starting www.
$reg_exUrl = "/(^|\A|\s)((www\.)[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}(\/\S*)?)/";
if(preg_match($reg_exUrl, $text_result, $url)) {
// make the urls hyper links
$text_result=preg_replace( $reg_exUrl, "$1$2", $text_result );
echo $textorigen=$text_result;
?>
So Its works fine but duplicates my post if its the second if statement found true i.e (www.anything.com) and If there's a post without a link it doesnt display it.Am very sure its my if statement that is wrong and have spent hours trying to fix it..Kindly help me.
I want it to :
1) Display a www link if posted
2) Display a http link if posted
3) Display the link if no htto or www is posted..Thanks
Kindly note the preg_replace function works perfectly well
The problem is that every time the program reaches echo, the post gets printed again.
So my suggestion is to change all echo $textorigen=$text_result; to just $textorigen=$text_result;, and at the very end of the all the replacement logic put echo $textorigen;.

Creating a redirect page that handles multiple destinations?

I have a website with about 50 pages, each sends the visitor to a different external url, currently with a direct link to the external domain.
I'm rebuilding the website, and want to have the visitor pass first through an internal redirect page.
Instead of building 50 different redirect pages, how can I create one redirect page that would handle all these 50 different destinations?
I guess I can transform the links into
http://mydomain.com/redirect.php?destination=[1/2/3/4/.../50]
But how do I define (in the easiest way) on the redirect.php,
the matching external URL for each of these destination numbers?
preferably with some simple .txt file, like
1 http://external-1.com/
2 http://external-2.com/
.
.
50 http://external-50.com/
Note though that the external URLs would be composed of some PHP variables as well,
e.g.,
http://external-1.com/?id=<?php print $variable; ?>
Thanks
You could build an array containing all your destination URLS, like:
$target = array(0 => "target1.com/something", 1 => "target2.com/somethingelse");
And just redirect
header('location:' .$target[$_GET['destination'])
I am using your example of URL http://mydomain.com/redirect.php?destination=[1/2/3/4/.../50]
function get_url($lineNo){
$array = file("domain.txt");
return preg_replace('/[0-50]+\. /', '', $array[$lineNo]); //removes numbering from result
}
All you need to do is
http://mydomain.com/redirect.php?destination=<?php echo get_url(46) ?>
This assumes that domain no. is in the same line as line no.
EDIT: Did a little more work for you. Now you can write the line nos. as well!
EDIT 2
Q) Would it work if my .txt file looks like this: '1 external-1.com/?id=' and my redirect.php have this command: ''
A) Change the function to this:
function get_url($lineNo){
$array = file("domain.txt");
$domain = preg_replace('/[0-50]+\. /', '', $array[$lineNo]); //removes numbering from result
return $domain.'?id='.$lineNo; //appends $_GET['id'] to the found domain.
}
So if the searched domain is 1. http://www.domain-1.com/ then http://www.domain-1.com/?id=$lineNo would be returned.
You need to provide a parameter to identifying which page you will be redirected. For external pages check yor host name. you need to keep mapping for each external sites in a array or in a txt file. e.g "1" => "externalsite.com1". if you want to mapped it in text file then you have to read it and need to retrieve the url by its key. Your Redirect php will looks like:
if(isset($_REQUEST['destination']){
switch ($_REQUEST['destination']) {
case "1":
//retrive the url corresponding the key "1" and $redUrl= retrived url ;
break;
case "2":
//retrive the url corresponding the key "2" and $redUrl= retrived url ;
break;
case "3":
//retrive the url corresponding the key "3" and $redUrl= retrived url ;
break;
}
header("location:".$redUrl);
}

Issue with & in a string submitted with $_GET

I'm building an "away"-page for my website and when a user posted a link to another website, each visitor clicking that link will be redirected first to the away.php file with an info that I am not responsible for the content of the linked website.
The code in away.php to fetch the incoming browser URI is:
$goto = $_GET['to'];
So far it works, however there's a logical issue with dynamic URIs, in example:
www.mydomain.com/away.php?to=http://example.com
is working, but dynamic URIs like
www.mydomain.com/away.php?to=http://www.youtube.com/watch?feature=fvwp&v=j1p0_R8ZLB0
aren't working since there is a & included in the linked domain, which will cause ending the $_GET['to'] string to early.
The $goto variable contains only the part until the first &:
echo $_GET['to'];
===> "http://www.youtube.com/watch?feature=fvwp"
I understand why, but looking for a solution since I haven't found it yet on the internet.
Try using urlencode:
$link = urlencode("http://www.youtube.com/watch?feature=fvwp&v=j1p0_R8ZLB0") ;
echo $link;
The function will convert url special symbols into appropriate symbols that can carry data.
It will look like this and may be appended to a get parameter:
http%3A%2F%2Fwww.youtube.com%2Fwatch%3Ffeature%3Dfvwp%26v%3Dj1p0_R8ZLB0
To get special characters back (for example to output the link) there is a function urldecode.
Also function htmlentities may be useful.
You can test with this:
$link = urlencode("http://www.youtube.com/watch?feature=fvwp&v=j1p0_R8ZLB0") ;
$redirect = "{$_SERVER['PHP_SELF']}?to={$link}" ;
if (!isset($_GET['to'])){
header("Location: $redirect") ;
} else {
echo $_GET['to'];
}
EDIT:
Ok, I have got a solution for your particular situation.
This solution will work only if:
Parameter to will be last in the query string.
if (preg_match("/to=(.+)/", $redirect, $parts)){ //We got a parameter TO
echo $parts[1]; //Get everything after TO
}
So, $parts[1] will be your link.

Using jquery and ajax to retrieve a set of image path

I am implementing an online book application. It is based on html.
Generally, the book is double page . However , only single page is shown for the first page and it is page 1. I need loading six page for each time. For instance, if user read the 4th 5th page , what I need to get are 2th 3th 4th 5th 6th 7th pages.
So my idea is provide the folder name and one page number (i , e.g. i =
5th) and return i-3,i-2,i-1 ,i , i+1, i+2 image paths.
Here is some sample code I find:
public function getImage(){
$params = $this->getRequest()->getParams();
//d($params);
if ($_SERVER['REQUEST_METHOD'] == 'POST')
{
$src = $params['image_path'];
exit;
}
}
It is useful? and How can I convert it to accept the folder name/ page number? Thank you

Categories