how can read whole of page that variable address using php - php

how can Dynamic this code with php .my address is variable
<?php
$pagecontents = file_get_contents("http://google.com");
$html = htmlentities($pagecontents);
echo $html;
?>

I'm not sure, I understood, what the goal is, but if you want to do the same thing, that you have shown in the question but with multiple sites, then you can do it with a simple loop:
$sites = ["aa.com/a", "aa.com/b"] // array(...) with earlier PHP versions
foreach($sites as $url) {
$pagecontents = file_get_contents($url);
echo htmlentities($pagecontents);
}
If this is not what you are looking for, then please refactor the question, so it clearly explains, what you want to do!

Related

Bibliography in PHP with CSL

I am trying to display a bibliography in PHP and allowing the use of CSL to format it, but am coming up short of good examples of how to implement it. Basically, I am looking for a library or script which can take a bibliography, in the form of Bibtex or JSON or similar, and output it as HTML through PHP.
Formatting with CSL, through for example citeproc-php, would accomodate a vast variety of output styles. Does anyone know of any examples of this, or up-to-date libraries for doing so?
The author of citeproc-php answered an issue on GitHub with some details:
<?php
include 'vendor/autoload.php';
use \AcademicPuma\CiteProc\CiteProc;
$bibliographyStyleName = 'apa';
$lang = "en-US";
$csl = CiteProc::loadStyleSheet($bibliographyStyleName);
$citeProc = new CiteProc($csl, $lang);
$file = file_get_contents("citations.json");
$data = json_decode($file);
echo "<ul>";
foreach ($data as $item) {
echo "<li>".$citeProc->render($item)."</li>";
}
echo "</ul>";
?>
And this works as expected with a sample citations.json from citeproc-js.

PHP Crawler not crawling all elements

so i'm trying to make a PHP crawler (for personal use).
What the code does is displaying "found" for each ebay auction item found that ends in less than 1 hour but there seems to be a problem. The crawler can't get all the span elements and the "remaining time" element is a .
the simple_html_dom.php is downloaded and not edited.
<?php include_once('simple_html_dom.php');
//url which i want to crawl -contains GET DATA-
$url = 'http://www.ebay.de/sch/Apple-Notebooks/111422/i.html?LH_Auction=1&Produktfamilie=MacBook%7CMacBook%2520Air%7CMacBook%2520Pro%7C%21&LH_ItemCondition=1000%7C1500%7C2500%7C3000&_dcat=111422&rt=nc&_mPrRngCbx=1&_udlo&_udhi=20';
$html = new simple_html_dom();
$html->load_file($url);
foreach($html->find('span') as $part){
echo $part;
//when i echo $part it does display many span elements but not the remaining time ones
$cur_class = $part->class;
//the class attribute of an auction item that ends in less than an hour is equal with "MINUTES timeMs alert60Red"
if($cur_class == 'MINUTES timeMs alert60Red'){
echo 'found';
}
}
?>
Any answers would be useful, thanks in advance
Looking at the fetched HTML it seems as if the class alert60Red is set through JavaScript. So you couldn't find it as JavaScript is never executed.
So just searching for MINUTES timeMs looks stable as well.
<?php
include_once('simple_html_dom.php');
$url = 'http://www.ebay.de/sch/Apple-Notebooks/111422/i.html?LH_Auction=1&Produktfamilie=MacBook%7CMacBook%2520Air%7CMacBook%2520Pro%7C%21&LH_ItemCondition=1000%7C1500%7C2500%7C3000&_dcat=111422&rt=nc&_mPrRngCbx=1&_udlo&_udhi=20';
$html = new simple_html_dom();
$html->load_file($url);
foreach ($html->find('span') as $part) {
$cur_class = $part->class;
if (strpos($cur_class, 'MINUTES timeMs') !== false) {
echo 'found';
}
}
If a snippet of code is included in another php file, or html is embedded in php, your browser cannot see it.
So no webcrawl api can detect it. I think your best bet is to find the location of simple_html_Dom.php and try crawl that file somehow. You may not even be able to get access to it. It's tricky.
You could also try find by Id if your api has that function?

How can I sanitise the explode() function to extract only the marker I require?

I have some php code that extracts a web address. The object I have extracted is of the form:
WEBSITE?flage=2&fgast=48&frat=1&sort=D&fsrc=2&wid=bf&page=1&id=16123012&source=searchresults
Now in PHP I have called this object $linkHREF
I want to extract the id element only and put it into an array (I'm bootstrapping this process to get multiple id's)
So the command is:
$detailPagePathArray = explode("id=",$linkHREF); #Array
Now the problem is the output of this includes what comes after the id tag, so the output looks like:
echo $detailPagePathArray[0] = WEBSITE?flage=2&fgast=48&frat=1&sort=D&fsrc=2&w
echo $detailPagePathArray[1] = bf&page=1&
echo $detailPagePathArray[2] = 16123012&source=searchresults
Now the problem is obvious, where it'd firstly picking up the "id" in the "wid" marker and cutting it there, however the secondary problem is it's also picking up all the material after the actual "id". I'm just interested in picking up "16123012".
Can you please explain how I can modify my explode command to point it to the particular marker I'm interested in?
Thanks.
Use the built-in functions provided for the purpose.
For example:
<?php
$url = 'http://www.example.com?flage=2&fgast=48&frat=1&sort=D&fsrc=2&wid=bf&page=1&id=16123012&source=searchresults';
$qs = parse_url($url);
parse_str($qs['query'], $vars);
$id = $vars['id'];
echo $id; // 16123012
?>
References:
parse_url()
parse_str()
if you are sure that you are getting &id=123456 only once in your object, then below
$linkHREF = "WEBSITE?flage=2&fgast=48&frat=1&sort=D&fsrc=2&wid=bf&page=1&id=16123012&source=searchresults";
$str = current(explode('&',end(explode('&id', $linkHREF,2))));
echo "id" .$str; //output id = 16123012

php print whole page

Quite simple question i belive. How to print the whole page into variable and then use where i need.
For instance if the code is:
<?php
$arr = array('hello','mate','world');
foreach ($arr as $a) {print "<p>".$a."</p>"; }
?>
Now if we go to that page, we can see an array output, but i would prefer to print the whole page into variable and then generate static page for instance out of that.
Maybe file_get_content or <<<EOT, but the page will get more complicated later so not sure what is the best option.
Not sure about your exact needs but:
ob_start();
require('/path/to/templates/foo.php');
$template = ob_get_contents();
ob_get_clean();
ob_start();
// your code
$var = ob_get_clean();
print $var;
Why don't you use smarty ,
put all HTML in a template , and insert PHP code or variables into it . in the end , using $x=$smarty->fetch('template_name'); you put all the page in the $x variable ...

How to write a PHP script to find the number of indexed pages in Google?

I need to find the number of indexed pages in google for a specific domain name, how do we do that through a PHP script?
So,
foreach ($allresponseresults as $responseresult)
{
$result[] = array(
'url' => $responseresult['url'],
'title' => $responseresult['title'],
'abstract' => $responseresult['content'],
);
}
what do i add for the estimated number of results and how do i do that?
i know it is (estimatedResultCount) but how do i add that? and i call the title for example this way: $result['title'] so how to get the number and how to print the number?
Thank you :)
I think it would be nicer to Google to use their RESTful Search API. See this URL for an example call:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:stackoverflow.com&filter=0
(You're interested in the estimatedResultCount value)
In PHP you can use file_get_contents to get the data and json_decode to parse it.
You can find documentation here:
http://code.google.com/apis/ajaxsearch/documentation/#fonje
Example
Warning: The following code does not have any kind of error checking on the response!
function getGoogleCount($domain) {
$content = file_get_contents('http://ajax.googleapis.com/ajax/services/' .
'search/web?v=1.0&filter=0&q=site:' . urlencode($domain));
$data = json_decode($content);
return intval($data->responseData->cursor->estimatedResultCount);
}
echo getGoogleCount('stackoverflow.com');
You'd load http://www.google.com/search?q=domaingoeshere.com with cURL and then parse the file looking for the results <p id="resultStats" bit.
You'd have the resulting html stored in a variable $html and then say something like
$arr = explode('<p id="resultStats"'>, $html);
$bottom = $arr[1];
$middle = explode('</p>', $bottom);
Please note that this is untested and a very rough example. You'd be better off parsing the html with a dedicated parser or matching the line with regular expressions.
google ajax api estimatedResultCount values doesn't give the right value.
And trying to parse html result is not a good way because google blocks after several search.
Count the number of results for site:yourdomainhere.com - stackoverflow.com has about 830k
// This will give you the count what you see on search result on web page,
//this code will give you the HTML content from file_get_contents
header('Content-Type: text/plain');
$url = "https://www.google.com/search?q=your url";
$html = file_get_contents($url);
if (FALSE === $html) {
throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}
$arr = explode('<div class="sd" id="resultStats">', $html);
$bottom = $arr[1];
$middle = explode('</div>', $bottom);
echo $middle[0];
Output:
About 8,130 results
//vKj
Case 2: you can also use google api, but its count is different:
https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=ursitename&callback=processResults
https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:google.com
cursor":{"resultCount":"111,000,000","
"estimatedResultCount":"111000000",

Categories