I am trying to write a tool that detects if a remote website uses flash using php. So far I have written a script that detects if embed or objects exist which give an indicator that there is a possibility of it being installed but some sites encrypt their code so renders this function useless.
include_once('simple_html_dom.php');
$flashTotalCount = 0;
function file_get_contents_curl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$html = file_get_contents_curl($url);
$doc = new DOMDocument();
#$doc->loadHTML($html);
foreach($html->find('embed') as $pageEmbed){
$flashTotalCount++;
}
foreach($html->find('object') as $pageObject){
$flashTotalCount++;
}
if($flashTotalCount == 0){
echo "NO FLASH";
}
else{
echo "FLASH";
}
Would anyone one know of a way to check to see if a website uses flash or if possible get header information that flash is being used etc.
Any advise would be helpful.
As far as I understand, flash can be loaded by javascript. So you should execute the web page. For this purposes you'll have to use tool like this:
http://seleniumhq.org/docs/02_selenium_ide.html#the-waitfor-commands-in-ajax-applications
I don't think that it is usable from php.
Related
Introduction
I am developing my presentation site and I want to include my Stack Overflow profile info/posts/data (eg top tag, score and so on.)
I found data.stackexchange.com to retrieve the desired data but I can't understand how can I show this data in my site.
In github.com I found this prerequisites: https://github.com/StackExchange/StackExchange.DataExplorer#prerequisites which basically says that I must be a .NET programmer to be able to display this data but I am a PHP programmer, I work with Apache MySQL and PHP.
I know that there are lots of PHP MsSQL functions I can use but how can I connect to the Stack Exchange database (I think as a guest/limited user) with which username-password?
Even if this is not too much on-topic here, where can I find more info on how I can display Stack Overflow data on my site?
I recommend checking out http://simplehtmldom.sourceforge.net/
Something like this should get reputation using the PHP Simple HTML DOM Parser
$html = file_get_html('https://stackoverflow.com/users/5039442/thetaskmaster');
$reputation = $html->find('.reputation', 0)->plaintext;
Even if CONFUS3D' answer is a good solution, any alteration to the User Interface may cause errors in your site.
I suggest you to use the Stack Exchange API set with which you can retrieve the most of data you probably need.
Any API query will return a JSON object. I use this PHP class to retrieve this object:
class ApiReader {
public function getResponse($url) {
$cH = curl_init();
curl_setopt($cH, CURLOPT_URL, $url);
curl_setopt($cH, CURLOPT_HEADER, 0);
curl_setopt($cH, CURLOPT_RETURNTRANSFER, true);
curl_setopt($cH, CURLOPT_TIMEOUT, 30);
curl_setopt($cH, CURLOPT_USERAGENT, cURL);
curl_setopt($cH, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($cH, CURLOPT_ENCODING, "gzip");
$result = curl_exec($cH);
if(curl_errno($cH)) {
$retur = FALSE;
}
else {
$status = curl_getinfo($cH, CURLINFO_HTTP_CODE);
if($status == 200) {
$retur = $result;
}
else {
$retur = FALSE;
}
}
curl_close($cH);
return $retur;
}
}
I use this little trick to test the site even if I am off-line.
In your host, save all the JSON objects you need to use, then declare two vars $UInfo_API containing the API query and $UInfo_Syn which gets the content of the saved JSON object
$UInfo_API = "api.stackexchange.com/2.2/users/5039442?site=stackoverflow";
$UInfo_Syn = file_get_contents("yourjsonobject.json");
Then save the result in a variable checking if the getResponse() method has failed or not. After that, you have the data on tap.
$sear = new ApiReader();
$uInfo = $sear->getResponse($UInfo_API);
$uInfo = ($uInfo !== FALSE)? json_decode($uInfo, TRUE): json_decode($UInfo_Syn, TRUE);
$rep = $uInfo["items"][0]["reputation"];
This is probably a stupid question, but I'm just wondering if this is possible or if I'm supposed to do something else...
When using multi-curl one would use URLs right?
// create both cURL resources
$ch1 = curl_init();
$ch2 = curl_init();
// set URL and other appropriate options
curl_setopt($ch1, CURLOPT_URL, "http://lxr.php.net/");
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch2, CURLOPT_HEADER, 0);
etc..
Per the multi-curl document...
So what if I have some method (I think thats what you call it) that I'm using from a library
$tags = $instagram->searchTags( 'tag' );
Now thats searching the library of the word tag. But what if I want to be able to do multiple searches,
$tags1 = $instagram->searchTags( 'tag' );
$tags2 = $instagram->searchTags( 'tagme' );
How do I implement this into multi curl? Is it just simply replacing the URLs with $tags1 and tags2?
This is without your class, dont see why you need that.
function fetchHTML($website) {
if(function_exists('curl_init')) {
$ch = curl_init($website);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$content = curl_exec($ch);
curl_close($ch);
} else {
$content = file_get_contents($website)
}
return $content;
}
$dom = new DOMDocument();
$dom->loadHTML(fetchHTML("http://example1.com"));
$tag1 = $dom->getElementsByTagName('tagname');
$dom->loadHTML(fetchHTML("http://example2.com"));
$tag2 = $dom->getElementsByTagName('tagname');
/* Will give you a DOM object list with your first tagname */
print_r($tag1);
/* Will give you a DOM object list with your second tagname */
print_r($tag2);
I have looked into that PHP library anh found that each instance of the Instagram class use only one cURL handler, which leads to the fact that you cannot send multiple request asynchronously.
You can read this article about connection Sharing with CURL in PHP to get the idea of modify the CurlClient class of the Instagram library. The main idea here is keep a static class member which hold a handler from curl_multi_init() and add each new cURL single handler to it when you need.
I am trying to retrieve the content of web pages and check if the page contain certain error keywords I am monitoring. (instead of manually loading each URL everytime to check on the sites, I hope to do this programmatically and flag out errors when they occur)
I have tried XMLHttpRequest. I am able to get the HTML content, like what I see when I "view source" on the page. But the pages I monitor runs on Sharepoint and the webparts are dynamically generated. I believe if error occurs when loading these parts I would not be able to flag them out as the HTML I pull will not contain the errors but just usual paths to the webparts.
cURL seems to do the same. I just read about DOMDocument and I was wondering if DOMDocument process the codes or does it just break the HTML into a hierarchical structure.
I only wish to have the content of the URL. (like what you get when you save website as txt in IE, not the HTML). Or if I can further process the HTML then it would be good too. How can I do that? Any help will be really appreciated. :)
Why do you want to strip the HTML? It's better to use it!
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$data = curl_exec($ch);
curl_close($ch);
// libxml_use_internal_errors(true);
$oDom = new DomDocument();
$oDom->loadHTML($data);
// Go through DOM and look for error (it's similar if it'd be
// <p class="error">error message</p> or whatever)
$errors = $oDom->getElementsByTagName( "error" ); // or however you get errors
foreach( $errors as $error ) {
if(strstr($error->nodeValue, 'SOME ERROR')) {
echo 'SOME ERROR occurred';
}
}
If you don't want to do that, you can just do:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$data = curl_exec($ch);
curl_close($ch);
if(strstr($data, 'SOME_ERROR')) {
echo 'SOME ERROR occurred';
}
I found a script in Web Designer magazine that enables you to gather Album Data from a Facebook Fan Page, and put it on your site.
The script utilizes PHP's file_get_contents() function, which works great on my personal server, but is not allowed on the Network Solutions hosting.
In looking through their documentation, they recommended that you use a cURL session to gather the data. I have never used cURL sessions before, and so this is something of a mystery to me. Any help would be appreciated.
The code I "was" using looked like this:
<?php
$FBid = '239319006081415';
$FBpage = file_get_contents('https://graph.facebook.com/'.$FBid.'/albums');
$photoData = json_decode($FBpage);
$albumID = $photoData->data[0]->id;
$albumURL = "https://graph.facebook.com/".$albumID."/photos";
$rawAlbumData = file_get_contents("https://graph.facebook.com/".$albumID."/photos");
$photoData2 = json_decode($rawAlbumData);
$a = 0;
foreach($photoData2->data as $data) {
$photoArray[$a]["source"] = $data->source;
$photoArray[$a]["width"] = $data->width;
$photoArray[$a]["height"] = $data->height;
$a++;
}
?>
The code that I am attempting to use now looks like this:
<?php
$FBid = '239319006081415';
$FBUrl = "https://graph.facebook.com/".$FBid."/albums";
$ch = curl_init($FBUrl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$contents = curl_exec($ch);
curl_close($ch);
$photoData = json_decode($contents);
?>
When I try to echo or manipulate the contents of $photoData however, it's clear that it is empty.
Any thoughts?
Try removing curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); I'm not exactly sure what that does but I'm not using it and my code otherwise looks very similar. I'd also use:
json_decode($contents,true); This should put the results in an array instead of an object. I've had better luck with this approach.
Put it in the works for me category.
Try this it could work
$ch = curl_init($FBUrl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$contents = curl_exec($ch);
$pageData = json_decode($contents);
//object to array
$objtoarr = get_object_vars($pageData);
curl_close($ch);
Use jquery get Json instead This tips is from FB Album downloader GreaseMonkey script
I want to retrieve the HTML code of a link (web page) in PHP. For example, if the link is
https://stackoverflow.com/questions/ask
then I want the HTML code of the page which is served. I want to retrieve this HTML code and store it in a PHP variable.
How can I do this?
If your PHP server allows url fopen wrappers then the simplest way is:
$html = file_get_contents('https://stackoverflow.com/questions/ask');
If you need more control then you should look at the cURL functions:
$c = curl_init('https://stackoverflow.com/questions/ask');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)
$html = curl_exec($c);
if (curl_error($c))
die(curl_error($c));
// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);
curl_close($c);
Also if you want to manipulate the retrieved page somehow, you might want to try some php DOM parser.
I find PHP Simple HTML DOM Parser very easy to use.
Simple way: Use file_get_contents():
$page = file_get_contents('http://stackoverflow.com/questions/ask');
Please note that allow_url_fopen must be true in you php.ini to be able to use URL-aware fopen wrappers.
More advanced way: If you cannot change your PHP configuration, allow_url_fopen is false by default and if ext/curl is installed, use the cURL library to connect to the desired page.
You may want to check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql
The task at hand is as simple as
select * from html where url = 'http://stackoverflow.com/questions/ask'
You can try this out in the console at: http://developer.yahoo.com/yql/console (requires login)
Also see Chris Heilmanns screencast for some nice ideas what more you can do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html
Here is two different, simple ways to get content from URL:
1) the first method
Enable Allow_url_include from your hosting (php.ini or somewhere)
<?php
$variableee = readfile("http://example.com/");
echo $variableee;
?>
or
2)the second method
Enable php_curl, php_imap and php_openssl
<?php
// you can add anoother curl options too
// see here - http://php.net/manual/en/function.curl-setopt.php
function get_dataa($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$variableee = get_dataa('http://example.com');
echo $variableee;
?>
you can use the DomDocument method to get an individual HTML tag level variable too
$homepage = file_get_contents('https://www.example.com/');
$doc = new DOMDocument;
$doc->loadHTML($homepage);
$titles = $doc->getElementsByTagName('h3');
echo $titles->item(0)->nodeValue;
look at this function:
http://ru.php.net/manual/en/function.file-get-contents.php
you could use file_get_contents if you are wanting to store the source as a variable however curl is a better practive.
$url = file_get_contents('http://example.com');
echo $url;
this solution will display the webpage on your site. However curl is a better option.
include_once('simple_html_dom.php');
$url="http://stackoverflow.com/questions/ask";
$html = file_get_html($url);
You can get the whole HTML code as an array (parsed form) using this code
Download the 'simple_html_dom.php' file here
http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download
$output = file("http://www.example.com"); didn't work until I enabled: allow_url_fopen, allow_url_include, and file_uploads in php.ini for PHP7
I tried this code and it's working for me .
$html = file_get_contents('www.google.com');
$myVar = htmlspecialchars($html, ENT_QUOTES);
echo($myVar);