Get any site data in .txt file using curl in php?

Get any site data in .txt file using curl in php? - php

I am trying to get the data of any website in my .txt file using curl in php.
so there is some issue in code. yesterday i get the data in web_content.txt file. But i don not know why and where error generate i am not getting the site url in .txt file.
Here is my code you can check and please help me what should i do to remove this error and get the data.
<?php
if($_REQUEST['url_text'] == "") {
echo "<div class='main_div border_radius'><div class='sub_div'> Please type any site URL.</div></div>";
} else {
$site_url = $_REQUEST['url_text'];
$curl_ref=curl_init();//init the curl
curl_setopt($curl_ref,CURLOPT_URL,$site_url);// Getting site
curl_setopt($curl_ref,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_ref,CURLOPT_RETURNTRANSFER,true);
$site_data = curl_exec($curl_ref);
if (empty($site_data)) {
echo "<div class='main_div border_radius'><div class='sub_div'> Data is not available.</div></div>";
}
else {
echo "<div class='main_div border_radius'><div class='sub_div'>".$site_data."</div></div>";
}
$fp = fopen("web_content.txt", "w") or die("Unable to open file!");//Open a file in write mode
curl_setopt($curl_ref, CURLOPT_FILE, $fp);
curl_setopt($curl_ref, CURLOPT_HEADER, 0);
curl_exec($curl_ref);
curl_close($curl_ref);
fclose($fp);
}
?>
if(isset($_GET['start_tag_textbox']) && ($_GET['end_tag_textbox'])) {
get_tag_data($_GET['start_tag_textbox'],$_GET['end_tag_textbox']);
}
header('content-type: text/plain');
function get_tag_data ($start,$end) {
$file_data= "";
$tag_data = "";
$file = fopen("web_content.txt","r");
$file_data = fread($file,filesize("web_content.txt"));
fclose($file);
$last_pos = strpos($start,'>');
$start = substr_replace($start,".*>",$last_pos);
preg_match_all('#'.$start.'(.*)'.$end.'#U',$file_data,$matches);
for($i=0; $i<count($matches[1]);$i++){
echo $matches[1][$i]."\n";
}
}
?>
<?php
$url = "http://www.ucertify.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);
$file = fopen("web_content.txt","w");
fwrite($file,$output);
fclose($file);
?>

Related

How to download google search big image using php script?

I want to download the google search big image. Below is my code, which is working fine for download small image from google search.
<?php
include_once('simple_html_dom.php');
set_time_limit(0);
$fp = fopen('csv/search.csv','r') or die("can't open file");
$csv_data = array();
while($csv_line = fgetcsv($fp)) {
for ($i = 0, $j = count($csv_line); $i < $j; $i++) {
$imgname = $csv_line[$i];
$search_query = $csv_line[$i];
$search_query = urlencode(trim($search_query));
$html=file_get_html('http://images.google.com/images?as_q='. $search_query .'&hl=en&imgtbs=z&btnG=Search+Images&as_epq=&as_oq=&as_eq=&imgtype=&imgsz=m&imgw=&imgh=&imgar=&as_filetype=&imgc=&as_sitesearch=&as_rights=&safe=images&as_st=y');
$image_container = $html->find('div#rcnt', 0);
$images = $html->find('img');
$image_count = 1; //Enter the amount of images to be shown
$i = 0;
foreach($images as $image){
$srcimg = $image->src;
if($i == $image_count) break;
$i++;
$randname = $imgname.".jpg";
$randname = "Images/".$randname;
file_put_contents("$randname", file_get_contents($srcimg));
}
}
}
?>
Any idea?

This worked for me. simple_html_dom.php wouldn't do the trick since the 'big image' is inside a snippet of JSON near each thumbnail in the DOM.
<?php
$search_query = "Some Keyword"; //change this
$search_query = urlencode( $search_query );
$googleRealURL = "https://www.google.com/search?hl=en&biw=1360&bih=652&tbs=isz%3Alt%2Cislt%3Asvga%2Citp%3Aphoto&tbm=isch&sa=1&q=".$search_query."&oq=".$search_query."&gs_l=psy-ab.12...0.0.0.10572.0.0.0.0.0.0.0.0..0.0....0...1..64.psy-ab..0.0.0.wFdNGGlUIRk";
// Call Google with CURL + User-Agent
$ch = curl_init($googleRealURL);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux i686; rv:20.0) Gecko/20121230 Firefox/20.0');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$google = curl_exec($ch);
$array_imghtml = explode("\"ou\":\"", $google); //the big url is inside JSON snippet "ou":"big url"
foreach($array_imghtml as $key => $value){
if ($key > 0) {
$array_imghtml_2 = explode("\",\"", $value);
$array_imgurl[] = $array_imghtml_2[0];
}
}
var_dump($array_imgurl); //array contains the urls for the big images
die();
?>

I think rather than crawling the page you can you the google custom search api
for more details here the url:
https://developers.google.com/custom-search/json-api/v1/overview

Using cURL and PHP for CACTI in Windows

Recently tasked to monitor external webpage response/loading time via CACTI. I found some PHP scripts that were working (pageload-agent.php and class.pageload.php) using cURL. All was working fine until they requested it to be transferred from LINUX to Windows 2012R2 server. I'm having a very hard time modifying the scripts to work for windows. Already installed PHP and cURL and both working as tested. Here are the scripts taken from askaboutphp.
class.pageload.php
<?php
class PageLoad {
var $siteURL = "";
var $pageInfo = "";
/*
* sets the URLs to check for loadtime into an array $siteURLs
*/
function setURL($url) {
if (!empty($url)) {
$this->siteURL = $url;
return true;
}
return false;
}
/*
* extract the header information of the url
*/
function doPageLoad() {
$u = $this->siteURL;
if(function_exists('curl_init') && !empty($u)) {
$ch = curl_init($u);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_ENCODING, "gzip");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)");
$pageBody = curl_exec($ch);
$this->pageInfo = curl_getinfo($ch);
curl_close ($ch);
return true;
}
return false;
}
/*
* compile the page load statistics only
*/
function getPageLoadStats() {
$info = $this->pageInfo;
//stats from info
$s['dest_url'] = $info['url'];
$s['content_type'] = $info['content_type'];
$s['http_code'] = $info['http_code'];
$s['total_time'] = $info['total_time'];
$s['size_download'] = $info['size_download'];
$s['speed_download'] = $info['speed_download'];
$s['redirect_count'] = $info['redirect_count'];
$s['namelookup_time'] = $info['namelookup_time'];
$s['connect_time'] = $info['connect_time'];
$s['pretransfer_time'] = $info['pretransfer_time'];
$s['starttransfer_time'] = $info['starttransfer_time'];
return $s;
}
}
?>
pageload-agent.php
#! /usr/bin/php -q
<?php
//include the class
include_once 'class.pageload.php';
// read in an argument - must make sure there's an argument to use
if ($argc==2) {
//read in the arg.
$url_argv = $argv[1];
if (!eregi('^http://', $url_argv)) {
$url_argv = "http://$url_argv";
}
// check that the arg is not empty
if ($url_argv!="") {
//initiate the results array
$results = array();
//initiate the class
$lt = new PageLoad();
//set the page to check the loadtime
$lt->setURL($url_argv);
//load the page
if ($lt->doPageLoad()) {
//load the page stats into the results array
$results = $lt->getPageLoadStats();
} else {
//do nothing
print "";
}
//print out the results
if (is_array($results)) {
//expecting only one record as we only passed in 1 page.
$output = $results;
print "dns:".$output['namelookup_time'];
print " con:".$output['connect_time'];
print " pre:".$output['pretransfer_time'];
print " str:".$output['starttransfer_time'];
print " ttl:".$output['total_time'];
print " sze:".$output['size_download'];
print " spd:".$output['speed_download'];
} else {
//do nothing
print "";
}
}
} else {
//do nothing
print "";
}
?>
Thank you. any type of assistance is greatly appreciated.

PHP Get metadata of remote .mp3 file (from URL)

I am trying to get song name / artist name / song length / bitrate etc from a remote .mp3 file such as http://shiro-desu.com/scr/11.mp3 .
I have tried getID3 script but from what i understand it doesn't work for remote files as i got this error: "Remote files are not supported - please copy the file locally first"
Also, this code:
<?php
$tag = id3_get_tag( "http://shiro-desu.com/scr/11.mp3" );
print_r($tag);
?>
did not work either.
"Fatal error: Call to undefined function id3_get_tag() in /home4/shiro/public_html/scr/index.php on line 2"

As you haven't mentioned your error I am considering a common error case undefined function
The error you get (undefined function) means the ID3 extension is not enabled in your PHP configuration:
If you dont have Id3 extension file .Just check here for installation info.

Firstly, I didn’t create this, I’ve just making it easy to understand with a full example.
You can read more of it here, but only because of archive.org.
https://web.archive.org/web/20160106095540/http://designaeon.com/2012/07/read-mp3-tags-without-downloading-it/
To begin, download this library from here: http://getid3.sourceforge.net/
When you open the zip folder, you’ll see ‘getid3’. Save that folder in to your working folder.
Next, create a folder called “temp” in that working folder that the following script is going to be running from.
Basically, what it does is download the first 64k of the file, and then read the metadata from the file.
I enjoy a simple example. I hope this helps.
<?php
require_once("getid3/getid3.php");
$url_media = "http://example.com/myfile.mp3"
$a=getfileinfo($url_media);
echo"<pre>";
echo $a['tags']['id3v2']['album'][0] . "\n";
echo $a['tags']['id3v2']['artist'][0] . "\n";
echo $a['tags']['id3v2']['title'][0] . "\n";
echo $a['tags']['id3v2']['year'][0] . "\n";
echo $a['tags']['id3v2']['year'][0] . "\n";
echo "\n-----------------\n";
//print_r($a['tags']['id3v2']['album']);
echo "-----------------\n";
//print_r($a);
echo"</pre>";
function getfileinfo($remoteFile)
{
$url=$remoteFile;
$uuid=uniqid("designaeon_", true);
$file="temp/".$uuid.".mp3";
$size=0;
$ch = curl_init($remoteFile);
//==============================Get Size==========================//
$contentLength = 'unknown';
$ch1 = curl_init($remoteFile);
curl_setopt($ch1, CURLOPT_NOBODY, true);
curl_setopt($ch1, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch1, CURLOPT_HEADER, true);
curl_setopt($ch1, CURLOPT_FOLLOWLOCATION, true); //not necessary unless the file redirects (like the PHP example we're using here)
$data = curl_exec($ch1);
curl_close($ch1);
if (preg_match('/Content-Length: (\d+)/', $data, $matches)) {
$contentLength = (int)$matches[1];
$size=$contentLength;
}
//==============================Get Size==========================//
if (!$fp = fopen($file, "wb")) {
echo 'Error opening temp file for binary writing';
return false;
} else if (!$urlp = fopen($url, "r")) {
echo 'Error opening URL for reading';
return false;
}
try {
$to_get = 65536; // 64 KB
$chunk_size = 4096; // Haven't bothered to tune this, maybe other values would work better??
$got = 0; $data = null;
// Grab the first 64 KB of the file
while(!feof($urlp) && $got < $to_get) { $data = $data . fgets($urlp, $chunk_size); $got += $chunk_size; } fwrite($fp, $data); // Grab the last 64 KB of the file, if we know how big it is.
if ($size > 0) {
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RESUME_FROM, $size - $to_get);
curl_exec($ch);
}
// Now $fp should be the first and last 64KB of the file!!
#fclose($fp);
#fclose($urlp);
}
catch (Exception $e) {
#fclose($fp);
#fclose($urlp);
echo 'Error transfering file using fopen and cURL !!';
return false;
}
$getID3 = new getID3;
$filename=$file;
$ThisFileInfo = $getID3->analyze($filename);
getid3_lib::CopyTagsToComments($ThisFileInfo);
unlink($file);
return $ThisFileInfo;
}
?>

Is there an alternative to file_get_contents?

Is there an alternative to file_get_contents? This is the code I'm having issues with:
if ( !$img = file_get_contents($imgurl) ) { $error[] = "Couldn't find the file named $card.$format at $defaultauto"; }
else {
if ( !file_put_contents($filename,$img) ) { $error[] = "Failed to upload $filename"; }
else { $success[] = "All missing cards have been uploaded"; }
}
I tried using cURL but couldn't quite figure out how to accomplish what this is accomplishing. Any help is appreciated!

There are many alternatives to file_get_contents I've posted a couple of alternatives below.
fopen
function fOpenRequest($url) {
$file = fopen($url, 'r');
$data = stream_get_contents($file);
fclose($file);
return $data;
}
$fopen = fOpenRequest('https://www.example.com');// This returns the data using fopen.
curl
function curlRequest($url) {
$c = curl_init();
curl_setopt($c, CURLOPT_URL, $url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($c);
curl_close($c);
return $data;
}
$curl = curlRequest('https://www.example.com');// This returns the data using curl.
You could use one of these available options with the data stored in a variable to preform what you need to.

Using curl as an alternative to fopen file resource for fgetcsv

Is it possible to make curl, access a url and the result as a file resource? like how fopen does it.
My goals:
Parse a CSV file
Pass it to fgetcsv
My obstruction: fopen is disabled
My chunk of codes (in fopen)
$url = "http://download.finance.yahoo.com/d/quotes.csv?s=USDEUR=X&f=sl1d1t1n&e=.csv";
$f = fopen($url, 'r');
print_r(fgetcsv($f));
Then, I am trying this on curl.
$curl = curl_init();
curl_setopt($curl, CURLOPT_VERBOSE, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, false);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, $param);
curl_setopt($curl, CURLOPT_URL, $url);
$content = #curl_exec($curl);
curl_close($curl);
But, as usual. $content already returns a string.
Now, is it possible for curl to return it as a file resource pointer? just like fopen? Using PHP < 5.1.x something. I mean, not using str_getcsv, since it's only 5.3.
My error
Warning: fgetcsv() expects parameter 1 to be resource, boolean given
Thanks

Assuming that by fopen is disabled you mean "allow_url_fopen is disabled", a combination of CURLOPT_FILE and php://temp make this fairly easy:
$f = fopen('php://temp', 'w+');
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FILE, $f);
// Do you need these? Your fopen() method isn't a post request
// curl_setopt($curl, CURLOPT_POST, true);
// curl_setopt($curl, CURLOPT_POSTFIELDS, $param);
curl_exec($curl);
curl_close($curl);
rewind($f);
while ($line = fgetcsv($f)) {
print_r($line);
}
fclose($f);
Basically this creates a pointer to a "virtual" file, and cURL stores the response in it. Then you just reset the pointer to the beginning and it can be treated as if you had opened it as usual with fopen($url, 'r');

You can create a temporary file using fopen() and then fwrite() the contents into it. After that, the newly created file will be readable by fgetcsv(). The tempnam() function should handle the creation of arbitrary temporary files.
According to the comments on str_getcsv(), users without access to the command could try the function below. There are also various other approaches in the comments, make sure you check them out.
function str_getcsv($input, $delimiter = ',', $enclosure = '"', $escape = '\\', $eol = '\n') {
if (is_string($input) && !empty($input)) {
$output = array();
$tmp = preg_split("/".$eol."/",$input);
if (is_array($tmp) && !empty($tmp)) {
while (list($line_num, $line) = each($tmp)) {
if (preg_match("/".$escape.$enclosure."/",$line)) {
while ($strlen = strlen($line)) {
$pos_delimiter = strpos($line,$delimiter);
$pos_enclosure_start = strpos($line,$enclosure);
if (
is_int($pos_delimiter) && is_int($pos_enclosure_start)
&& ($pos_enclosure_start < $pos_delimiter)
) {
$enclosed_str = substr($line,1);
$pos_enclosure_end = strpos($enclosed_str,$enclosure);
$enclosed_str = substr($enclosed_str,0,$pos_enclosure_end);
$output[$line_num][] = $enclosed_str;
$offset = $pos_enclosure_end+3;
} else {
if (empty($pos_delimiter) && empty($pos_enclosure_start)) {
$output[$line_num][] = substr($line,0);
$offset = strlen($line);
} else {
$output[$line_num][] = substr($line,0,$pos_delimiter);
$offset = (
!empty($pos_enclosure_start)
&& ($pos_enclosure_start < $pos_delimiter)
)
?$pos_enclosure_start
:$pos_delimiter+1;
}
}
$line = substr($line,$offset);
}
} else {
$line = preg_split("/".$delimiter."/",$line);
/*
* Validating against pesky extra line breaks creating false rows.
*/
if (is_array($line) && !empty($line[0])) {
$output[$line_num] = $line;
}
}
}
return $output;
} else {
return false;
}
} else {
return false;
}
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Get any site data in .txt file using curl in php? - php

Related

How to download google search big image using php script?

Using cURL and PHP for CACTI in Windows

PHP Get metadata of remote .mp3 file (from URL)

Is there an alternative to file_get_contents?

Using curl as an alternative to fopen file resource for fgetcsv

Categories

Resources