Load All TXT Files In A External Directory - php

So I need to load all the txt files in: http://orcahub.com/unchecked-proxy-list/ as one txt file and go into my server which is a different one to Orcahub;
For some reason it wont work. I cant get it to actually get the HTML to even do regex.
What I tried:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://orcahub.com/unchecked-proxy-list');
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_NOBODY, FALSE); // remove body
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$st = curl_exec($ch);
//curl_close($ch);
//preg_match_all("/(.*\.txt)/", $st, $out);
var_dump($ch);
?>
UPDATE:
New issue, I get a Server Error 500 when I use the following script:
UPDATE: Found out this issue was from a newline after the URL.
<?php
function disguise_curl($url) {
//Prepare Curl;
$curl = curl_init();
//Setup Headers (Firefox 2.0.0.6);
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
//Setup Curl;
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'http://orcahub.com/unchecked-proxy-list/');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 60);
//Execute Curl;
$html = curl_exec($curl);
//End Curl;
curl_close($curl);
//Output the HTML;
return $html;
}
function rem_href($x) { return substr(strstr($x, '>'), strlen('>')); }
$response = disguise_curl('http://orcahub.com/unchecked-proxy-list/');
preg_match_all("/<a[\s]+[^>]*?href[\s]?=[\s\"\']+"."(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/", $response, $matches, PREG_SET_ORDER );
foreach($matches as $value) {
$proxylists[] = 'http://orcahub.com/unchecked-proxy-list/'.rem_href($value[0]);
};
echo $proxylists[0];
$response = disguise_curl($proxylists[0]);
//Server Error 500 Here;
echo $response;
?>

Came accross from php.net a function that add headers to disguise the call, a regex I added for parsing the response:
function disguise_curl($url)
{
$curl = curl_init();
// Setup headers - I used the same headers from Firefox version 2.0.0.6
// below was split up because php.net said the line was too long. :/
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
$html = curl_exec($curl); // execute the curl command
curl_close($curl); // close the connection
return $html; // and finally, return $html
}
$response = disguise_curl('http://orcahub.com/unchecked-proxy-list/');
preg_match_all("/<a[\s]+[^>]*?href[\s]?=[\s\"\']+"."(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/", $response, $matches, PREG_SET_ORDER );
foreach($matches as $value) {
var_dump($value);
};

Related

Curl work suitable on localhost but not working on server

Right now I use curl to get html5player.setVideoUrlLow and is working good but is only poor quality. So i need to get html5player.setVideoUrlHigh but this param don't appear with curl response if I run from server! On localhost work fine! What I missing from my code?
Tried already with different CURLOPT_USERAGENT ans same problem! Thank you!
<?php
function getstring($string,$start,$end)
{
$str = explode($start,$string);
$str = explode($end,$str[1]);
return $str[0];
}
$viewkey = $_GET['viewkey']; // https://mypage.com/view_video.php?viewkey=54501623
$url = "http://www.xvideos.com/embedframe/".$viewkey."";
// $url = "http://www.xvideos.com/video".$viewkey.""; // alternate but same result //
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) Ap");
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, "http://www.google.com/bot.html");
curl_setopt($curl, CURLOPT_ENCODING, "gzip,deflate");
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION,true);
$html = curl_exec($curl);
curl_close($curl);
$VideoUrlLow=getstring($html,"html5player.setVideoUrlLow('","');");
$VideoUrlHigh=getstring($html,"html5player.setVideoUrlHigh('","');");
if($VideoUrlHigh!="")
{
$mp4 = $VideoUrlHigh; // empty on server but work in localhost
} else
{
$mp4 = $VideoUrlLow;
}
header('Location: '.$mp4);
?>

PHP cURL no way on this specific URL

I find no way to PHP cURL this URL :
http://www.bvger.ch/publiws/pub/cache.jsf?displayName=A-1695/2006&decisionDate=2007-02-27
Can any of you help me ? I tried many ways without any success. For example :
FUNCTION get_data2($url)
{
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com/bot.html');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
$html = curl_exec($curl); // execute the curl command
curl_close($curl); // close the connection
return $html; // and finally, return $html
}
OR
FUNCTION get_data1($url)
{
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
Both return nothing echoed.
This worked fine for me...
function get_data2($url)
{
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com/bot.html');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
$html = curl_exec($curl); // execute the curl command
curl_close($curl); // close the connection
return $html; // and finally, return $html
}
$url = 'http://www.bvger.ch/publiws/pub/cache.jsf?displayName=A-1695/2006&decisionDate=2007-02-27';
echo get_data2($url);
Don't capitalize "function". Also you must have been cut-n-pasting and didn't correct one of the lines to match the others using $curl instead of $ch.

PHP Curl Function doesn't work

I have some problem with the curl function. I am fetching Instagram posts with curl, but suddenly the function stopped working. I tried various changes but I can't make it work.
This is the function:
function cek($url) {
$curl = curl_init();
/**
* Setup headers - I used the same headers from Firefox version 2.0.0.6
* below was split up because php.net said the line was too long.
*/
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($curl, CURLOPT_HEADER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($curl, CURLOPT_MAXREDIRS, 10); /* Max redirection to follow */
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
$html = curl_exec($curl); // execute the curl command
curl_close($curl); // close the connection
return $html; // and finally, return $html
}
function get_curl($url) {
if(function_exists('curl_init')) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$output = curl_exec($ch);
echo curl_error($ch);
curl_close($ch);
return $output;
} else {
return file_get_contents($url);
}
}
and this is where I call the function:
#$var = cek('https://api.instagram.com/v1/tags/'.instagram_tag.'/media/recent?client_id='.intagram_clientid.'&count=2');
$var = get_curl('https://api.instagram.com/v1/tags/'.instagram_tag.'/media/recent?max_tag_id=11&min_tag_id=1&access_token='.intagram_token.'&callback=?');
$cek = json_decode($var);

Fetching the meta title of a Facebook profile through PHP

I'm trying to fetch the meta title of a website through the PHP function below.
It works for all websites but Facebook, I'm getting this this error:
"Update Your Browser | Facebook"
function fetchMeta($url)
{
$ch = curl_init();
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $urlX);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$sites_html = fetchMeta("https://www.facebook.com/stackexchange");
echo $sites_html;
What do I need to change to make it work?
Try sending your user agent with it, right now as it's blank (or cURL's default)
curl_setopt($ch, CURLOPT_USERAGENT, "Proper user agent string, maybe your browsers?");
Facebook is detecting you are using an unsupported/old browser as it's not able to detect a modern user agent.
TRY THIS
<?php
// Fetching the meta title of a Facebook profile through PHP
ini_set("user_agent","facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"); // EXTRA
function fetchMeta($url)
{
$ch = curl_init();
$timeout = 5; // EXTRA
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] = "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
//curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
// EXTRA
curl_setopt($ch, CURLOPT_USERAGENT, "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)");
curl_setopt($ch, CURLOPT_REFERER, "http://facebook.com");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // SSL TRUE or FALSE
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_POST, FALSE);
// EXTRA
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$sites_html = fetchMeta("https://www.facebook.com/albdroid.official"); //Albdroid Test <Working OK>
//$sites_html = fetchMeta("https://www.facebook.com/stackexchange"); // stackexchange Test <Working OK>
echo $sites_html;
?>

php curl is not fetching the data from youtube API [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Php - Debugging Curl
I am using curl to fetch the user data from youtube API
The code is
$url_userInfo = 'https://gdata.youtube.com/feeds/api/users/default?access_token=ya29.AHES6ZS7GMdZf91LbMtoOdhFSFOpTuHHT-t7pSggAp-tS0A;
print_r($url_userInfo);
$ch = curl_init($url_userInfo);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, '3');
$content = curl_exec($ch);
curl_close($ch);
print_r( $content);
If I manually visit this url it displays the data in xml form. but there is nothing to print in $content.
Is there any problem with code??
That's actually a HTTPS link, i.e. it uses SSL, and you need to get cacert.pem, and set up cURL for SSL to make that work.
You can get the certificate here!
and you would set it up like so:
$curl = curl_init();
$browser = $_SERVER['HTTP_USER_AGENT'];
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
curl_setopt($curl, CURLOPT_FAILONERROR,true);
curl_setopt($curl, CURLOPT_USERAGENT, $browser);
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($curl, CURLOPT_TIMEOUT, 20);
curl_setopt($curl, CURLOPT_HTTPAUTH, CURLAUTH_ANY); //needed for SSL
curl_setopt($curl, CURLOPT_CAINFO, "/scripts/cacert.pem"); //path to file
curl_setopt($curl, CURLOPT_URL, $url_userInfo);
$content = curl_exec($curl);
echo curl_error($curl); //display errors under development
curl_close($ch);
print_r( $content );
Using a proper user agent, and authenticating with SSL and a certificate, just like the browser would.

Categories