can't scrape data - from httpswebsite - php

I am trying to get some country name from one of website. that website URL starting with https so i am not able to scrape data. please give me some solution.
Here is my code :
$curl = curl_init('https://testing.co/india');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$page = curl_exec($curl);
if (curl_errno($curl)) {`enter code here`
echo 'Scraper error: ' . curl_error($curl);
exit;
}
curl_close($curl);
$regex = '/<a class="startup-link">(.*?)<\/a>/s';
if (preg_match($regex, $page, $list))
echo $list[0];
else
print "Not found";
Get this error : Scraper error: SSL certificate problem: unable to get local issuer certificate

Today i am solving this problem and i came to know about it.
See. Below is code that is working for me.
// Set so curl_exec returns the result instead of outputting it.<br/>
$url = "https://www.google.co.in/?gws_rd=ssl";<br/>
$ch = curl_init();<br/>
curl_setopt($ch, CURLOPT_URL, $url);<br/><br/>
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);<br/>
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);<br/>
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);<br/>
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "GeoTrustGlobalCA.crt");
<br/>
// Get the response and close the channel.<br/>
$response = curl_exec($ch);<br/>
$link = fopen("data.txt","w+");<br/>
fputs($link,$response);<br/>
fclose($link);<br/>
curl_close($ch);<br/>
You have pass certificate on this..
On Mozilla firefox left handside of website URL you get one information icon. Then click on Security tab then find View certificate. Click on Details Tab.
See Certificate Hierarchy Section. Click on top most label and see below there is an option as EXPORT. Export that certificate and save the CA certificate to your selected location, making sure to select the X.509 Certificate (PEM) as the save type/format.
e.g.
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "GeoTrustGlobalCA.crt");
Now save it and run.. You will get the data..

use
curl_setopt($curl,CURLOPT_SSL_VERIFYPEER, false)

Related

PHP how to check if openload video exist?

I need something to check if openload video exist, some videos sometimes get removed by DMCA report and i just need to display myself not working links.
Just a sketch what I wanna
$result = mysqli_query($db, "SELECT videos FROM table");
while($row=mysqli_fetch_assoc($result) {
$embedUrl = $row["videos"];
//so i wanna show only not working url's
if($embedUrl == false)
echo $embedUrl;
}
This is example of not working link here
Try this. Outputs: 'Video unavailable' if a video doesn't exist.
See comments for step-by-step explanation.
<?php
// Your Openload URL
$url = 'https://openload.co/embed/UgmaOAo1wlg/Horrible.Bosses.2.2014.720p.BluRay.x264.YIFY.mp4';
// Initialize cURL library.
if (($curl = curl_init()) === FALSE)
{
$errno = curl_errno();
throw new RuntimeException("curl_init() ($errno): " . curl_strerror($errno));
}
// Tell cURL which URL to operate on. GET is the default method.
curl_setopt($curl, CURLOPT_URL, $url);
// Optionally specify a path to a certificate store in PEM format.
// curl_setopt($curl, CURLOPT_CAINFO, __DIR__ . '/cacert.pem');
// Given Openload URL is requested over https. Allow for some sanity checking.
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, TRUE);
// Set this to the latest SSL standard supported by PHP at the time of this answer.
curl_setopt($curl, CURLOPT_SSLVERSION, 6);
// Return response, so we can inspect its contents.
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
// Openload returns HTTP code 200 if a video wasn't found. Any code >= 400 indicates a different problem.
curl_setopt($curl, CURLOPT_FAILONERROR, TRUE);
// Allow for server-side redirects.
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
// Don't include header in response.
curl_setopt($curl, CURLOPT_HEADER, FALSE);
if (($response = curl_exec($curl)) === FALSE)
throw new RuntimeException("curl_exec() failed for $url: " . curl_error($curl));
// Perform a case-insensitive search for a token that is specific to the 'video not found' page.
if (stripos($response, '<img class="image-blocked" src="/assets/img/blocked.png" alt="blocked">') !== FALSE)
echo 'Video unavailable';

Does Marketo API block curl on a per account basis?

I am trying to connect to the Marketo.com REST API using curl.
I can't get a response from the identity service. I only get an error message
"[curl] 6: Couldn't resolve host 'MY_CLIENT_ENDPOINT.mktorest.com'
,
but I can print the constructed url and paste it into a browser address bar and this will provide the expected response with the access_token element.
I can use curl in php and in a terminal to access my gmail account so curl is able to access an https service.
I have tried sending the parameters in the curl url as a get request and also by declaring them with curl's -F option as a post request
My application uses dchesterton/marketo-rest-api available on github, but I have also tried a simple php curl request just to get the access token.
private function getToken() {
$url = "$this->client_url/identity/oauth/token?grant_type=client_credentials&client_id=$this->client_id&client_secret=$this->client_secret";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
$errors = curl_error($ch);
curl_close($ch);
file_put_contents($this->logDir . 'access_token_response' . date('Y-m-d') . '.txt', $url . "\n" . $response . "\n", FILE_APPEND);
if ($errors) {
file_put_contents($this->logDir . 'access_token_errors' . date('Y-m-d') . '.txt', $errors . "\n", FILE_APPEND);
}
return $response['access_token'];
}
Again, this fails with the same error but produces a perfectly formed url that I can paste into the browser and get a valid response.
I have also tried this using post instead of get as I have for every other test mentioned, and these have been tried on my localhost and on a test server.
Can anyone explain to me why this would fail?
Does Marketo block curl on a per account basis?
I was trying to implement something similar but my code wasn't working. I'm not sure exactly what is failing but I tried your code and it seems to work perfectly after some slight modifications:
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($request_data));
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
$response = curl_exec($curl);
$errors = curl_error($curl);
curl_close($curl);
I hope this helps.

PHP curl post request to server using cloudflare (Full SSL) has SSL error and Blank SESSION Cookie

Hi I'm doing a website right now. Both of these files is in one server and domain and I'm using cloudflare to boost the loading. I'm using Full SSL option on cloudflare because I bought my own SSL Geotrust on my server. I already upgraded my curl on the server to 7.41.0.
One php file consist of the function
Function File:
<?php
function get_content($session){
$endpoint = "https://sample.ph/php/resource.php";
// Use one of the parameter configurations listed at the top of the post
$params = array(
"yel" => $session
);
$curl = curl_init();
curl_setopt($curl,CURLOPT_URL,$endpoint);
$strCookie = 'PHPSESSID='.$_COOKIE['PHPSESSID'];
curl_setopt($curl, CURLOPT_COOKIE, $strCookie);
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_VERBOSE, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 2);
$postData = "";
//This is needed to properly form post the credentials object
foreach($params as $k => $v)
{
$postData .= $k . '='.urlencode($v).'&';
}
$postData = rtrim($postData, '&');
curl_setopt($curl, CURLOPT_POSTFIELDS, $postData);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 60);
curl_setopt($curl, CURLOPT_HEADER, 0); // Don’t return the header, just the html
curl_setopt($curl, CURLOPT_CAINFO,"/home/sample/public_html/php/cacert.pem"); // Set the location of the CA-bundle
session_write_close();
$response = curl_exec($curl);
if ($response === FALSE) {
return "cURL Error: " . curl_error($curl);
}
else{
// evaluate for success response
return $response;
}
curl_close($curl);
}
?>
Resource File
<?php
session_start();
if(isset($_POST['yel'])){
$drcyt_key = dcrypt("{$_POST['yel']}");
if($drcyt_key == $_SESSION['token']){
echo "Success";
}
}
?>
How do you think will I fix this?
The SSL Verification error. Upon debugging sometimes I got cURL Error: SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
Sometimes I got cURL Error: SSL peer certificate or SSH remote key was not OK
When I put curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, true); to FALSE, which is not a good idea; There comes a second problem for the SESSION COOKIE becoming blank on first load.
I HOPE YOU CAN HELP ME. THANK YOU.
This issue looks to be an outdated certificate bundle or outdated OpenSSL version on the server. You should both ensure you have the latest root certificates on your computer and also ensure that you have the latest versions of OpenSSL (including the PHP OpenSSL module).

Want to use cURL instead of SimpleXML_load_file()

Using below script I can parse API data successfully.
$xml_report_daily=simplexml_load_file("https://api.sitename.com/api/reports/api_get.asp?User=00012345&Key=abcdefghijklmnop&fromDate=11/12/2014&toDate=12/12/2014&mid=25");
foreach ($xml_report_daily as $report_daily):
$trans_id=$report_daily->TRANSID;
$trans_id=$report_daily->MID;
$trans_id=$report_daily->EXT;
$trans_id=$report_daily->USER;
endforeach;
XML data are something like this:
<DATABASE>
<RECORD>
<TRANSID>1348818</TRANSID>
<MID/>
<EXT>0</EXT>
<USER>00012345</USER>
</RECORD>
.
.
.
so on...
</DATABASE>
But I want to use cURL instead of simplexml_load_file. So I used below script but it is not giving any result data.
$url = "https://api.sitename.com/api/reports/api_get.asp?User=00012345&Key=abcdefghijklmnop&fromDate=11/12/2014&toDate=12/12/2014&mid=25";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, false);
$xml = curl_exec($ch);
echo $xml;
Please let me know what I am missing or doing wrong.
Thank you,
Ok, here is my complete answer and hope it will be useful to others.
I used 2 methods to read XML data from specific link.
Method# 1 : Using simplexml_load_file() - allow_url_fopen should be ON on hosting server for this method to work. This method is working fine both on my local as well as actual server.
$xml_report_daily=simplexml_load_file("https://api.sitename.com/api/reports/api_get.asp?User=00012345&Key=abcdefghijklmnop&fromDate=11/12/2014&toDate=12/12/2014&mid=25");
foreach ($xml_report_daily as $report_daily):
$trans_id=$report_daily->TRANSID;
$m_id=$report_daily->MID;
$ext_id=$report_daily->EXT;
$user_id=$report_daily->USER;
echo $trans_id." ".$m_id." ".$ext_id." ".$user_id."<br/>";
endforeach;
Method# 2 : Using cURL - After doing as suggested here, now this method too is working fine both on my local as well as actual server.
$url = "https://api.sitename.com/api/reports/api_get.asp?User=00012345&Key=abcdefghijklmnop&fromDate=11/12/2014&toDate=12/12/2014&mid=25";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, false);
$xml = curl_exec($ch);
$xml_report_daily = simplexml_load_string($xml);
foreach ($xml_report_daily as $report_daily):
$trans_id=$report_daily->TRANSID;
$m_id=$report_daily->MID;
$ext_id=$report_daily->EXT;
$user_id=$report_daily->USER;
echo $trans_id." ".$m_id." ".$ext_id." ".$user_id."<br/>";
endforeach;
When using cURL, I was getting no result data so paul-crovella suggested me to check error. so I used below script and I found that I was trying to acess https (SSL certificate) data as also mentioned by Raffy Cortez
if(curl_exec($ch) === false)
{ echo 'Curl error: ' . curl_error($ch); }
else
{ echo 'Operation completed without any errors'; }
To resolve this https (SSL certificate) related issue, here is very very helpful link and you can use any of methods mentioned there as per your necessity.
HTTPS and SSL3_GET_SERVER_CERTIFICATE:certificate verify failed, CA is OK
Thank you,
You are calling https URL in your cURL, you need to use
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

Unable to retrieve and save user’s display picture through cURL

I am trying to retrieve users display picture through graph api and using cUrl to save it into the disk, but am unable succeed in it and getting this error when trying to check the mime type of the picture that I saved:
Notice: exif_imagetype(): Read error! in
//$userPpicture = $user_profile[picture];
//Create image instances
$url = "http://graph.facebook.com/{$userId}/picture?type=large";
$dpImage = 'temp/' . $userId . '_dpImage_' . rand().'.jpg';
echo $dpImage;
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$returned_content = get_data($url);
file_put_contents($dpImage, $returned_content);
echo "Type: " . exif_imagetype($dpImage);
for this updated code using curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); I am getting this error:
Warning: curl_setopt(): CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir is set in /var/fog/apps/app12345/myapp.phpfogapp.com/start.php on line 178
If this action requires any server side configuration then i might not be able to do this as am using a shared cloud storage over phpfog.
Kindly help me with this.
Thankyou.
The graph url you are using of http://graph.facebook.com/4/picture?type=large returns a HTTP 302 redirect, not the actual user image. You would need to follow the redirect and download the image at that url which is a url that looks like this: http://profile.ak.fbcdn.net/hprofile-ak-snc4/49942_4_1525300_n.jpg
As OffBySome points out, you need to follow the 302 redirect served by graph.facebook.com to the final destination, which contains the actual image data.
The simplest way to do that in this case is to add another curl_setopt call with CURLOPT_FOLLOWLOCATION as true. i.e.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true)
Check out http://us3.php.net/manual/en/function.curl-setopt.php for more details.

Categories