CURL - While Loop until Captcha is sucesss? - php

Assume captcha key is invalid, it need to download new captcha image again and re-validate captcha key. How can that be done?
I have include short example, is this the way to do?
while (1) {
$postData = http_build_query($data);
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "\**********************.crt");
curl_setopt($ch, CURLOPT_URL, "https://domain.com/test" . $form_link);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiesPath . "/cookiefile.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiesPath . "/cookiefile.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$page = curl_exec($ch);
//Just a quick example
if ($page == "Sucess") {
break;
} else {
$ch = curl_init();
//Some curl code here to Re-download Captcha Image (new image)
$data['captchaText'] = CaptchaToText::Scan("images/captcha.jpg");
}
}

Yes, you doing it right. But only in the firs part )
You already have cURL resource initiated ($ch).
So you only need to execute cURL request again by curl_exec($ch) and you will get a new page.
All the cURL options set by curl_setopt() are saved in resourse.
Here is the code:
if ($page == "Sucess") {
break;
} else {
$page = curl_exec($ch);
//Some curl code here to Re-download Captcha Image (new image)
$data['captchaText'] = CaptchaToText::Scan("images/captcha.jpg");
}

Related

Login using php

I have tried the following code for login authentication but its not working.
<?php
define('URL', 'https://xxxxxxxxxxxx.com');
function authenticate($uname, $pass) {
$url = URL . 'Issue/Bug-5555';
$curl = curl_init();
curl_setopt($curl, CURLOPT_USERPWD, "$uname:$pass");
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); // ssl ensure cert
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 1); /// ssl ensure cert
$issue_list = (curl_exec($curl));
echo $issue_list;
return $issue_list;
} ?>
You didn't mention what you are trying to achieve with this code.
Are you trying to get a ticket info, post an issue? You are just using the API...
Well here's a script that works with the JIRA API.
<?php
$username = 'test';
$password = 'test';
$url = "https://xxxxx.xxxxxxx.net/rest/api/2/project";
$ch = curl_init();
$headers = array(
'Accept: application/json',
'Content-Type: application/json'
);
$test = "This is the content of the custom field.";
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET");
//curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERPWD, "$username:$password");
$result = curl_exec($ch);
$ch_error = curl_error($ch);
if ($ch_error) {
echo "cURL Error: $ch_error";
} else {
echo $result;
}
curl_close($ch);
?>
This code is fetching a project from JIRA.
If you want to create issues, you will have to change the REST URL to /rest/api/2/issue/ and use "POST" instead of "GET" method.
You are missing a slash in the URL. The $url you pass to cURL currently has value:
https://jira.hawkdefense.comIssue/Bug-5555
After the curl_exec() line, insert this and copy/paste the result here.
if (curl_errno($url)) {
throw new Exception(curl_error($url));
}

Get username from page

<?php
/* EDIT EMAIL AND PASSWORD */
$EMAIL = "MY EMAIL"; // here email
$PASSWORD = "MY PASS"; // here password
function cURL($url, $header=NULL, $cookie=NULL, $p=NULL)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, $header);
curl_setopt($ch, CURLOPT_NOBODY, $header);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
if ($p) {
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $p);
}
$result = curl_exec($ch);
if ($result) {
return $result;
} else {
return curl_error($ch);
}
curl_close($ch);
}
$a = cURL("https://login.facebook.com/login.php?login_attempt=1",true,null,"email=$EMAIL&pass=$PASSWORD");
preg_match('%Set-Cookie: ([^;]+);%',$a,$b);
$c = cURL("https://login.facebook.com/login.php?login_attempt=1",true,$b[1],"email=$EMAIL&pass=$PASSWORD");
preg_match_all('%Set-Cookie: ([^;]+);%',$c,$d);
for($i=0;$i<count($d[0]);$i++)
$cookie.=$d[1][$i].";";
/*
NOW TO JUST OPEN ANOTHER URL EDIT THE FIRST ARGUMENT OF THE FOLLOWING FUNCTION.
TO SEND SOME DATA EDIT THE LAST ARGUMENT.
*/
echo cURL("https://www.facebook.com/search/results.php?q=Funny",null,$cookie,null);
?>
I want to be able to get the usernames on that page.
I can access the page i want to access, i cant get the second step.
I want to get the usernames listed on that page, i don't know where to start.
Can anybody help me out with this ?
Your method of accessing Facebook is against Facebook's Terms of Service. You should use the Facebook Graph API to get this kind of data.

PHP cron reload woes

I made a script which iterates through a couple pages of a third party website looking for data, I need to run it on a crontable once a day. The way I currently wrote, testing its function on a browser, the script reloads itself with javascript to go to the next page if the data it seeks isn't found. So this won't work in cron. The problem with simply looping through the function is that I can't run this function multiple times: http_get() as defined by
function http_get($target, $ref)
{
return http($target, $ref, $method="GET", $data_array="", EXCL_HEAD);
}
function http($target, $ref, $method, $data_array, $incl_head)
{
# Initialize PHP/CURL handle
$ch = curl_init();
# Prcess data, if presented
if(is_array($data_array))
{
# Convert data array into a query string (ie animal=dog&sport=baseball)
foreach ($data_array as $key => $value)
{
if(strlen(trim($value))>0)
$temp_string[] = $key . "=" . urlencode($value);
else
$temp_string[] = $key;
}
$query_string = join('&', $temp_string);
}
# HEAD method configuration
if($method == HEAD)
{
curl_setopt($ch, CURLOPT_HEADER, TRUE); // No http head
curl_setopt($ch, CURLOPT_NOBODY, TRUE); // Return body
}
else
{
# GET method configuration
if($method == GET)
{
if(isset($query_string))
$target = $target . "?" . $query_string;
curl_setopt ($ch, CURLOPT_HTTPGET, TRUE);
curl_setopt ($ch, CURLOPT_POST, FALSE);
}
# POST method configuration
if($method == POST)
{
if(isset($query_string))
curl_setopt ($ch, CURLOPT_POSTFIELDS, $query_string);
curl_setopt ($ch, CURLOPT_POST, TRUE);
curl_setopt ($ch, CURLOPT_HTTPGET, FALSE);
}
curl_setopt($ch, CURLOPT_HEADER, $incl_head); // Include head as needed
curl_setopt($ch, CURLOPT_NOBODY, FALSE); // Return body
}
curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE); // Cookie management.
curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE_FILE);
curl_setopt($ch, CURLOPT_TIMEOUT, CURL_TIMEOUT); // Timeout
curl_setopt($ch, CURLOPT_USERAGENT, WEBBOT_NAME); // Webbot name
curl_setopt($ch, CURLOPT_URL, $target); // Target site
curl_setopt($ch, CURLOPT_REFERER, $ref); // Referer value
curl_setopt($ch, CURLOPT_VERBOSE, FALSE); // Minimize logs
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); // No certificate
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow redirects
curl_setopt($ch, CURLOPT_MAXREDIRS, 4); // Limit redirections to four
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Return in string
$return_array['FILE'] = curl_exec($ch);
# Create return array
$return_array['STATUS'] = curl_getinfo($ch);
$return_array['ERROR'] = curl_error($ch);
# Close PHP/CURL handle
curl_close($ch);
# Return results
return $return_array;
}
Any ways I can get around this? Thanks
I'm not sure what your issue is - you could just have the loop contain the code that calls the functions, not the functions themselves.
Alternatively, use function_exists to test if you've already defined the functions.

How to use CURL instead of file_get_contents?

I use file_get_contents function to get and show external links on my specific page.
In my local file everything is okay, but my server doesn't support the file_get_contents function, so I tried to use cURL with the below code:
function file_get_contents_curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
echo file_get_contents_curl('http://google.com');
But it returns a blank page. What is wrong?
try this:
function file_get_contents_curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
This should work
function curl_load($url){
curl_setopt($ch=curl_init(), CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
curl_close($ch);
return $response;
}
$url = "http://www.google.com";
echo curl_load($url);
I encountered such a problem accessing Google Drive content via the direct link.
After calling file_get_contents returned 302 Moved temporarily
//Any google url. This example is fake for Google Drive direct link.
$url = "https://drive.google.com/uc?id=0BxQKKJYjuNElbFBNUlBndmVHHAj";
$html = file_get_contents($url);
echo $html; //print none because error 302.
With the code below it worked again:
//Any google url. This example is fake for Google Drive direct link.
$url = "https://drive.google.com/uc?id=0BxQKKJYjuNElbFBNUlBndmVHHAj";
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 3);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$html = curl_exec($ch);
curl_close($ch);
echo $html;
I tested it today, 03/19/2018
//You can try this . It should work fine.
function curl_tt($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 3);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
echo curl_tt("https://google.com");

Facebook page scraping

I am trying to scrap a facebook page ( https://www.facebook.com/pages/PTSD/455847705426 )
I found this script to login to facebook.
<?php
$EMAIL = "me#mail.com";
$PASSWORD = "facebookPassword";
function cURL($url, $header=NULL, $cookie=NULL, $p=NULL)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, $header);
curl_setopt($ch, CURLOPT_NOBODY, $header);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
if ($p) {
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $p);
}
$result = curl_exec($ch);
if ($result) {
return $result;
} else {
return curl_error($ch);
}
curl_close($ch);
}
$a = cURL("https://login.facebook.com/login.php?login_attempt=1",true,null,"email=$EMAIL&pass=$PASSWORD");
preg_match('%Set-Cookie: ([^;]+);%',$a,$b);
$c = cURL("https://login.facebook.com/login.php?login_attempt=1",true,$b[1],"email=$EMAIL&pass=$PASSWORD");
preg_match_all('%Set-Cookie: ([^;]+);%',$c,$d);
for($i=0;$i<count($d[0]);$i++)
$cookie.=$d[1][$i].";";
/*
NOW TO JUST OPEN ANOTHER URL EDIT THE FIRST ARGUMENT OF THE FOLLOWING FUNCTION.
TO SEND SOME DATA EDIT THE LAST ARGUMENT.
*/
$page_html = cURL("https://www.facebook.com/pages/PTSD/455847705426",null,$cookie,null);
?>
now variable $page_html have only few posts, moreover they are in very complex code
my questions are
how can I get all posts.
is there some other approach which return me complete and clear data.
is there some way to have all posts in json format.
please tell me if there is some useful tutorial or articles regarding this.
Regards
Spend some time reading the developer documentation. You can get all the posts as a JSON object from a page by setting up an app, then querying the graph api with a page access token.

Categories