I am trying to use php's curl to post to a sites form for me then extract the result, but it is not working. Instead it shows a blank form, like I just did a basic GET reequest to the page.
<?php
$domains = [
'expireddomains.net',
'stackoverflow.com',
'toastup.com'
];
$ccd = '';
foreach ($domains as $domain) {
$ccd .= $domain . '\r\n';
}
// set post fields
$post = [
'removedoubles' => '1',
'removeemptylines' => '1',
'showallwordmatches' => '1',
'wordlist' => 'en-v1',
'camelcasedomains' => $ccd,
'button_submit' => 'Camel+Case+Domains'
];
$ch = curl_init('https://www.expireddomains.net/tools/camel-case-domain-names/');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
$headers = [
'Referer: https://www.expireddomains.net/tools/camel-case-domain-names/',
'Content-Type: application/x-www-form-urlencoded',
'Origin: https://www.expireddomains.net'
];
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
// execute!
$response = curl_exec($ch);
// close the connection, release resources used
curl_close($ch);
I confirmed the postdata formatting and names using the network tab in the browser's dev tools to check the request.
Originally I wasn't sending any headers, then I thought maybe the site validated the origin or referer, but even adding that didn't work.
I checked, the form doesn't include any hidden fields for something like a CSRF token or anything.
Any ideas?
For application/x-www-form-urlencoded, use http_build_query and let it encode the values like +'s etc, plus the seperator between domains is | not new lines.
<?php
$domains = [
'expireddomains.net',
'stackoverflow.com',
'toastup.com'
];
// set post fields
$post = [
'removedoubles' => 1,
'removeemptylines' => 1,
'showallwordmatches' => 1,
'wordlist' => 'en-v1',
'camelcasedomains' => implode(' | ', $domains),
'button_submit' => 'Camel Case Domains'
];
$ch = curl_init('https://www.expireddomains.net/tools/camel-case-domain-names/');
$headers = array();
$headers[] = 'authority: www.expireddomains.net';
$headers[] = 'pragma: no-cache';
$headers[] = 'cache-control: no-cache';
$headers[] = 'origin: https://www.expireddomains.net';
$headers[] = 'upgrade-insecure-requests: 1';
$headers[] = 'dnt: 1';
$headers[] = 'content-Type: application/x-www-form-urlencoded';
$headers[] = 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36';
$headers[] = 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9';
$headers[] = 'sec-fetch-site: same-origin';
$headers[] = 'sec-fetch-mode: navigate';
$headers[] = 'sec-fetch-user: ?1';
$headers[] = 'sec-fetch-dest: document';
$headers[] = 'referer: https://www.expireddomains.net/tools/camel-case-domain-names/';
$headers[] = 'referrer-policy: same-origin';
$headers[] = 'accept-language: en-GB,en-US;q=0.9,en;q=0.8';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
// execute!
$response = curl_exec($ch);
// close the connection, release resources used
curl_close($ch);
// parse whats in textarea
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($response);
libxml_clear_errors();
$result = [];
foreach ($dom->getElementsByTagName('textarea') as $textarea) {
if ($textarea->getAttribute('name') === "camelcasedomains") {
$result = explode(' | ', $textarea->nodeValue);
}
}
print_r($result);
Result:
Array
(
[0] => ExpiredDomains.net
[1] => ExpiredDoMains.net
)
You could probably remove most of the headers, if not needed. I just added them all to exactly match the request, but ended up being the aforementioned encoding.
Related
I'm trying to get some code of mine to work. but I keep getting the following error. Any thoughts on what's going wrong here? I think I have all the quoatations escaped correctly
{"errors":[{"message":"json body could not be decoded: invalid
character 'L' after object key:value pair"}],"data":null}
I know my query is correct as I can run it in the graphQL playground and get the data.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://xxxxxxxxxxxx.com/api/v4/endpoint');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "{\"query\":\"{ search(q: \"LM123\") { results { part { mpn manufacturer { name }}}}\"}");
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
$headers = array();
$headers[] = 'Accept-Encoding: gzip, deflate';
$headers[] = 'Content-Type: application/json';
$headers[] = 'Accept: application/json';
$headers[] = 'Connection: keep-alive';
$headers[] = 'Dnt: 1';
$headers[] = 'Origin: https://xxxxxxxxxxxxx.com';
$headers[] = 'Token: xxxxxxxxxxxxxxxxxxxxxxxx';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close($ch);
echo $result;
If I run a simple query that doesn't search for a term it works perfectly. Like:
curl_setopt($ch, CURLOPT_POSTFIELDS, "{\"query\":\"{ categories { name }}\"}");
You have a problem with using double quotes here \"LM123\". When your JSON is parsing, the parser expects, that this \" ends your value and then you will have , \"other_key\": \"...\" in your JSON, but you have LM123... instead.
You can try something like this:
curl_setopt($ch, CURLOPT_POSTFIELDS, '{"query":"{ search(q: \"LM123\") { results { part { mpn manufacturer { name }}}}"}');
I have an issue with a cUrl. Recently, I have found that the problem is in the syntax of the curl, because when I dump the request I want to execute and put it in PostMan it works, but the function inside my code doesn't and returns 3: 'CURLE_URL_MALFORMAT'. For information my curl version is 7.79.1
I have tried some different approaches with CURLOPT_RETURNTRANSFER, CURLOPT_POST and etc, but it does not work...
Here is my function:
private function cUrl_CreateContactFastFood($merchant_token, $arr)
{
$status = "active";
$Contact_name = $arr["customerName"];
$Contact_email = '';
if($Contact_email=='') {
$Contact_email= 'FastFoodCustomer#ttt.com';
}
$Contact_phone = $arr["phoneNumber"];
$Contact_address = $arr["endAddressResolved"];
$url = 'http://dostavka-bg.com/api_services/insert_contact';
$curl_params = '?keys=f74192da825962d3b1c2b2aa616ab68b&merchant_token='.$merchant_token.'&name='.$Contact_name.'&email='.$Contact_email.'&phone='.$Contact_phone.'&address='.$Contact_address.'&status='.$status;
$ch = curl_init();
$headers = array(
'Accept: */*',
'User-Agent: ',
'Accept-Encoding: gzip, deflate, br',
'Host: dostavka-bg.com',
'Connection: keep-alive'
);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_URL, $url.$curl_params);
$data = curl_exec($ch);
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close($ch);
}
Here is the solved problem:
private function cUrl_CreateContactFastFood($merchant_token, $arr)
{
$status = "active";
$Contact_name = $arr["customerName"];
$Contact_email = '';//$arr["customer"]["invoicing_details"]["company_address"];
if($Contact_email==''){
$Contact_email= 'FastFoodCustomer#ttt.com';
}
$Contact_phone = $arr["phoneNumber"];
$Contact_address = $arr["endAddressResolved"];
$url = 'http://dostavka-bg.com/api_services/insert_contact';
$curl_params = '?keys=f74192da825962d3b1c2b2aa616ab68b&merchant_token='.urlencode($merchant_token).'&name='.urlencode($Contact_name).
'&email='.urlencode($Contact_email).'&phone='.urlencode($Contact_phone).'&address='.urlencode($Contact_address).'&status='.urlencode($status);
$ch = curl_init();
$headers = array(
'Accept: */*',
'User-Agent: ',
'Accept-Encoding: gzip, deflate, br',
'Host: dostavka-bg.com',
'Connection: keep-alive'
);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_URL, $url.$curl_params);
curl_setopt($ch, CURLOPT_POST, 1);
$data = curl_exec($ch);
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
$info = curl_version();
curl_close($ch);
return true;
}
The URL you're passing includes several query string parameters, only some of which are shown in your example, but some of which probably include punctuation or other characters which aren't allowed in URLs. Those need to be correctly encoded by either:
Calling urlencode or rawurlencode on each parameter as you build up the query string.
Using http_build_query to format it all for you, rather than manually concatenating the parts.
I want to send many request to a website and find the id of last post that exist.
As my host hits the limit of requests to that websites after several requests, I expect the curl request to return an error so that I can save the last post's id in my database and continue scrolling later.
But after about 200 successful requests, curl doesn't return any response nor http code.
To be specific I want to get posts of a telegram channel from an id to the end.
Here is the function that I have written for this purpose:
function get_post_html_content($channel_username, $message_id){
try {
error_log($message_id."\n");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,
"https://t.me/".$channel_username."/".$message_id."?embed=1");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
$headers = array();
$headers[] = 'Pragma: no-cache';
$headers[] = 'Sec-Fetch-Site: same-origin';
$headers[] = 'Origin: https://t.me';
$headers[] = 'Accept-Encoding: gzip,deflate';
$headers[] = 'Accept-Language: en-US,en;q=0.9';
$headers[] = 'Sec-Fetch-Mode: cors';
$headers[] = 'Content-Type: application/x-www-form-urlencoded';
$headers[] = 'Accept: */*';
$headers[] = 'Cache-Control: no-cache';
$headers[] = 'Referrer Policy: no-referrer-when-downgrade';
$headers[] = 'Connection: keep-alive';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$content = curl_exec($ch);
if (!$content) {
$errno = curl_errno($ch);
$error = curl_error($ch);
error_log("Curl returned error $errno: $error\n");
curl_close($ch);
return false;
}
$http_code = intval(curl_getinfo($ch, CURLINFO_HTTP_CODE));
error_log("http code: ".$http_code."\n");
} catch (Exception $e) {
error_log($e->getMessage());
}
$content = gzdecode($content);
curl_close($ch);
return $content;
}
The problem is after several time that http code 200 printed in error log file and this function returns the content, Suddenly it doesn't print any http code in error log and even doesn't return false so than I can save last post id in database.
So how can I change this function to return false in this situation?
I need to extract DOM from external website in php. I tried testing URL, but sometimes it shows a many many chinesse letters :) (more specifically characters in unicode I though)
It's strange, that if I use different link, it works, but if I use link below and run php for example 3 times, after 3. try it stops working (but for the 1, a 2. time it shows normal DOM structure)
URL: https://www.csfd.cz/film/300902-bohemian-rhapsody/prehled/
DOM after 3. (ca.) run: https://i.stack.imgur.com/lnM1I.png
Code:
$doc = new \DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTMLFile("https://www.csfd.cz/film/300902-bohemian-rhapsody/prehled/");
dd($doc->saveHTML());
Does anybody know, what to do?
I guess it is because of the site compression, you can extract data by using good old curl:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.csfd.cz/film/300902-bohemian-rhapsody/prehled/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
$headers = array();
$headers[] = 'Connection: keep-alive';
$headers[] = 'Cache-Control: max-age=0';
$headers[] = 'Save-Data: on';
$headers[] = 'Upgrade-Insecure-Requests: 1';
$headers[] = 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36';
$headers[] = 'Dnt: 1';
$headers[] = 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8';
$headers[] = 'Accept-Encoding: gzip, deflate, br';
$headers[] = 'Accept-Language: en-US;q=0.8,en;q=0.7,uk;q=0.6';
$headers[] = 'Cookie: nette-samesite=1; developers-ad=1;';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($ch);
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close ($ch);
$doc = new \DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($result);
dd($doc->saveHTML());
I am doing a project where I need to import Facebook Page feeds. For accessing Facebook page feeds, I need a page_access_token and to generate page_access_token I need User access token.
Here my question is
1.How to generate this User_access_token using CURL ? Most of the solution requires APP_KEY & APP_SECRET. Is it not possible to get user_access_token without any APP ?
Once I get the User_access_token how do I use it to get Page access Token using CURL.
You can´t get ANY Token without an App, but you don´t need to program anything in order to get a User Token. These articles explain everything in detail:
https://developers.facebook.com/docs/facebook-login/access-tokens
http://www.devils-heaven.com/facebook-access-tokens/
https://developers.facebook.com/docs/facebook-login/
For example, you can use the API Explorer to select your App and generate User Tokens.
Wrong question, No tokens are needed
I just tried it and it took me less than 5 minutes, never having scraped FB in the past. I saved the page to my Server and brought up the page using my URL and it looked just like if I were on FB.
If a Browser can load the page with JavaScript disabled, then you can too.
You have to use https://m.facebook.com/, JavaScript is not required on their mobile site.
What you want to do, is not difficult at all.
Just go there in your Browser and copy the cookies key values into the Cookie: HTTP request header. Mine are x'ed out.
<?php
$request = array();
$request[] = 'Host: m.facebook.com';
$request[] = 'User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:39.0) Gecko/20100101 Firefox/39.0';
$request[] = 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
$request[] = 'Accept-Language: en-US,en;q=0.5';
$request[] = 'Accept-Encoding: gzip, deflate';
$request[] = 'DNT: 1';
$request[] = 'Cookie: datr=x; fr=x; lu=x s=xx; csm=x; xs=xx; c_user=x; p=-2; act=x; presence=x; noscript=1';
$request[] = 'Connection: keep-alive';
$request[] = 'Pragma: no-cache';
$request[] = 'ache-Control: no-cache';
$url = 'https://m.facebook.com/';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_ENCODING,"");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FILETIME, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 100);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_TIMEOUT,100);
curl_setopt($ch, CURLOPT_FAILONERROR,true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $request);
$data = curl_exec($ch);
if (curl_errno($ch)){
$data .= 'Retreive Base Page Error: ' . curl_error($ch);
}
else {
$skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE));
$responseHeader = substr($data,0,$skip);
$data= substr($data,$skip);
$info = curl_getinfo($ch);
$info = var_export($info,true);
}
while(true){ // get cookies from response header
$s = strpos($head,'Set-Cookie: ',$e);
if (!$s){break;}
$s += 12;
$e = strpos($head,';',$s);
$cookie = substr($head,$s,$e-$s) ;
$s = strpos($cookie,'=');
$key = substr($cookie,0,$s);
$value = substr($cookie,$s);
$cookies[$key] = $value;
}
$cookie = ''; // format cookies for next request header
$show = '';
$head = '';
$delim = '';
foreach ($cookies as $k => $v){
$cookie .= "$delim$k$v";
$delim = '; ';
}
$fp = fopen("fb.html",'w');
fwrite($fp,"$data\n$info\n$responseHeader");
fclose($fp);
readfile('fb.html');