Doesn’t my curl post request contains cookies? - php

I need to login to http://auto.vsk.ru/login.aspx making a post request to it from my site.
I wrote a js ajax function that sends post request to php script on my server, that sends cross-domain request via cUrl.
post.php
<?php
function request($url,$post, $cook)
{
$ch = curl_init();
$curlConfig = array(
CURLOPT_URL => $url,
CURLOPT_POST => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_COOKIEFILE => $cook,
CURLOPT_COOKIEJAR => $cook,
CURLOPT_USERAGENT => '"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Trident/7.0; Touch; .NET4.0C; .NET4.0E; Tablet PC 2.0)"',
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_REFERER => $url,
CURLOPT_POSTFIELDS => $post,
CURLOPT_HEADER => 1,
);
curl_setopt_array($ch,$curlConfig);
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
$result = request($_POST['url'], $_POST['data'], $_POST['cook']);
if ($result === FALSE)
echo('error');
else
echo($result);
?>
Js code:
function postcross(path,data,cook,run)
{
requestsp('post.php','url='+path+'&data='+data+'&cook='+cook, run);
}
function requestp(path, data, run)
{
var http = new XMLHttpRequest();
http.open('POST', path, true);
http.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
http.onreadystatechange = function()
{
if(http.readyState == 4 && http.status == 200)
{
run(http);
}
}
http.send(data);
}
postcross('http://auto.vsk.ru/login.aspx',encodeURIComponent('loginandpassord'),'vskcookies.txt',function(e){
document.getElementById('container').innerText=e.responseText;
});
The html page I getting from response says two things:
My browser is not Internet Explorer, I should switch to it.(actually it works from Google Chrome, at least can login).
My browser doesn’t support cookies.
About the cookies it is very similar to this (veeeery long) question. File vskcookies.txt is created in my server and it is actually updates after post request call, and stores cookies.
About the IE, firstly I thought that the site checks browser from js, but it is wrong, because js doesn’t run at all - I only read html page as a plain text, and it already has that notification about IE.
So wondered what if I make cUrl request wrong? I wrote new php script that shows request headers, here is a source:
head.php
<?php
foreach (getallheaders() as $name => $value)
{
echo "$name: $value\n";
}
?>
The result of postcross('http://mysite/head.php',encodeURIComponent('loginandpassord'),'vskcookies.txt',function(e){ document.getElementById('container').innerText=e.responseText; }):
Host: my site
User-Agent: "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Trident/7.0; Touch; .NET4.0C; .NET4.0E; Tablet PC 2.0)"
Accept: */*
Content-Type: application/x-www-form-urlencoded
Referer: mysite/head
X-1gb-Client-Ip: my ip
X-Forwarded-For: ip, ip, ip
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Port: 443
Accept-Encoding: gzip
X-Forwarded-URI: /head
X-Forwarded-Request: POST /head HTTP/1.1
X-Forwarded-Host: my site
X-Forwarded-Server: my site
Content-Length: 823
Connection: close
For some reason there is no Cookie: parameters, but user agent is IE as I mentioned.
Also I tried to replace head.php source with
print_r($_COOKIE);
And got empty array:
Am I doing something wrong, or it is site bot-protection?
Update 1
It is showing cookies only if to pass them through CURLOPT_COOKIE.
So I think I will leave CURLOPT_COOKIEFILE => $cook; as it is, and for CURLOPT_COOKIE something like file_get_contents($cook), although there is useless information. protection?
Important Update 2
Okay, probably I just stupid. Response html page indeed consists messages about IE and offed cookies, but they are in div that is display:none and are displayed on by js.
So, seems my tries fail because of another reasons.

Related

Logging into the Amazon SellerCentral with PHP and cURL

I'm trying to find a way to log into the Amazon SellerCentral account via PHP, I fund this script
https://github.com/mindevolution/amazonSellerCentralLogin
which in theory should work but I'm being redirected to the login page everytime I run it.
Also, I tried PhantomJS + CasperJS but without any luck, the first problem I had with that approach is that I need to disable 2-factor authentication and the second problem was that I'm getting captchas which I can't solve via code.
Here is the CasperJS code I tried:
var urlBeforeLoggedIn = "https://sellercentral.amazon.com/gp/homepage.html";
var urlAfterLoggedIn = "https://sellercentral.amazon.com/";
var casper = require('casper').create({
pageSettings: {
loadImages: false,
loadPlugins: false,
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36'
}
});
casper.start(urlBeforeLoggedIn);
casper.waitForSelector('form[name="signIn"]', function() {
casper.fillSelectors('form[name="signIn"]', {
'input[name="email"]': 'some_username',
'input[name="password"]': 'some_password'
}, true);
});
casper.waitForUrl(urlAfterLoggedIn, function() {
this.viewport(3000, 1080);
this.capture('./testscreenshot.png', {top: 0,left: 0,width: 3000, height:
1080});
});
casper.run();
not an answer, but too long to post as a comment.
Do not parse html with regex., use a proper HTML parser instead, like DOMDocument & DOMXPath. i don't have an account to test with, but this should get you past the first login page, with a correct email & password,
<?php
declare(strict_types=1);
header("content-type: text/plain;charset=utf-8");
$email="em#ail.com";
$password="passw0rd";
$ch=curl_init();
curl_setopt_array($ch,array(
CURLOPT_AUTOREFERER => true,
CURLOPT_BINARYTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_CONNECTTIMEOUT => 4,
CURLOPT_TIMEOUT => 8,
CURLOPT_COOKIEFILE => "", // <<makes curl save/load cookies across requests..
CURLOPT_ENCODING => "", // << makes curl post all supported encodings, gzip/deflate/etc, makes transfers faster
CURLOPT_USERAGENT => 'whatever; curl/' . (curl_version() ['version']) . ' (' . (curl_version() ['host']) . '); php/' . PHP_VERSION,
CURLOPT_RETURNTRANSFER=>1,
CURLOPT_URL=>'https://sellercentral.amazon.com/gp/homepage.html',
));
$html=curl_exec($ch);
//var_dump($html) & die();
$domd=#DOMDocument::loadHTML($html);
$xp=new DOMXPath($domd);
$form=$xp->query("//form[#name='signIn']")->item(0);
$inputs=[];
foreach($form->getElementsByTagName("input") as $input){
$name=$input->getAttribute("name");
if(empty($name) && $name!=="0"){
continue;
}
$inputs[$name]=$input->getAttribute("value");
}
assert(isset($inputs['email'],$inputs['password'],
$inputs['appActionToken'],$inputs['workflowState'],
$inputs['rememberMe']),"missing form inputs!");
$inputs["email"]=$email;
$inputs["password"]=$password;
$inputs["rememberMe"]="false";
$login_url=$form->getAttribute("action");
var_dump($inputs,$login_url);
curl_setopt_array($ch,array(
CURLOPT_URL=>$login_url,
CURLOPT_POST=>1,
CURLOPT_POSTFIELDS=>http_build_query($inputs)
));
$html=curl_exec($ch);
$domd=#DOMDocument::loadHTML($html);
$xp=new DOMXPath($domd);
$loginErrors=[];
// warning-message-box is also used for login *errors*, amazon web devs are just being stupid with the names.
foreach($xp->query("//*[contains(#id,'error-message-box')]|//*[contains(#id,'warning-message-box')]") as $loginError){
$loginErrors[]=preg_replace("/\s+/"," ",trim($loginError->textContent));
}
if(!empty($loginErrors)){
echo "login errors: ";
var_dump($loginErrors);
die();
}
//var_dump($html);
echo "login successful!";
the important takeaway here is
$domd=#DOMDocument::loadHTML($domd);
$xp=new DOMXPath($domd);
$form=$xp->query("//form[#name='signIn']")->item(0);
$inputs=[];
foreach($form->getElementsByTagName("input") as $input){
$name=$input->getAttribute("name");
if(empty($name) && $name!=="0"){
continue;
}
$inputs[$name]=$input->getAttribute("value");
}
that's how most website login pages can be parsed for login info.
I'm getting captchas which I can't solve via code
deathbycaptcha api to the rescue: http://www.deathbycaptcha.com/user/api

file_get_contents appends data when downloading a binary file

I wrote a function that allows me to send HTTP GET requests in PHP:
function get_HTTPS_page_with_version($url, $hostname, $use_HTTP_1_0 = false, &$headers = NULL, $follow_redirect = true, $use_SSL = true) {
$context = Array(
"http" => Array(
"method" => "GET",
"ignore_errors" => true,
"follow_location" => $follow_redirect,
"user_agent" => "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36",
"header" => "Accept-language: en")
);
if ($use_SSL) {
$context["ssl"] = Array(
"peer_name" => $hostname,
"SNI_server_name" => $hostname,
"verify_peer_name" => false,
"verify_peer" => false);
}
if (!$use_HTTP_1_0) {
$context["http"]["protocol_version"] = "1.1";
$context["http"]["header"] .= "\r\nHost: $hostname".
"\r\nConnection: close";
}
$page = file_get_contents($url, false, stream_context_create($context));
$responseCode = get_HTTP_response_code($http_response_header);
$headers = $http_response_header;
return Array($responseCode, $page);
}
The issue is that when I use this function to get a specific file (which is Verisign Certificate Revocation List), it appends some chars at the beginning:
6639 6361 0a0d --> f9 ca \n\r
and at the end of the file:
0a0d 0d30 0d0a 000a --> \n\r \r0 \r\n NUL\n
I compared files obtained manually using Wget and with this function, and also at network level using Wireshark, and I can confirm that the file sent by the server is the same in both cases.
I also don't have the problem for Thawte CRL
Does anyone have an idea about what might cause this behavior?
EDIT
Verisign CRL changed and other bytes are added by file_get_contents than the ones that I listed above, but the result is still the same --> The file is not the same when downloaded by this function than manually using wget.
EDIT 2
There is no problem when I set $use_HTTP_1_0 = true.
There is no problem when I add header Accept-Encoding: gzip,
deflate (except that I get a gz file and that I'd prefer to avoid
this :))
There is still a problem when I only change Connection from close
to Keep-Alive

Posting data to PHP script and then process it

I'm trying to make a HTTP POST request to my PHP script. However, it doesn't seem to be retrieving the request data.
To simulate a POST request, I used Request Maker and sent over a url of http://php-agkh1995.rhcloud.com/lta.php and a request data of var1=65059.
Using the default url in the else statement works perfectly fine but not the other
I'm suspecting the request headers to be the fault unless there's a major flaw in my code
lta.php
$stopid=$_POST['var1'];
$defurl = ""; // Default url
if(!empty($stopid)){
$defurl = 'http://datamall2.mytransport.sg/ltaodataservice/BusArrival?BusStopID=$stopid';
} else {
$defurl = 'http://datamall2.mytransport.sg/ltaodataservice/BusArrival?BusStopID=83139';
}
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_URL => $defurl,
CURLOPT_USERAGENT => 'Ashton',
CURLOPT_HTTPHEADER => array('AccountKey: ********', 'UniqueUserId: ******', 'accept: application/json')
));
$resp = curl_exec($curl);
curl_close($curl);
echo($resp); // To test if data is displayed or not
return $resp;
Request headers sent
POST /lta.php HTTP/1.1
Host: php-agkh1995.rhcloud.com
Accept: */*
Content-Length: 10
Content-Type: application/x-www-form-urlencoded
You could use array_key_exists to test the existence of the POST variable
if(array_key_exists($_POST,'var1')){
$stopid=$_POST['var1'];
$defurl = "http://datamall2.mytransport.sg/ltaodataservice/BusArrival?BusStopID=$stopid";
} else {
..
}
PS : if your $defurl is set to the else case value by default you don't even need the else clause

How do I get twitter posts?

I am trying to get twitter posts following this tutorial:
https://www.youtube.com/watch?v=tPrsVKudecs
there aren't a lot of tutorials regarding this online, and twitters console doesn't support running queries anymore as far as I understood.
any idea why this is happening?
This is the output I get in the Chrome "Network":
Remote Address:54.666.666.666:80
Request URL:http://666.com/yh/test/tweets_json.php
Request Method:GET
Status Code:500 Internal Server Error
Response Headers
view source
Connection:close
Content-Length:0
Content-Type:text/html
Date:Mon, 15 Jun 2015 13:51:40 GMT
Server:Apache/2.4.7 (Ubuntu)
X-Powered-By:PHP/5.5.9-1ubuntu4.5
Request Headers
view source
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Host:666.com
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.124 Safari/537.36
Any ideas why this is happening?
Is there a better simple way to do it?
EDIT:
tweets_json.php
<?php
require 'tmhOAuth.php'; // Get it from: https://github.com/themattharris/tmhOAuth
// Use the data from http://dev.twitter.com/apps to fill out this info
// notice the slight name difference in the last two items)
$connection = new tmhOAuth(array(
'consumer_key' => '',
'consumer_secret' => '',
'user_token' => '', //access token
'user_secret' => '' //access token secret
));
// set up parameters to pass
$parameters = array();
if ($_GET['count']) {
$parameters['count'] = strip_tags($_GET['count']);
}
if ($_GET['screen_name']) {
$parameters['screen_name'] = strip_tags($_GET['screen_name']);
}
if ($_GET['twitter_path']) { $twitter_path = $_GET['twitter_path']; } else {
$twitter_path = '1.1/statuses/user_timeline.json';
}
$http_code = $connection->request('GET', $connection->url($twitter_path), $parameters );
if ($http_code === 200) { // if everything's good
$response = strip_tags($connection->response['response']);
if ($_GET['callback']) { // if we ask for a jsonp callback function
echo $_GET['callback'],'(', $response,');';
} else {
echo $response;
}
} else {
echo "Error ID: ",$http_code, "<br>\n";
echo "Error: ",$connection->response['error'], "<br>\n";
}
// You may have to download and copy http://curl.haxx.se/ca/cacert.pem
tmhOAuth.php: https://github.com/themattharris/tmhOAuth/blob/master/tmhOAuth.php
and this pem key: http://curl.haxx.se/ca/cacert.pem
All three in the same folder
In the tutorial it should run the query and get the json output.
I get a blank page.

How do you detect a website visitor's country (Specifically, US or not)?

I need to show different links for US and non-US visitors to my site. This is for convenience only, so I am not looking for a super-high degree of accuracy, and security or spoofing are not a concern.
I know there are geotargeting services and lists, but this seems like overkill since I only need to determine (roughly) if the person is in the US or not.
I was thinking about using JavaScript to get the user's timezone, but this appears to only give the offset, so users in Canada, Mexico, and South America would have the same value as people in the US.
Are there any other bits of information available either in JavaScript, or PHP, short of grabbing the IP address and doing a lookup, to determine this?
There are some free services out there that let you make country and ip-based geolocalization from the client-side.
I've used the wipmania free JSONP service, it's really simple to use:
<script type="text/javascript">
// plain JavaScript example
function jsonpCallback(data) {
alert('Latitude: ' + data.latitude +
'\nLongitude: ' + data.longitude +
'\nCountry: ' + data.address.country);
}
</script>
<script src="http://api.wipmania.com/jsonp?callback=jsonpCallback"
type="text/javascript"></script>
Or if you use a framework that supports JSONP, like jQuery you can:
// jQuery example
$.getJSON('http://api.wipmania.com/jsonp?callback=?', function (data) {
alert('Latitude: ' + data.latitude +
'\nLongitude: ' + data.longitude +
'\nCountry: ' + data.address.country);
});
Check the above snippet running here.
The best indicator is probably the HTTP Accept-Language header. It will look something like below in the HTTP request:
GET / HTTP/1.1
Accept: */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; MDDC; OfficeLiveConnector.1.4; OfficeLivePatch.0.0; .NET CLR 3.0.30729)
Accept-Encoding: gzip, deflate
Host: www.google.com
Connection: Keep-Alive
You should be able to retrieve this in PHP using the following:
<?php
echo $_SERVER['HTTP_ACCEPT_LANGUAGE'];
?>
I would say that geotargetting is the only method that's even remotely reliable. But there are also cases where it doesn't help at all. I keep getting to sites that think I'm in France because my company's backbone is there and all Internet traffic goes through it.
The HTTP Accept Header is not enough to determine the user locale. It only tells you what the user selected as their language, which may have nothing to do with where they are. More on this here.
Wipmania.com & PHP
<?php
$site_name = "www.your-site-name.com";
function getUserCountry() {
$fp = fsockopen("api.wipmania.com", 80, $errno, $errstr, 5);
if (!$fp) {
// API is currently down, return as "Unknown" :(
return "XX";
} else {
$out = "GET /".$_SERVER['REMOTE_ADDR']."?".$site_name." HTTP/1.1\r\n";
$out .= "Host: api.wipmania.com\r\n";
$out .= "Typ: php\r\n";
$out .= "Ver: 1.0\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
$country = fgets($fp, 3);
}
fclose($fp);
return $country;
}
}
?>
#rostislav
or using cURL:
public function __construct($site_name) {
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, Array("Content-Type: text/xml"));
curl_setopt($ch, CURLOPT_URL, "http://api.wipmania.com".$_SERVER['REMOTE_ADDR']."?".$site_name);
curl_setopt($ch, CURLOPT_HEADER, 0);
// grab URL and pass it to the browser
$response = curl_exec($ch);
$info = curl_getinfo($ch,CURLINFO_HTTP_CODE);
if (($response === false) || ($info !== 200)) {
throw new Exception('HTTP Error calling Wipmania API - HTTP Status: ' . $info . ' - cURL Erorr: ' . curl_error($ch));
} elseif (curl_errno($ch) > 0) {
throw new Exception('HTTP Error calling Wipmania API - cURL Error: ' . curl_error($ch));
}
$this->country = $response;
// close cURL resource, and free up system resources
curl_close($ch);
}
Simply we can use Hostip API
<?php $country_code = file_get_contents("http://api.hostip.info/country.php"); <br/>if($country_code == "US"){ echo "You Are USA"; } <br/>else{ echo "You Are Not USA";} ?>
All Country codes are here..
http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2
My solution, easy and small, in this example i test Canada region from language fr-CA or en-CA
if( preg_match( "/^[a-z]{2}\-(ca)/i", $_SERVER[ "HTTP_ACCEPT_LANGUAGE" ] ) ){
$region = "Canada";
}
Depending on which countries you want to distinguish, time zones can be a very easy way to achieve it - and I assume it's quite reliable as most people will have the clocks on their computers set right. (Though of course there are many countries you can't distinguish using this technique).
Here's a really simple example of how to do it:
http://unmissabletokyo.com/country-detector

Categories