how to get cookie list from remote url using php - php

Does anyone knows how I could get a list of cookies for an external URL using php?
I think this could be done with cURL?
When i'm using cookiejar or using get_headers, I only see one cookie (the PHPSESSID) for example. But when you open chrome console (F12) and go to cookie storage, you see a much bigger list. Also the google analytics cookies for example. I want to be able to display that list of cookies. So also 3rd party cookies..
Is there any way to retrieve those cookies? Maybe store them temporary or something?

If I got you right & you mean getting cookies that a page sets when visiting it using cURL,
You can use CURLOPT_COOKIEJAR option of cURL for auto manage cookies with cURL (curl_setopt):
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$cont = curl_exec ($ch);
curl_close ($ch);
Or search response header for Set-Cookie: line:
how to get the cookies from a php curl into a variable

Related

Can open PSE API using browser but not using PHP

I'm trying to record data from Philippine Stock Exchange website. I have found that they have an endpoint which is http://www.pse.com.ph/stockMarket/companyInfo.html?method=fetchHeaderData&company=29&security=146
I can clearly access it using any browsers except when I go into incognito mode where I'm being shown with a content saying Access Denied and it never stops loading. When I try to access it using PHP I'm quite sure that what is happening is the same as the later.
I'm trying to access it using PHP to no avail, here are the attempts I tried:
file_get_contents
cURL with user agent
cURL with temporary cookies
Tried all in localhost and in live server.
Code:
$c = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.pse.com.ph/stockMarket/companyInfo.html");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch, CURLOPT_COOKIEJAR, $c);
curl_setopt($ch, CURLOPT_COOKIEFILE, $c);
curl_setopt($ch, CURLOPT_POSTFIELDS, "method=fetchHeaderData&ajax=true&company=29&security=146");
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
var_dump(curl_exec($ch));
curl_close ($ch);
I don't have any clear idea on why and how does this happen. Can someone explain to me why it happens and what are the possible solutions (PHP only if possible)
I have reviewed other developer's approach on this API (They all implemented it using Java) and it is just a simple POST request and it is done. I have not verified though if their code is still working. I can't post links to their repository (limited).
SOLUTIONS:
Problem 1. Can't access API
$posts = array(
"method"=>"fetchHeaderData",
"ajax"=>"true",
"company"=>29,
"security"=>146
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.pse.com.ph/stockMarket/companyInfo.html");
curl_setopt($ch, CURLOPT_POSTFIELDS,$posts);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
var_dump(curl_exec($ch));
curl_close ($ch);
It seems I have two different problems. I can now access and use the API using the code above. No need for other options. Turning the post data into array fixed the problem.
Problem 2. Access Denied
On the problem about the Access Denied, it is cookie related. Answered below by #Wayne.
Unfortunately, I can't accept two answers.
Try this solution. convert your post data in array then pass this array in CURLOPT_POSTFIELDS
$posts = array(
"method"=>"fetchHeaderData",
"ajax"=>"true",
"company"=>29,
"security"=>146
);
$c = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.pse.com.ph/stockMarket/companyInfo.html");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch, CURLOPT_COOKIEJAR, $c);
curl_setopt($ch, CURLOPT_COOKIEFILE, $c);
curl_setopt($ch, CURLOPT_POSTFIELDS,$posts);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
var_dump(curl_exec($ch));
curl_close ($ch);
It is because they have their server setup to stop you from doing that. They are securing the data with a cookie.
Cookie details
When you visit the site http://www.pse.com.ph/stockMarket/companyInfo.html it gives you a cookie as it knows you are a human visitor.
In your browser tools enter
document.cookie
to see your cookie. It will provide you an individual the data because you have the cookie.
Remove the cookie
document.cookie = "JSESSIONID=; expires=Thu, 01 Jan 1970 00:00:00 UTC; path=/;";
and visit
http://www.pse.com.ph/stockMarket/companyInfo.html?method=fetchHeaderData&company=29&security=146
without going to get a cookie http://www.pse.com.ph/stockMarket/companyInfo.html first you will get the 403 (Forbidden)
Also they do not have jsonp with a callback so an ajax request will violate the cross domain security. Requests for the JSON must be from pages that originate from their domain or an approved domain.
Why would they do that.
Likely their licence to the information does not allow them to give it to other websites, or they need/want to get paid to provide the information to other websites. Or they have terms of use for the information.
Where can you get the data ... data wants to be free
I don't see anyplace on their site http://www.pse.com.ph where they have API information and how to request permission to access it.
Programable web has been the number one source for finding APIs, they have 96 stock APIs listed ... Obviously I can not just copy their data and past it here, but one of these API may work for you?

Headless Login: login and open remote session with cURL

I am trying to use a button on my php web-application to launch a logged-in session on another website. In other words I want my application to:
open a new tab/window (achieved)
go to another website + login or
(alternatively) collect the session data needed for the target site to consider the current browser logged in.
This is achieved (in an incomplete manner with the following code:
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$postdata);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_REFERER, $url);
$result = curl_exec ($ch);
curl_close ($ch);
print $result;
This successfully visualise the "logged-in" page of the remote site but whenever I click on any of the functionalities of such remote site I get and obvious 404. This is because I am just printing the output of the successful login via cURL and my browser is not dealing with the remote application on the target website. E.g. my address bar says I am in local.dev/loggedin.php instead of being at secure.targetsite.com/loggein.php.
This maybe helpful: Once logged-in via the browser, the target website sets a session cookie that allows the session to survive for a certain amount of time so that may also be useful. Can my web-application just fetch and store the session data from the auth procedure carried out by curl and use it to login?
This might not be possible to be done via cURL..
I was thinking of just parsing the response header for the cookie and use php setcookie() but it does not work: I get bounced by the remote app as if I was never logged in.
Please be patient, I am not an expert in the use of curl.
I have done that for a few of my own applications, but it should work for almost anything that can be logged in via an html form submission. You can't use curl for this because it is running on your web server (whether that is on your local machine or in the cloud somewhere is irrelevant) and not actually being run by your browser. Your PHP application needs to open a new tab/window with a page that includes an HTML that includes all necessary fields, method="get" or "post" as appropriate, and action="the destination login URL". Then just add an automatic form submission - e.g., with jQuery $('#form_id').submit() on page load.

PHP NTLM session with cURL

So a little trivia first..
There is written in ASP.NET website, which uses NTLM protocol to authenticate users that want to log in. It's perfectly ok when they normally use it, they type in website URL, they provide their credentials, authenticate and maintain session in web browser.
What I want to do, is create PHP website that will act as bot. It is my companys internal website and I am approved to do so. The problem I run into, is managing session. Users will be able to type in their credentials in my PHP website, and my PHP website will authenticate them to target site, using cURL.
The code I got so far is:
$cookie_file_path = dirname(__FILE__) . '/cookies.txt';
$ch = curl_init();
//==============================================================
curl_setopt($ch, CURLOPT_USERPWD, $username. ':' . $password);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_setopt($ch, CURLOPT_FAILONERROR, 0);
curl_setopt($ch, CURLOPT_MAXREDIRS, 100);
//=============================================================
$ret = curl_exec($ch);
Above code logs in to target website by cURL (which manages NTLM handshake, as it seems), and fetches websites content. It also stores Session ID that is sent back in cookie file.
What I'm trying to do next, is comment the CURLOPT_USERPWD option, in hope that this script will use session ID stored in cookie file to authenticate previously logged in user in second execution of this script. It could get rid of user credentials and do not store it anywhere that way, becouse it is not safe to store it in manually created session, database, or anywhere else.
I need this becouse bot will be using CRON to periodically check if website status has changed and perform some user actions as reaction to this. But to do this, user first must be authenticated, and his username and password must not be stored anywhere, so I have to use session information estabilished when he initially logged in.
CURL seems to NOT DO THIS. When I execute script second time with commented CURLOPT_USERPWD option, it does not use stored cookie to keep beeing authenticated. Instead, it REWRITES cookie file with not relevant data send to me from service as response to NOT AUTHRORISED access request.
My questions are:
Why cURL doesnt use stored session information to keep beeing authenticated?
Is there any way to maintain this session with cURL and NTLM protocol based website?
Thanks in advance.
A few Month ago I had a similar problem then you. I tried to get a connection to a navision soap api. Navision use the ntlm authentication. The problem is that curl doesn't native support ntlm so you have to do it yourself.
A blog post that helped me a lot in this situation was the following:
http://rabaix.net/en/articles/2008/03/13/using-soap-php-with-ntlm-authentication
** Edit
Sorry i misread you question.
You problem is simple.
Just receive the header from a request with this line
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
You can then get from the result of curl_exec function, the Set-Cookie header.
preg_match('/^Set-Cookie:\s*([^;]*)/mi', $ret, $match);
$cookie = parse_url($match[0]);
Now you can store it somewhere, and use it on the 2ten request.
I have the same problem and i solved it using curl_setopt($ch, CURLOPT_COOKIEFILE, ""); line of code. The string should be exactly empty.

How to store a cookie fetched by CURL such that it can be accessed by a page loaded in an iFrame

I have a situation whereby when a page loads, I send some authentication data (in this case the associative array $data) which is verified by a script on another domain. Code below:
$cookie_path = 'cookies.txt';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.mysite.com/verify');
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_path);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_path);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
$result = curl_exec($ch);
the site then sets a session (in this case I am using the codeigniter framework and sessions are set like: $this->session->set_userdata('logged_in', true); )
however when I load the external site in an iframe it does not seem to be able to detect that the session is set and redirects to the login page.
How do I ensure that my session cookie is being sent properly and can be accessed by an iframe?
Your curl script is running server side and storing the cookie for the second site there, but your browser is loading the second site in the client. You can share cookies across domains.
If you control the site you are attempting to create the session on, you may be able to pass the session ID to the PHP script, then generate the iframe URL dynamically, including the session ID as a query string, eg:
http://www.brainbell.com/tutors/php/php_mysql/Encoding_the_session_ID_as_a_GET_variable.html
Edit
To clarify, if you control the script on the second site, you can modify it to provide the SESSIONID of the authenticated session to your CURL script, which your PHP script making the cURL request can then incorporate into the dynamically generated iFrame src URL.
You can set cookies via:
http://php.net/manual/en/function.setcookie.php
However, you can't set cookies for domains outside of your script's domain.

Trying to AVOID an ASP.NET session using cURL

I'm using a web-service from a provider who is being a little too helpful in anticipating my needs. They have given me a HTML snippet to paste on my website, for users to click on to trigger their services. I'd prefer to script this process, so I've got a php script which posts a cURL request to the same url, as appropriate. However, this provider is keeping tabs on my session, and interprets each new request as an update of the first one, rather than each being a unique request.
I've contacted the provider regarding my issue, and they've gone so far as to inform me that their system is working as intended, and that it's impossible for me to avoid using the same ASP.NET session for each subsequent cURL request. While my favored option would be to switch to a different vendor, that doesn't appear to be an option right now. Is there a reliable way to get a new ASP.NET session with each cURL request?
I've tried the following set of CURLOPT's, to no avail:
//initialize curl
$ch = curl_init($url);
//build a string out of the post_vars
$post_str = http_build_query($post_vars);
//set the necessary curl options
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_str);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIESESSION, 1);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FORBID_REUSE, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "UZ_".uniqid());
curl_setopt($ch, CURLOPT_REFERER, CURRENT_SITE_URL."index.php?newsession=".uniqid());
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Pragma: no-cache", "Cache-Control: no-cache"));
//execute the call to the backend script, retrieve the results
$xmlstr = curl_exec($ch);
If cURL isn't helping much, why not try other methods to call the services from your script, like php's file() function, or file_get_contents().
If you see do not see any difference at all, then the service provider might be using your ip to track your requests. Try using some proxy for a test.
Normal Asp.net session is tracked by a cookie called ASP.NET_SessionId. This cookie is sent within the response to your first request. So as long as your curl requests don't send back this asp.net cookie, each of your requests will have no connection to each other. Use the curl -c option to see what cookies are flying in-between you and them. Overriding this cookie with a cookie file should work if you confirm that it is normal asp.net session being used here.
It is quite poor for a service to use session (http has much cleaner ways of maintaining state which ReST exploits) so I wouldn't completely rule out the vendor switch option.
Well given the options you are using, it seems you have covered your basics. Can you find out how their sessions are setup?
If you know how they setup a session, IE what they use (if it is IP or what not) and then you can figure out a work around. Another option is trying to set the cookies in a different cookie file:
CURLOPT_COOKIEFILE - The name of the file containing the cookie data. The cookie file can be in Netscape format, or just plain HTTP-style headers dumped into a file.
But if all they do is check cookies your current code should work. If you can figure out what the cookie's name is, you can pass a custom cookie that is blank with the request to see if that works. But if you can get information out of them on how their session's work, that would be best.
use these two line to handle the session:
curl_setopt($ch, CURLOPT_COOKIEJAR, "path/to/cookies.txt"); // cookies.txt should be writable
curl_setopt($ch, CURLOPT_COOKIEFILE, "path/to/cookies.txt");

Categories