Using curl to get data from a website - php

I am trying to use cURL to get train information from http://www.indianrail.gov.in/know_Station_Code.html .
I have the following PHP code :
<?php
$fields = array(
'lccp_src_stncode_dis'=>'MANGAPATNAM-+MUM',
'lccp_src_stncode'=>'MUM',
'lccp_dstn_stncode_dis‌'=>'AMBALA+CITY-+UBC',
'lccp_dstn_stncode'=>'UBC',
'lccp_classopt'=>'SL',
'lccp_day'=>'17',
'lccp_month'=>'8',
'CurrentMonth'=>'7',
'CurrentDate'=>'17',
'CurrentYear'=>'2015'
);
$fields_string = ''; //defining an empty string
foreach($fields as $key=>$value) {
$temp = $key.'='.$value.'&';
$fields_string.$temp;
}
rtrim($fields_string,'&'); //removing the last '&' from the generated string
$curl = curl_init('http://www.indianrail.gov.in/cgi_bin/inet_srcdest_cgi_date.cgi');
curl_setopt($curl,CURLOPT_POST,1);
curl_setopt($curl,CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl,CURLOPT_POSTFIELDS,$fields_string);
$result = curl_exec($curl);
var_dump($result);
?>
The problem is that I get this output on my browser window :
boolean false
I tried using var_dump(curl_error($curl)) and I got the following output :
string 'Empty reply from server' (length=23)
Thanks for any help.

Solution :
$fields = array(
'lccp_src_stncode_dis'=>'MANGAPATNAM-+MUM',
'lccp_src_stncode'=>'MUM',
'lccp_dstn_stncode_dis?'=>'AMBALA+CITY-+UBC',
'lccp_dstn_stncode'=>'UBC',
'lccp_classopt'=>'SL',
'lccp_day'=>'17',
'lccp_month'=>'8',
'CurrentMonth'=>'7',
'CurrentDate'=>'17',
'CurrentYear'=>'2015'
);
$fields_string = ''; //defining an empty string
foreach($fields as $key=>$value) {
$temp = $key.'='.urlencode($value).'&'; // urlencode
$fields_string.= $temp; // equal sign
}
rtrim($fields_string, '&');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6',
'Content-Type: application/x-www-form-urlencoded'
);
$curl = curl_init('http://www.indianrail.gov.in/cgi_bin/inet_srcdest_cgi_date.cgi');
curl_setopt($curl,CURLOPT_POST,1);
curl_setopt($curl,CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl,CURLOPT_HTTPHEADER,$header); //server require
curl_setopt($curl, CURLOPT_REFERER, 'http://www.indianrail.gov.in/know_Station_Code.html'); //server require
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13'); //server require
curl_setopt($curl,CURLOPT_POSTFIELDS,$fields_string);
$result = curl_exec($curl);
//close connection
curl_close($curl);
echo $result;

There are a few ways to solve this problem.
1. cURL:
I believe what you were missing is http_build_query(). This internally converts an array into a HTTP query format. Like this:
$fields = array(
'lccp_src_stncode_dis'=>'MANGAPATNAM-+MUM',
'lccp_src_stncode'=>'MUM',
'lccp_dstn_stncode_dis‌'=>'AMBALA+CITY-+UBC',
'lccp_dstn_stncode'=>'UBC',
'lccp_classopt'=>'SL',
'lccp_day'=>'17',
'lccp_month'=>'8',
'CurrentMonth'=>'7',
'CurrentDate'=>'17',
'CurrentYear'=>'2015'
);
$post_fields = http_build_query($fields);
$ch = curl_init($API_URL);
curl_setopt( $ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt( $ch, CURLOPT_TIMEOUT, 30);
curl_setopt( $ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/x-www-form-urlencoded'
));
curl_setopt($ch, CURLOPT_ENCODING , "gzip, deflate");
curl_setopt( $ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
print_r(response);
2. Postman
This is one of the best tools for the problem you have. here is how this works:
Install Postman on your chrome browser. you can get it from the chrome store.
Get the postman Interceptor from the same chrome store.
Once you have these two, follow these steps to get your PHP code for all the requests your browser makes:
1) Turn on the Interceptor
2) Open the Postman App:
3) Turn on the request capturing feature
4) Make your call on the Indian railways website
5) You'll get your Entire call, including headers and the post fields capturen in postman.
6) Click on Generate Code button and choose PHP -> cURL. You'll get th PHP code that makes the exact same request as the browser.
You can copy this code into the clipboard and use it as you wish.
3. Use a Library
There are libraries that you can use which handles all the errors. Guzzle is one such framework. You can find it's documentation here
Hope it helps! :)

Related

Getting lyrics from Musixmatch

Unfortunately I haven't any experience in coding PHP. Just Html.
I'm trying to retrieve lyrics from Musixmatch without success.
This code below I'd used to retrieve successfully Bios from Last.fm, and to use with Musixmatch I've changed the values (url, api_key).
Can you give me a little help?
Thanks so much.
Merry Xmas.
<?php
$fields = array(
'q_track' => $track,
'q_artist' => $artist,
'api_key' => 'xxxxxecab2a0072c88ee31b50a4225b');
$fields_string = '';
foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
rtrim($fields_string,'&');
$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_URL,
'http://api.musixmatch.com/ws/1.1/matcher.lyrics.get');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields_string);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01;
Windows NT 5.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
$soap = simplexml_load_string($response); ?>
<div>
<div />
<h3><?php echo $track; ?></h3>
</div>
<br><div><p><?php print nl2br(strip_tags($soap->body->lyrics-
>lyrics_body)); ?></p><br></div>
I am new to Musixmatch api. You need to use the query input parms (q_*) with the api commands that end with 'search', not get. If you use 'get' you need the ID of the object (ex track_id=#####). Regards.
example works with 'search' and input of 'q_':
curl:
curl --url "http://api.musixmatch.com/ws/1.1/track.search?q_artist=Toto&q_track=Rosanna&apikey=########"
partial curl output:
{"message":{"header":{"status_code":200,"execute_time":0.020166873931885,"available":26},"body":{"track_list":[{"track":{"track_id":88430107
1/ You're sending the request using POST method which is not supported By the API.
Try replacing the CURL request by this:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,
'http://api.musixmatch.com/ws/1.1/matcher.lyrics.get?'.$fields_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
2/ The API's response type is JSON, so try using json_decode() instead of simplexml_load_string()
Alos, I recommend using HTTPS.

Instagram API retrieve the code using PHP

I try to use the Instagram API but it's really not easy.
According to the API documentation, a code must be retrieved in order to get an access token and then make requests to Instagram API.
But after few try, I don't succeed.
I already well-configured the settings in https://www.instagram.com/developer
I call the url api.instagram.com/oauth/authorize/?client_id=[CLIENT_ID]&redirect_uri=[REDIRECT_URI]&response_type=code with curl, but I don't have the redirect uri with the code in response.
Can you help me please ;)!
I would recommend you use one of the existing PHP instagram client libraries like this https://github.com/cosenary/Instagram-PHP-API
I did this not too long ago, here's a good reference:
https://auth0.com/docs/connections/social/instagram
Let me know if it helps!
I've made this code, I hope it doesnt have error, but i've just made it for usecase like you wantedHere is the code, I'll explain it below how this code works.
$authorization_url = "https://api.instagram.com/oauth/authorize/?client_id=".$instagram_client_id."&redirect_uri=".$your_website_redirect_uri."&response_type=code";
$username='ig_username';
$password='ig_password';
$_defaultHeaders = array(
'User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: ',
'Connection: keep-alive',
'Upgrade-Insecure-Requests: 1',
'Cache-Control: max-age=0'
);
$ch = curl_init();
$cookie='/application/'.strtoupper(VERSI)."instagram_cookie/instagram.txt";
curl_setopt( $ch, CURLOPT_POST, 0 );
curl_setopt( $ch, CURLOPT_HTTPGET, 1 );
if($this->token!==null){
array_push($this->_defaultHeaders,"Authorization: ".$this->token);
}
curl_setopt( $ch, CURLOPT_HTTPHEADER, $this->_defaultHeaders);
curl_setopt( $ch, CURLOPT_HEADER, true);
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_COOKIEFILE,getcwd().$cookie );
curl_setopt( $ch, CURLOPT_COOKIEJAR, getcwd().$cookie );
curl_setopt($this->curlHandle,CURLOPT_URL,$url);
curl_setopt($this->curlHandle,CURLOPT_FOLLOWLOCATION,true);
$result = curl_exec($this->curlHandle);
$redirect_uri = curl_getinfo($this->curlHandle, CURLINFO_EFFECTIVE_URL);
$form = explode('login-form',$result)[1];
$form = explode("action=\"",$form)[1];
// vd('asd',$form);
$action = substr($form,0,strpos($form,"\""));
// vd('action',$action);
$csrfmiddlewaretoken = explode("csrfmiddlewaretoken\" value=\"",$form);
$csrfmiddlewaretoken = substr($csrfmiddlewaretoken[1],0,strpos($csrfmiddlewaretoken[1],"\""));
//finish getting parameter
$post_param['csrfmiddlewaretoken']=$csrfmiddlewaretoken;
$post_param['username']=$username;
$post_param['password']=$password;
//format instagram cookie from vaha's answer https://stackoverflow.com/questions/26003063/instagram-login-programatically
preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $result, $matches);
$cookieFileContent = '';
foreach($matches[1] as $item)
{
$cookieFileContent .= "$item; ";
}
$cookieFileContent = rtrim($cookieFileContent, '; ');
$cookieFileContent = str_replace('sessionid=; ', '', $cookieFileContent);
$cookie=getcwd().'/application/'.strtoupper(VERSI)."instagram_cookie/instagram.txt";
$oldContent = file_get_contents($cookie);
$oldContArr = explode("\n", $oldContent);
if(count($oldContArr))
{
foreach($oldContArr as $k => $line)
{
if(strstr($line, '# '))
{
unset($oldContArr[$k]);
}
}
$newContent = implode("\n", $oldContArr);
$newContent = trim($newContent, "\n");
file_put_contents(
$cookie,
$newContent
);
}
// end format
$useragent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0";
$arrSetHeaders = array(
'origin: https://www.instagram.com',
'authority: www.instagram.com',
'upgrade-insecure-requests: 1',
'Host: www.instagram.com',
"User-Agent: $useragent",
'content-type: application/x-www-form-urlencoded',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: deflate, br',
"Referer: $redirect_uri",
"Cookie: $cookieFileContent",
'Connection: keep-alive',
'cache-control: max-age=0',
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__)."/".$cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__)."/".$cookie);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, $arrSetHeaders);
curl_setopt($ch, CURLOPT_URL, $this->base_url.$action);
curl_setopt($ch, CURLOPT_REFERER, $redirect_uri);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_param));
sleep(5);
$page = curl_exec($ch);
preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $page, $matches);
$cookies = array();
foreach($matches[1] as $item) {
parse_str($item, $cookie1);
$cookies = array_merge($cookies, $cookie1);
}
var_dump($page);
Step 1:
We need to get to the login page first.
We can access it using curl get, with CURLOPT_FOLLOWLOCATION set to true so that we will be redirected to the login page, we access our application instagram authorization url
$authorization_url = "https://api.instagram.com/oauth/authorize/?client_id=".$instagram_client_id."&redirect_uri=".$your_website_redirect_uri."&response_type=code";
$username='ig_username';
This is step one from this Instagram documentation here
Now the result of the first get curl we have the response page and its page uri that we store at $redirect_uri, this must be needed and placed on referer header when we do http post for login.
After get the result of login_page, we will need to format the cookie, I know this and use some code from vaha answer here vaha's answer
Step 2:
After we get the login_page we will extract the action url , extract csrfmiddlewaretoken hidden input value.
After we get it, we will do a post parameter to login.
We must set the redirect uri, and dont forget the cookiejar, and other header setting like above code.After success sending the parameter post for login, Instagram will call your redirect uri, for example https://www.yourwebsite.com/save_instagram_code at there you must use or save your instagram code to get the access token using curl again ( i only explain how to get the code :D)
I make this in a short time, I'll update the code which I have tested and work if i have time, Feel free to suggest an edit of workable code or a better explanation.

Scrape website with javascript using cURL

I try to scrape data of this website:
http://ntthnue.edu.vn/tracuudiem
First, when I insert the SBD field with data 'TS4740', I can successfully get the result. However, when I try to run this code:
Here is my PHP cURL code:
<?php
function getData($id) {
$url = 'http://ntthnue.edu.vn/tracuudiem';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, ['sbd' => $id]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
echo getData('TS4740');
I just got the old page. Can anybody explain why? Thank you!
Make sure you add all the necessary headers and input data. The server that is processing this request can do all kinds of checks to see if it's a "valid" form request. As such you need to spoof the request to be as close to a regular browser request as possible.
Use tools like Chrome Dev Tools to see both the request and respons headers that are sent between the server and your browser to better understand what you curl setup should be like. And further use a app like Postman to make the request simulation super easy and to see what works and not.
Working example:
<?php
function getData($id) {
$url = 'http://ntthnue.edu.vn/tracuudiem';
$ch = curl_init($url);
$postdata = 'namhoc=2015-2016&kythi_name=Tuy%E1%BB%83n+sinh+v%C3%A0o+l%E1%BB%9Bp+10&hoten=&sbd='.$id.'&btnSearch=T%C3%ACm+ki%E1%BA%BFm';
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Origin: http://ntthnue.edu.vn',
'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36',
'Content-Type: application/x-www-form-urlencoded',
'Referer: http://ntthnue.edu.vn/tracuudiem',
));
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
echo getData('TS4740');

POST to REST server by php

I want to make a post to a server I have by using php.
I was thinking in curl but all examples I find, urlfy data and I have to send a json file but not in the url.
I already have the json in an array : 'key'=>'value'...
I have to add headers, I think I can with this:
curl_setopt($ch,CURLOPT_HTTPHEADER,array('HeaderName: HeaderValue','HeaderName2: HeaderValue2'));
but I don't knoe how to add my array and post it.
Any idea?
I need to add a json like this:
[{"a":"q",
"b":"w",
"c":[{
"e":"w",
"r":"t"
}]
}]
Here how you can post the data using CURL, and as you mentioned you already have a json you can do so as
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, 'your api end point');
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields); // $postfields is the json that you have
$request_headers = array();
$request_headers[] = 'HeaderName: HeaderValue','HeaderName2: HeaderValue2';
$request_headers[] = 'Content-Type: application/json','Content-Length: ' . strlen($postfields) ;
curl_setopt($ch, CURLOPT_HTTPHEADER, $request_headers);
$response = curl_exec($ch);
curl_close ($ch);

PHP CURL: Complete manipulation of HTTP Headers not allowed?

I've been working on writing a script which automatically logs me into my school's network, checks if the classes I'm trying to get into are no longer completely full, and if a spot has opened up, registers the class for me. However, I've hit a big snag in just the logging-in process.
Basically, I've been looking at the headers that are sent when I log in and try to replicate them. The problem is I keep getting an error saying "HTTP/1.1 400 Bad Request Content-Type: text/html Date: Sat, 23 Oct 2010 18:42:20 GMT Connection: close Content-Length: 42
Bad Request (Invalid Header Name)".
I'm guessing it has something to do with Host parameter I'm setting being different from what it really is (I set it so it is elion.psu.edu, but when looking at the headers from my script it has changed back to grantbachman.com, where the script is hosted). I guess it'll be best just to show you.
The beginning of the header I'm trying to create:
https://elion.psu.edu/cgi-bin/elion-student.exe/submit
POST /cgi-bin/elion-student.exe/submit HTTP/1.1
Host: elion.psu.edu
The beginning of the header which shows up when I run my script:
http://myDomain.com/myScriptName.php
GET /elionScript.php HTTP/1.1
Host: myDomain.com
Basically, the first line is different, the Host name is different, and it says I'm sending my info with a GET variable instead of a POST variable (even though I set curlopt_post to true). I'm basically looking for any help with altering this info such that the server accepts my script. I'm fresh out of ideas. Thanks.
Oh here's the code I'm using:
$data = array(
"$userIDName" => '********',
"$passName" => '********',
"$submitName" => 'Login+to+eLion',
'submitController' => '',
'forceUnicode' => '%D0%B4%D0%B0',
'sessionKey' => "$sessionValue",
'pageKey' => "$pageKeyValue",
'shopperID' => '');
$contentLength = strlen($userIDName . '=*********&' . $passName . '=********&' . $submitName .'=Login+to+eLion&submitController=&forceUnicode=%D0%B4%D0%B0&sessionKey=' . $sessionValue . '&pageKey=' . $pageKeyValue . '&shopperID=');
$ch = curl_init("https://elion.psu.edu/cgi-bin/elion-student.exe/submit");
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE,'sessionKey="$sessionValue";pageKey="$pageKeyValue";BIGipServerelion_prod_pool="$prodPoolValue"');
curl_setopt($ch,CURLOPT_HTTPHEADER,array(
'Host: elion.psu.edu',
'User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-us,en;q=0.5','Accept-Encoding: gzip,deflate',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Keep-Alive: 115',
'Referer: https://elion.psu.edu/cgi-bin/elion-student.exe/launch/ELionMainGUI/Student',
"Cookie: sessionKey=$sessionValue; pageKey=$pageKeyValue; BIGipServerelion_prod_pool=$prodPoolValue", 'Content-Type: application/x-www-form-urlencoded',
"Content-Length: $contentLength"));
$contents2 = curl_exec ($ch);
I't also probably important to note that when I run the script, none of the information below the 'Keep-Alive: 115' line is displayed when I view the header.
Seems it missing some code in your question but try this :
1- Save your certificate on your server
2- Try this code
$pg = curl_init();
// Set the form data for posting the login information
$postData = array();
$postData["username"] = $username;
$postData["password"] = $password;
$postText = "";
foreach( $postData as $key => $value ) {
$postText .= $key . "=" . $value . "&";
}
curl_setopt( $pg, CURLOPT_URL, $YOUR_URL );
curl_setopt( $pg, CURLOPT_POST, true );
curl_setopt( $pg, CURLOPT_POSTFIELDS, $postText );
curl_setopt( $pg, CURLOPT_SSL_VERIFYPEER, true );
curl_setopt( $pg, CURLOPT_SSL_VERIFYHOST, 2 );
curl_setopt( $pg, CURLOPT_CAINFO, getcwd() . '/web'); //web is the exported certificate
//curl_setopt( $pg, CURLOPT_VERBOSE, true ); // for debug
//curl_setopt( $pg, CURLOPT_RETURNTRANSFERT, true); // if you want a ouput
curl_setopt( $pg, CURLOPT_COOKIEJAR, "cookies.txt" );
curl_setopt( $pg, CURLOPT_COOKIEFILE, "cookies.txt" );
curl_setopt( $pg, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 5.01; Windows NT 5.0)" );
curl_setopt( $pg, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $pg, CURLOPT_COOKIE, session_name() . '=' . session_id() );
if( ( $response = curl_exec( $pg ) ) === false ) {
echo '*Curl erro' . curl_error($pg) . "\n";
}
curl_close($pg)
$YOUR_URL:https://elion.psu.edu/cgi-bin/elion-student.exe/launch/ELionMainGUI/Student
The form using dynamic name so its not simple like "username" and "password". Check on the website to know a "good" one.
Do not forget to add others hidden field like you did in the postData array and update the cookie section too.

Categories