cUrl set language header
I was trying to get the source code of Facebook's homepage by using cURL, but it was all Chinese due to the location of my server host. For this reason, I added Accept-Language of CURLOPT_HTTPHEADER to change the language to English, but failed. According to the answer I quoted above, below is the PHP code of cURL I tried:
<?php
$url = "http://www.facebook.com/";
if(isset($_SERVER['HTTP_USER_AGENT']))
$user_agent = $_SERVER['HTTP_USER_AGENT'];
else
$user_agent = "";
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_HTTPHEADER => array("Accept-Language: en-US;q=0.6,en;q=0.4"),
CURLOPT_USERAGENT => $user_agent);
$ch = curl_init($url);
curl_setopt_array($ch, $options);
$content = curl_exec($ch);
$err = curl_errno($ch);
$errmsg = curl_error($ch);
$header = curl_getinfo($ch);
curl_close($ch);
echo $content;
?>
But it still showed Chinese:
How can I solve this problem?
I am trying to make a cURL request. The problem I am facing is that the page have different text depending on which country it is. So I would like the cURL request to have the language en_US (English). So it will get the English text on the website.
Currently I have this code, but its not getting the US text.
$url = 'http://testurl.com'; // Not the real URL
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_HTTPHEADER => array("Accept-Language: en-US;q=0.6,en;q=0.4"),
CURLOPT_USERAGENT => Array("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15"),
);
$ch = curl_init($url);
curl_setopt_array($ch, $options);
$content = curl_exec($ch);
$err = curl_errno($ch);
$errmsg = curl_error($ch);
$header = curl_getinfo($ch);
curl_close($ch);
echo htmlspecialchars($content);
So to make this simple, I would like the cURL request to send the request with the US language, if possible.
Right now it has the language 'dutch' I think this is because my hosting server is located in Netherlands. So therefore it is deutch. But I would like to change it to English.
I'm trying to login to channel advisor but it output an error of: HTTP/1.1 302 Moved Temporarily.
But last week it runs perfectly that I login and retrieve my data and now I run again it has an error.
Here is my code:
$pages = array('home' =>
'https://login.channeladvisor.com/?gotourl=https%3a%2f%2fcomplete.channeladvisor.com%2f',
'login' =>
'https://login.channeladvisor.com/?gotourl=https%3a%2f%2fcomplete.channeladvisor.com%2f',
'data' =>
'https://merchant.channeladvisor.com/AM/MyInventory/View_Inventory.aspx?apid=32001263');
$ch = curl_init();
//Set options for curl session
$options = array(CURLOPT_USERAGENT => 'Mozilla/12.0 (compatible; MSIE 6.0; Windows NT 5.1)',
CURLOPT_SSL_VERIFYPEER => FALSE,
CURLOPT_SSL_VERIFYHOST => 2,
CURLOPT_HEADER => TRUE,
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_COOKIEFILE => 'cookies.txt',
CURLOPT_COOKIEJAR => 'cookies.txt');
//Hit home page for session cookie
$options[CURLOPT_URL] = $pages['home'];
curl_setopt_array($ch, $options);
//curl_exec($ch);
//Login
$options[CURLOPT_URL] = $pages['login'];
$options[CURLOPT_POST] = TRUE;
$options[CURLOPT_POSTFIELDS] = 'username=xxxxx#gmail.com&password=xxxxxxx';
$options[CURLOPT_FOLLOWLOCATION] = false;
curl_setopt_array($ch, $options);
curl_exec($ch);
//Hit data page
$options[CURLOPT_URL] = $pages['data'];
curl_setopt_array($ch, $options);
$data = curl_exec($ch);
//Output data
echo $data;
//Close curl session
curl_close($ch);
If you are looking for data from 'https://merchant.channeladvisor.com/AM/MyInventory/View_Inventory.aspx' Why not use the API instead?
http://developer.channeladvisor.com/display/cadn/Inventory+Service
If exporting your inventory information is all you're trying to do, ChannelAdvisor has its own UI-available inventory export service. At very least, you can automate your code to kick that off and download the exported csv or tab delim file: http://ssc.channeladvisor.com/howto/exporting-inventory
I think you should handle such update from Channel Advisor and so, follow the redirect.
Since I don't think this is a common way to login to Channel Advisor using curl, you will always need to update your code on each Channel Advisor update. It remember me when we can only use curl to retrieve Google Analytics data: every time they update the login system, you have to rewrite your own curl login method - boring.
You can read this answer about following a Header: Location with curl. Basically:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch);
if(preg_match('#Location: (.*)#', $a, $r))
$l = trim($r[1]);
I want to check whether a website is up or down at a particular instance using PHP. I came to know that curl will fetch the contents of the file but I don't want to read the content of the website. I just want to check the status of the website. Is there any way to check the status of the site? Can we use ping to check the status? It is sufficient for me to get the status signals like (404, 403, etc) from the server. A small snippet of code might help me a lot.
something like this should work
$url = 'yoururl';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
$retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if (200==$retcode) {
// All's well
} else {
// not so much
}
curl -Is $url | grep HTTP | cut -d ' ' -f2
curl -Is $url outputs just the headers.
grep HTTP filters to the HTTP response header.
cut -d ' ' -f2 trims the output to the second "word", in this case the status code.
Example:
$ curl -Is google.com | grep HTTP | cut -d ' ' -f2
301
function checkStatus($url) {
$agent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; pt-pt) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27";
// initializes curl session
$ch = curl_init();
// sets the URL to fetch
curl_setopt($ch, CURLOPT_URL, $url);
// sets the content of the User-Agent header
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
// make sure you only check the header - taken from the answer above
curl_setopt($ch, CURLOPT_NOBODY, true);
// follow "Location: " redirects
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// disable output verbose information
curl_setopt($ch, CURLOPT_VERBOSE, false);
// max number of seconds to allow cURL function to execute
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
// execute
curl_exec($ch);
// get HTTP response code
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($httpcode >= 200 && $httpcode < 300)
return true;
else
return false;
}
// how to use
//===================
if ($this->checkStatus("http://www.dineshrabara.in"))
echo "Website is up";
else
echo "Website is down";
exit;
Here is how I did it. I set the user agent to minimize the chance of the target banning me and also disabled SSL verification since I know the target:
private static function checkSite( $url ) {
$useragent = $_SERVER['HTTP_USER_AGENT'];
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // do not return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_USERAGENT => $useragent, // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 2, // timeout on connect (in seconds)
CURLOPT_TIMEOUT => 2, // timeout on response (in seconds)
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLOPT_SSL_VERIFYPEER => false, // SSL verification not required
CURLOPT_SSL_VERIFYHOST => false, // SSL verification not required
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
curl_exec( $ch );
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
return ($httpcode == 200);
}
Have you seen the get_headers() function ? http://it.php.net/manual/en/function.get-headers.php . It seems to do exactly what you need.
If you use curl directly with the -I flag, it will return the HTTP headers (404 etc) instead of the page HTML. In PHP, the equivalent is the curl_setopt($ch, CURLOPT_NOBODY, 1); option.
ping won't do what you're looking for - it will only tell you if the machine is up (and responding to ping). That doesn't necessarily mean that the webserver is up, though.
You might want to try using the http_head method - it'll retrieve the headers that the webserver sends back to you. If the server is sending back headers, then you know it's up and running.
You can not test a webserver with ping, because its a different service. The server may running, but the webserver-daemon may be crashed anyway. So curl is your friend. Just ignore the content.
This function checks whether a URL exists or not. The time of the check is a maximum of 300ms, but you can change that parameter within the cURL option CURLOPT_TIMEOUT_MS
/*
* Check is URL exists
*
* #param $url Some URL
* #param $strict You can add it true to check only HTTP 200 Response code
* or you can add some custom response code like 302, 304 etc.
*
* #return boolean true or false
*/
function is_url_exists($url, $strict = false)
{
if (is_int($strict) && $strict >= 100 && $strict < 600 || is_array($strict)) {
if(is_array($strict)) {
$response = $strict;
} else {
$response = [$strict];
}
} else if ($strict === true || $strict === 1) {
$response = [200];
} else {
$response = [200,202,301,302,303];
}
$ch = curl_init( $url );
$options = [
CURLOPT_NOBODY => true,
CURLOPT_FAILONERROR => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_NOSIGNAL => true,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_HEADER => false,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_VERBOSE => false,
CURLOPT_USERAGENT => ( $_SERVER['HTTP_USER_AGENT'] ?? '' ),
CURLOPT_TIMEOUT_MS => 300, // TImeout in miliseconds
CURLOPT_MAXREDIRS => 2,
];
curl_setopt_array($ch, $options);
$return = curl_exec($ch);
$errno = curl_errno($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if (!$errno && $return !== false) {
return ( in_array($httpcode, $response) !== false );
}
return false;
}
You can check any URL, from domain, ip address to images, files, etc. I think this is the fastest way and proven useful.
I've never done something like this before...I'm trying to log into swagbucks.com and get retrieve some information, but it's not working. Can someone tell me what's wrong with my script?
<?php
$pages = array('home' =>
'http://swagbucks.com/?cmd=home',
'login' =>
'http://swagbucks.com/?cmd=sb-login&from=/?cmd=home',
'schedule' =>
'http://swagbucks.com/?cmd=sb-acct-account&display=2');
$ch = curl_init();
//Set options for curl session
$options = array(CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; `rv:1.9.2) Gecko/20100115 Firefox/3.6',`
CURLOPT_HEADER => TRUE,
//CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_COOKIEFILE => 'cookie.txt',
CURLOPT_COOKIEJAR => 'cookies.txt');
//Hit home page for session cookie
$options[CURLOPT_URL] = $pages['home'];
curl_setopt_array($ch, $options);
curl_exec($ch);
//Login
$options[CURLOPT_URL] = $pages['login'];
$options[CURLOPT_POST] = TRUE;
$options[CURLOPT_POSTFIELDS] = 'emailAddress=lala#yahoo.com&pswd=jblake&persist=on';
$options[CURLOPT_FOLLOWLOCATION] = FALSE;
curl_setopt_array($ch, $options);
curl_exec($ch);
//Hit schedule page
$options[CURLOPT_URL] = $pages['schedule'];
curl_setopt_array($ch, $options);
$schedule = curl_exec($ch);
//Output schedule
echo $schedule;
//Close curl session
curl_close($ch);
?>
But it still doesn't log me in. What's wrong?
try to echo each request to see if something went wrong.
(enabling CURLOPT_RETURNTRANSFER)
I suggest you to use
curl_setopt($ch, CURLOPT_COOKIEFILE, '/dev/null');
This way cookies are stored internally in-memory without the need of a separated file.
It works for me with "persist=1" , not "persist=on" :
$options[CURLOPT_POSTFIELDS] = 'emailAddress=lala#yahoo.com&pswd=jblake&persist=on'; // doesn't work
$options[CURLOPT_POSTFIELDS] = 'emailAddress=lala#yahoo.com&pswd=jblake&persist=1'; // works
$options[CURLOPT_POSTFIELDS] = 'emailAddress=lala#yahoo.com&pswd=jblake'; // also works