Check if Youtube video valid? Code Fix - php

ive a database with around 5000 videos and i noticed some of them are removed now.. SO i decided to write a php script to fix bulk check this..
From the various sources below is the code i implemented based on most answers here, but it doesnt give correct results. IT gives a 403 header for 3/4th videos though practically more than 90% are working..Am i missing anything?
foreach ($video as $cat) {
$str = explode("=",$cat->videourl);
$headers = get_headers('http://gdata.youtube.com/feeds/api/videos/' . $str[1]);
if (!strpos($headers[0], '200')) {
print_r($headers[0].'<br>');
$i=$i+1;
print_r("Unpublish".$cat->id. PHP_EOL);
}
else{
print_r("publish".$cat->id. PHP_EOL);
}
}
I'm printing the header here to debug it, and for most it gives, HTTP/1.0 403 Forbidden
Edit :: ive already checked the videoids are passed correctly(so string processing has no issues)

For anyone trying to achieve this, here is the code.. do appreciate if works for you as ive spend hours to get it working for the new api
$headers = checkYoutubeId ($str[1]);
if ($headers == false) {
$i=$i+1;
$db->query('UPDATE `ckf_hdflv_upload` SET `published`="0" Where `id`='.$cat->id);
print_r("Unpublished".$cat->id. PHP_EOL);
}
else{
$db->query('UPDATE `ckf_hdflv_upload` SET `published`="1" Where `id`='.$cat->id);
}
}
}
echo('done'.$i);
function checkYoutubeId($id) {
if (!$data = #file_get_contents("http://gdata.youtube.com/feeds/api/videos/".$id)) return false;
if ($data == "Video not found") return false;
if ($data == "Private video") return false;
return true;
}

Related

How to add an error handling to read an XML file in php?

I am developing a PHP script that allows me to modify tags in an XML file and move them once done.
My script works correctly but I would like to add error handling: So that if the result of my SQL query does not return anything display an error message or better, send a mail, and not move the file with the error and move to the next.
I did some tests but the code never displays the error and it moves the file anyway.
Can someone help me to understand why? Thanks
<?php
}
}
$xml->formatOutput = true;
$xml->save($source_file);
rename($source_file,$destination_file);
}
}
closedir($dir);
?>
Give this one a try
$result = odbc_fetch_array($exec);
if ($result === false || $result['GEAN'] === null) {
echo "GEAN not found for $SKU_CODE";
// continue;
}
$barcode = (string) $result['GEAN'];
echo $barcode; echo "<br>"; //9353970875729
$node->getElementsByTagName("SKU")->item(0)->nodeValue = "";
$node->getElementsByTagName("SKU")->item(0)->appendChild($xml->createTextNode($result[GEAN]));

php file_get_contents() returning false with a valid url

I'm currently working on a geocoding php function, using google maps API. Strangely, file_get_contents() returns bool(false) whereas the url I use is properly encoded, I think.
In my browser, when I test the code, the page takes a very long time to load, and the geocoding doesn't work (of course, given that the API doesn't give me what I want).
Also I tried to use curl, no success so far.
If anyone could help me, that'd be great !
Thanks a lot.
The code :
function test_geocoding2(){
$addr = "14 Boulevard Vauban, 26000 Valence";
if(!gc_geocode($addr)){
echo "false <br/>";
}
}
function gc_geocode($address){
$address = urlencode($address);
$url = "http://maps.google.com/maps/api/geocode/json?address={$address}";
$resp_json = file_get_contents($url);
$resp = json_decode($resp_json, true);
if($resp['status']=='OK'){
$lati = $resp['results'][0]['geometry']['location']['lat'];
$longi = $resp['results'][0]['geometry']['location']['lng'];
if($lati && $longi){
echo "(" . $lati . ", " . $longi . ")";
}else{
echo "data not complete <br/>";
return false;
}
}else{
echo "status not ok <br/>";
return false;
}
}
UPDATE : The problem was indeed the fact that I was behind a proxy. I tested with another network, and it works properly.
However, your answers about what I return and how I test the success are very nice as well, and will help me to improve the code.
Thanks a lot !
The problem was the fact that I was using a proxy. The code is correct.
To check if there is a proxy between you and the Internet, you must know the infrastructure of your network. If you work from a school or a company network, it is very likely that a proxy is used in order to protect the local network.
If you do not know the answer, ask your network administrator.
If there is no declared proxy in your network, it is still possible that a transparent proxy is there. However, as states the accepted answer to this question: https://superuser.com/questions/505772/how-can-i-find-out-if-there-is-a-proxy-between-myself-and-the-internet-if-there
If it's a transparent proxy, you won't be able to detect it on the client PC.
Some website also provide some proxy detectors, though I have no idea of how relevant is the information given there. Here are two examples :
http://amibehindaproxy.com/
http://www.proxyserverprivacy.com/free-proxy-detector.shtml
When you are not return anything function returns null.
Just use that:
if(!is_null(gc_geocode($addr))) {
echo "false <br/>";
}
Or:
if(gc_geocode($addr) === false) {
echo "false <br/>";
}
Take a look at the if statement:
if(!gc_geocode($addr)){
echo "false <br/>";
}
This means that if gc_geocode($addr) returns either false or null, this statement will echo "false".
However, you never actually return anything from the function, so on success, it's returning null:
$address = urlencode($address);
$url = "http://maps.google.com/maps/api/geocode/json?address={$address}";
$resp_json = file_get_contents($url);
$resp = json_decode($resp_json, true);
if($lati && $longi){
echo "(" . $lati . ", " . $longi . ")"; //ECHO isn't RETURN
/* You should return something here, e.g. return true */
} else {
echo "data not complete <br/>";
return false;
}
} else {
echo "status not ok <br/>";
return false;
}
Alternatively, you can just change the if statement to only fire when the function returns false:
if(gc_geocode($addr)===false){
//...
Above function gc_geocode() working properly on my system, without any extra load. You have called gc_geocode () it returns you lat, long that is correct now you have check through
if(!gc_geocode($addr)){
echo "false <br/>";
}
Use
if($responce=gc_geocode($addr)){
echo $responce;
}
else{
echo "false <br/>";
}

JSON PHP parsing search API

Alright, I'm using the Blekko search API:
http://blekko.com/ws/?q=hello+%2Fjson
how would I go about parsing it ?
I have no experience of parsing JSON from PHP, so I'd appreciate a little help, and the json_decode() docs failed to explain everything for me, particularly getting the data inside RESULT. :) You know, [ and ].
Could you help pointing me in the right direction ? :)
Thank you, you're all so helpful! :)
Here's the code to access the API.
You should enter your own error/unexpected results handling where i've left the comments.
$data = file_get_contents('http://blekko.com/ws/?q=hello+%2Fjson');
if(!empty($data)){
$data = json_decode($data);
if(!empty($data->ERROR)){
// Error with API response.
} else {
$data = $data->RESULT;
if(empty($data)){
// No results.
} else {
// Uncomment the line below to see your data
// echo '<pre>' . print_r($data) . '</pre>';
foreach($data AS $key => $val){
echo $val->short_host . '<br />';
}
}
}
} else {
// Failed to retrieve data.
}

php: inconsistent behaviour with HTTP headers

am testing lots of links on the same domain to see whether they exist or not. I am using the following code:
function get_http_response_code($url)
{
$headers = get_headers($url);
return substr($headers[0], 9, 3);
}
function getURLs()
{
foreach($allResults as $result)
{
$tempURL = 'http://www.doma.in/foo/'.$result.'/bar';
if(get_http_response_code($tempURL) != "404" && get_http_response_code($tempURL) != "500")
{
$URLs[] = $tempURL;
}
else
{
echo $tempURL.' could not be reached<br />';
}
return $URLs;
}
$URLs = getURLs();
The problem is, among the hundreds that do exist, the $URLs array contains URLs that do not exist (404); sometimes two, sometimes four, but every time it produces an HTTP/1.0 404 Not Found error. Why such variance? Is there a timeout I should be setting? Any help will be appreciated.
As I understand from Your code the problem is in mistake of variable $url
Try this.
...
foreach($allResults as $result)
{
$tempURL = 'http://www.doma.in/foo/'.$result['url'].'/bar';
...
$url changed to $result

PHP script to show google ranking results

does anyone know if it is possible to display google page rank of a particular website using php script?
if it is possible, how do i do it?
Okay, i re-wrote my Answer and extracted only the relevant part of my SEO Helper (my previous version had other stuff like Alexa Rank, Google Index, Yahoo Links etc in it. If you are looking for that, just see check an older revision of this answer!)
Please be aware that there are pages that have NO PAGERANK and by no I DON'T MEAN ZERO. There is just none. This may be because the page is so very unimportant (even less im portant than PR 0) or just so new but might very well be important.
This is consiedered the same as PR 0 in my class!
This has some pros and some cons. If possible you should handle it seperately in your logic, but this is not always possible, so 0 is the next best approach.
Furthermore:
This code is reverse engeneered and does not utilize some sort of API that has any form of SLA or whatever.
So it might stop working ANY TIME!
And PLEASE DONT FLOOD GOOGLE!
I made the test. If you have only a very short period of sleep, google blocks you after 1000 requests (for quite some time!). With a random sleep between 1.5 and 2 secs it looks fine.
I once crawled the pagerank for 70k pages. Only once, because I just needed it. I did only 5k a day from several IPs and now i have the data and It doesnt get outdated because the pages are there for decades.
IMO its totally OK to check a pagerank once in a while or even some at once, but dont miss-use this code or google may lock us out all together!
<?php
/*
* #author Joe Hopfgartner <joe#2x.to>
*/
class Helper_Seo
{
protected function _pageRankStrToNum($Str,$Check,$Magic) {
$Int32Unit=4294967296;
// 2^32
$length=strlen($Str);
for($i=0;$i<$length;$i++) {
$Check*=$Magic;
//If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31),
// the result of converting to integer is undefined
if($Check>=$Int32Unit) {
$Check=($Check-$Int32Unit*(int)($Check/$Int32Unit));
//if the check less than -2^31
$Check=($Check<-2147483648)?($Check+$Int32Unit):$Check;
}
$Check+=ord($Str {
$i
});
}
return $Check;
}
/*
* Genearate a hash for a url
*/
protected function _pageRankHashURL($String) {
$Check1=self::_pageRankStrToNum($String,0x1505,0x21);
$Check2=self::_pageRankStrToNum($String,0,0x1003F);
$Check1>>=2;
$Check1=(($Check1>>4)&0x3FFFFC0)|($Check1&0x3F);
$Check1=(($Check1>>4)&0x3FFC00)|($Check1&0x3FF);
$Check1=(($Check1>>4)&0x3C000)|($Check1&0x3FFF);
$T1=(((($Check1&0x3C0)<<4)|($Check1&0x3C))<<2)|($Check2&0xF0F);
$T2=(((($Check1&0xFFFFC000)<<4)|($Check1&0x3C00))<<0xA)|($Check2&0xF0F0000);
return($T1|$T2);
}
/*
* genearate a checksum for the hash string
*/
protected function CheckHash($Hashnum) {
$CheckByte=0;
$Flag=0;
$HashStr=sprintf('%u',$Hashnum);
$length=strlen($HashStr);
for($i=$length-1;$i>=0;$i--) {
$Re=$HashStr {
$i
};
if(1===($Flag%2)) {
$Re+=$Re;
$Re=(int)($Re/10)+($Re%10);
}
$CheckByte+=$Re;
$Flag++;
}
$CheckByte%=10;
if(0!==$CheckByte) {
$CheckByte=10-$CheckByte;
if(1===($Flag%2)) {
if(1===($CheckByte%2)) {
$CheckByte+=9;
}
$CheckByte>>=1;
}
}
return '7'.$CheckByte.$HashStr;
}
public static function getPageRank($url) {
$fp=fsockopen("toolbarqueries.google.com",80,$errno,$errstr,30);
if(!$fp) {
trigger_error("$errstr ($errno)<br />\n");
return false;
}
else {
$out="GET /search?client=navclient-auto&ch=".self::CheckHash(self::_pageRankHashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0 HTTP/1.1\r\n";
$out.="Host: toolbarqueries.google.com\r\n";
$out.="User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 2.0.114-big; Windows XP 5.1)\r\n";
$out.="Connection: Close\r\n\r\n";
fwrite($fp,$out);
#echo " U: http://toolbarqueries.google.com/search?client=navclient-auto&ch=".$this->CheckHash($this->_pageRankHashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0";
#echo "\n";
//$pagerank = substr(fgets($fp, 128), 4);
//echo $pagerank;
#echo "DATA:\n\n";
$responseOK = false;
$response = "";
$inhead = true;
$body = "";
while(!feof($fp)) {
$data=fgets($fp,128);
if($data == "\r\n" && $inhead) {
$inhead = false;
} else {
if(!$inhead) {
$body.= $data;
}
}
//if($data == '\r\n\r\n')
$response .= $data;
if(trim($data) == 'HTTP/1.1 200 OK') {
$responseOK = true;
}
#echo "D ".$data;
$pos=strpos($data,"Rank_");
if($pos===false) {
}
else {
$pagerank=trim(substr($data,$pos+9));
if($pagerank === '0') {
fclose($fp);
return 0;
} else if(intval($pagerank) === 0) {
throw new Exception('couldnt get pagerank from string: '.$pagerank);
//trigger_error('couldnt get pagerank from string: '.$pagerank);
fclose($fp);
return false;
} else {
fclose($fp);
return intval( $pagerank );
}
}
}
fclose($fp);
//var_dump($body);
if($responseOK && $body=='') {
return 0;
}
//return 0;
throw new Exception('couldnt get pagerank, unknown error. probably google flood block. my tests showed that 1req/sec is okay! i recommend a random sleep between 1.5 and 2 secs. no sleep breaks at ~1000 reqs.');
//trigger_error('couldnt get pagerank, unknown error. probably google flood block.');
return false;
}
}
}
$url = "http://www.2xfun.de/";
$pagerank = Helper_Seo::getPagerank($url);
var_dump($pagerank);
?>

Categories