I want to fetch google images against any query. I have gone through the google image search api but unable to understand. i have also seen some methods, they fetch images but only of first page.i have used following method.
function getGoogleImg($k)
{
$url = "http://images.google.it/images?as_q=##query##&hl=it&imgtbs=z&btnG=Cerca+con+Google&as_epq=&as_oq=&as_eq=&imgtype=&imgsz=m&imgw=&imgh=&imgar=&as_filetype=&imgc=&as_sitesearch=&as_rights=&safe=images&as_st=y";
$web_page = file_get_contents( str_replace("##query##",urlencode($k), $url ));
$tieni = stristr($web_page,"dyn.setResults(");
$tieni = str_replace( "dyn.setResults(","", str_replace(stristr($tieni,");"),"",$tieni) );
$tieni = str_replace("[]","",$tieni);
$m = preg_split("/[\[\]]/",$tieni);
$x = array();
for($i=0;$i<count($m);$i++)
{
$m[$i] = str_replace("/imgres?imgurl\\x3d","",$m[$i]);
$m[$i] = str_replace(stristr($m[$i],"\\x26imgrefurl"),"",$m[$i]);
$m[$i] = preg_replace("/^\"/i","",$m[$i]);
$m[$i] = preg_replace("/^,/i","",$m[$i]);
if ($m[$i]!="")
array_push($x,$m[$i]);
}
return $x;
}
This function return only 21 images. i want all images against this query. i am doing this in php
Sadly the image API is being closed down, so I wont suggest moving to that, but that would have been a nicer solution I think.
My best guess is that image 22 and forwards is being loaded using som ajax/javascript of some sort (if you search for say logo and scroll down you will see placeholders that gets loaded as you move down) and that you need to pass the page by a javascript engine and that is not something that I can find anyone who have done with php (yet).
Have you checked that $web_page contains more than 21 images (when I toy against google image search it uses javascript to load some of the images)?
When you access the link from your normal browser what happens then and what happens if you turn off javascript?
Is there perhaps a link to next page in the result you have?
In the now deprecated Image API there were ways to limit the number of results per page and ways to step to the next page https://developers.google.com/image-search/v1/jsondevguide#json_snippets_php
If you wish to keep on doing searches and fetching images from the search result then for later http://simplehtmldom.sourceforge.net/ might be a nice alternative to look at.
It fetches a html DOM and allows you to easily find nodes and makes it easy to work with them. But it still uses file_get_contents or curl libraries to fetch the data so it might need some fiddling to get javascript working.
I wrote a script to download images form google Image search which I currently downloading 100 original images
The original script I wrote on stackoverflow answer
Python - Download Images from google Image search?
which I will explain in detail how I am scraping url’s of original Images from Google Image search using urllib2 and BeautifulSoup
For example if u want to scrape images of movie terminator 3 from google image search
query= "Terminator 3"
query= '+'.join(query.split()) #this will make the query terminator+3
url="https://www.google.co.in/search?q="+query+"&source=lnms&tbm=isch"
header={'User-Agent':"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36"
}
req = urllib2.Request(url,headers=header)
soup= urllib2.urlopen(req)
soup = BeautifulSoup(soup)
variable soup above contains the html code of the page that is requested now we need to extract the images for that u have to open the web page in your browser and and do inspect element on the image
here you will find the the tags containing the image of the url
for example for google image i found "div",{"class":"rg_meta"} containing the link to image
You can search up the BeautifulSoup documentation
print soup.find_all("div",{"class":"rg_meta"})
You will get a list of results as
<div class="rg_meta">{"cl":3,"cr":3,"ct":12,"id":"C0s-rtOZqcJOvM:","isu":"emuparadise.me","itg":false,"ity":"jpg","oh":540,"ou":"http://199.101.98.242/media/images/66433-Terminator_3_The_Redemption-1.jpg","ow":960,"pt":"Terminator 3 The Redemption ISO \\u0026lt; GCN ISOs | Emuparadise","rid":"VJSwsesuO1s1UM","ru":"http://www.emuparadise.me/Nintendo_Gamecube_ISOs/Terminator_3_The_Redemption/66433","s":"Screenshot Thumbnail / Media File 1 for Terminator 3 The Redemption","th":168,"tu":"https://encrypted-tbn2.gstatic.com/images?q\\u003dtbn:ANd9GcRs8dp-ojc4BmP1PONsXlvscfIl58k9hpu6aWlGV_WwJ33A26jaIw","tw":300}</div>
the result above contains link to our image url
http://199.101.98.242/media/images/66433-Terminator_3_The_Redemption-1.jpg
You can extract these links and images as follows
ActualImages=[]# contains the link for Large original images, type of image
for a in soup.find_all("div",{"class":"rg_meta"}):
link , Type =json.loads(a.text)["ou"] ,json.loads(a.text)["ity"]
ActualImages.append((link,Type))
for i , (img , Type) in enumerate( ActualImages):
try:
req = urllib2.Request(img, headers={'User-Agent' : header})
raw_img = urllib2.urlopen(req).read()
if not os.path.exists(DIR):
os.mkdir(DIR)
cntr = len([i for i in os.listdir(DIR) if image_type in i]) + 1
print cntr
if len(Type)==0:
f = open(DIR + image_type + "_"+ str(cntr)+".jpg", 'wb')
else :
f = open(DIR + image_type + "_"+ str(cntr)+"."+Type, 'wb')
f.write(raw_img)
f.close()
except Exception as e:
print "could not load : "+img
print e
Voila now u can use this script to download images from google search. Or for collecting training images
For the fully working script you can get it here
https://gist.github.com/rishabhsixfeet/8ff479de9d19549d5c2d8bfc14af9b88
Related
$page = isset($input['page'])?$input['page']:0;
$perPageRecord = 10;
$calls = $this->twilio->calls->page(["to" => "+919876543210"],$perPageRecord,'',$page);
$data = [];
echo $calls->getNextPageUrl;
exit;
I am using above code to get next page url and it print successfully. But i want to print last page url while In php twilio.
Anyone can tell me how can i get last page url using twilio php.
Thanks
It looks like you will need to programmatically extract a returned range and manipulate the resulting data to get the X most recent results (last page).
Replacing Absolute Paging and Related Properties
Usage and Migration Guide for Twilio's PHP Helper Library 5.x
I am trying to understand how this web site is working. There is an input form where you can provide a url. This form returns information retrieved from another site (Youtube). So:
My first and more interesting question is if anybody has any idea how this site retrieve the entire corpus of statements?
Alternatively, since now I am using the following code:
from BeautifulSoup import BeautifulSoup
import json
urlstr = 'http://www.sandracires.com/en/client/youtube/comments.php?v=' + videoId + '&page=' + str(npage)
url = urllib2.urlopen(urlstr)
content = url.read()
soup = BeautifulSoup(content)
#parse json
newDictionary=json.loads(str(soup))
#print example
print newDictionary['list'][1]['username']
However, I can not iterate in all pages (which is not happening when I to that manually). I have placed timer.sleep(30) below json but without success. Why is that happening?
Thanks!
Python 2.7.8
Probably using the Google Youtube data API. Note that (presently) comments can only be retrieved using version 2 of the API - which has been deprecated. Apparently no support yet in V3. Python clients libraries are available, see https://developers.google.com/youtube/code#Python.
Response is already JSON, no need for BS. The web server seems to require cookies, so I recommend using requests module, in particular its session management:
import requests
videoId = 'ZSzeFFsKEt4'
results = []
npage = 1
session = requests.session()
while True:
urlstr = 'http://www.sandracires.com/en/client/youtube/comments.php'
print "Getting page ", npage
response = session.get(urlstr, params={'v': videoId, 'page': npage})
content = response.json()
if len(content['list']) > 1:
results.append(content)
else:
break
npage += 1
print results
Recently youtube changed the way direct video download links work (found in url_encoded_fmt_stream_map), there is a signature now and links don't work unless the right signature is presented
the signature is there as a 'sig' argument so you can easy take it and construct the link and it will work, however ever since this signature appeared the link is also locked to the user's browser somehow
meaning if I probe "http://youtube.com/get_video_info" on the server side and construct the links with the signature and then print that as a link when the user clicks the link a black page will open, however if I try to download the video on the server side it will work.
This means that the link is somehow locked and belongs to the user who opened "http://youtube.com/get_video_info"
The problem with this situation is that in order to stream the videos you have to first download them on your server
Does anyone know how are the links locked to specific user and is there a way around it?
The idea is for example - you get the link on the server side and then you feed it to some flash player, instead of using the chromeless player
here is a code example with php:
<?
$video_id = $_GET['id']; //youtube video id
// geting the video info
$content = file_get_contents("http://youtube.com/get_video_info?video_id=".$video_id);
parse_str($content, $ytarr);
// getting the links
$links = explode(',',$ytarr['url_encoded_fmt_stream_map']);
// formats you would like to use
$formats = array(35,34,6,5);
//loop trough the links to find the one you need
foreach($links as $link){
parse_str($link, $args);
if(in_array($args['itag'],$formats)){
//right link found since the links are in hi-to-low quality order
//the match will be the one with highest quality
$video_url = $args['url'];
// add signature to the link
if($args['sig']){
$video_url .= '&signature='.$args['sig'];
}
/*
* What follows is three ways of proceeding with the link,
* note they are not supposed to work all together but one at a time
*/
//download the video and output to browser
#readfile($video_url); // this works fine
exit;
//show video as link
echo 'link for '.$args['itag'].''; //this won't work
exit;
//redirect to video
header("Location: $video_url"); // this won't work
exit;
}
}
?>
I am doing this flash banners for multiple clients and one major request is to have some sort of counter so they know how many times the banner has been clicked.
I know how to do it in ActionScript 3.0, I make a simple var:int and i increase it +1 when a click is made on the banner. What do I do with the value of this var(say its 121) where do I store it online so its safe and can be changed by multiple flash banners(as3).
But how do I save this information so next time when the banner is loaded(on diffrent webpages) the number of clicks is whatever it was last time it was loaded.
Should I look into PHP for that ? I have no clue how to do this... some examples, tutorials, whatever works... would be much appreciated.(I am a designer, not programmer...please dont speak php-ish, or you know... :D)
I've googled a bit, and found some help, but i am still confused, and much of it its not AS3, I'm thinking maybe stuff has evolved a bit since the stuff that I found(2008)...
Thank you very much.
You'd have to store (and fetch) the value somewhere - either in the DB, in a text-file, ...
I'd go search for a tutorial on PHP+MySQL. If you don't like PHP-ish, you're probably better of finding another solution though :p
Example tutorial: http://www.freewebmasterhelp.com/tutorials/phpmysql
You need to store the data you want be retrievable/update-able from multiple clients, to be stored on a server.
You can use any server side language with a database.
Server Languages : PHP, ASP.net, JSP, ColdFusion
Database : MySQL, MSSQL, PostgreSQL, Oracle, DB2 etc..
Use whatever combination you are comfortable with.
In general:
You have a web app that increments the counter in the database
call the page using URLLoader from your AS3 banner.
Database
counter_table
-------------
counter INT
PHP File
$db = mysql_connect('localhost', 'mysql_user', 'mysql_password');
mysql_select_db('database_name');
mysql_query('UPDATE counter_table SET counter = counter + 1');
AS3 Banner
// url request with your php page address
var scriptRequest:URLRequest = new URLRequest("http://www.example.com/script.php");
// loader
var scriptLoader:URLLoader = new URLLoader();
// load page to trigger database update
scriptLoader.load(scriptRequest);
Do you also want to retrieve the value of the number of clicks in Banner ?
Easy solution (really not the best :) You should use one of the other answers.. anyways, make a php file that reads txt file containing the count of visits.. and in your flashbanner just call the php file. It'll add one hit per call..
PHP:
<?php
/**
* Create an empty text file called counterlog.txt and
* upload to the same directory as the page you want to
* count hits for.
*
*
* #Flavius Frantz: YOU DONT NEED THESE:
* Add this line of code on your page:
* <?php include "text_file_hit_counter.php"; ?>
*/
// Open the file for reading
$fp = fopen("counterlog.txt", "r");
// Get the existing count
$count = fread($fp, 1024);
// Close the file
fclose($fp);
// Add 1 to the existing count
$count = $count + 1;
// Display the number of hits
// If you don't want to display it, comment out this line
//echo "<p>Page views:" . $count . "</p>";
// Reopen the file and erase the contents
$fp = fopen("counterlog.txt", "w");
// Write the new count to the file
fwrite($fp, $count);
// Close the file
fclose($fp);
?>
Example code from: (google: php counter file) http://www.totallyphp.co.uk/text-file-hit-counter
Code is not tested, but looks ok. I only commented just a little..
This been bugging me for quite a few hours. I've been searching a lot and I have found a lot of information. The problem is, I'm not that good, I'm actually a beginner to the max. I'd like to achieve this with Python (if it's possible!). Maybe with JavaScript and PHP also? Let me explain.
I just found this website http://listeningroom.net and it's great. You can create/join rooms and upload tracks and listen to them together with friends.
I'd like to extract/scrape/get some specific data from a .json file.
This file contains artist, album title, track title and more. I'd like to extract just the artist, album and track title.
http://listeningroom.net/room/chillasfuck/spins.json The .json file Contains the tracks played in the past 24 hours.
I managed to scrape the whole .json file with Python after looking around, (local .json file) with the following probably not so valid code.
json_data=open('...\spins.json')
data = json.load(json_data)
pprint(data)
json_data.close()
This prints out the following:
[{u'endTime': u'1317752614105',
u'id': u'cf37894e8eaf886a0d000000',
u'length': 492330,
u'metadata': {u'album': u'Mezzanine',
u'artist': u'Massive Attack',
u'bitrate': 128000,
u'label': u'Virgin',
u'length': 17494.479054779807,
u'title': u'Group Four'},
Just a part of the print
1. I'd like to scrape it from an url (the one provided at the top)
2. Just get 'album', 'artist' and 'title'
3. Make sure it prints it as simple as possible like this:
Artist
Track title
Album
Artist
Track title
Album
4. If it's not too much, save it to a .txt file
I hope I could get some help, I really want to create this for myself, so I can check out more music!
Marvin
Python (after you loaded the json)
for elem in data:
print('{artist}\n{title}\n{album}\n'.format(**elem['metadata']))
To save in a file:
with open('the_file_name.txt','w') as f:
for elem in data:
f.write('{artist}\n{title}\n{album}\n\n'.format(**elem['metadata']))
You're already really close.
data = json.load(json_data)
is taking the JSON string and converting it to a Python object - in this case, a list of dictionaries (plus 'metadata', which is a dictionary of dictionaries).
To get this into the format that you want, you just need to loop through the items.
for song in data:
artist = song['metadata']['artist'] # This tells it where to look in the dictionary. It's looking for the dictionary item called 'metadata'. Then, looking inside that dictionary for 'artist'.
album = song['metadata'['album']
songTitle = song['metadata']['title']
print '%s\n%s\n%s\n' % (artist, album, songTitle)
Or, to print it to a file:
with open('the_file_name.txt','w') as f:
for song in data:
artist = song['metadata']['artist']
album = song['metadata'['album']
songTitle = song['metadata']['title']
f.write('%s\n%s\n%s\n' % (artist, album, songTitle))
Okay this is a bit short but the thing about json is that it translate an array into a string
eg.
array['first'] = 'hello';
array['second'] = 'there';
will become
[{u'first': u'hello', u'second': 'there'}];
after a jsonencode
run that sting throu jsondecode and you get your array back
so simply run you json file thou a decoder and then you should be able to reach your data through:
array['metadata'].album
array['metadata'].artist
...
have never used python but it should be the same.
have a look at http://www.php.net/manual/en/function.json-decode.php it might clear upp a thing or two.
For PHP you need json.decode
<?php
$json = file_get_contents($url);
$val = json_decode($json);
$room = $val[0]->metadata;
echo "Album : ".$room->album."\n";
echo "Artist : ".$room->artist."\n";
echo "Title : ".$room->title."\n";
?>
Outputs
Album : Future Sandwich
Artist : Them, Roaringtwenties
Title : Fast Acting Nite-Nite Spray With Realistic Uncle Beard
Note its a truck load of JSON data there so you'll have to iterate adequately