I have a custom Google Search included on a html page. like
http://www.******.com/search.htm?cx=partner-pub--00000000000-c77&cof=FORID%3A10&ie=ISO-8ds3-1&q=software&sa=Search&siteurl=www.******.com%2#1342
When I am using same url in browser I get results. I want to call it by simple dom html parser then it is returning blank.
Or how can I fetch Google custom search results with Google partner ID via Simple HTML DOM parser so I can get analytics for searches done.
You can't, they have safeguards against that and it is against their terms of use.
Excerpt from the Web Search API Terms of Service:
[...]By way of example, and not as a limitation, You agree that when using the Service, You will not, and will not permit users or other third parties to:
[...] use any robot, spider, site search/retrieval application, or other device to retrieve or index any portion of Google Search Results or to collect information about users for any unauthorized purpose;
I do not know about the custom google search, but with the normal one I got all results, by simply applying the
url[?]q=([^&]+)&
regex to all hrefs.
edit: taking the match in the parentheses to get the url, ofc.
(Did not notice that this was an old question that was edited (for what?), but perhaps it is still useful for someone)
Related
I am trying to generate/retrieve a list of news links from a keyword search from a news website using Python. For Google search, I know some use
requests, but while Google search page has its own link address (i.e. https://www.google.dz/search?q=keyword), some websites do not transfer keyword through web address.
First - for example, in http://english.hani.co.kr/ , users are led to a search result page http://search.hani.co.kr/Search with list of links regardless which keyword they type (Korea Times is another example). In this way, is it still possible to use Python library to extract those links?
Second - in the earlier two and many other cases (like this), the search results are displayed in as many as hundreds of pages. What tools and techniques should I turn to in order to produce a comprehensive list of news links?
There are two basic tasks that are used to scrape web sites:
Load a web page to a string.
Parse HTML from a web page to locate the interesting bits.
You can see more details how to do here.
So, some searchs engine use GET to do a search and others the method POST. For those that use method POST the unique way is doing the search (not by url) and get the html results for analyze.
Both ways(GET and POST) you can use beautifulsoup.
I just started using the eBay API, but couldn't find a way to accomplish my needs. Also this should be really simple normally.
I want to know if it is possible to pass just any eBay URL to the API (PHP) and get a new link with my affiliate-parameters added.
It should just work the same way like the Link Generator in the Partner-Network-Section on eBay.
For example:
www.ebay.de/some-ebay-url/
Should be turned into something like:
http://rover.ebay.com/rover/x/xxx-xxxxx-xxxxx-x/x?xxx=4&pub=[my_publisher_id]&toolid=10001&campid=[my_campaign_id]&customid=&mpre=http%3A%2F%2Fwww.ebay.de%2Fsome-ebay-url
Or is there a easy way to just add my Affiliate id to an ordinary link?
Thanks
There isn't an API call at eBay to do what you want, in part because you can do it yourself pretty easily. You kinda already did it yourself with your example - yes, it's that easy.
You basically use the Link Generator just once (or a few times, to get acquainted with things) to obtain the template of a rover link and select the Custom URL option, then you use that format with any eBay URL you like, making any required tweaks. Take the eBay URL you want, URL encode it, then insert it as the &mpre= parameter at the end of the affiliate/rover link. That's the gist of it.
Know that &pub= is optional and doesn't affect the tracking of your link. &campid= is the necessary tracking parameter.
The long numerals in the front of the rover link (Rotation/Placement ID) define what ePN program to attribute tracking to. You might need to parse your input URL to determine the eBay domain (eBay.com? eBay.de?) to decide which Rotation/Placement ID to use, although they may/should work for cross-site tracking if you desire.
I've documented all the Rotation IDs here on my site. Click the page title up top for the main site with more eBay technical info I've discovered over the years.
I need to figure out how to (if it is possible) populate html/php page with following information:
I have a url of a page and a set of keywords, I'd would like to check every week what position in google search results is that url, if search is preformed for that set of keywords that is associated with it.
Say if it is on a second page of google it will have position of 18 etc.. (count starting from first result on first page).
I then have a html/php page with a table structure which has a column with urls, another column with keywords associated to those urls. Than there should be two more columns which contain information of position in google's search and date when that position was checked (so these two columns should be populated by that script that checks the position).
I'm gona be honest, I have no idea how to achieve this nor as I know if it is possible. Please suggest ideas, code snippets, maybe some services that do this kind of stuff.
To scrape Google's result pages, have a look here.
But note, that Google's former SOAP API does no longer exist. This I wonder, that it is legal to scrape Google's pages. See this Google blog page and Google's Terms of Use.
Google writes this:
Automated searching is strictly prohibited, as is permanently storing any search results. Please refer to the Terms of Use for more detail.
I am building a restaurant review site using php. I wanted to know how to show reviews from other reviews sites. For example check this link to see how google is picking up reviews from other sites. When clicked on, it takes you to their review site.
Any help will be appreciated. Thanks
First, make a list of the sites that you want to pull reviews from. Second, read through those sites and see if they have a developer section and if they expose a public API. If they do, look around to see if they have any client libraries for php which you can use to access their API from your php site. If they do have an API but there are no client libraries available, contribute to the community by creating a client library and sharing it as open source. :)
Also, it may be possible that they have an RSS feed of their reviews that you can consume easily of your site, so check that out too.
You will probably find the other sites have partnered with Google for this, however I would be using cURL to get the information you want.
I would suggest you start here with cURL, then have a look here to extract the portion you're looking for.
Edit: the section of the second link that is relevant is
preg_match_all("/<div>.+<\/div>/", $page, $matches);
print_r($matches);
What this is doing is getting the content you're looking for and then displaying it, you will probably need to define the unique elements with the content you want however, this could mean a separate rule for each website.
I hope this helps for you.
I think google find News, Blog , Reviews and something like that by site map what web administrators introduce to google as sitemap.xml .
To do this you must get page content what you want to fetch it's reviews ( By CURL or something like that to get remote file ) and fetch reviews by regular expression of HTML
I am developing a search solution for my photo community web application. I am making use of Google Site Search. There are various ways to make use of it, but because I want a seamless fit with custom search result rendering, I went for the XML option.
It works really simple. I have a custom-styled search box on the site which posts to my back-end, a CodeIgniter PHP controller. The controller then will do a GET to the Google Site Search XML service, which returns me the search results in XML.
It works brilliantly and gives me full control over output rendering. There is just one little thing missing. If I search for a misspelled word, let's say "crocodilw" (should be "crocodile") I would like to get the "did you mean "crocodile?" functionality that is so common in Google.
This feature does work when you use the front-end integration method of Google Site Search. I kind of expected the correct search suggestion to be part of the return XML as well, but I can't seem to find it.
Any clues on how/if this is possible using the XML method?
This will only work on the first page (start=0) of the results. On the second page, it's gone.
http://www.google.com/cse/docs/resultsxml.html#results_xml_tag_Spelling