Box.com API 2 - search from website & present results from folder - php

Quick question... (for a PHP intranet I am creating).
Is it possible to search documents in a Box.com account using their API 2???
Ideally I'd like to:
Search Box.com from within the intranet (a port is open to the outside world).
Present the results from a Box.com folder, as document titles that are links to download the document.
Any hint's or even just a link or 2 that 'totally' confirms this is possible would be really appreciated - as I have been hearing conflicting answers that this is & isn't possible.
Thanks ;-)

Yes, it is possible to search a Box.com account through the API. See the documentation at http://developers.box.com/docs/#search
However, I don't think you can restrict the search to a single folder at present: you have to be logged in to Box to do the search and the only documented options are the search term and some control over how many results are returned. The documentation does say they will add filters but if you can filter results right now there's nothing in the documentation about it.

Related

How to get a JSON response from a Google Image Search?

All I want to do is a simple Google Images search. We were doing the old, really, incredibly simple way using the now completely deprecated JSON Image Search API.
That page now says it is included in Google Custom Search. The problem is that I don't want to search my own website, I want to search the web for images still. I cannot, for the life of me, find what the new way to do this is.
I have tried the Google API for PHP, it also works only with Custom Search which is only for searches on my website and not the entire web.
I have tried it using https://www.googleapis.com/customsearch/v1?key=MYKEY&cx=MYCX&q=candy and it's still searching our website only not the web in general.
Can someone point me to a documentation page that describes how to use an API to do an image search on the web in general and get results in the JSON?
You can indeed use the Google API for PHP. You just need to configure your custom search engine to search the entire web. To do this, login to your CSE settings at https://cse.google.com/cse/, and remove all sites.
Then, change 'Sites to search' to 'Search the entire web but emphasize included sites'.
Since you don't have any sites specified, it will just search the whole web.
You can test using the box on the right.
As for documentation, the CSE API is documented here https://developers.google.com/custom-search/json-api/v1/reference/cse/list#parameters

Python Scraping links from search results

I am trying to generate/retrieve a list of news links from a keyword search from a news website using Python. For Google search, I know some use
requests, but while Google search page has its own link address (i.e. https://www.google.dz/search?q=keyword), some websites do not transfer keyword through web address.
First - for example, in http://english.hani.co.kr/ , users are led to a search result page http://search.hani.co.kr/Search with list of links regardless which keyword they type (Korea Times is another example). In this way, is it still possible to use Python library to extract those links?
Second - in the earlier two and many other cases (like this), the search results are displayed in as many as hundreds of pages. What tools and techniques should I turn to in order to produce a comprehensive list of news links?
There are two basic tasks that are used to scrape web sites:
Load a web page to a string.
Parse HTML from a web page to locate the interesting bits.
You can see more details how to do here.
So, some searchs engine use GET to do a search and others the method POST. For those that use method POST the unique way is doing the search (not by url) and get the html results for analyze.
Both ways(GET and POST) you can use beautifulsoup.

How to index custom url based on terms searched on Google

Sometimes I see on Google links with my terms searched on Google as parameter. For example, if I search "StrangeWord", I can see in results:
example.com/p=StrangeWord
I'm pretty sure it is generated automatically, how to do it? I'm using PHP with Nginx.
It isn't generated automatically. If that page is in the index, it's because there was a crawlable link to that page - whether intentionally done by the webmaster or not - and Google happened to crawl that link.
Links can get generated by users sharing such a page, bookmarking it, or even linking to it from their own sites / social profiles

Get restaurant reviews from other review sites

I am building a restaurant review site using php. I wanted to know how to show reviews from other reviews sites. For example check this link to see how google is picking up reviews from other sites. When clicked on, it takes you to their review site.
Any help will be appreciated. Thanks
First, make a list of the sites that you want to pull reviews from. Second, read through those sites and see if they have a developer section and if they expose a public API. If they do, look around to see if they have any client libraries for php which you can use to access their API from your php site. If they do have an API but there are no client libraries available, contribute to the community by creating a client library and sharing it as open source. :)
Also, it may be possible that they have an RSS feed of their reviews that you can consume easily of your site, so check that out too.
You will probably find the other sites have partnered with Google for this, however I would be using cURL to get the information you want.
I would suggest you start here with cURL, then have a look here to extract the portion you're looking for.
Edit: the section of the second link that is relevant is
preg_match_all("/<div>.+<\/div>/", $page, $matches);
print_r($matches);
What this is doing is getting the content you're looking for and then displaying it, you will probably need to define the unique elements with the content you want however, this could mean a separate rule for each website.
I hope this helps for you.
I think google find News, Blog , Reviews and something like that by site map what web administrators introduce to google as sitemap.xml .
To do this you must get page content what you want to fetch it's reviews ( By CURL or something like that to get remote file ) and fetch reviews by regular expression of HTML

how can google find me if I am inside a mysql table?

I am creating a classifieds website.
Im storing all ads in mysql database, in different tables.
Is it possible to find these ads somehow, from googles search engine?
Is it possible to create meta information about each ad so that google finds them?
How does major companies do this?
I have thought about auto-generating a html-page for each ad inserted, but 500thousand auto-generated html pages doesn't really sound that good of a solution!
Any thoughts and idéas?
UPDATE:
Here is my basic website so far:
(ALL PHP BASED)
I have a search engine which searches database for records.
After finding and displaying search results, you can click on a result ('ad') and then PHP fetches info from the database and displays it, simple!
In the 'put ad' section of my site, you can put your own ad into a mysql database.
I need to know how I should make google find ads in my website also, as I dont think google-crawler can search my database just because users can.
Please explain your answers more thoroughly so that I understand fully how this works!
Thank you
Google doesn't find database records. Google finds web pages. If you want your classifieds to be found then they'll need to be on a Web page of some kind. You can help this process by giving Google a site map/index of all your classifieds.
I suggest you take a look at Google Basics and Creating and submitting SitemapsPrint
. Basically the idea is to spoon feed Google every URL you want Google to find. So if your reference your classifieds this way:
http://www.mysite.com/classified?id=1234
then you create a list of every URL required to find every classified and yes this might be hundreds of thousands or even millions.
The above assumes a single classified per page. You can of course put 5, 10, 50 or 100 on a single page and then create a smaller set of URLs for Google to crawl.
Whatever you do however remember this: your sitemap should reflect how your site is used. Every URL Google finds (or you give it) will appear in the index. So don't give Google a URL that a user couldn't reach by using the site normally or that you don't want a user to use.
So while 50 classifieds per page might mean less requests from Google, if that's not how you want users to use your site (or a view you want to provide) then you'll have to do it some other way.
Just remember: Google indexes Web pages not data.
How would you normally access these classifieds? You're not just keeping them locked up in the database, are you?
Google sees your website like any other visitor would see your website. If you have a normal database-driven site, there's some unique URL for each classified where it it displayed. If there's a link to it somewhere, Google will find it.
If you want Google to index your site, you need to put all your pages on the web and link between them.
You do not have to auto-generate a static HTML page for everything, all pages can be dynamically created (JSP, ASP, PHP, what have you), but they need to be accessible for a web crawler.
Google can find you no matter where you try to hide. Even if you can somehow fit yourself into a mysql table. Because they're Google. :-D
Seriously, though, they use a bot to periodically spider your site so you mostly just need to make the data in your database available as web pages on your site, and make your site bot-friendly (use an appropriate robots.txt file, provide a search engine-friendly site map, etc.) You need to make sure they can find your site, so make sure it's linked to by other sites -- preferably sites with lots of traffic.
If your site only displays specific results in response to search terms you'll have a harder time. You may want to make full lists of the records available for people without search terms (paged appropriately if you have lots of data).
First Create a PHP file that pulls the index plus human readable reference for all records.
That is your main page broken out into categories (like in the case of Craigslist.com - by Country and State).
Then each category link feeds back to the php script the selected value regardless of level(s) finally reaching the ad itself.
So, If a category is selected which contains more categories (like states contain cities) Then display the next list of categories. Else display the list of ads for that city.
This will give Google.com a way to index a site (aka mysql db) dynamically with out creating static content for the millions (billions or trillions) of records involved.
This is Just an idea of how to get Google.com to index a database.

Categories