PHP : How to interpret google url - php

I would like a tips about how to force this script to interpret the google url as if i'd done the research on google
<?php
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, 'http://www.google.com/?q=cr#hl=fr&q=help+me+please&psj=1&oq=variable+get+google+recherche&fp=1/');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
$lines = array();
$lines = explode("\n", $file_contents);
foreach($lines as $line_num => $line) {
echo "Line # {$line_num} : ".htmlspecialchars($line)."<br />\n";
}
?>
This is what I've come with, but when I try this on my server I only get google.com source code and not the source code from the google page after the research.
Can anyone help me ? thanks :D

This isn't the best way you could be doing this.
The JSON/Atom custom search API will do what you want. http://code.google.com/apis/customsearch/v1/overview.html
For Yahoo, the BOSS API: http://developer.yahoo.com/search/boss/
And for Bing: http://www.bing.com/toolbox/bingdeveloper/
Additionally, the reason your CURL isn't giving you the results you need is because the search query is behind a hash in the URL. That means Google is pulling the results in via ajax. You will have to find a way to directly pass the query string to the google results page.
You can attempt to simulate this by turning javascript off in your browser, performing a search, and copying the resulting URL.
For the lazy, this is: http://www.google.com/search?hl=en&q=test+search

you can use Google Mobile view
http://www.google.com/gwt/x?u=http%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dkeyword&btnGo=Go&source=wax&ie=UTF-8&oe=UTF-8
or You can use Google API to get google search results in json format
For web search
http://ajax.googleapis.com/ajax/services/search/web?q=keyword&v=1.0&start=8&rsz=8
For image search
http://ajax.googleapis.com/ajax/services/search/images?q=keyword&v=1.0&start=8&rsz=8
For video search
http://ajax.googleapis.com/ajax/services/search/video?q=keyword&v=1.0&start=8&rsz=8

Related

How to get results from other website

I need to fetch data from this web page Sender score
.I try to use cURL but it renders white page.
Here is my code :
$ch = curl_init();
$keyword = "an-example.com";
curl_setopt($ch, CURLOPT_URL, 'https://www.senderscore.org/lookup.php?lookup='.$keyword.'&validLookup=true');
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
print_r($data);
curl_close($ch);
any idea ?
Regards.
You're getting a blank page because of the captcha which is needed to fill in. Perhaps senderscore has an API which you can use? Or maybe there's another website available doing the same thing. I thought this was about scoring email statusus right? Then maybe this site will help you out: http://www.reputationauthority.org/domain_lookup.php?ip=somedomain.nl&Submit.x=0&Submit.y=0&Submit=Search
I can use this site without the need of captcha or any other bot interference.

Google feed api not returning results with php in wordpress

i am trying to use the google feed api in my wordpress site. i have enabled php with a plugin, which allows me to input php code in my pages. my hosting provider also confirmed they have curl enabled.
This is the code which iam trying to run which i got from the google developer site
(https://developers.google.com/feed/v1/jsondevguide#basic_query)
<?php
$url = "https://ajax.googleapis.com/ajax/services/feed/find?v=1.0&q=iphone5& userip=2.96.214.41";
// sendRequest
// note how referer is set manually
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, http://www.iweb21.com);
$body = curl_exec($ch);
curl_close($ch);
// now, process the JSON string
$json = json_decode($body);
// now have some fun with the results...
?>
i don't get any results, just a blank page.
i am not a php programmer. just a novice wordpress user. i have been searching for a plugin to use the google feed api but got nowhere. so i decided to try using the code provided by google.
i Would very much appreciate any advise. thnx
Blank page means that there is a Fatal or Parse error, and error reporting is disabled in PHP settings. See this: How to get useful error messages in PHP?
In your particular case, the referer string is not enclosed in quotes and generates a Parse Error. Replace with:
curl_setopt($ch, CURLOPT_REFERER, 'http://www.iweb21.com');

How to copy content from a dynamic page using PHP?

Is it possible to get the information displayed in the page link given below using PHP. I want all the text content displayed on the page to be copied to a variable or to a file.
http://www.ncbi.nlm.nih.gov/nuccore/24655740?report=fasta&format=text
I have tried cURL too, but it didn't work. Where as cURL worked with a few other sites I know. But even if solutions with cURL are there do post. I might have tried various methods in which cURL can be used.
Use cURL to get the page content and then parse it - extract the <pre> section.
$ch = curl_init();
// Set query data here with the URL
curl_setopt($ch, CURLOPT_URL, 'val=24655740&db=nuccore&dopt=fasta&extrafeat=0&fmt_mask=0&maxplex=1&sendto=t&withmarkup=on&log$=seqview&maxdownloadsize=1000000');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, '3');
$content = trim(curl_exec($ch));
curl_close($ch);
// show ALL the content
print $content;
$start_index = strpos($content, '<pre>')+5;
$end_index = strpos($content, '</pre>');
$your_text = substr($content, $start_index, $end_index-$start_index);
UPDATE
Using the link from #ovitinho's answer - it now works :)
You need to request the url used by form to show this result via javascript.
I founded this final url
http://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?val=24655740&db=nuccore&dopt=fasta&extrafeat=0&fmt_mask=0&maxplex=1&sendto=t&withmarkup=on&log$=seqview&maxdownloadsize=1000000
Pay attention to use 24655740 from your first link in this request.
You can use cURL.

Grabbing revenue from website with PHP cURL

I am attempting to grab revenue from a website past a login page through cURL. I know this is a sloppy way but I have no choice.
<?php
$username = "example";
$password = "example";
$postfields = "email=$username&password=$password";
// Use Curl to return the raw source of a webpage to a variable called
$ch = curl_init();
//curl_setopt($ch, CURLOPT_HEADER, 1); // Get the header
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // Allow redirection
curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookie");
curl_setopt($ch, CURLOPT_URL, "https://www.domain.com/login");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "$postfields");
$page = curl_exec($ch);
curl_close($ch); // Closing
if (preg_match("/<th>(.*)<\/th/s", $page, $matches)) {
echo $matches[1];
}
?>
Essentially I am able to get past login fine and it redirects me to the dashboard of the specific website I am trying to grab revenue from, however when trying to use preg_match it doesn't grab anything, it just prints all HTML for the dashboard.
I am trying to only get "$99.99" within the
<th>$99.99</th>
Help greatly appreciated.
Add to your code this line:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
.. otherwise curl just prints out the result of its session. I'd suggest making your regex non-greedy as well: it's only one <th> now, but this part of code might be copy-pasted (as a perfectly working one) into some other program - and will cause troubles then. )
Your regex is greedy, and will likely grab the contents of several <th> if there are more than one. It is usually not a good idea to attempt to parse HTML or XML with regular expressions. An HTML parser will accomplish this task more effectively. I am partial to DOMDocument.
To solve the problem at hand though,[^<]+ will gather all characters up to but not including the next <.
if (preg_match("/<th>([^<]+)<\/th/s", $page, $matches)) {
echo $matches[1];
}
Use preg_match_all() if you have multiple <th> to retrieve, as the above will get only the first one.

Php curl incorrect download

I'm attempting to use Youtube's API to pull a list of video and display them. To do this, I need to curl their api and get the xml file returned, which I will then parse.
When I run the following curl function
function get_url_contents($url){
$crl = curl_init();
$timeout = 5;
curl_setopt ($crl, CURLOPT_URL,$url);
curl_setopt ($crl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($crl, CURLOPT_CONNECTTIMEOUT, $timeout);
$ret = curl_exec($crl);
curl_close($crl);
return $ret;
}
against the url
http://gdata.youtube.com/feeds/api/videos?q=Apple&orderby=relevance
The string that is saved is horribly screwed up. There are no < > tags, or half of the characters in most of it. It looks 100% different then if I view it in a browser.
I tried print, echo, and var dump and they all show it has completely different, which makes parsing it impossible.
How do I get the file properly from the server?
It's working for me. I'm pretty sure that the file is returned without errors, but when you print it, the <> tags aren't showed. But if you look on the source code you can see them.
Try this, you can see it work:
$content = get_url_contents('http://gdata.youtube.com/feeds/api/videos?q=Apple&orderby=relevance');
$xml = simplexml_load_string($content);
print_r($xml);
Make use of the client library that Google provides, it'll make your life easier.
http://code.google.com/apis/youtube/2.0/developers_guide_php.html

Categories