I'm using the Bing Search API 2.0 (XML) & PHP to retreive results.
But when running some queries, the API doesn't return the (same) results Bing.com would.
When I send this request: (This is using the API)
http://api.search.live.net/xml.aspx?Appid=__________&query=3+ts+site%3Amycharity.ie/charity&sources=web&web.count=10&web.offset=0
I get 0 results.
But if I go to Bing.com and search for bacon the URL would be:
http://www.bing.com/search?q=bacon&go=&form=QBRE&filt=all&qs=n&sk=&sc=8-5
So If I take I substitute in my API query into this URL like so:
http://www.bing.com/search?q=3+ts+site%3Amycharity.ie/charity&go=&form=QBRE&filt=all&qs=n&sk=&sc=8-5
I should get 0 results again, right?
No, I get the 1 result. (The result I was looking for with the API).
Why is this? Is there anyway around this?
Yes the Bing API is totally brain dead and utterly useless because of this fact.
But, luckily, screen scraping is trivial:
<?
function searchBing($search_term)
{
$html = file_get_contents("http://www.bing.com/search?q=".urlencode($search_term)."&go=&qs=n&sk=&sc=8-20&first=$start&FORM=QBLH");
$doc = new DOMDocument();
#$doc->loadHtml($html);
$x = new DOMXpath($doc);
$output = array();
// just grab the urls for now
foreach ($x->query("//div[#class='sb_tlst']//a") as $node)
{
$output[] = $node->getAttribute("href");
}
return $output;
}
print_r(searchBing("bacon"));
Doesnt look like the API request is actually requesting the information. Well, it is, but not quite. Example;
from the bing search; "search?q=bacon&go=&form" Note the word bacon in it.
This doesnt appear to be parsed in any way in the API request. Not even as a hex value. I believe that herein lies the problem.
Perhaps there was an issue, which is now fixed...
Currently, if I'm trying the following queries made according to the Bing API 2.0 MSDN they all return the same single result:
http://www.bing.com/search?q=3+ts+site%3Amycharity.ie/charity&go=&form=QBRE&filt=all&qs=n&sk=&sc=8-5
http://api.bing.net/xml.aspx?Appid=______7&query=3+ts+site%3Amycharity.ie/charity&sources=web
http://api.bing.net/json.aspx?Appid=_______&query=3+ts+site%3Amycharity.ie/charity&sources=web
Related
As I am new to this PHP world I require some help to consume rest request with php. What I have already achieved is to submit json thoug jquery($("#form").serialize) and read the same in php along with returning json value in the response of the server call.
But I am bit stuck when I have to read rest url parameter with php. For example in case of select by Id my client calls me with url mentioned below. Now I would like to retrieve 1 and fetch from the url. With Spring pathvariable it's easy but how I can parse this url in php.
/dummy/customer/1/fetch
Note - As of now I am not using any framework only raw php.
Any help would be appreciated. Thanks in advance.
There's a lot of different ways to do this. I'll show 2 ways.
Basic string functions
We can simply split the string on / and grab the 4th part.
$output = explode($input, '/')[3];
Regular expressions
This example of a regular expression doesn't just grab the 1, it also ensures that all the other parts in the url are what you expect. Regular expressions are harder but more versatile.
$matches = null;
$success = preg_match('#^/dummy/customer/([0-9]+)/fetch$#', $input, $matches);
if (!$success) {
throw new Exception('Unexpected format');
}
$output = $matches[1];
I have the following PHP. Basically, I'm getting similar data from multiple pages of a website (the current number of homeruns from a website that has a bunch of baseball player profiles). The JSON that I'm bringing in has all of the URLs to all of the different profiles that I'm looking to grab from, and so I need PHP to run through the URLs and grab the data. However, the following PHP only gets the info from the very first URL. I'm probably making a stupid mistake. Can anyone see why it's not going through all the URLs?
include('simple_html_dom.php');
$json = file_get_contents("http://example.com/homeruns.json");
$elements = json_decode($json);
foreach ($elements as $element){
$html = new simple_html_dom();
$html->load_file($element->profileurl);
$currenthomeruns = $html->find('.homeruns .current',0);
echo $element->name, " currently has the following number of homeruns: ", strip_tags($currenthomeruns);
return $html;
}
Wait... You are using return $html. Why? Return is going to break out of your function, thus stopping your foreach.
If you are indeed trying to get the $html out of your function for ALL of the elements, you should push each $html into an array and then return that array after the loop.
Because you return. return leaves the current method, function, or script, which includes every loop. With PHP5.5 you can use yield to let the function behaves like an generator, but this is definitely out of scope for now.
Unless your braces are off, you return at the very end of the loop so the loop will never iterate.
I have some Json like data got crawling a URL
[[["oppl.lr",[,,,,,,,,,,,[[[[[,"A Google User"]
,,,,1]
,3000,,,"Double tree was ok, it wasnt super fancy or anything. Its good for families and just relaxing by the pool. Service was good, and rooms were kept neat.","a year ago",["http://www.ma..",,1],,"","",""]
]
,["Rooms","Service","Location","Value"]
,[]
Which is impossible to parse using php json_decode() function. Is there any library or something which will allow me to convert this to a regular json so that my task will be easier ? Otherwise I know I have to write regular expression.
Thanks in advanced.
Based on your comment. Incase your data is
[["oppl.lr",[,,,,,,,,,,,[[,"A Google User"],"",""],""]]]
If you can somehow send the data to client side. Then it is a valid javascript array. Either you can process the data #client side or use
JSON.stringify([["oppl.lr",[,,,,,,,,,,,[[,"A Google User"],"",""],""]]]);
and send the data back to server as
"[["oppl.lr",[null,null,null,null,null,null,null,null,null,null,null,[[null,"A Google User"],"",""],""]]]"
Else via php you can use this function
function getValidArray($input) {
$input = str_replace(",,", ',"",', $input);
$input = str_replace(",,", ',"",', $input);
$input = str_replace("[,", '["",', $input);
return eval("return $input;");
}
You can optimize the above function as per the need.
Just wondering if someone has sample code to return the top 25 links of reddit (using PHP). In JSON or XML. i can't wrap my head around the API... and rarely use python.
$array = (array) json_decode(file_get_contents("http://www.reddit.com/.json"));
That will return a PHP array with all the data from the front page. You can limit how many you display using a simple while() loop.
This is very simple now that you have the array from above, because all you have to do is step through it.
I am trying to create a simple alert app for some friends.
Basically i want to be able to extract data "price" and "stock availability" from a webpage like the folowing two:
http://www.sparkfun.com/commerce/product_info.php?products_id=5
http://www.sparkfun.com/commerce/product_info.php?products_id=9279
I have made the alert via e-mail and sms part but now i want to be able to get the quantity and price out of the webpages (those 2 or any other ones) so that i can compare the price and quantity available and alert us to make an order if a product is between some thresholds.
I have tried some regex (found on some tutorials, but i an way too n00b for this) but haven't managed to get this working, any good tips or examples?
$content = file_get_contents('http://www.sparkfun.com/commerce/product_info.php?products_id=9279');
preg_match('#<tr><th>(.*)</th> <td><b>price</b></td></tr>#', $content, $match);
$price = $match[1];
preg_match('#<input type="hidden" name="quantity_on_hand" value="(.*?)">#', $content, $match);
$in_stock = $match[1];
echo "Price: $price - Availability: $in_stock\n";
It's called screen scraping, in case you need to google for it.
I would suggest that you use a dom parser and xpath expressions instead. Feed the HTML through HtmlTidy first, to ensure that it's valid markup.
For example:
$html = file_get_contents("http://www.example.com");
$html = tidy_repair_string($html);
$doc = new DomDocument();
$doc->loadHtml($html);
$xpath = new DomXPath($doc);
// Now query the document:
foreach ($xpath->query('//table[#class="pricing"]/th') as $node) {
echo $node, "\n";
}
What ever you do: Don't use regular expressions to parse HTML or bad things will happen. Use a parser instead.
1st, asking this question goes too into details. 2nd, extracting data from a website might not be legitimate. However, I have hints:
Use Firebug or Chrome/Safari Inspector to explore the HTML content and pattern of interesting information
Test your RegEx to see if the match. You may need do it many times (multi-pass parsing/extraction)
Write a client via cURL or even much simpler, use file_get_contents (NOTE that some hosting disable loading URLs with file_get_contents)
For me, I'd better use Tidy to convert to valid XHTML and then use XPath to extract data, instead of RegEx. Why? Because XHTML is not regular and XPath is very flexible. You can learn XSLT to transform.
Good luck!
You are probably best off loading the HTML code into a DOM parser like this one and searching for the "pricing" table. However, any kind of scraping you do can break whenever they change their page layout, and is probably illegal without their consent.
The best way, though, would be to talk to the people who run the site, and see whether they have alternative, more reliable forms of data delivery (Web services, RSS, or database exports come to mind).
The simplest method to extract data from Website. I've analysed that my all data is covered within <h3> tag only, so I've prepared this one.
<?php
include(‘simple_html_dom.php’);
// Create DOM from URL, paste your destined web url in $page
$page = ‘http://facebook4free.com/category/facebookstatus/amazing-facebook-status/’;
$html = new simple_html_dom();
//Within $html your webpage will be loaded for further operation
$html->load_file($page);
// Find all links
$links = array();
//Within find() function, I have written h3 so it will simply fetch the content from <h3> tag only. Change as per your requirement.
foreach($html->find(‘h3′) as $element)
{
$links[] = $element;
}
reset($links);
//$out will be having each of HTML element content you searching for, within that web page
foreach ($links as $out)
{
echo $out;
}
?>