Is there any way I can access the thumbnail picture of any wikipedia page by using an API? I mean the image on the top right side in box. Is there any APIs for that?
You can get the thumbnail of any wikipedia page using prop=pageimages. For example:
http://en.wikipedia.org/w/api.php?action=query&titles=Al-Farabi&prop=pageimages&format=json&pithumbsize=100
And you will get the thumbnail full URL.
http://en.wikipedia.org/w/api.php
Look at prop=images.
It returns an array of image filenames that are used in the parsed page. You then have the option of making another API call to find out the full image URL, e.g.:
action=query&titles=Image:INSERT_EXAMPLE_FILE_NAME_HERE.jpg&prop=imageinfo&iiprop=url
or to calculate the URL via the filename's hash.
Unfortunately, while the array of images returned by prop=images is in the order they are found on the page, the first can not be guaranteed to be the image in the info box because sometimes a page will include an image before the infobox (most of the time icons for metadata about the page: e.g. "this article is locked").
Searching the array of images for the first image that includes the page title is probably the best guess for the infobox image.
This is good way to get the Main Image of a page in wikipedia
http://en.wikipedia.org/w/api.php?action=query&prop=pageimages&format=json&piprop=original&titles=India
Check out the MediaWiki API example for getting the main picture of a wikipedia page: https://www.mediawiki.org/wiki/API:Page_info_in_search_results.
As other's have mentioned, you would use prop=pageimages in your API query.
If you also want the image description, you would use prop=pageimages|pageterms instead in your API query.
You can get the original image using piprop=original. Or you can get a thumbnail image with a specified width/height. For a thumbnail with width/height=600, piprop=thumbnail&pithumbsize=600. If you omit either, the image returned in the API callback will default to a thumbnail with width/height of 50px.
If you are requesting results in JSON format, you should always use formatversion=2 in your API query (i.e., format=json&formatversion=2) because it makes retrieving the image from the query easier.
Original Size Image:
https://en.wikipedia.org/w/api.php?action=query&format=json&formatversion=2&prop=pageimages|pageterms&piprop=original&titles=Albert Einstein
Thumbnail Size (600px width/height) Image:
https://en.wikipedia.org/w/api.php?action=query&format=json&formatversion=2&prop=pageimages|pageterms&piprop=thumbnail&pithumbsize=600&titles=Albert Einstein
Way 1: You can try some query like this:
http://en.wikipedia.org/w/api.php?action=opensearch&limit=5&format=xml&search=italy&namespace=0
in the response, you can see the Image tag.
<Item>
<Text xml:space="preserve">Italy national rugby union team</Text>
<Description xml:space="preserve">
The Italy national rugby union team represent the nation of Italy in the sport of rugby union.
</Description>
<Url xml:space="preserve">
http://en.wikipedia.org/wiki/Italy_national_rugby_union_team
</Url>
<Image source="http://upload.wikimedia.org/wikipedia/en/thumb/4/46/Italy_rugby.png/43px-Italy_rugby.png" width="43" height="50"/>
</Item>
Way 2: use query http://en.wikipedia.org/w/index.php?action=render&title=italy
then you can get a raw html code, you can get the image use something like PHP Simple HTML DOM Parser
http://simplehtmldom.sourceforge.net
I have no time write it to you. just give you some advice, thanks.
I'm sorry for not answering specifically your question about the main image. But here's some code to get a list of all images:
function makeCall($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
return curl_exec($curl);
}
function wikipediaImageUrls($url) {
$imageUrls = array();
$pathComponents = explode('/', parse_url($url, PHP_URL_PATH));
$pageTitle = array_pop($pathComponents);
$imagesQuery = "http://en.wikipedia.org/w/api.php?action=query&titles={$pageTitle}&prop=images&format=json";
$jsonResponse = makeCall($imagesQuery);
$response = json_decode($jsonResponse, true);
$imagesKey = key($response['query']['pages']);
foreach($response['query']['pages'][$imagesKey]['images'] as $imageArray) {
if($imageArray['title'] != 'File:Commons-logo.svg' && $imageArray['title'] != 'File:P vip.svg') {
$title = str_replace('File:', '', $imageArray['title']);
$title = str_replace(' ', '_', $title);
$imageUrlQuery = "http://en.wikipedia.org/w/api.php?action=query&titles=Image:{$title}&prop=imageinfo&iiprop=url&format=json";
$jsonUrlQuery = makeCall($imageUrlQuery);
$urlResponse = json_decode($jsonUrlQuery, true);
$imageKey = key($urlResponse['query']['pages']);
$imageUrls[] = $urlResponse['query']['pages'][$imageKey]['imageinfo'][0]['url'];
}
}
return $imageUrls;
}
print_r(wikipediaImageUrls('http://en.wikipedia.org/wiki/Saturn_%28mythology%29'));
print_r(wikipediaImageUrls('http://en.wikipedia.org/wiki/Hans-Ulrich_Rudel'));
I got this for http://en.wikipedia.org/wiki/Saturn_%28mythology%29:
Array
(
[0] => http://upload.wikimedia.org/wikipedia/commons/1/10/Arch_of_SeptimiusSeverus.jpg
[1] => http://upload.wikimedia.org/wikipedia/commons/8/81/Ivan_Akimov_Saturn_.jpg
[2] => http://upload.wikimedia.org/wikipedia/commons/d/d7/Lucius_Appuleius_Saturninus.jpg
[3] => http://upload.wikimedia.org/wikipedia/commons/2/2c/Polidoro_da_Caravaggio_-_Saturnus-thumb.jpg
[4] => http://upload.wikimedia.org/wikipedia/commons/b/bd/Porta_Maggiore_Alatri.jpg
[5] => http://upload.wikimedia.org/wikipedia/commons/6/6a/She-wolf_suckles_Romulus_and_Remus.jpg
[6] => http://upload.wikimedia.org/wikipedia/commons/4/45/Throne_of_Saturn_Louvre_Ma1662.jpg
)
And for the second URL (http://en.wikipedia.org/wiki/Hans-Ulrich_Rudel):
Array
(
[0] => http://upload.wikimedia.org/wikipedia/commons/e/e9/BmRKEL.jpg
[1] => http://upload.wikimedia.org/wikipedia/commons/3/3f/BmRKELS.jpg
[2] => http://upload.wikimedia.org/wikipedia/commons/2/2c/Bundesarchiv_Bild_101I-655-5976-04%2C_Russland%2C_Sturzkampfbomber_Junkers_Ju_87_G.jpg
[3] => http://upload.wikimedia.org/wikipedia/commons/6/62/Bundeswehr_Kreuz_Black.svg
[4] => http://upload.wikimedia.org/wikipedia/commons/9/99/Flag_of_German_Reich_%281935%E2%80%931945%29.svg
[5] => http://upload.wikimedia.org/wikipedia/en/6/64/HansUlrichRudel.jpeg
[6] => http://upload.wikimedia.org/wikipedia/commons/8/82/Heinkel_He_111_during_the_Battle_of_Britain.jpg
[7] => http://upload.wikimedia.org/wikipedia/commons/6/66/Regulation_WW_II_Underwing_Balkenkreuz.png
)
Note that the URL changed a bit on the 6th element of the second array. It's what #JosephJaber was warning about in his comment above.
Hope this helps someone.
I have written some code that gets main image (full URL) by Wikipedia article title. It's not perfect, but overall I'm very pleased with the results.
The challenge was that when queried for a specific title, Wikipedia returns multiple image filenames (without path). Furthermore, the secondary search (I used the code varatis posted in this thread - thanks!) returns URLs of all images found based on the image filename that was searched, regardless of the original article title. After all this, we may end up with a generic image irrelevant to the search, so we filter those out. The code iterates over filenames and URLs until it finds (hopefully the best) match... a bit complicated, but it works :)
Note on the generic filter: I've been compiling a list of generic image strings for the isGeneric() function, but the list just keeps growing. I am considering maintaining it as a public list - if there is any interest let me know.
Pre:
protected static $baseurl = "http://en.wikipedia.org/w/api.php";
Main function - get image URL from title:
public static function getImageURL($title)
{
$images = self::getImageFilenameObj($title); // returns JSON object
if (!$images) return '';
foreach ($images as $image)
{
// get object of image URL for given filename
$imgjson = self::getFileURLObj($image->title);
// return first image match
foreach ($imgjson as $img)
{
// get URL for image
$url = $img->imageinfo[0]->url;
// no image found
if (!$url) continue;
// filter generic images
if (self::isGeneric($url)) continue;
// match found
return $url;
}
}
// match not found
return '';
}
== The following functions are called by the main function above ==
Get JSON object (filenames) by title:
public static function getImageFilenameObj($title)
{
try // see if page has images
{
// get image file name
$json = json_decode(
self::retrieveInfo(
self::$baseurl . '?action=query&titles=' .
urlencode($title) . '&prop=images&format=json'
))->query->pages;
/** The foreach is only to get around
* the fact that we don't have the id.
*/
foreach ($json as $id) { return $id->images; }
}
catch(exception $e) // no images
{
return NULL;
}
}
Get JSON object (URLs) by filename:
public static function getFileURLObj($filename)
{
try // resolve URL from filename
{
return json_decode(
self::retrieveInfo(
self::$baseurl . '?action=query&titles=' .
urlencode($filename) . '&prop=imageinfo&iiprop=url&format=json'
))->query->pages;
}
catch(exception $e) // no URLs
{
return NULL;
}
}
Filter out generic images:
public static function isGeneric($url)
{
$generic_strings = array(
'_gray.svg',
'icon',
'Commons-logo.svg',
'Ambox',
'Text_document_with_red_question_mark.svg',
'Question_book-new.svg',
'Canadese_kano',
'Wiki_letter_',
'Edit-clear.svg',
'WPanthroponymy',
'Compass_rose_pale',
'Us-actor.svg',
'voting_box',
'Crystal_',
'transportation_inv',
'arrow.svg',
'Quill_and_ink-US.svg',
'Decrease2.svg',
'Rating-',
'template',
'Nuvola_apps_',
'Mergefrom.svg',
'Portal-',
'Translation_to_',
'/School.svg',
'arrow',
'Symbol_',
'stub',
'Unbalanced_scales.svg',
'-logo.',
'P_vip.svg',
'Books-aj.svg_aj_ashton_01.svg',
'Film',
'/Gnome-',
'cap.svg',
'Missing',
'silhouette',
'Star_empty.svg',
'Music_film_clapperboard.svg',
'IPA_Unicode',
'symbol',
'_highlighting_',
'pictogram',
'Red_pog.svg',
'_medal_with_cup',
'_balloon',
'Feature',
'Aiga_'
);
foreach ($generic_strings as $str)
{
if (stripos($url, $str) !== false) return true;
}
return false;
}
Comments welcome.
Lets take Example of Page http://en.wikipedia.org/wiki/index.html?curid=57570
to get Main Pic
Check out
prop=pageprops
action=query&pageids=57570&prop=pageprops&format=json
Results Page Data Eg.
{ "pages" : { "57570":{
"pageid":57570,
"ns":0,
"title":"Sachin Tendulkar",
"pageprops" : {
"defaultsort":"Tendulkar,Sachin",
"page_image":"Sachin_at_Castrol_Golden_Spanner_Awards_(crop).jpg",
"wikibase_item":"Q9488"
}
}
}
}}
We get main Pic file name this result as
** (wikiId).pageprops.page_image = Sachin_at_Castrol_Golden_Spanner_Awards_(crop).jpg**
Now as we have Image file name we will have to make another Api Call to get full image path from file name as follows
action=query&titles=Image:INSERT_EXAMPLE_FILE_NAME_HERE.jpg&prop=imageinfo&iiprop=url
Eg.
action=query&titles=Image:Sachin_at_Castrol_Golden_Spanner_Awards_(crop).jpg&prop=imageinfo&iiprop=url
Returns Array of Image Data having url in it as
http://upload.wikimedia.org/wikipedia/commons/3/35/Sachin_at_Castrol_Golden_Spanner_Awards_%28crop%29.jpg
I there is a way to reliably get a main image for a wikipedia page - the Extension called PageImages
The PageImages extension collects information about images used on a page.
Its aim is to return the single most appropriate thumbnail associated
with an article, attempting to return only meaningful images, e.g. not
those from maintenance templates, stubs or flag icons. Currently it
uses the first non-meaningless image used in the page.
https://www.mediawiki.org/wiki/Extension:PageImages
Just add the prop pageimages to your API Query:
/w/api.php?action=query&prop=pageimages&titles=Somepage&format=xml
This reliably filters out annoying default images and prevents you from having to filter them yourself! The extension is installed on all the main wikipedia pages...
Like Anuraj mentioned, the pageimages parameter is it. Look at the following url that'll bring about some nifty stuff:
https://en.wikipedia.org/w/api.php?action=query&prop=info|extracts|pageimages|images&inprop=url&exsentences=1&titles=india
Her are some interesting parameters:
The two parameters extracts and exsentences gives you a short
description you can use. (exsentences is the number of sentences you want to include in the excerpt)
The info and the inprop=url parameters gives you the url of the page
The prop property has multiple parameters separated by a bar symbol
And if you insert the format=json in there, it is even better
See this related question on an API for Wikipedia. However, I would not know if it is possible to retrieve the thumbnail picture through an API.
You can also consider just parsing the web page to find the image URL, and retrieve the image that way.
Here is my list of XPaths I have found work for 95 percent of articles. the main ones are 1, 2 3 and 4. A lot of articles are not formatted correctly and these would be edge cases:
You can use a DOM parsing lib to fetch image using the XPath.
static NSString *kWikipediaImageXPath2 = #"//*[#id=\"mw-content-text\"]/div[1]/div/table/tr[2]/td/a/img";
static NSString *kWikipediaImageXPath3 = #"//*[#id=\"mw-content-text\"]/div[1]/table/tr[1]/td/a/img";
static NSString *kWikipediaImageXPath1 = #"//*[#id=\"mw-content-text\"]/div[1]/table/tr[2]/td/a/img";
static NSString *kWikipediaImageXPath4 = #"//*[#id=\"mw-content-text\"]/div[2]/table/tr[2]/td/a/img";
static NSString *kWikipediaImageXPath5 = #"//*[#id=\"mw-content-text\"]/div[1]/table/tr[2]/td/p/a/img";
static NSString *kWikipediaImageXPath6 = #"//*[#id=\"mw-content-text\"]/div[1]/table/tr[2]/td/div/div/a/img";
static NSString *kWikipediaImageXPath7 = #"//*[#id=\"mw-content-text\"]/div[1]/table/tr[1]/td/div/div/a/img";
I used a ObjC wrapper called Hpple around libxml2.2 to pull out the image url. Hope this helps
You can also use cocoa Pod called SDWebImage
Code sample (remember to also add import SDWebImage):
func requestInfo(flowerName: String) {
let parameters : [String:String] = [
"format" : "json",
"action" : "query",
"prop" : "extracts|pageimages",//pageimages allows fetch imagePath
"exintro" : "",
"explaintext" : "",
"titles" : flowerName,
"indexpageids" : "",
"redirects" : "1",
"pithumbsize" : "500"//specify image size in px
]
AF.request(wikipediaURL, method: .get, parameters: parameters).responseJSON { (response) in
switch response.result {
case .success(let value):
print("Got the wikipedia info.")
print(response)
let flowerJSON : JSON = JSON(response.value!)
let pageid = flowerJSON["query"]["pageids"][0].stringValue
let flowerDescription = flowerJSON["query"]["pages"][pageid]["extract"].stringValue
let flowerImageURL = flowerJSON["query"]["pages"][pageid]["thumbnail"]["source"].stringValue //fetching Image URL
self.wikiInfoLabel.text = flowerDescription
self.imageView.sd_setImage(with: URL(string : flowerImageURL))//imageView updated with Wiki Image
case .failure(let error):
print(error)
}
}
}
I think not, but you can capture the image using a link parser HTML documents
Related
I have a PHP backend and I use it on my localhost so everything okay but I have a problem that the image URL I get from API is a wrong path and I can't change it from server-side so I decide to fix it on my client-side
I can display an image on my emulator with this path :
http://10.0.2.2:8000/storage/app/public/171/conversions/api-icon.jpg
and the API gives me this path
http://192.168.1.114/multi-restaurants/public/storage/app/public/171/conversions/api-icon.jpg
I fix it by making a function to change the path but it takes a lot of work like I should put this function in every place I wanna display an image!!
I'm sure there is a way to change the path directly from the model when I receive the api
here is my model
class Media {
String id;
String name;
String url;
String thumb;
String icon;
String size;
Media();
Media.fromJSON(Map<String, dynamic> jsonMap)
: id = jsonMap['id'].toString(),
name = jsonMap['name'],
url = jsonMap["url"] ,
thumb = jsonMap['thumb'],
icon = jsonMap['icon'],
size = jsonMap['formated_size'];
This function that I'm using in every class to change path Url
String changepath(String uuu) {
final uri = Uri.parse(uuu);
print("This is $uri");
if (uri.path.contains("multi-restaurants")) {
print("http://10.0.2.2:8000/${uri.pathSegments[2]}/${uri.pathSegments[3]}/${uri.pathSegments[4]}/${uri.pathSegments[5]}/${uri.pathSegments[6]}");
return"http://10.0.2.2:8000/${uri.pathSegments[2]}/${uri.pathSegments[3]}/${uri.pathSegments[4]}/${uri.pathSegments[5]}/${uri.pathSegments[6]}";
}
}
}
If the API constantly sending the URL in "http://192.168.1.114/multi-r ... " in this format, then there is a solution for it, don't know is this a smart move or anything you looking for.Just create String type Function,pass the url from json url = changer(jsonMap["url"]); and use it to convert the URL to desired URL.
String changer(String _string1) {
String _string2 = _string1.replaceAll("http://192.168.1.114/multi-restaurants/public", "http://10.0.2.2:8000");
return _string2;
}
I Solve the Problem with Create a Helper Class and put inside it a static function and call it in every widget I wanna display (Url)
static String changer(String _string1) {
String _string2 = _string1.replaceAll(
"http://192.168.1.114/multi-restaurants/public",
"http://10.0.2.2:8000");
print(_string2);
return _string2;
}
I want to save xml document in mysql database as below:
$_docId= $this->getRequest()->getParam('docId', 0);
$_slmData = '<SLM_form_v2 id="slmform"><group_1_1><section1_1/><name1_1>sdfds</name1_1>
enter code here<localname1_2/><selectcountry_1_3>Algeria</selectcountry_1_3></group_1_1>
enter code here<group_1_2><main_doc/><name_doc1>gdgf gfh f</name_doc1><sex_doc1>Male
</sex_doc1><name_institution_1>fsdgdfg</name_institution_1><address_institution_1>gdgfdgfd
</address_institution_1>....';
$_docMapper = new Model_Mapper_XMLDoc();
$_docModel = new Model_XMLDoc();
$_docModel ->doc_data = Zend_Json::encode($_docData);
if ($_docId != 0) {
$_docModel->id = $_docId;
$_docMapper->update($_docModel->toArray(), 'id = ' . $_docId);
$_action = 'update';
} else {
$_docMapper->insert($_docModel->toArray());
$_lastId = $_docMapper->getAdapter()->lastInsertId(); //gets the id of the last inserted record
$_action = 'add';
}
$this->_helper->json->sendJson(array(
'message' => 'success',
'id' => $_lastId,
'action' => $_action
));
It is stored in the db:
INSERT INTO crpcoreix.tbl_xml_doc (slm_data, web_gis_fk) VALUES ('"<SLM_form_v2 id=\\"slmform\\"><group_1_1><section1_1\\/><name1_1>sdfds<\\/name1_1><localname1_2\\/><selectcountry_1_3>Algeria<\\/selectcountry_1_3><\\/group_1_1><group_1_2><main_doc\\/><name_doc1>gdgf gfh f<\\/name_doc1><sex_doc1>Male<\\/sex_doc1><name_institution_1>fsdgdfg<\\/name_institution_1><address_institution_1>gdgfdgfd gdgf<\\/address_institution_1>...', null);
When I'm trying to read the data from the database and display the encoded xml tags the output miss the first part ("<SLM_form_v2 id=\\"slmform\\"><group_1_1><section1_1\\/><name1_1>...) the first part
Array
(
[id] => 1
[xml_data] => "sdfds<\/name1_1>Algeria<\/selectcountry_1_3>...'
[web_gis_fk] =>
)
Please advice how to fix and if there is a better way to store xml documents in database.
Thanx,
first is first you need to double check your given xml syntax, whether its correct or not.
Zend_Json includes a static function called Zend_Json::fromXml(). so
you would better manage to use it instead of Zend_Json::encode when
you encode XML
This function will generate JSON from a given XML input.
This function takes any arbitrary XML string as an input parameter. It also takes an optional Boolean input parameter to instruct the conversion logic to ignore or not ignore the XML attributes during the conversion process.
If this optional input parameter is not given, then the default behavior is to ignore the XML attributes. This function call is made as shown below:
$jsonContents = Zend_Json::fromXml($xmlStringContents, true);
Give it a try but again check your XML syntax
another (probably) easy question for you.
As I wrote many times, I'm not a programmer but thanks to you I was able to build an interesting site for my uses.
So again thx.
This is my new problem:
I have a site that recover json data from a file and store it locally once a day (this is the code - I post it so it can be useful to anyone):
// START
findList();
function findList() {
$serverName = strtolower($_POST["Server"]); // GET SERVER FROM FORM POST
$localCoutryList = ("json/$serverName/countries.json"); // LOCAL LOCATIONS OF FILE
$urlCountryList = strtolower("url.from.where.I.get.Json.file.$serverName/countries.json"); // ONLINE LOCATION OF FILE
if (file_exists($localCoutryList)) { // FILE EXIST
$fileLastMod = date("Y/m/d",filemtime($localCoutryList)); // IF FILE LAST MOD = FILE DATE TIME
$now = date('Y/m/d', time()); // NOW
if ($now != $fileLastMod) { // IF NOW != FILE LAST MOD (date)
createList($serverName,$localCoutryList,$urlCountryList); // GO AND CREATE FILE
} else { // IF NOW = FILE LAST MOD (date)
jsonList($serverName,$localCoutryList,$urlCountryList); // CALL JSON DATA FROM FILE
}} else { // FILE DOESN'T EXIST
createList($serverName,$localCoutryList,$urlCountryList); // CALL CREATE FILE
}};
function createList($serverName,$localCoutryList,$urlCountryList) {
file_put_contents($localCountryList, file_get_contents($urlCountryList)); // CREATE FILE
jsonList($serverName); // CALL JSON DATA FROM FILE
};
function jsonList($serverName,$localCoutryList,$urlCountryList) { // JSON DATA FROM FILE
$jsonLoopCountry = file_get_contents($localCoutryList); // GET CONTENT OF THE LOCAL FILE
$outLoopCountry = json_decode($jsonLoopCountry, true); // DECODE JSON FILE
$countryList = $outLoopCountry['country']; // ACCESS TO JSON OBJECT
foreach ($countryList as $dataCountry) { // FOR EACH "COUNTRY" IN JSON DECODED OBJECT
// SET VARS FOR ALL THE COUNTRIES
$iscountryID = ($dataCountry['id']); // TEST
$countryCurrency = ($dataCountry['curr']); // TEST
$countryName = ($dataCountry['name']);} // TEST
echo "<br>total country list: ".count($countryList); // THIS RETURN TOTAL ELEMENTS IN THE ARRAY
[...]
The kind of Json data I'm working with is structured as it follows:
{"country":[{"id":130,"curr":"AFN","name":"Afghanistan"},{"id":55,"curr":"ALL","name":"Albania"},{"id":64,"curr":"DZD","name":"Algeria"},{"id":65,"curr":"AOA","name":"Angola"},{"id":24,"curr":"ARS","name":"Argentina"}...
So I can say that
$outLoopCountry = json_decode($jsonLoopCountry, true); // DECODE JSON FILE
creates the JSON ARRAY right? (Because JSON array starts with {"something":[{"..."...)
if it was [{"a":"answer","b":"answer",...} ... it would have a been a JSON Object right?
So to access to the array I uses
$countryList = $outLoopCountry['country']; // ACCESS TO JSON OBJECT
right?
So I understood that arrays are a fast useful ways to store relational data and access to it right?
So the question is how to make a precise search inside the array, so to make it works like a Query MySQLi, like "SEARCH INSIDE ARRAY WHERE id == 7" and have as a result "== ITALY" (ex.).
The way exist for sure, but with whole exmples I found on the net, I was able to fix one to make it works.
Thx again.
Alberto
I am writing an app in which i am allowing user to enter their product details into a framework [i have made on PHP] and then by using that framework i am encoding data into json and that encoded data i am retrieving in my Android App using web services.
But now issue is i am not able to encode web image url therefore i am not getting correct image path in json:-
Like:
I have given this link to encode into JSON:
"imageurl" => "http://www.libpng.org/pub/png/img_png/pnglogo-blk-sml1.png"
but whenever i see encoded web image url in JSON getting something like this:
"imagepath":"http:\/\/www.libpng.org\/pub\/png\/img_png\/pnglogo-blk-sml1.png"
why i am getting wrong image url....
I don't know how can i resolve this issue..please someone help me....
albums.php:
<?php
/*
*Simple JSON generation from static array
*Author: JSR
*/
include_once './data.php';
$albums = array();
// looping through each album
foreach ($album_tracks as $album) {
$tmp = array();
$tmp["id"] = $album["id"];
$tmp["name"] = $album["album"];
$tmp["description"] = $album["detail"];
$tmp["imagepath"] = $album["imageurl"];
$tmp["songs_count"] = count($album["songs"]);
// push album
array_push($albums, $tmp);
}
//printing json
echo json_encode($albums);
?>
data.php:
<?php
/*
*Simple JSON generation from static array
*Author: JSR
*/
$album_tracks = array(
1 => array(
"id" => 1,
"album" => "Appetizers",
"detail" => "All appetizers served with mint and tamarind chutneys.",
"imageurl" => "http://www.libpng.org/pub/png/img_png/pnglogo-blk-sml1.png"
)
));
?>
For me it works
<script>
var str="http:\/\/www.libpng.org\/pub\/png\/img_png\/pnglogo-blk-sml1.png";
document.write("<img src='" + str + "'/>");
</script>
Whenever you send image path or url in json, it escapes all the slashes, quotes, etc. So you can use some regex like s/\////gis to get rid of these escapes characters and get the original url or location.
Hi, is there a way to download the BibTeX entry for something from Google Scholar using PHP without having to download the BibTeX manually one by one? For example, setting a search value like "research" and then downloading the related BibTeX from the links automatically through code.
Any help would be appreciated. I tried to get the HTML page, but as I try to get the page contents the "Import to BibTeX" link disappears on the retrieved page contents.
My code:
<?php
$url = 'http://scholar.google.com/scholar?q=honors+college&hl=en&btnG=Search& amp;as_sdt=1%2C4&as_sdtp=on';
$needle = 'Import into bibtex';
$contents = file_get_contents($url);
echo $contents;
if(strpos($contents, $needle)!== false) {
echo 'found';
} else {
echo 'not found';
}
?>
The short answer is No you cannot do this
Google does not provide API's for search / scholar and uses firm rate-limitation. The problem is that for each BibTex entry you need 2 additional requests (1 for the query, 1 for the 'import link' and a final one to get the actual BibTex entry content)
I wrote a script that scrapes google scholar results and finds the BibTex links and saves the results. However, due to the rate limit is not viable and will get blocked almost instantly.
Code can be viewed here: https://gist.github.com/Tessmore/11099509 and is free of use, but at your own risk.
As Tessmore said - you can't. But you can make it work by using Google Scholar Organic Results API from SerpApi that bypasses quota limits and blocks from search engines so you don't have to think about how to reduce the chance of being blocked.
Example:
Install google-search-results-php package first via composer:
$ composer require serpapi/google-search-results-php:2.0
Code to integrate and full example in the online IDE:
<?php
ini_set("display_errors", 1);
ini_set("display_startup_errors", 1);
error_reporting(E_ALL);
require __DIR__ . "/vendor/autoload.php";
function getResultIds () {
$result_ids = array();
$params = [
"engine" => "google_scholar", // parsing engine
"q" => "biology" // search query
];
$search = new GoogleSearch(getenv("API_KEY"));
$response = $search->get_json($params);
foreach ($response->organic_results as $result) {
// print_r($result->result_id);
array_push($result_ids, $result->result_id);
}
return $result_ids;
}
function getBibtexData () {
$bibtex_data = array();
foreach (getResultIds() as $result_id) {
$params = [
"engine" => "google_scholar_cite", // parsing engine
"q" => $result_id
];
$search = new GoogleSearch(getenv("API_KEY"));
$response = $search->get_json($params);
foreach ($response->links as $result) {
if ($result->name === "BibTeX") {
array_push($bibtex_data, $result->link);
}
}
}
return $bibtex_data;
}
print_r(json_encode(getBibtexData(), JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES));
?>
Output:
[
"https://scholar.googleusercontent.com/scholar.bib?q=info:KNJ0p4CbwgoJ:scholar.google.com/&output=citation&scisdr=CgXjqB_WGAA:AAGBfm0AAAAAYkm8amenawYn_EBidiCQT5QBh0L1KJEX&scisig=AAGBfm0AAAAAYkm8at9X4P3eIWKUCOc6UriCEDKVsQE0&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:6zRLFbcxtREJ:scholar.google.com/&output=citation&scisdr=CgWhqfi6GAA:AAGBfm0AAAAAYkm8bDoIhTlfTkQFCOzYGax54Bst576o&scisig=AAGBfm0AAAAAYkm8bMe_7Nq4e4pB5lg_eR9jmeGrO8ek&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:6Yb0qOX88FMJ:scholar.google.com/&output=citation&scisdr=CgXn_4MdGAA:AAGBfm0AAAAAYkm8bi8ypCZcFDNEQZYZeoSlvx-U1OSk&scisig=AAGBfm0AAAAAYkm8bnFMnwTWGfkfJDCNEx0C4n-aQwql&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:HFdEElNr3IgJ:scholar.google.com/&output=citation&scisdr=CgXKCFpQGAA:AAGBfm0AAAAAYkm8byukcQCl4WHQx-nSNp2pC1gUFSKG&scisig=AAGBfm0AAAAAYkm8b8EReTVkLwtxfth_pjwMyyY3dqts&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:bs-D_MeC14YJ:scholar.google.com/&output=citation&scisdr=CgXEUXwWGAA:AAGBfm0AAAAAYkm8bwwfMNJrffe16EaGypsem9JlmGTi&scisig=AAGBfm0AAAAAYkm8b6nWlPOQL63fXg6dV2U-JQbpyQyS&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:Rn1qFVLRfKwJ:scholar.google.com/&output=citation&scisdr=CgU-HswkGAA:AAGBfm0AAAAAYkm8cHE1YRK23eHV8nzF89Eem-Bsuz72&scisig=AAGBfm0AAAAAYkm8cDEj8ZrzZjAo2bNX-tjYYYJYQZay&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:d8thHtTwq6YJ:scholar.google.com/&output=citation&scisdr=CgXj7oe9GAA:AAGBfm0AAAAAYkm8cTYamCKGKImjdg5MQdgbxUIIHAEY&scisig=AAGBfm0AAAAAYkm8cTcop1ceKzKYvKAKtvlSQ1EdEtSN&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:IUmhOhGaDaEJ:scholar.google.com/&output=citation&scisdr=CgU0qZ2_GAA:AAGBfm0AAAAAYkm8ctCPwoihZkjbNcdEqSnwa0J3jwDy&scisig=AAGBfm0AAAAAYkm8cingBcYnEp8YRqFDFdN-FAEBgDT7&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:PWsf8O5OMQEJ:scholar.google.com/&output=citation&scisdr=CgVBAJxXGAA:AAGBfm0AAAAAYkm8c3CDKQG0Wh_lWsXU_DZxEJkwZz5y&scisig=AAGBfm0AAAAAYkm8c6I-HjAxD1Gy6FLFDRdxH_qU4OBr&scisf=4&ct=citation&cd=-1&hl=en",
"https://scholar.googleusercontent.com/scholar.bib?q=info:yGvgHH8ROuIJ:scholar.google.com/&output=citation&scisdr=CgXFuhOkGAA:AAGBfm0AAAAAYkm8dD0rcSR4LQF8GgTxx865BADtXNDN&scisig=AAGBfm0AAAAAYkm8dIQhodz3rHF9IUdaCSRlhdudACNQ&scisf=4&ct=citation&cd=-1&hl=en"
]
Bibtex data from the first URL:
#article{woese2004new,
title={A new biology for a new century},
author={Woese, Carl R},
journal={Microbiology and molecular biology reviews},
volume={68},
number={2},
pages={173--186},
year={2004},
publisher={Am Soc Microbiol}
}
Disclaimer, I work for SerpApi.