Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am planning on creating a small website for my personal book collection. To automate the process a little bit, I would like to create the following functionality:
The website will ask me for the ISBN number of the book and will then automatically fetch the title and add it to my database.
Although I am mainly interested in doing this in php, I also have some Java implementation ideas for this. I believe it could also help if the answer was as much language-agnostic as possible.
This is the LibraryThing founder. We have nothing to offer here, so I hope my comments will not seem self-serving.
First, the comment about Amazon, ASINs and ISBN numbers is wrong in a number of ways. In almost every circumstance where a book has an ISBN, the ASIN and the ISBN are the same. ISBNs are not now 13 digits. Rather, ISBNs can be either 10 or 13. Ten-digit ISBNs can be expressed as 13-digit ones starting with 978, which means every ISBN currently in existence has both a 10- and a 13-digit form. There are all sorts of libraries available for converting between ISBN10 and ISBN13. Basically, you add 978 to the front and recalculate the checksum digit at the end.
ISBN13 was invented because publishers were running out of ISBNs. In the near future, when 979-based ISBN13s start being used, they will not have an ISBN10 equivalent. To my knowledge, there are no published books with 979-based ISBNs, but they are coming soon. Anyway, the long and short of it is that Amazon uses the ISBN10 form for all 978 ISBN10s. In any case, whether or not Amazon uses ten or thirteen-digit ASINs, you can search Amazon by either just fine.
Personally, I wouldn't put ISBN DB at the top of your list. ISBN DB mines from a number of sources, but it's not as comprehensive as Amazon or Google. Rather, I'd look into Amazon—including the various international Amazons—and then the new Google Book Data API and, after that, the OpenLibrary API. For non-English books, there are other options, like Ozone for Russian books.
If you care about the highest-quality data, or if you have any books published before about 1970, you will want to look into data from libraries, available by Z39.50 protocol and usually in MARC format, or, with a few libraries in Dublin Core, using the SRU/SRW protocol. MARC format is, to a modern programmer, pretty strange stuff. But, once you get it, it's also better data and includes useful fields like the LCCN, DDC, LCC, and LCSH.
LibraryThing runs off a homemade Python library that queries some 680 libraries and converts the many flavors of MARC into Amazon-compatible XML, with extras. We are currently reluctant to release the code, but maybe releasing a service soon.
Google has it's own API for Google Books that let's you query the Google Book database easily. The protocol is JSON based and you can view the technical information about it here.
You essentially just have to request the following URL :
https://www.googleapis.com/books/v1/volumes?q=isbn:YOUR_ISBN_HERE
This will return you the information about the book in a JSON format.
Check out ISBN DB API. It's a simple REST-based web service. Haven't tried it myself, but a friend has had successful experiences with it.
It'll give you book title, author information, and depending on the book, number of other details you can use.
Try https://gumroad.com/l/RKxO
I purchased this database about 3 weeks ago for a book citation app I'm making. I haven't had any quality problems and virtually any book I scanned was found. The only problem is that they provide the file in CSV and I had to convert 20 million lines which took me almost an hour! Also, the monthly updates are not delta and the entire database is sent which works for me but might be some work for others.
I haven't tried it, but take a look at isbndb
API Description: Introduction
ISBNdb.com's remote access application programming interface (API) is designed to allow other websites and standalone applications use the vast collection of data collected by ISBNdb.com since 2003. As of this writing, in July 2005, the data includes nearly 1,800,000 books; almost 3,000,000 million library records; close to a million subjects; hundreds of thousands of author and publisher records parsed out of library data; more than 10,000,000 records of actual and historic prices.
Some ideas of how the API can be used include:
- Cataloguing home book collections
- Building and verifying bookstores' inventories
- Empowering forums and online communities with more useful book references
- Automated cross-merchant price lookups over messaging devices or phones
Using the API you can look up information by keywords, by ISBN, by authors or publishers, etc. In most situations the API is fast enough to be used in interactive applications.
The data is heavily cross-linked -- starting at a book you can retrieve information about its authors, then other books of these authors, then their publishers, etc.
The API is primarily intended for use by programmers. The interface strives to be platform and programming language independent by employing open standard protocols and message formats.
Although the other answers are correct, this one explains the process in a little more detail. This one uses the GOOGLE BOOKS API.
https://giribhatnagar.wordpress.com/2015/07/12/search-for-books-by-their-isbn/
All you need to do is
1.Create an appropriate HTTP request
2.Send it and Receive the JSON object containing detail about the book
3.Extract the title from the received information
The response you get is in JSON. The code given on the above site is for NODE.js but I'm sure it won't be difficult to reproduce that in PHP(or any other language for that matter).
To obtain data for given ISBN number you need to interact with some online service like isbndb.
One of the best sources for bibliographic information is Amazon web service. It provides you with all bibliographic info + book cover.
You might want to look into LibraryThing, it has an API that would do what you want and they handle things like mapping multiple ISBNs for different editions of a single "work".
As an alternative to isbndb (which seems like the perfect answer) I had the impression that you could pass an ISBN into an Amazon product URL to go straight to the Amazon page for the book. While this doesn't programmatically return the book title, it might have been a useful extra feature in case you wanted to link to Amazon user reviews from your database.
However, this link appears to shows that I was wrong. Actually what Amazon uses is the ASIN number and while this used to be the same as 10-digit ISBN numbers, those are no longer the only kind - ISBNs now have 13 digits (though there is a straight conversion from the old 10-digit type).
But more usefully, the same link does talk about the Amazon API which can convert from ISBN to ASIN and is likely to also let you look up titles and other information. It is primarily aimed at Amazon affiliates, but no doubt it could do the job if for some reason isbndb does not.
Edit: Tim Spalding above points out a few practical facts about ISBNs - I was slightly too pessimistic in assuming that ASINs would not correspond any more.
You may also try this database: http://www.usabledatabases.com/database/books-isbn-covers/
It's got more books / ISBN than most web services you can currently find on the web. But it's probably an overkill for your small site.
Related
I have a dilemma that I need to figure out.
So I am building a website, where people can go watch a competitive game (Such as Counter Strike: Global Offensive), perhaps using either a Twitch TV stream, or actually through the matchmaking streaming services that the game may offer (In the case of this example, CS: GO TV). While playing, members can place "bets" on which teams will win, using some form of credits with no real value. Of course, the issue here, is that the site will need to be able to pull the score from the game, and update in real time. So sticking with the example of CS:GO, is there a portion of the Steamworks API, that would allow for real-time pulling of a game's score, through some kind of PHP or JavaScript method?
I'm sorry to tell you that you can't, for now.
In the API description of the CS:GO Competitive Match Information says:
It would be interesting to be able to find out competitive match information -- exactly like what DOTA 2 has. It could contain all the players in the map, with their steamids and competitive ranks, the score at half time/full time. There are probably a few more bits of info that could also be included. Pigophone2 16:54, 14 September 2013 (PDT)
To answer your question, there is no Steam developed API that does this.
However many websites still do exactly what you are looking for.
My guess is that they use a regularly updated script which parses websites like ESEA and ESL and pull data about those matches. After all, they are the ones who host almost all big games that people care about.
You'll need to keep up-to-date with private leagues though, as they don't typically publish live stats in an easily parse-able format. GOSU Gamers can help you track any new players that come to the big-league table.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I recently got an internship involving the writing of an sql database and don't really know where to start.
The database will hold the information of a ton of electrical products: including but not limited to watt usage, brand, year, color, size, country of origin. The database's main task is to use some formulas using the above information to output several energy-related things. I will also be building a simple gui for it. The database will be accessed solely on Windows computers by no more than 100 people, possibly around the area of 20.
The organization does not normally give internships, and especially not to programmers (they're all basically electrical engineers); I got it via really insisting to a relative that works there who was looking into how to organize some of the products they overlook. In other words, I can't really ask THEM for guidance on the matter since they're not programmers, which is the reason I headed here to get a feel of where I'm starting.
I digress -- my main concerns are:
What is some recommended reading or viewing for this? Tips, tricks?
How do I set up a server for this? What hardware/bandwidth/etc. do I require?
How do I set up the Database?
For the gui client, I decided to take a look at having it be a window showing a webpage built with the sql embeded into php. Is this a good idea? Recommended reading for doing that? What are alternatives?
What security measures would you recommend? Recommended reading?
I have: several versions of Microsoft's mySQL servers, classroom experience with mySQl and PHP, several versions of Visual Studio, access to old PCs for testing (up to and including switching operating systems, hardware, etc.), access to a fairly powerful PC (non-modifiable), unlimited bandwidth.
Any help would be appreciated, thanks guys!
What is some recommended reading or viewing for this? Tips, tricks?
I'd recommend spending quite a bit of time in the design stage, before you even touch a computer. Just grab some scrap paper and a pencil and start sketching out various "screens" that your UI might expose at various stages (from menus to inputs and outputs); show them to your target users and see if your understanding of the application fits with the functionality they expect/require; consider when, where, how and why they will access and use the application; refine your design.
You will then have a (vague) functional specification, from which you will be able to answer some of the further questions posed below so that you can start researching and identifying the technical specification: at this stage you may settle upon a particular architecture (web-based?), or certain tools and technologies (PHP and MySQL?). You can then identify further resources (tutorials?) to help progress toward implementation.
How do I set up a server for this? What hardware/bandwidth/etc. do I require?
Other than the number of users, your post gives very little indication of likely server load from which this question can be answered.
How much data will the database store ("a ton of electrical products" is pretty vague)? What operations will need to be performed ("use some formulas ... to output several energy-related things"
is pretty vague)? What different classes of user will there be and what activities will they perform? How often will those activities write data to and read data from the database (e.g. write 10KiB, once a month; and read 10GiB, thousands of times per second)? Whilst you anticipate 20 users, will they all be active simultaneously, or will there typically only be one or two at any given time? How critical is the application (in terms of the reliability/performance required)?
Perhaps, for now, just install MySQL and see how you fare?
How do I set up the Database?
As in, how should you design the schema? This will depend upon the operations that you intend to perform. However, a good starting point might be a table of products:
CREATE TABLE products (
product_id SERIAL,
power INT UNSIGNED COMMENT 'watt usage',
brand VARCHAR(255),
year INT UNSIGNED,
color VARCHAR(15),
size INT UNSIGNED,
origin CHAR(2) COMMENT 'ISO 3166-1 country code'
);
Depending upon your requirements, you may then wish to create further tables and establish relationships between them.
For the gui client, I decided to take a look at having it be a window showing a webpage built with the sql embeded into php. Is this a good idea? Recommended reading for doing that? What are alternatives?
A web-based PHP application is certainly one option, for which you will find a ton of helpful free resources (tutorials, tools, libraries, frameworks, etc.) online. It also is highly portable (as virtually every device has a browser which will be able to interact with your application, albeit that ensuring smooth cross-browser compatibility and good cross-device user experience can be a bit painful).
There are countless alternatives, using virtually any/every combination of languages, runtime environments and architectures that you could care to mention: from Java servlets to native Windows applications, from iOS apps to everything in between, the choice is limitless. However, the best advice is probably to stick to that with which you are already most comfortable/familiar (provided that it can meet the functional requirements of the application).
What security measures would you recommend? Recommended reading?
This is another pretty open-ended question. If you are developing a web-app, I'd at very least make yourself aware of (how to defend against) SQL injection, XSS attacks and techniques for managing user account security. Each of these areas alone are quite large topics—and that's before one even begins to consider the security of the hosting platform or physical environment.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I was thinking about using either the google or yahoo api to calculate the distance from one zip code to another, and to get the city of that zip code. However, the api calls are limited, as the website I am working on will query the api multiple times throughout multiple pages.
I was wondering where can I go for either a database with zip codes and cities, or a zip code database to query lat / long for distance.
I did some googling, and most of the free ones I downloaded were either not accurate, or it was to large to fit in the database.
Thanks
I will be using PHP
In the past I have used this PHP class. While I haven't used it very extensively, it did what I needed it to do in terms of Zip Code lookup and distance.
Commercial zip code databases with Lat/Long are available. They are not expensive and are not large (well, if you restrict to USA, 40K small records or so). I have had good luck with zip-finder.com in the past, but important caveat... once you begin maintaining your own zip code table(s), you will need to keep it in sync with whatever the USPS does with zipcodes over time. One really irritating thing they do is remove zipcodes.
That said, calculating distance is pretty trivial, but you only get one lat/long point per zipcode (more or less the centroid of area). For a large zipcode, your distance accuracy can have a mile or more of slop in it, so be aware of that.
This is the best free zip code database you will find: http://federalgovernmentzipcodes.us/
It works fine if you don't need super accurate lat,lng. Unfortunately the lat,lng are not that accurate, not because of the two decimal places - but rather because they are simply a bit off.
I reworked this database to make the lat,lng more accurate by hitting google maps
maps.googleapis.com/maps/api/geocode/json?address="zip code city state";
You can try this API
http://ws.geonames.org/postalCodeSearch?postalcode=10033
Gives out results with lat and long too
Check out PHP-ZipCode-Class. You can adapt it to any number of zip code databases. Personally, I would go with a commercial database as the free ones can easily get outdated. Trust me on that. I tried to maintain a free database on a high traffic e-commerce site for years. Would have been a LOT CHEAPER to just buy a commercial database. If you really insist on a free database, here are a couple that I know of (but have not tried).
http://zipcode.jassing.com/
http://federalgovernmentzipcodes.us/
You're correct about the api limitations. It is true with Bing as well (although their api is pretty good).
If you go with a database...
Although it takes a little work, you can get it free from the US Census Tiger Data - which is what most low-end 3rd party ZIP Code database are based on. Just know that in 2000, they replaced the ZIP Code with ZCTA (which is ZIP Code like, but not exact). I’ve included the link below which has an explanation of ZCTA from the census site: http://www.census.gov/geo/ZCTA/zcta.html
Other things to consider: most latitude and longitude centroids are based on geometric calculations- meaning they could fall in the middle of forestry land, large lakes, parks (e.g. Central Park) where no people live. I’m not sure what your needs are but that may be fine for you. If you need population based centers, you will probably need commercial data (see http://greatdata.com/zip-codes-lat-long click the ‘more details’ tab at the top for an explanation of this topic).
Also, determine if you only need the major city for each ZIP Code (one-to-one relationship – normally 40,000+ records) or if a ZIP Code boundary covers more than one city, you need each city listed as a separate record (~57,000 records). Most locators and address validation utilities need the latter.
I've been using the zip-code database from http://zip-info.com/ for many years. It's updated every quarter (very important) and is very accurate. The database is about $50 and I purchase an update twice a year.
There are something like 54,000 5 digit zip codes in the US - so any good database is going to be large - just strip out the data fields you don't need (limit it to zip/lat/lon) if you want to reduce data storage (though it's minimal savings). As Rob said, distance calcs are easy to do - just look for a script that does great circle calculations as a staring point.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to do a little script, where I can Google for my keywords daily.
What is the best approach for this?
If I use the API, which i don't think there is any for this task, is there a limit?
I want to check for the first 100-200 results.
Do your search manually once, copy the resulting URL that points to the results page
Write a PHP script that:
fetches the content from that URL using file-get-contents()
parses the full HTML result back to a PHP array containing only search result data that is relevant to you
writes the array to database or file system
Run the PHP-script as a cron job on your server (hourly, daily, whatever you prefer)
Be prepared to update your script whenever Google changes the format of its results page
Get yourself a lawyer
Better yet, get yourself a commercial license as indicated by mario. That way you can skip all steps above (especially 4 and 5 can be missed).
Your big problem is that Google results are very customised now - depending on what you are searching for, results can be customised based on your exact location (not just country), time of day, search history, etc.
Hence, your results probably won't be completely constant and certainly won't be the same as for somebody a few miles away with a different browser history, even if they search for exactly the same thing.
There are various SEO companies offering tools to make the results more standardised, and these tools won't break the Google Terms of Service.
Try: http://www.seomoz.org/tools and http://tools.seobook.com/firefox/rank-checker/
I wrote a php script which does the task of parsing/scraping the top 1000 results gracefully without any personalized effects by google, along with a better version called true Google Search API (which generalizes the task, returning an array of nicely formatted results)
Both of these scripts work server-side and parse the results directly from results page uging cURL and regex
A few moths ago, I worked with GooHackle guys and they have a web application that does exactly what you're looking for, plus the cost is not high, they have under $30/month plans.
Like Blowski already said, nowadays Google results are very customized, but if you search always using the same country and query parameters, you can have a pretty accurate view of your rankings for several keywords and domains.
If you want to develop the app yourself is not going to be too difficult neither, you can use PHP or any other language to periodically do the queries and save the results in a DB. There are basically only two points to resolve, do the HTTP queries(easily done with cURL) and parse the results(you can use regex or the DOM structure). Then if you want to monitor thousands of keywords and domains, things turns a little more difficult because Google starts to ban your IP addresses.
I think that the apps like this from the "big guys" have hundreds or thousands of different IP addresses, from different countries. That allows them to collect the Google results for a huge number of keywords.
Regarding the online tool that I initially mentioned, they also have an online Google scraper that anybody can use and shows how this works, just query and parse.
SEOPanel works around a similar issue: your could download open source code and extract a simple keyword parser for your search results. The "trick" used is about slowing query searches, while project is hosted by Google (Code) self.
Need some ideas/help on best way to approach a new data system design. Basically, the way this will work is there will be a bunch of different database/tables that will need to be updated on a regular (daily/weekly/monthly) basis with new records.
The people that will be imputing the data will be proficient in excel. The input process will be done via a simple upload form. Then the system needs to add what was imported to the existing data in the databases. There needs to be a "rollback" process that'll reset the database to any day within the last week.
There will be approximatively 30 to 50 different data sources. the main primary interface will be an online search area area. so all of the records need to be indexed/searchable.
Ideas/thoughts on how to best approach this? It needs to be built mostly out of php/mysql.
imputing the data
Typo?
What you are asking takes people with several years formal training to do. Conventionally, the approach would be to draw up a set of requirements, then a set of formal specifications, then the architecture of the system would be designed, then the data design, then the code implementation. There are other approaches which tend to shortcut this. However even in the case of a single table (although it does not necessarily follow that one "simple upload form" corresponds to one table), with a single developer there's a couple of days work before any part of the design could be finalised, the majority of which is finding out what the system is supposed to do. But you've given no indication of the usage nor data complexity of the system.
Also what do you mean by upload? That implies they'll be manipulating the data elsewhere and uploading files rather than inputting values directly.
You can't adequately describe the functionality of a complete system in a 9 line SO post.
You're unlikely to find people here to do your work for free.
You're not going to get the information you're asking for in a S.O. answer.
You seem to be struggling to use the right language to describe the facts you know.
Your question is very vague.