I maintain a hobby website that, among other things, chronicles whether certain items are in print or out of print at a particular web store.
The store's management removes products when they are out of stock, and re-adds the pages when they're back in stock.
Scraping the category page's item list for item titles is easy enough, but I'm not sure what to do about pages with more results than are shown.
The pages default to 10 items, and clicking Next loads up the next 10 via AJAX.
Is there a standard way of handling and scraping such setups?
If you use the developer feature of your web browser (Firebug, Inspector, Developer Tools, ...) you should able to see the connections being made to retrieve the data through Ajax and the request and response headers being sent and received.
The request headers will contain the data being sent as well as the URL that's been request. The query string of the URL or the POST data would most likely contain a "start" or "next" or some time of parameter that identifies the start and number of results to return.
You can then use PHP and cURL to automate the rest of the process.
Here's a screenshot of what the "Web Inspector" looks like in Safari 5.1 on OS X (Chrome looks identical):
What's relevant to you here is the Request URL, Request Method and what's under Form Data. The text on the left (in light grey) is the parameter and the text on the right is the value.
Related
I want to load a page from a different source like how the online flight ticket reservation sites are working. Meaning the data needs to be pulled from different systems and show it up in a single page.
I can do this by creating one file which can collects the data from different systems and merge it by the required sort order and show in the page. But if any one of the source systems works slow then the entire page needs to wait till all the results are received from various sources.
The Question is
Is it possible to show the content which is retrieved from various sources without any middle layer to manipulate data before display? Meaning the page will show the content when it receives from either of the sites at first and the page will reorder when it receives the content from other sites.
Advance thanks for your help.
The way i solved this issue by doing the below.
Created one aggregate layer, which makes the request (curl) to different systems (asynchronous)
Upon receiving response from one system (whichever comes first) have stores it in cache (memcache) and display the result in the page
Then when the response comes form the other system, aggregate the result with the previous results which is stores in cache and refresh the page again with the aggregated data
I know this is not a good solution, but since i don't have any better option and right now handling it like this.
I'm trying to make some custom feed with posts of image and titles and currently I'm doing it for mobile. I'm using PHP for webservice.
When using pagination how do you download posts from web? Do you send some page parameter to webservice or is there some other way?
So something like this:
http://www.mywebpage.com/?command=stream&page=0
and then just increment page that is private variable in-app and autoincremented before every new request? Or this is done in a different way?
Thanks.
Yes you are on the right track. The usual practice is to send a currentOffset, or a page number parameter (if your entries per page is constant).
I am assuming you want to display the feed in a UITableView, and lazy-load more entries (the next page of entries) when the user scrolls to the bottom of the list. You can implement the pagination logic yourself by implementing scrollViewDidScroll: to detect that you have hit the bottom of the list, but I find NMPaginator particularly helpful for this purpose.
I'm trying to add this functionality to my PHP cart
Let's suppose we have two pages: catalog.php and cart.php
What I would like to do is:
User clicks on "Add to cart" button on catalog.php
and triggers an ajax request via jQuery to cart.php sending it info about which product was added (this all works as expected) and makes the cart.php page update itself by including the just added product without a page refresh (this is the part I can't get to work).
If I put the two pages side by side and click "Add to cart" nothing happens, only on page refresh (cart.php) I see that the new product was added.
Is there a way to archieve this?
EDIT: I wasn't clear enough i'm sorry my bad
The pages are presented in a standard way, no frames no popups
The "app" works as expected and this will unlikely be an issue for users
The "side-by-side" thing was just because i would like to know a way to obtain this functionality since i can see myself using it in the future for pretty much anything (DOM manipulation of pageB from pageA, CSS, etc.)
You can simulate this by having an AJAX call on Cart.php which checks the (session/db) to see if something new has been added.
I'd suggest something like...
Cart.php makes an AJAX call every 10-60 seconds and asks for a complete list of products in the cart.
Any new items are added to the table as appropriate (either by checking product Ids or OrderItem Ids). You should also update quantities/etc.
This way, no matter what mechanism is used to add items, the cart will see them before too long.
Obviously, the more often you poll, the quicker the cart will update but the more load you'll put on the system.
Now the reason it's not directly possible is that AJAX requests are always initiated by the client, not the server.
There is one other potential workaround which omits frequent polling and gives near-instant updates but it's a little more complex to implement.
It's known as long polling (see this answer) and effectively what happens is this...
Client sends AJAX ("A") request to the server for cart information.
The server accepts the request but does nothing (as if a long-running script were processing)
When a new shopping cart item is received via AJAX ("B"), the server responds to Request A with details and The cart page updates the table as appropriate.
If no cart activity is detected within a reasonable timeout (30-120s), the server responds with a "No Operation" response and closes the connection.
Whichever response the client receives, it immediately opens a new AJAX request and starts all over again.
Effectively, your PHP script then deals with checking the database/session/etc for updates and the client is only waiting on a "slow" server. This is how twitter implement their various feeds via APIs - each new tweet is returned on a new line as they're generated.
Note that this can be a pain to implement as you're just shifting the polling from the client to the PHP but it does make it more elegant from a JavaScript point of view, removes the delay and reduces wasted network overhead.
Last I checked, only pages opened with window.open (or opened with a <a target="_blank"> link) can talk to each other.
So in catalog.php call cart = window.open('cart.php'), cart contains the window object of the page that opened it, so you could do:
cart.document.getElementById('whatever').innerHTML = '<p> new content</p>';
To get windows to talk to each other side-by-side w/o one opening the other you can use HTML5 local storage, but that takes some more advanced wizardry.
Hello guys I newbie question :) - I am currently using PHP/Zend and now I need to display a form and other content in one of my pages. I do not want the page to reload and I cant use a pop-up window so the best option is to sort of dynamic display a "square" in the middle of the current page with this form being load on the go... this way i could have my pages (forms, text, whatever) being pulled in this square.
In order to keep compatibility with older/new and different browsers, what would be the best choice? DOJO - that is already in Zend, JQuery, or just HTML5/CSS3? Besides, if anyone could point me to some references of where can I find this info it would be great!
AJAX is the most common means (Asynchronous Javascript And Xml) to do this- which uses Javascript to poll other scripts (can be .php pages) which then return predefined output based on the request- this output can be content to inject into a page, or data which can then be interpreted by your page for another action (i.e. the output from another page etc..).
In this instance, your .php page could include JS (javascript) in the head, whether linked or inline, which would contain details for launching an AJAX request- namely, how often or on what trigger (button press etc), by what means (POST or GET), what is sent (any other variables you wish), what the target script is (the script which will handle the request and output your required content/data), and what to do when the response is recieved (i.e. which element on the page should be updated with the response).
A little about AJAX:
http://webdesign.about.com/od/ajax/a/aa101705.htm
http://webtrends.about.com/od/web20/a/what-is-ajax.htm
Likely the simplest way to begin is to use a pre-existing Javascript library like the ubiquitous jQuery (jquery.com), there are thousands of tutorials out there for it, and though you will need to do some Javascript programming, the library has meant that you can rely on fairly simple syntax to do so (as simple as $('#myelement').load('mypage.php')):
http://net.tutsplus.com/tutorials/javascript-ajax/5-ways-to-make-ajax-calls-with-jquery/
http://www.devirtuoso.com/2009/07/beginners-guide-to-using-ajax-with-jquery/
http://www.sitepoint.com/ajax-jquery/
http://yensdesign.com/2008/12/how-to-load-content-via-ajax-in-jquery/
In simple terms:
You have your php page with the element (area) that needs updating (page A)
Build another php script which outputs the content you want 'refreshing', e.g. the latest news stories, each time it is run (page B)
Link to the jQuery library in your header section (page A)
Write a simple jquery function in the header section of page A, which says every X seconds/minutes (or on demand), run an AJAX request to fetch the content of page B and insert into an element (DIV) within page A
---updated---
If you wish to use DOJO as opposed to jQuery, there is also a wealth of resources available:
http://dojotoolkit.org/documentation/tutorials/1.6/ajax/
http://www.infernodevelopment.com/dojo-ajax-tutorial
http://startdojo.com/2010/01/02/simple-ajax-form-tutorial/
http://today.java.net/pub/a/today/2006/04/27/building-ajax-with-dojo-and-json.html
http://www.ibm.com/developerworks/web/tutorials/wa-dojotoolkit/index.html
http://www.roseindia.net/dojo/
I'm a C# developer for Windows and I know NOT THAT MUCH about web programming. I have developed a special search engine in Java. I want to create a php interface for it. For now, I managed to connect php and Java via a Web Service. I watched some tutorials for creating a search engine and I have some slight idea of what should I do but I don't know exactly what to do with some problems. Here's the scenario I want to implement:
An Index page with a search box, user types the search query in that page, some results shows, if the user scrolls down, more results shows (like Facebook). When user clicks on a result item's link, the browser then opens another page that shows the result (also in my app).
Now what I know is that the index page should be a HTML file with a Get method to a PHP file.
What I don't know is How to enable "more" results? For this, my php should send an array containing the URL of the previous results to my Java service, get the results, add them to the array and wait. The next time it should use this array.
Please let me know what code structure should I use for my app.
Thanks in advance.
Edit:
Requested code samples in java server:
public String processQuery(String query, List<String> previousURLs);
this will be called for the first time like this:
processQuery("test", null);
suppose it has returned 2 results with urls:
http://www.bing.com
http://stackoverflow.com
these will be stored in an array and the second time:
processQuery("test", previous);
this will return new results which will be added at the end of the page.
You need to use AJAX (Asynchronous JavaScript and XML) requests. Essentially as a user scrolls down the page this triggers a request to get more results. You'd probably do something like cache the last result id to know from where to get the next batch of results. You'll need to brush up your javascript and possibly jQuery in order to figure out how to implement all this - ie trigger the request, handle the response and append new elements to the DOM.
An example website that does this is Duck Duck Go. Their search results page dynamically appends new results as you scroll. Make sure you have Firefox + Firebug to inspect the page, the network requests that get made and to step through (debug) the running javascript.
I did it with the help of this tutorial:
http://www.9lessons.info/2009/07/load-data-while-scroll-with-jquery-php.html