I have some code that utilizes simplexml to retrieve some data.
I perform about 4 functions that each use simplexml, per entry in a database. So if i have 4 entries in the database, im running simplexml 16 times to load that content.
Problem is that it takes about a quarter of a second or so to load each item, so as the page loads, it trickles in and takes a second or two to load the entire page.
Is there anyway to easily speed this up, or cash this, or some better way of watching my page expand with content each time it loads?
Well, you only need to parse the XML once, and pass the parsed object to each of your functions.
Related
I'm currently creating a web-based system that would have a millions of data after some years (3 years = 1 million record, just guessing).
Now I have a webpage where I display all records in a html table dynamically.
If the time comes can it display these amount of data?
What are the things I need to consider?
What about hardware requirements (for the server probably)?
The set up would be a LAN set up to be use by 7 users simultaneously.
Any help would be appreciated.
Here is a my php code:
my php code
and this is the result:
Result
I guess your browser will crash if you display this inside of a table or list.
The only way i see is to lazy load and keep the DOM as small as possible while scrolling through.
Why do you want to display one million records?
Possible but browser will possibly crash.
Best approach is to have a pagination that will display like, 100 per page maybe, and have a search function.
it's not a human readable ,
you can get all data from the database is possible and saved it in an array
then make search in array and get the result what you want
I am using 1 million cells (not rows) in a plain HTML table.
It takes some time do download the data and some more to render it. But if you really need it you can display it. I don't see a production use case. I only display it to spot inconsistent data. So, not going to prod.
There are multiple component libraries to handle continuous scroll.
I have a table which dynamically reloads continually every 60 seconds. This keeps the data up to date.
I am using the script from here - http://www.michaelfretz.com/2010/04/21/using-ajax-to-load-data-from-php-into-your-website/
On the pages that have about 15 records, the Twitter Bootstrap tooltips work fine, they are speedy and look great.
On another page however I have over 400 records. Each record has a hover tooltip which shows the information from the database about that record. The information has already been outputted to the title tag but when hovering it takes more than a second before it appears which makes the whole page seem sluggish.
I'm thinking the reason for this is due to using the 'Rel' tag and twitter javascript which is live(Continually updating) , and therefore slows it down. But I'm not sure.
Is there any way to fix this..... or am I better to try and make a paginated table which loads the next page each time I click Next?
400 records is a lot to expect someone to traverse within a 60 second period. Without actually seeing any actual html, it's a bit hard to make suggestions but here are three:
Use the title attribute instead - see about the Title attribute. This will mean that you are using inbuilt browser code rather than Bootstrap rendering for tooltips.
Show a subject/content snippet for each row rather than just subject i.e. place the initial part of the content in available space after the actual subject. Most people have large monitors these days and with a responsive design you can show a lot of content after the subject.
As you say, use pagination. Bootstrap provides one but it requires you to do the wiring.
400 Records! Too much!
This is something seriously to do with the browser and system's performance. Displaying 400 records with live() is kinda crazy. The browser will crash for sure. Instead you can do one thing. Use pagination and display only a small sub set. Also, the users will find it difficult to navigate and search.
One another way is to use datatables. Load the full content in table and don't worry about anything. Datatables will take care of the rest. Pagination and Search are good features in this.
Screenshot of Datatables:
(source: webresourcesdepot.com)
If you see this, everything from Searching, Sorting, giving tooltips are done in the client side, with minimal set of data. So the payload on the browser will be less and the users see the part, which they just require.
Just a quick question. If I use a Pagination for my website and am expecting a lot of results, is it better to use jQuery or just a basic PHP/MySQL one that just loads a new page?
If I use jQuery and I have over 300 results from the database, will it have to load it all at once on the initial page load? That might take a long time to load. Or can you make it load only the first 10, and then when you go to Page 2 it will load the next 10?
Just wondering if you have any suggestions for my situation, and if you recommend any good scripts I can use for it.
Thanks!
IMO, start with "basic" PHP/MySQL pagination (load a new page each time the user changes pages).
Once you've got that working, if you want it, then add in jQuery pagination on top. All the jQuery pagination would do is load the new page of results via AJAX, rather than loading an entire new page.
So, the key here is how you handle paginating results via javascript (jQuery). If you render all 300 results on the page and simply hide results 200-300 (and reveal them via javascript), your page will still be really slow to render initially, and you'll be taxing the database with a query that could be optimized via a limit (pagination).
On the other hand, if you asynchronously query for more results via say, an asynchronous GET request to a web-service that spits the data out via JSON, you can both have a responsive page and avoid a taxing, limitless query.
Using PHP / MySQL and post-backs to handle the issue also prevents the long-initial page load + taxing query.
So, in summary, I'd absolutely paginate your results. I would also suggest you do the following:
1) First architect things using purely PHP / MySQL. So, for instance:
/results/?start=0&limit=20 (Show results 0-19)
/results/?start=20&limit=40 (Show results 20-40)
2) Then if you want to provide a responsive, javascript mechanism for loading more, extend your page so that it can spit out JSON with a format parameter:
/results/?start=0&limit=20&format=JSON
So if format=JSON instead of rendering the HTML, it'll just spit out the JSON data, paginated.
3) Then wire up the javascript to use the JSON data to dynamically load in more content:
$.get('/results/?start=' + start + '&format=JSON', function(data) {
// Process and display data
});
Hopefully that makes sense!
You've tagged your question with ajax, that's the answer... You should use a combo of PHP/MySQL + Ajax to make the things faster and smoother.
Here is a very popular plugin which implement the client interface: JQGrid
I'm using php to take xml files and convert them into single line tab delimited plain text with set columns (i.e. ignores certain tags if database does not need it and certain tags will be empty). The problem I ran into is that it took 13 minutes to go through 56k (+ change) files, which I think is ridiculously slow. (average folder has upwards of a million xml files) I'll probably cronjob it overnight anyways, but it is completely untestable at a reasonable pace while I'm at work for things like missing files and corrupt files and such.
Here's hoping someone can help me make the thing faster, the xml files themselves are not too big (<1k lines) and I don't need every single data tag, just some, here's my data node method:
function dataNode ($entries) {
$out = "";
foreach ($entries as $e) {
$out .= $e->nodeValue."[ATTRIBS]";
foreach ($e->attributes as $name => $node)
$out .= $name."=".$node->nodeValue;
}
return $out;
}
where $entries is a DOMNodeList generated from XPath queries for the nodes I need. So the question is, what is the fastest way to go to a target data node or nodes (if I have 10 keyword nodes from my XPath query then I need all of them to be printed from that function) and output the nodevalue and all it's attributes?
I read here that iterating through a DOMNodeList isn't constant time but I can't really use the solution given because a sibling to the node I want might be one that I don't need or need to call a different format function before I write it to file and I really don't want to run the node through a gigantic switch statement for every iteration trying to format out the data.
Edit: I'm an idiot, I had my write function inside my processing loop so every iteration it had to reopen the file I was writing to, thanks for both of your help, I'm trying to learn XSLT right now as it seems very useful.
A comment would be a little short, so I write it as an answer:
It's hard to say where actually your setup can benefit from optimizing. Perhaps it's possible to join multiple of your many XML files together before loading.
From the information you give in your question I would assume that it's more the disk operations that are taking the time than the XML parsing. I found DomDocument and Xpath quite fast even on large files. An XML file with up to 60 MB takes about 4-6 secs to load, a file of 2MB only a fraction.
Having many small files (< 1k) would mean a lot of work on the disk, opening / closing files. Additionally, I have no clue how you iterate over directories/files, sometimes this can be speed up dramatically as well. Especially as you say that you have millions of file nodes.
So perhaps concatenating/merging files is an option for you which can be run quite safe so to reduce the time to test your converter.
If you encounter missing or corrupt files, you should create a log and catch these errors. So you can let the job run through and check for errors later.
Additionally, if possible, you can try to make your workflow resumeable. E.g. if an error occurs, the current state is saved and next time you can continue at this state.
The suggestion above in a comment to run an XSLT on the files is a good idea as well to transform them first. Having a new layer in the middle to transpose data can help to reduce the overall problem dramatically as it can reduce complexity.
This workflow on XML files has helped me so far:
Preprocess the file (plain text filters, optional)
Parse the XML. That's loading into DomDocument, XPath iterating etc.
My Parser sends out events with the parsed data if found.
The Parser throws a specific exception if data is encountered that is not in the expected format. That allows to realize errors in the own parser.
Every other errors are converted to Exceptions as well.
Exceptions can be caught and operations finished. E.g. go to next file etc.
Logger, Resumer and Exporter (file-export) can hook onto the events. Sort of the visitor pattern.
I've build such a system to process larger XML files which formats change. It's flexible enough to deal with changes (e.g. replace the parser with a new version while keeping logging and exporting). The event system really pushed it for me.
Instead of a gigantic switch statement I normally use a $state variable for the parsers state while iterating over a domnodelist. $state can be handy to resume operations later. Restore the state and go to the last known position, then continue.
I've got a script in php that continually grows an array as it's results are updated. It executes for a very long time on purpose as it needs to filter a few million strings.
As it loops through results it prints out strings and fills up the page until the scroll bar is super tiny. Instead of printing out the strings, I want to just show the number of successful results dynamically as the php script continues. I did echo(count($array)); and found the number at 1,232,907... 1,233,192 ... 1,234,874 and so forth printed out on many lines.
So, how do I display this increasing php variable as a single growing number on my webpage with Javascript?
Have your PHP script store that number somewhere, then use AJAX to retrieve it every so often.
You need to find a way to interface with the process, to get the current state out of it. Your script needs to export the status periodically, e.g. by writing it to a database.
The easiest way is to write the status to a text file every so often and poll this text file periodically using AJAX.
You can use the Forever Frame technique. Basically, you have a main page containing an iframe. The iframe loads gradually, intermittently adding an additional script tag. Each script tag modifies the content of the parent page.
There is a complete guide available.
That said, there are many good reasons to consider doing more pre-computation (e.g. in a cron job) to avoid doing the actual work during the request.
This isn't what you're looking for (I'm as interested in an answer to this..), but a solution that I've found works is to keep track of the count server-side, and only print every 1000/5000/whatever number works best, rather than one-by-one.
I'd suggest that you have a PHP script that returns the value in JSON format. Then in another php page you can do an AJAX call to the page and fetch the JSON value and display it. Your AJAX call can be programmed to run perhaps every 5 seconds or so depending on how fast your numbers output. Iframe though easier, is a bit outdated.