Should I Cache XML feed from localhost API (performance) - php

As you can read in the title I have question about php and caching an xml feed obtained from a localhost API.
The xml feed is already cached inside of the API, so that the data don't needs to be read every time from the database. (.NET Service) -> there are multiple feeds running on the server.
The XML feed contains job advertisements (Meta data, keywords and also the HTML), the size of the feeds are different between 100kb and 5mb. Also the time periods how often new data is added is pretty different. (This is often daily or in shorter intervall). As mentioned above this is already handled in the holy .NET Service.
Now my actual question is:
Is it useful to cache the data again for a microsite which have the XML feed as a datasource for rendering a job overview.
If i wouldn't cache anything, for every request the data would be read from the API, filtered and rendered in to an html.
Now I'm not sure if an additional cache would enhance the performance. And where should I probably add a cache. (After the XML is sent back from the API, or after the data is deserialized).
My actual question for the above scenario:
Is it useful to add an additional cache? (performance reasons)
If yes, is there a simple php solution or a php library to use?
Thanks,
Hans Rudolf

Related

Is it bad to use JSON files instead of real databases?

I have no access to create/edit any databases but due to the massive amount of content I need some sort of management system which is why I created my own. Here's how it works:
Each blog post has got its own .php file which loads static parts of the website like the header or the menu bar. But there are many category sites that display previews of the respective posts. It would be sooo annoying having to edit the same preview on 10 sites due to a misspelled word or so. That's why I store those previews (not the full content since there's no need to) in a JSON file.
Is that bad practice? Could this lead to long loading times if the number of previews rose? And could I prevent that by creating multiple JSON files?
Thanks for your advice!
If you are working on a small scale, then using JSON files is fine however it would definitely be beneficial to switch to a database management system whenever possible for storage when it comes to PHP (Or the majority of languages for that matter).
It can be considered bad practice if JSON is used to store large amounts of data or if a lot of data is stored in the same file in which case yes, using multiple JSON files rather than one large one is indeed more viable since the input stream when reading the file does not have to go over as much data.

How to use database with CDN to improve peformance

I have question something similar to this question, I am using jwplayer for playing my videos. I have saved my videos in CDN. Due to some requirements, I have to save my subtitle first in cdn and then save both video file url and associated subtitle urls [eng, chinese, japanese etc] in DB.
When I make a Ajax call to retrieve the data in my JS file from PHP file. It is taking more time and it is causing performance issue.
I was wondering if there is any DB option in CDN, so that instead of saving those detail in my db I can directly save this info (associated subtitles of one video file) in CDN. since retrieving from CDN is much faster it will surely improve the performance.
CDNs just bring static information closer to the users, caching that information in points-of-presence (PoPs) around the globe. It is mostly done by web-servers sitting within those PoPs. So whatever you can't retrieve by HTTP GET will likely be a problem. For example, legacy protocol RTMP (also video) is supported by legacy CDNs (Level3/Akamai/EdgeCast), but not by newly formed Cloudflare/Cloudfront and so on, because it requires adds-on to web-server and clutters workflows.
Technically, any static database can be stored in a file, and the file can be cached by a CDN. But then, again, it would be your code that takes care of db->file->db metamorphosis. Therefore, if something is static, you don't really want to use database for it (to be future/CDN-proof). Subtitles are just text files, so let them be files in asset folders. I appreciate that high level architecture might be beyond your control here (due to specific ingesting system for instance), but then the answer is that you won't be able to do what you try, and resulting performance will suffer.
If you have the bucks you can look at Continuent.
http://www.continuent.com/solutions/pricing-and-services

When loading multiple images from network requests, should I return the image data or a link to the image?

I have an iOS app that lists local places in a table view. Each cell has a picture, text, and subtext.
Each cell's detail view also has multiple pictures of the relevant location, as well as a decent amount of text. JSON is the interchange format.
Currently I am sending bit blobs and constructing it into a jpeg once loaded to the device but I am worried this is intensive on both the device and the server. So I was considering sending a link to the picture and asynchronously downloading each picture, but I am unaware of what repercussions this would have. Especially considering that I am currently using a cheap PHP/MySQL shared hosting plan for the backend.
I am looking for a list of pros and cons for sending the raw image data through JSON vs a link to that image. Any other options for quickly and efficiently populating a view with multiple network images is welcome.
I think the difference is as following:
1- the user will download : (link+image > image) more data stream.
2- if the image is on another server -> might be slower than your server or faster -> affect the image loading speed provided for the user and minimize transmitted data size between your server and the client.
3- if the image is on another server -> do you guarantee that it will be there when your website is up ?
4- loading data using ajax is already an asynchronous method and you don't have to worry about another server if you use it. well, unless your server is as slow as hell then you should consider using another server for the big images as the synchronization won't be your major concern as it is the load you are applying to your server.
if other points come to my mind, I'll post them here ..
I've done a little bit of research into this and asked a few colleagues, so I'll share what tidbits I've come up with.
At some point, the raw image data is going to have to be sent- that is unavoidable.
But I can benefit from lazily loading the image data- so that if my user only looks through 14 tableview cells, I only spend time loading 14 images instead of however many total results are returned from the server (And even less if I implement proper caching).
My solution so far is to return 30 (the number of tableview cells I load at one time) JSON objects, each having an "Image_URL":"..." field and putting those into a dictionary. Then, in cellForRowAtIndexPath:, I check to see if the image for that cell is already cached and if not, I make a request for that picture and update the cell.image in the network callback.
This is pretty simple to do on your own, but SDWebImage seems like a pretty good library for handling corner cases, caching, and other things that aren't covered in a basic implementation. I should note that AFNetworking also includes functionality for asynchronous image downloading.

RSS Feeds - add serverside (PHP) or clientside (JavaScript/jQuery)

A quick Google search of "jquery rss parser" returns many cool plugins.
That being said, what are the pros and cons of adding rss content to my website using server-side versus client-side technology?
It could be cool to implement it client side, as then you could build really cool user interface, where new RSS items are fetched every few secs / mins (consider Masonry plugin). Client side would allow you to load RSS feeds and display them as soon as content loaded from a source, rather than waiting until all feeds are loaded.
Server side - you would have to consider some sort of caching as parsing feeds may be very time consuming and no one likes waiting....
Here a few pros and cons:
PROs
Content Dynamically Updated
Can be flexible to drive the most relevant content to the visitor
Allows you to style and manipulate the data to keep the site looking consistant
CONs
Client side requires javascript to be enabled on the browser/device.
Server side while more versatile may require additional installation of modules on the server to work properly
Server side will not auto update without a page reload
RSS Feed structure could change requiring a rewrite of your code.
RSS Feed not outputting properly causing your site not to show the right information (improper tag use or unescaped characters)
May not be updated right away, I know a few feeds that update on a weekly basis
If you are looking for a way to add content to your site. Look into creating or using a Content Management System. There are available plugins that do the same and will allow you some control over how the data is displayed or interpreted, not to mention can also cache an older feed. A list of most common Content Management Systems can be found here: http://en.wikipedia.org/wiki/List_of_content_management_systems
I was looking for a list like the following:
Benefits of using client-side technology for RSS feed:
Content can be updated without page reloads
Content can be sorted/manipulated without page reloads
Benefits of server-side technology for RSS feed:
Improves SSO
Doesn't require the client to have JavaScript enabled.

Can you get a specific xml value without loading the full file?

I recently wrote a PHP plugin to interface with my phpBB installation which will take my users' Steam IDs, convert them into the community ids that Steam uses on their website, grab the xml file for that community id, get the value of avatarFull (which contains the link to the full avatar), download it via curl, resize it, and set it as the user's new avatar.
In effect it is syncing my forum's avatars with Steam's avatars (Steam is a gaming community/platform and I run a gaming clan). My issue is that whenever I am reading the value from the xml file it takes around a second for each user as it loads the entire xml file before searching for the variable and this causes the entire script to take a very long time to complete.
Ideally I want to have my script run several times a day to check each avatarFull value from Steam and check to see if it has changed (and download the file if it has), but it currently takes just too long for me to tie up everything to wait on it.
Is there any way to have the server serve up just the xml value that I am looking for without loading the entire thing?
Here is how I am calling the value currently:
$xml = #simplexml_load_file("http://steamcommunity.com/profiles/".$steamid."?xml=1");
$avatarlink = $xml->avatarFull;
And here is an example xml file: XML file
The file isn't big. Parsing it doesn't take much time. Your second is wasted mostly for network communication.
Since there is no way around this, you must implement a cache. Schedule a script that will run on your server every hour or so, looking for changes. This script will take a lot of time - at least a second for every user; several seconds if the picture has to be downloaded.
When it has the latest picture, it will store it in some predefined location on your server. The scripts that serve your webpage will use this location instead of communicating with Steam. That way they will work instantly, and the pictures will be at most 1 hour out-of-date.
Added: Here's an idea to complement this: Have your visitors perform AJAX requests to Steam and check if the picture has changed via JavaScript. Do this only for pictures that they're actually viewing. If it has, then you can immediately replace the outdated picture in their browser. Also you can notify your server who can then download the updated picture immediately. Perhaps you won't even need to schedule anything yourself.
You have to read the whole stream to get to the data you need, but it doesn't have to be kept in memory.
If I were doing this with Java, I'd use a SAX parser instead of a DOM parser. I could handle the few values I was interested in and not keep a large DOM in memory. See if there's something equivalent for you with PHP.
SimpleXml is a DOM parser. It will load and parse the entire document into memory before you can work with it. If you do not want that, use XMLReader which will allow you to process the XML while you are reading it from a stream, e.g. you could exit processing once the avatar was fetched.
But like other people already pointed out elsewhere on this page, with a file as small as shown, this is likely rather a network latency issue than an XML issue.
Also see Best XML Parser for PHP
that file looks small enough. It shouldn't take that long to parse. It probably takes that long because of some sort of network problem and the slowness of parsing.
If the network is your issue then no amount of trickery will help you :(.
If isn't the network then you could try a regex match on the input. That will probably be marginally faster.
Try this expression:
/<avatarFull><![CDATA[(.*?)]]><\/avatarFull>/
and read the link from the first group match.
You could try the SAX way of parsing (http://php.net/manual/en/book.xml.php) but as i said since the file is small i doubt it will really make a difference.
You can take advantage of caching the results of simplexml_load_file() somewhere like memcached or filesystem. Here is typical workflow:
check if XML file was processed during last N seconds
return processing results on success
on failure get results from simplexml
process them
resize images
store results in cache

Categories