lastBuildDate in dynamically generated RSS - php

RSS feed being generated on demand.
As far as I can see, for I have 2 options for lastBuildDate - current time or pubDate.
Which one would you choose and why?

According to the RSS 2.0 spec, lastBuildDate is the last time the content of the channel changed. (I'm not entirely satisfied with this definition because what if the feed's meta data changes? I think the common convention is to update lastBuildDate in that case, too.)
The channel-wide pubDate is supposed to be used for the original publication date of the items in the feed. It is never a good value to use for lastBuildDate because the pubDate is to stay unchanged even if the item gets updated.
Using the current time is the easy way out, but it's not perfect (because clients may start unnecessary operations due to the changed lastBuildDate)
The best way would be to actually know / find out when the feed's content last changed, and output that.
Related question

The item having the newest PubDate should become the lastBuildTime.
[EDIT]: If there is a separate PubDate you are using too for whole feed, then lastBuildTime should be current time because you are building it at current time on-demand :).
[EDIT]: 2:: As lastBuildTime is optional and you're anyways including PubDate for whole feed, why not remove it from your feed output?

Related

Is there a way to store multiple records rather than using multiple rows in MySQL?

I would like to make full use out of MySQL for the purpose of a (web) application I have developed for a chiropractor.
So far I have been storing in a single row for [every year] for what are called progress notes. The table structure looks something like this (progress_note_id, patient_id, date (Y-0-0), progress_note). When the client wishes to append for the year of the current progress notes, he simply clicks at the top of a textarea (html), which I use TinyMCE JavaScript library, to make a new entry date along with the shorthand notes to go at the beginning of the column (progress_note). So far its been working ok, if there are 900+ clients (est.) there could potentially be 1300+ progress notes, for each year since the beginning of the application (2018).
Now the client wishes to be able to see previous progress notes (history), but is unable to modify any previous notes, while still be able to write new ones. The solution I have come up with is to use XML inside the textarea, and use PHP to decipher the new notes from the old ones.
My problem however is if I should have to convert my entire table from a yearly to a daily, that it could take up a lot of time and energy to convert multiple notes into each single rows, (est. 10x) Which could end up being 13,000+ rows. I realize that no matter what method I choose to do is going to be a lot of work. Another way around this perhaps I found was to use XML column type in MySQL to potentially store multiple records, and if I wish to append it, all I would need is PHP to interpret the entire XML and add a new child node, to the beginning. Each progress note is 255 - 500 chars. And in worst case scenario, if the patient was to be 52 times a year (1 for every week), there shouldn't be a large enough overhead.
Is this the correct way to solving this problem? I do wish to keep with MySQL DB and I realize that MySQL is not an intended for XML. And for some clarification, what I hope to accomplish is the same thing I intended to do with current progress notes, but with XML. I believe in ascending order (newer -> oldest).
<xml_result>
<progress_note>
<date>2020-08-16</date>
<content></content>
</progress_note>
<xml_result>
Thank-you for any of your time and for any suggestions.
Firstly, 13000+ is not a problem for mysql. In most case for web application, mysql can handle more than 10m+ records for a single instance with a good performance.
Secondly, you can use either XML or JSON format in a text field and handle the decoding in your application.

Change RSS feed, but only new items

I'm fairly new to PHP, and I'm trying to write a script that solves the following
I have an RSS feed that gets saved to my server every 10 minutes (copied from elsewhere).
There is a problem with the timestamps (pubDate tag) on the RSS feed, they always have the correct date but 00:00:00 GMT as the timestamp (I have no control over this).
Therefor, when I use an autotweeting program to tweet updates from the feed (it checks it every hour or so), it won't - It only tweets the first update of each day as a result.
Therefor, what I'm trying to do to fix it to some degree is to check if the feed has changed, and if it has, change the saved pubDate to the current server time on only the new items.
I'm also kind of confused as to how I can have it check for changes - If I have a corrected version (with fairly accurate timestamps) saved to my server, it will ALWAYS think there are changes, because the timestamps will always be 00:00:00. I'm thinking, check both feeds for items including the full strings such as <guid isPermaLink="true">http://services.runescape.com/m=adventurers-log/a=161/display_player_profile.ws?searchName=A13d&id=-463827091</guid> - Since the id= at the end stays constant, it would only change the <pubDate> of items found to be new.
http://services.runescape.com/m=adventurers-log/a=161/rssfeed?searchName=A13d Here is a feed as an example. If anyone could get me started or point me to some kind of tutorial that might help, I'd really appreciate it. This is over my head, but something I need to learn how to do.
Maybe there is something wrong with your code parsing the timestamp, date format perhaps?
I believe the method of doing full string comparisons(<title> & <description>) between items with the same <guid> is your best bet. Here is some reading about RSS duplicate detection if you are interested.

XML maximum limit?

I'm using Expression Engine to generate a very large XML template. The generated XML is probably in the neighborhood of 1800 - 2000 lines. I've started to see a funky behavior where I add a new project my oldest project is no longer showing up in the XML. It is almost as if there is some kind of limit that it's reaching and pushing anything in after that limit forces the oldest item out. There are no errors on the page and the XML closes properly. Has anyone ever come across something like this?
I believe the channel:entries tag (or weblog:entries on EE1) has a default limit of 1000 entries unless specified otherwise. Try adding limit="5000" to your entries tag.

A Publisher with RSS as Datasource

I am writing some code to fetch news from rss feed and publish n items at once every m hours to another site.
I compare the update xml file with the previous one saved on server using PHP.
I load the two xml into php array and the latest post is filter out using array_diff_assoc().
If the number of the latest post>n, the older one will be publish first, the rest will be done next time. Therefore I need some ways to store which item have publish or not.
What is the simplest way to do so? I don't want to apply mySQL/S for such a simple task.
Can't you just store those not published? Then each time, pull up the old, stored ones, and append to the list those new ones ID'd by array_diff_assoc(). Publish n, and if number > n, store the new list of unpublished ones.
As to how to store them, I'm not a PHP programmer, but what about using PHP's serialize and unserialize functions? In python, I'd use the pickle module if I had to store data objects of some type, and I understand those are the PHP equivalent.

Best way to find updates in xml feed

I have an xml feed that I have to check periodically for updates. The xml consists of many elements and I'm looking to figure it out which is the best (and probably faster) way to find out which elements suffered updates from last time I've checked.
What I think of is to check first the lastBuildDate for modifications and if it differs from the previous one to start parse the xml again. This would involve keeping each element with all of its attributes in my database. But each element can have different number of attributes as well as other nested elements. So if it would be to store each element in my database what would be the best way to keep them ?
That's why I'm asking for your help :) Thank you.
Most modern databases will store your XML as a blob if you like. (You tagged PHP... MySQL? If so, use MEDIUMTEXT.) Store your XML and generate a diff when you get a new one. If you don't have an XML diff tool, canonicalize both XML listings then run a text diff.

Categories