Ok guys, so i am 50% the way through creating a "content manager" plugin for wordpress (mainly for the internal benefit of the company i work for) that can create custom post types, taonamies and meta boes with a prety interface.
At the moment im using XML files created through php to parse and hold the data relating to "post types", "Taxonamies" and "metaboxes". The main reason i began down the xml road was so i could allow users to export to an xml file and import on another wordpress install. simple.
Although no im not sure? is it too server heavy to have the plugin recursing through directorys every each time to init the post types, taxonamies and meta boxes? would i be better served to crete 3 db tables and when i need to import or export simple do the XML from there?
would love to hear our opions?!
I would go with the database-solution. When the XML-File grows size, the parsing will take more and more time, as the whole file is read every time.
In a Database, you can select only the values you need and don't need to parse the whole document every time.
Also, realizing a XML import/export from the values stored in the database shouldn't be that much of a problem.
But if you have very tiny XML-files (like less then 100 chars) and they don't grow much, you'll have to decide if it's worth the time to change to a database.
Related
Problem:
I am to build a web-based tool in PHP to help users access information scattered across a collection of XML files that I plan to store data from into tables in MySQL. A lot of examples I see online seem to be based on importing data from a bunch of XML files that all use the same formatting. I do not have that luxury.
How would I be able to parse through XML files that have the following factors?
There are multiple categories of XML files in this collection, each having separate formats and types of information that differentiate each category. Ideally, I would create a separate table for each category. However...
Additional new categories of XML files could be added to this collection without my knowledge ahead of time.
Any existing category could have its format restructured and/or the types of information within could be increased or decreased, also without my knowledge ahead of time.
Even among the same category of XML files, there can be older files that have an outdated formatting version.
Expected Results:
Using an example where the collection of XML is about a group of people, if you search for "brown eyes", you get search results listing pages for everyone listed with brown eyes. One of the pages is "Robert". If you click this result, you'll go to a page where all the information from Robert's XML file is displayed (readable formatting to be handled later).
you could only create a self-learning parser, which adds new columns to the table whenever it finds new properties in the XML. basically there are two options available: either build up a data model, which at some point in time matches all the records - or stuff the mess into a noSQL database, which does not necessarily make the mess any better. "one size fits all" (stuffing unstructured data into a structured database) is not an option.
I have a PHP script which builds a sitemap (a XML file accodring to the standard sitemap structure).
My question is about improving it. As you you, a website has new posts daily. Also post may be edited several times per hour/day/month or whenever. I have two strategy to handle that:
Making a new PHP script which parse that XML file and finds the node and modify it when the post is edited and add a new node when a new post is added (it needs to count the number of all nodes before inserting a new one, since a sitemap file can has 50,000 URL utmost).
Exucuting my current PHP script according to a specific period daily (i.e every night on midnight) using a Cron-Jobs. It means rebuilding it from the scratch every time (actually building a new sitemap every night)
Ok which strategy is more optimal and profitable? Which one is the standard approach?
Modifying a XML file has its dangers. One reason is that you need to compare and compile actions (replace, insert, delete). This is complex and the possibility of errors is high. Another problem is that sitemaps can be large, loading them into memory for modifications might not be possible.
I suggest you generate the XML sitemap in a cronjob. Do not overwrite the current sitemap directly but copy/link it after it is completed. This avoids having no sitemap at all if here is an error.
If you like to manage the URLs incrementally do so in an SQL table, treat the XML sitemap as an export of this table.
This depends on busy your website is.
If you have a small website where content changes happen either on a weekly- or monthly-basis, you can simply create an XML- and HTML-sitemap by script, any time new content is available and upload it to your webspace.
If you have a website with many pages and an almost daily update frequency, such as a blog, it is quite handy if you can automatically generate a new sitemap anytime new content is ready.
If you are using a CMS then you have a wide range of plugins that could update it incrementally. Or you could just make your script do it.
I want to use Drupal for building a Genealogy application. The difficulty, I see, is in allowing users to upload a gedcom file and for it to be parsed and then from that data, various Drupal nodes would be created. Nodes in Drupal are content items. So, I'd have individuals and families as content types and each would have various events as fields that would come from the gedcom file.
The gedcom file is a generic format for genealogy information. Different desktop applications take the data and convert it to a proprietary format. What I am not sure about how to accomplish is to give the end user a form to upload their gedcom file and then have it create nodes, aka content items, with Drupal. I can find open source code to parse a gedcom file, using python, or perl, or other languages. So, I could use one of these libraries to create from the gedcom file output in JSON, XML, CSV, etc.
Drupal is written in PHP and another detail of the challenge is that for the end user, I don't want to ask him/her to find that file created in step one (where step one is the parse and convert of the gedcom file) and upload that into Drupal. I'd like to somehow make this happen in one step as far as the end user sees. Somehow I would need a way to trigger Drupal to import the data after it is converted into JSON, or XML or CSV.
I am not personally familiar with Drupal. But I do know there already is a fairly advanced program written for Drupal called Family Tree that is a project at Drupal.org.
You might want to check that project out, or even better, contribute to it and improve it.
Disclaimer: I have no connection at all to the project, and I've never even tried the module. I just happen to be aware of it.
You'll need create a specific module (or search for an existent one) that gets the gedcom file and generates the nodes. It would be easy for the user (just one file upload).
There have been some genealogy websites built on Drupal. The best one I've seen is https://ancestry.sandes.uk/, which was announced at Functional Drupal 7 Family History Website in the Genealogy Discussion Group. I've launched a project to build a multi-user, multi-family-tree website on Drupal at www.greatalbum.net, based on Drupal7+OG. I discuss it here and here.
To be able to upload a GEDCOM file, you will need a GEDCOM module to do the work. I found this GEDCOM Sandbox Module based on D8, but it's still very early stages, and the code is currently missing. I've contacted the author, skybow, and he's going to restore the code. I will probably hire someone to develop the module for the GreatAlbum site, but it will be on D7 for now.
It's essentially for an "is our service is available in your zip code?" yes/no answer so the lookup is trivial.
But the client wants to be able to add and remove zip codes themselves through WP. So, several thousand very short strings need to be stored. How would you set this up? A special content type? A CSV file that would get overwritten every time?
Personally I would rather handle this outside Wordpress entirely but if forced to keep inside the WP system, what are the best ideas?
Best practice in my opinion would be a WordPress plugin with a migration for a new table that holds your data.
In your backend you would provide a way to edit the zip codes (and/or add the ability to just import a csv file to overwrite them all, if this makes it easier for your client).
In your fronted you can just query the strings as you wish.
A custom post type is too much overhead and an abuse of the format.
Just today I've started using Drupal for a site I'm designing/developing. For my own site http://jwm-art.net I wrote a user-unfriendly CMS in PHP. My brief experience with Drupal is making me want to convert from the CMS I wrote. A CMS whose sole method (other than comments) of automatically publishing content is by logging in via SSH and using NANO to create a plain text file in a format like so*:
head<<END_HEAD
title = Audio
keywords= open,source,audio,sequencing,sampling,synthesis
descr = Music, noise, and audio, created by James W. Morris.
parent = home
END_HEAD
main<<END_MAIN
text<<END_TEXT
Digital music, noise, and audio made exclusively with
#=xlink=http://www.linux-sound.org#:Linux Audio Software#_=#.
END_TEXT
image=gfb#--#;Accompanying image for penonpaper-c#right
ilink=audio_2008
br=
ilink=audio_2007
br=
ilink=audio_2006
END_MAIN
info=text<<END_TEXT
I've been making PC based music since the early nineties -
fortunately most of it only exists as tape recordings.
END_TEXT
( http://jwm-art.net/dark.php?p=audio - There's just over 400 pages on there. )
*The jounal-entry form which takes some of the work out of it, has mysteriously broken. And it still required SSH access to copy the file to the main dat dir and to check I had actually remembered the format correctly and the code hadn't mis-formatted anything (which it always does).
I don't want to drop all the old content (just some), but how much work would be involved in converting it, factoring into account I've been using Drupal for a day, have not written any PHP for a couple of years, and have zero knowledge of SQL?
How would I map the abstraction in the text file above so that a user can select these elements in the page-publishing mechanism to create a page?
How might a team of developers tackle this? How do-able is it for one guy in his spare time?
You would parse the text with PHP and use the Drupal API to save it as a node object.
http://api.drupal.org/api/function/node_save
See this similar issue, programmatically creating Drupal nodes:
recipe for adding Drupal node records
Drupal 5: CCK fields in custom content type
Essentially, you create the $node object and assign values. node_save($node) will do the rest of the work for you, a Drupal function that creates the content record and lets other modules add data if need be.
You could also employ XML RPC services, if that's possible on your setup.
Since you have not written any PHP for a long time, and you are probably in a hurry, I suggest you this approach:
Download and install this Drupal module: http://drupal.org/project/node_import
This module imports data - nodes, users, taxonomy entries etc.- into Drupal from CVS files.
read its documentations and spend some time to learn how to use it.
Convert your blog into CVS files. unfortunately, I cannot help you much on this, because your blog entries have a complex structure. I think writing a code that converts it into CVS files takes same time as creating CVS files manually.
Use Node Import module to import data into your new website.
Of course some issues will remain that you have to do them manually; like creating menus etc.