Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I heard xml is used as database, can anybody give me a simple tip or link to tutorial how to store some information in database ? what is the best use of xml on php realted to data things?
I'm gonna throw my hat in, simply because I am working on a personal project that does in fact use XML as it's storage mechanism. Notice, I didn't call it a database. It's not a database, at least in the way most would define it. As expressed in an article I read recently, XML is not data oriented, it's document oriented.
In my case, I'm building a simple OO php/XML resume site for my girlfriend. I am using an XML file to store the content. I chose this mainly because it's small, lightweight, interchangeable, and easy to read. Initially, I thought I could just provide the XML to her, and she could fill in the blanks. XML is straightforward enough to allow a laymen to do that.
As I continued, I realized that it wasn't very difficult to throw in an admin type interface where she could simply enter values in a form and update the resume that way.
Since the site is not really a web site, but a web document, XML works well here and nicely separates content.
Of course, I could have used JSON as well, and I may in fact adapt things to handle either JSON or XML, but I decided to use XML initially simply out of familiarity, and (this is arguable) that I assumed it would be easier for a laymen to parse when entering content.
XML is not supposed to be used as a database but as a way to transport data in an application agnostic way. For example, say you have many RSS feeds in Google Reader and you want to add them into Thunderbird. You will export them from Google Reader in the XML format, and then import that XML file into Thunderbird. Both applications will know how to read and write from the XML and how to use the information (the RSS feeds) in it.
If you want to store information in a useful way that, for example, lets you organize and search through it, you will need a full fledged database. Some good ones are Mysql and Postgresql. Both of those work well with PHP and have extensive tutorials to begin with, all easily accessible via any search engine.
You can answer this question yourself after reading this very entertaining article by one of Stackoverflow founders:
Back to Basics by Joel Spolsky
Check out some of the responses I got to my question "Is there a simple, flat, XML-based query-able data storage solution?" on the Programmers.StackExchange Site.
It's a mixed bag. SimpleXML is great with PHP, but there is a lot of FUD when it comes to XML query languages and implementations..
To add to what Fanis said, if you want something lightweight then I strongly recommend MongoDB or SQLite
Related
This question already has answers here:
Add, update and edit an XML file with PHP
(3 answers)
Closed 6 years ago.
So completed my uni stuff using a little help from you fantastic programmers out there and a few all nighters to go as far as I could with manipulating XML data through Javascript. Now I'm done for the summer and my dear old mother has asked me to create her a basic site for her maths tutoring service with info and prices etc... I was thinking as she doesn't need much I would go in for using XML again but this time not restricted on the use of PHP to Create new elements/nodes, update or delete.
I was going to create her a basic booking system with a little admin panel for editing entries etc... As the information doesn't really need to be too secure the use of XML seems to be alright for the purpose.
My question is Does anyone know of any clean basic functions that can be used to this end with XML using PHP ?? In terms of functions I would mean things like Create/Insert, Edit/Update, Delete etc...
Any help or even a site that has a decent tutorial on it would be great as I've gone through youtube and there isn't anything decent or clean and simple.
Thanks in advance!
Well, you can use the DOM methods to create/edit/delete nodes from an XML.
http://us2.php.net/manual/en/book.dom.php
Just curious, why do you want to play with XML, may be the easier choice would be a database as simple as SQLite http://php.net/sqlite.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
There are many websites and blog which provide RSS feeds, but on the other hand there are also many which do not. I want to turn that type of web page into RSS feeds.
I found some solutions using through Google like Feed43, Page2rss, Dapper etc, but I want an Open Source project which can perform this task or any tutorial explaining about it.
Please give me suggestions and if you can explain, you are most welcome.
My preferable language is PHP.
There's nothing magic about RSS. I suggest you read this tutorial to understand how to build an RSS feed from scratch:
http://www.xul.fr/en-xml-rss.html
Then use your PHP skills to build one from your content. A generic HTML-to-RSS scraper can be found online by searching for "html to rss converter" or whatever, but most of these will be hosted solutions and the RSS feeds they produce aren't that great. A good RSS feed requires understanding the content that you're syndicating, not just the raw HTML. IMHO.
In general there is not going to be any "one size fites all" solution to something like this. You'll have to examine the HTML structure of the blog you want to build an RSS feed from, then parse out the content you are interested in, and stick it into an RSS feed.
Here's some PHP things to help get you started:
Parsing HTML:
DOMDocument (swiss-army-knife of HTML/XML parsing)
SimpleXML (easy to use, but requires valid XML)
Tidy (can be used to clean up bad HTML)
Understanding RSS Feeds:
http://en.wikipedia.org/wiki/RSS
To construct them with PHP, you can once again use DOMDocument or SimpleXML. Another option is, depending on the format of the HTML you want to convert into RSS, you may be able to create an XSLT stylesheet to transform it.
There is no simple or concrete answer to this question, but I will get you started.
First, you need to build a crawler of sorts. Typically, you are going to want this to be multi-threaded and run in the background on your server. This might be as simple as forking PHP processes on the server, but you might find a more efficient way, depending on how much traffic you expect.
Now probably the best way to start would be to read the DOM. See http://php.net/manual/en/class.domdocument.php Look for headings and try to associate them with the paragraphs below them. Beware though that probably less than half the sites out there (and likely far fewer from the ones that don't already have a feed) don't structure their site in an organized way. But, it is a place to start.
There are plenty of element attributes too you can use, such as alt text. Also, in time you may find a lot of sites using a particular template that you can write code to handle directly.
You should also have something to read existing feeds. If a site has a feed, no sense in generating one for it, right? Use SimplePie to get started, but there are alternatives you don't like it. http://simplepie.org/
Once you have parsed the page, you'll want a database backend to track it and changes and what not.
From there, you need something to generate the feed. There are plenty of OOP classes for doing this. Often times, I just write my own, but that is up to you.
If you build sites with the simple symphony cms then yes, its very easy. See this snippet of a tutorial. Learn here
Just a quick question I know how I would build a cms using a database but why would you want to create a cms with xml?
What are the pros and con's using xml also if I was to build a cms with xml would I need the help of a database of does xml just remove the need of a database?
I havent't seen CMS without a database in a while.
I think most of those were developed because "a long time ago" you didn't always get access to a database when purchasing/renting webspace.
You might be interested in storing your data in a changing format. XML definitely allows that - being able to define your own tags at will is somewhat akin to being able to add and remove columns without migrating data.
XML can remove the usage of a database - but as the size of the XML file grows, lookup and search become ever more costly. For a personal content management system - especially one where you are looking at the beginning of a file in your most common use case - it could be an acceptable solution.
Making a CMS like this would be something like using TiddlyWiki, which is a single html file that hosts an entire wiki.
For even slightly larger scale CMS, I would immediately opt for a database - probably SQLite for smaller scale, because it's the thing to do nowadays.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm working on a site which requires a very simple CMS - basically there's a block of text on the homepage that needs to be editable by the client. Their current hosting plan doesn't allow for a database, and including one will cost an extra $X a month which I think is unnecessary for such a basic system.
The site is currently built using Codeignitor. I'm planning to write the CMS part of it using either flat PHP or TXT files, are there alternative methods worth considering, and what are the pros/cons?
Okay, so further to this, I've opted for a custom flatfile system. I looked at a few of the recommended non DB CMS systems and they seem quite good - particularly this one which I later found: http://get-simple.info/
The reason for building my own is mainly due to the fact that the site is already on the Codeignitor Framework, and I don't want to rebuild it using a different one.
So my question now is - if my system is storing data in two txt files: one for userdata and one for site content, are there massive security issues if I set the sitecontent file permissions to RW? The site is quite small and I can't imagine anyone would want to hack it, but I'd still like to know if there are any major security implications.
Try http://www.opensourcecms.com/
example of some that might interest you
PivotX
pluck
razorCMS
cushyCMS
its ftp's into your hosting account, reads your html, and looks for tags that have a class="cushy" and makes those content feilds editable. its good forwhat your wanting.
I once solved this problem by simply putting markers in an HTML file, as HTML comments, and then had my PHP script parse the file and insert the desired text in between the markers. Done this way, you need no other files other than the PHP that handles the form submission from the CMS and a static HTML page.
In words, read the file into a string, explode() the string using the marker as the delimiter, modify the second (if you have a single editable section enclosed by the markers) array element to contain the new text submitted by the user, then implode the array back into the string, then write the string back as the complete file.
What about sqlite? it is just a file, no need to install anything? But if this is also not welcome, you could just keep the contents in txt files and have a php to read it, put to your template.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Hi I know about several PDF Generators for php (fpdf, dompdf, etc.)
What I want to know is about a parser.
For reasons beyond my control, certain information I need is only in a table inside a pdf
and I need to extract that table and convert it to an array.
Any suggestions?
I've written one before (for similar needs), and I can say this: Have fun. It's quite a complex task. The PDF specification is large and unwieldy. There are several methods of storing text inside of it. And the kicker is that each PDF generator is different in how it works. So while something like TFPDF or DOMPDF creates REALLY easy to read PDFs (from a machine standpoint), Acrobat makes some really hellish documents.
The reason is how it writes the text. Most DOM based renderers --that I've used-- write the entire line as one string, and position it once (which is really easy to read). Acrobat tries to be more efficient (and it is) by writing only one or maybe a few characters at a time, and positioning them independently. While this REALLY simplifies rendering, it makes reading MUCH more difficult.
The up side here, is that the PDF format in itself is really simple. You have "objects" that follow a regular syntax. Then you can link them together to generate the content. The specification does a good job at describing the file format. But real world reading is going to take a bit of brain power...
Some helpful pieces of advice that I had to learn the hard way if you're going to write it yourself:
Adobe likes to re-map fonts. So character 65 will likely not be A... You need to find a map object and deduce what it's doing based upon what characters are in there. And it is efficient since if a character doesn't appear in the document for that font, it doesn't include it (which makes life difficult if you try to programmatically edit a PDF)...
Write it as abstract as possible. Write classes for each object type, and each native type (strings, numbers, etc). Let those classes parse for you. There will be a fair bit of repetition in there, but you'll save yourself in the end when you realize that you need to tweak something for only one specific type)...
Write for a specific version or two of the PDF spec, and enforce it. Check the version number, and if it's higher than you expect, bail... And don't try to "make it work". If you want to support newer versions, break out the specification and upgrade the parser from there. Don't try to trial and error your way up (it's not fun)...
Good luck with compressed streams. I've found that typically you can't trust the length arguments to verify what you are uncompressing. Sometimes (for some generators) it works well... Others it's off by one or more bytes. I just attempt to deflate it if the filter matches, and then force the length...
When testing lengths, don't use strlen. Use mb_strlen($string, '8bit') since it will compensate for different character sets (and allow potentially invalid characters in other charsets).
Otherwise, best of luck...
I use PDFBox for that (http://pdfbox.apache.org/). This software is javabased and platform independend. It works fast and reliable. You can use it via exec or shell execute or via a PHP/Java-Bridge (http://php-java-bridge.sourceforge.net/)
Have you already looked at xPDF ? There is a program in there called pdftotext that will do the conversion. You can call it from PHP and then read in the text version of the PDF. You will need to have the ability to run exec() or system() from php, so this may not work on all hosted solutions though.
Also, there are some examples on the PHP site that will convert PDF to text, although its pretty rough. You may want to try some of those examples as well. On that PHP page, search for luc at phpt dot org.
Zend_Pdf is part of the Zend Framework. Their manual states:
The Zend_Pdf component is a PDF
(Portable Document Format)
manipulation engine. It can load,
create, modify and save documents.
Thus it can help any PHP application
dynamically create PDF documents by
modifying existing documents or
generating new ones from scratch.
Have a look at GhostScript or ITextSharp, there are various cross-platform version of both.
It may not actually be a table inside the PDF as the PDF loses that sort of information...
This is PHP PDF parser, which exists in two flavours:
Free version can parse PDFs up to format PDF 1.5
Commercial add-on can parse any PDF format (up to current 1.9)