Should a database be used for storing rich text? - php

I'm working on a web app for a private team that should let them post text documents, like a text editor, via PHP and then hypothetically it will be displayed on the site so it can be opened and viewed in the browser by other users. Basically like a post system.
I figured it should be stored using a database like mySQL/mariaDB but as I'm new to databases still I'm not fully sure whether a database can store that much text and keep it formatted.
So how should I go about storing this kind of text so it can then be fetched and posted?

Like user3783243 said TEXT data type can work pretty well as it can hold 65,535 characters. However, if you need something larger than that, you can use LONGTEXT which is capable of holding 4,294,967,295 or 4GB of Characters. This means users could make very long text files and it should still be capable of holding the entire text file.

Related

Which data type in mysql should i chose to save javascript code using PHP?

I'm making a website similar to jsfiddle, where the user can save their javascript codes and retrieve it back indented. I don't know which DATA TYPE I should use to save the codes or if I should save them in text files. Also, when the data will be printed using php then how to indent it?
You could store the data in a varchar, but you shoulld probably pursue alternate storage possibilities such as storing them in individual *.js files. As for indentation, all characters including indentation characters will be saved with the js.
I would use BLOB it's quicker for retrieving than grabbing a filename out of the db then getting the file. Also, though the chances are slim if you have your site set up right, it's possible to execute the javascript files on the server opening it up to vulnerabilities if you're running node.js or something.
You should use LONGTEXT type. If you can guarantee that your data will be less than about 8KB, you can use VARCHAR or TEXT types (Relevant MySQL documentation).
If there is possibility that text may contain some binary data, you may have to resort to BLOB or LONGBLOB types.
Regarding indentation: you can store tabs or spaces as well as newlines in your field and basically treat them as normal text files.
You'd better save it in text files with a randomly chosen name and save the name into the database, or use the saved ID as name. Of course if you delete a row in the database you'll have to delete the file too.
If you just want to save them in the database then choose TEXT or BLOB. VARCHARs are bad for large texts and if you don't know the lenght.
If you display/use content uploaded from users you have to carefully take care of people that try to exploit stuff.
About indentation, if you display them in a textarea you shouldn't have any problems.

Scalable way to store files on server (PHP)?

I'm creating my first web application - a really simplistic online text editor.
What I need to do is find the best way to store text based files - a lot of them.
These text files can be past 10,000 words in size (text words not computer words.) in essence I want the text documents to be limitless in size.
I was thinking about storing the text files in my MySQL database - but thought there was a better way.
Instead I'm planing on storing the text files in XML based format in a directory on my server.
The rows in the database define the name of the xml based text file and the user who created the text along with basic metadata.
An ID is generated using a V4 GUID generator , which gives the text an id and stores the text in the "/store" directory on my server. The text definitions in my server contain this id, and the android app I'm developing gets the contents of the text file by retrieving the text definition and then downloading the text to the local device using the GUID in the text definition.
I just think this is a botch job? how can I improve this system?
There has been cases of GUID colliding.
I don't want this to happen. A "slim" possibility isn't good enough - I need to make sure there is absolutely no chance in a GUID collision.
I was planning on checking the database for texts that have the same id before storing the text with a particular id - I however believe with over 20,000 pieces of text in my database this would take an long time and produce unneeded stress on the server.
How can I make GUID safe?
What happens when a GUID collides?
The server backend is going to be written in PHP.
You've got several questions here, so I'll try to answer them all.
Is XML with GUID the best way to do this?
"Best" is usually subjective. This is certainly one way to do it, but you're probably adding unneeded overhead. If it's just text you're storing, why not put it in the SQL with varchar(MAX)?
Are GUID collisions possible?
Yes, but the chance of that happening is small. Ridiculously small. There are much bigger things to worry about.
How can I make GUIDs safe?
Stop worrying about them.
What happens when a GUID collides?
This depends on how you're using them. In this case, the old data stored in the location indicated by the GUID would probably be overwritten by the new data.
Well i dont know if id use a guid i would probably just use the auto_increment key on the db table and name the files like that because unless you have deleted records from the db without cleaning up the filesystem they will always be unique. I dont know if the GUID is a requirement on the android side though.
There's nothing wrong with using MySQL to store the documents!
What is storing them in XML going to provide you with? Adding an additional format layer will only increase the processing time when they are to be read and formatted.
Placing them as files on disk would be no different than storing them in an RDBMS and in the longer-term probably cause you further issues down the line. (File access, disk-seek, locking, race conditions come to mind).

What datatype is best for storing articles in SQL database?

I'm creating a custom CMS for my site, and I'm going to store articles (Text and images per article) in a mysql database. what datatype is best suited for that task?
You shouldn't try to store text and images in a single field, and in this circumstance you probably shouldn't store images in your database at all.
The best solution for this would be to use some kind of markup system in your articles - at it's simplest this could be a filtered subset of html - that is stored as plain text in your database and then parsed in the browser in some way. Obviously if you use filtered html you would not need to write any special code to parse it, but taking this approach does raise possible security issues.
Other options to investigate include Markdown (the system used by this site) as well as BBCode (mainly used by online forums), as well as many others.
To summarise - don't store images and text in one field. Store text, and interpret that text to load images and other media in your articles as appropriate.
Save the images as VARCHAR, but only the image name (and/or location, depending on how big is your cms), save the text as - TEXT.
In my oppinion use text for articles, containing HTML characters, thats what i use. But there are many other things to consider. It depends on what content of your articles. In your images its up to you, you will just store the path of the picture, unless you plan to store the picture itself.
Please see this article in which Microsoft insists the text datatype is going to disappear and therefore should not be used in new development. The alternatives are varchar and nvarchar for text and varbinary for images.
Store images and text separately. Assuming you are trying to store the binary images in the database, I recommend storing images as a BLOB datatype and TEXT for the text.
Don't use VARCHAR for article text because it does not dynamically expand past the size you make it.
VARCHAR would be the best choice for storing text-articles as text and images data types would disappear in the future.
For images you can store them in a folder similar to what you use for css/js and store its src value in the database. This would be efficient in terms of both speed and storage.

BLOB over varchar?

I am developing a blog, where my client wants to use lot of images, for(articles, titles, advertisement, etc.). he hardly wants any text there, as the blog he wants it to be developed in arabic and he is not happy with any of the supporting font by web browser, nor he wants to adopt the EOT, he will be updating the blog daily (like just uploading the pictures),
what data type do you think i should be using for it? BLOB or VARCHAR ?
PS: i am using MySQL..
Check out the following site which
uses the same concept as of my
clients, although they are rivals,
they have used images for links, news
or advertisements and still the site
is not that heavy..
http://www.sahilonline.net/
Update: I misunderstood that you want to store the text as images. The recommendations below are for storing native text, not image data. I have to agree with #Col that this is a very bad idea - performance-wise, in regards to search engine visibility (no indexing will take place), accessibility for people with visual or other impairments, different screen resolutions, mobile devices... Although I can understand that the selection of browser-available fonts does arabic characters even less justice than they do ours, I would try and get the client away from this idea.
For new projects, definitely VARCHAR or TEXT / LONGTEXT with a UTF-8 character set.
The main reason being that only (VAR)CHAR and TEXT can do fulltext search
How you store your data in the data base has nothing to do with arabic font support or encoding issues. A UTF-8 table can store arabic text without problems.
For some very thorough basic reading on encoding issues, there's Joel Spolsky's famous Unicode article.
mySQL 5 String type overview
Blob stands for "Binary Object" while varchar stands for "variable number of characters", so, the answer is obvious, in my opinion
Though I cannot keep myself from commenting: keeping a blog post title as an image in the database is the most ridiculous solution I have ever seen in my life.
you can save your Image or other Blob object in directory hierarchy and just save the file address in database then you can use Varchar or Text for your field!
I suggest use this method other than save blob object
It will be better if you use blob. As it stores the data in format what you have inserted & displays as it is. also VARCHAR have some limit.
Please do not go for text as you are saying that "hardly wants any text there". It will be better to store Images & any Unicode formatted text in the data type blob.
Just take care of editor that you will be giving to the client for inputting the data, which should support Arabic. Plug-ins for those are available.

Store text in BLOB?

I'm making a little forum for my clans website. I'm wondering if I should store the thread text in TEXT or BLOB? Whats the difference? I've seen that phpBB does that.
What is BLOB anyway? cant find much about it on Google.
A blob is just a bunch of bytes. An arbitrary number of bytes, nothing more.
If you were to store text as a blob, you'd have to worry about encoding (the process of translating text to bytes). But if you store things as text whatever database transport your using will make sure that the text stored in the database is properly encoded and decoded for both efficient storage and easy use.
If you're planning to store text, you should store text.
phpBB could implement text encoding and decoding themselves and that could be one reason to use blob instead of text. It's unlikely but sometimes text data types have a maximum length, the blob might be a work around for phpBB in this particular instance.
Re the "what" - BLOB is Binary Large OBject; compare to CLOB: Character Large OBject. Different databases call them different things, though - for example, on SQL Server you have image/varbinary(max) for BLOB, and text/varchar(max) for CLOB.
If a system only supports a BLOB, then one option is to encode strings - for example using UTF8. This might be what is happening.
BLOB is for binary data. I don't know the reason why phpBB 3 stores everything in binary but I have noticed it myself. My guess is that they are compressing/encoding whatever they put into the database. You could try looking through the phpBB source code to see if there is any comments explaining it.

Categories