I'm making a little forum for my clans website. I'm wondering if I should store the thread text in TEXT or BLOB? Whats the difference? I've seen that phpBB does that.
What is BLOB anyway? cant find much about it on Google.
A blob is just a bunch of bytes. An arbitrary number of bytes, nothing more.
If you were to store text as a blob, you'd have to worry about encoding (the process of translating text to bytes). But if you store things as text whatever database transport your using will make sure that the text stored in the database is properly encoded and decoded for both efficient storage and easy use.
If you're planning to store text, you should store text.
phpBB could implement text encoding and decoding themselves and that could be one reason to use blob instead of text. It's unlikely but sometimes text data types have a maximum length, the blob might be a work around for phpBB in this particular instance.
Re the "what" - BLOB is Binary Large OBject; compare to CLOB: Character Large OBject. Different databases call them different things, though - for example, on SQL Server you have image/varbinary(max) for BLOB, and text/varchar(max) for CLOB.
If a system only supports a BLOB, then one option is to encode strings - for example using UTF8. This might be what is happening.
BLOB is for binary data. I don't know the reason why phpBB 3 stores everything in binary but I have noticed it myself. My guess is that they are compressing/encoding whatever they put into the database. You could try looking through the phpBB source code to see if there is any comments explaining it.
Related
I'm working on a web app for a private team that should let them post text documents, like a text editor, via PHP and then hypothetically it will be displayed on the site so it can be opened and viewed in the browser by other users. Basically like a post system.
I figured it should be stored using a database like mySQL/mariaDB but as I'm new to databases still I'm not fully sure whether a database can store that much text and keep it formatted.
So how should I go about storing this kind of text so it can then be fetched and posted?
Like user3783243 said TEXT data type can work pretty well as it can hold 65,535 characters. However, if you need something larger than that, you can use LONGTEXT which is capable of holding 4,294,967,295 or 4GB of Characters. This means users could make very long text files and it should still be capable of holding the entire text file.
I'm making a website similar to jsfiddle, where the user can save their javascript codes and retrieve it back indented. I don't know which DATA TYPE I should use to save the codes or if I should save them in text files. Also, when the data will be printed using php then how to indent it?
You could store the data in a varchar, but you shoulld probably pursue alternate storage possibilities such as storing them in individual *.js files. As for indentation, all characters including indentation characters will be saved with the js.
I would use BLOB it's quicker for retrieving than grabbing a filename out of the db then getting the file. Also, though the chances are slim if you have your site set up right, it's possible to execute the javascript files on the server opening it up to vulnerabilities if you're running node.js or something.
You should use LONGTEXT type. If you can guarantee that your data will be less than about 8KB, you can use VARCHAR or TEXT types (Relevant MySQL documentation).
If there is possibility that text may contain some binary data, you may have to resort to BLOB or LONGBLOB types.
Regarding indentation: you can store tabs or spaces as well as newlines in your field and basically treat them as normal text files.
You'd better save it in text files with a randomly chosen name and save the name into the database, or use the saved ID as name. Of course if you delete a row in the database you'll have to delete the file too.
If you just want to save them in the database then choose TEXT or BLOB. VARCHARs are bad for large texts and if you don't know the lenght.
If you display/use content uploaded from users you have to carefully take care of people that try to exploit stuff.
About indentation, if you display them in a textarea you shouldn't have any problems.
I'm creating my first web application - a really simplistic online text editor.
What I need to do is find the best way to store text based files - a lot of them.
These text files can be past 10,000 words in size (text words not computer words.) in essence I want the text documents to be limitless in size.
I was thinking about storing the text files in my MySQL database - but thought there was a better way.
Instead I'm planing on storing the text files in XML based format in a directory on my server.
The rows in the database define the name of the xml based text file and the user who created the text along with basic metadata.
An ID is generated using a V4 GUID generator , which gives the text an id and stores the text in the "/store" directory on my server. The text definitions in my server contain this id, and the android app I'm developing gets the contents of the text file by retrieving the text definition and then downloading the text to the local device using the GUID in the text definition.
I just think this is a botch job? how can I improve this system?
There has been cases of GUID colliding.
I don't want this to happen. A "slim" possibility isn't good enough - I need to make sure there is absolutely no chance in a GUID collision.
I was planning on checking the database for texts that have the same id before storing the text with a particular id - I however believe with over 20,000 pieces of text in my database this would take an long time and produce unneeded stress on the server.
How can I make GUID safe?
What happens when a GUID collides?
The server backend is going to be written in PHP.
You've got several questions here, so I'll try to answer them all.
Is XML with GUID the best way to do this?
"Best" is usually subjective. This is certainly one way to do it, but you're probably adding unneeded overhead. If it's just text you're storing, why not put it in the SQL with varchar(MAX)?
Are GUID collisions possible?
Yes, but the chance of that happening is small. Ridiculously small. There are much bigger things to worry about.
How can I make GUIDs safe?
Stop worrying about them.
What happens when a GUID collides?
This depends on how you're using them. In this case, the old data stored in the location indicated by the GUID would probably be overwritten by the new data.
Well i dont know if id use a guid i would probably just use the auto_increment key on the db table and name the files like that because unless you have deleted records from the db without cleaning up the filesystem they will always be unique. I dont know if the GUID is a requirement on the android side though.
There's nothing wrong with using MySQL to store the documents!
What is storing them in XML going to provide you with? Adding an additional format layer will only increase the processing time when they are to be read and formatted.
Placing them as files on disk would be no different than storing them in an RDBMS and in the longer-term probably cause you further issues down the line. (File access, disk-seek, locking, race conditions come to mind).
I'm making an android application which takes a photo and push the image (as a base64 encoded string) to a PHP script, from here I'll be storing data about the image inside a MySQL database.
Would it be wise to store the image inside the database (since it's passed as a base64 string), would it be better to convert it back to an image and store it on the filesystem?
A base64 encoded image takes too much place (about 33% more than the binary equivalent).
MySQL offers binary formats (BLOB, MEDIUM_BLOB), use them.
Alternatively, most people prefer to store in the DB only a key to a file that the filesystem will store more efficiently, especially if it's a big image. That's the solution I prefer for the long term. I usually use a SHA1 hash of the file content to form the path to the file, so that I have no double storage and that it's easy to retrieve the record from the file if I want to (I use a three level file tree, first two levels being made respectively from the first two characters and the characters 3 and 4 of the hash so that I don't have too many direct child of a directory). Note that this is for example the logic of the git storage.
The advantage of storing them in the DB is that you'll manage more easily the backups, especially as long as your project is small. The database will offer you a cache, but your server and the client too, it's hard to decide a priori which will be fastest and the difference won't be big (I suppose you don't make too many concurrent write).
I've done it both ways, and every time I come back to code where I stored binary data in a MySQL table I always switch it to filesystem with a pointer in the MySQL table.
When it comes to performance, you're going to be much better off going to the FS as pulling multiple large BLOBs from a MySQL server will tend to saturate its pipe quickly. Usually it's a pipe you don't want clogged.
You could always save the base64_encode($image) in a file and only store the file path in the database, then use fopen() to get the encoded image.
My apologies if I didn't understand the question correctly.
"wise" is pretty subjective, I think. I think it would be wise from a "keep people from directly linking to my images" perspective. Also, it may be helpful as far as if you decide you need to change up dir structures etc.. it might make it easier on you (but this really depends on how you wrote your scripts to begin with..) but other than that... offhand I can't really think of any benefits to doing this.
I am developing a blog, where my client wants to use lot of images, for(articles, titles, advertisement, etc.). he hardly wants any text there, as the blog he wants it to be developed in arabic and he is not happy with any of the supporting font by web browser, nor he wants to adopt the EOT, he will be updating the blog daily (like just uploading the pictures),
what data type do you think i should be using for it? BLOB or VARCHAR ?
PS: i am using MySQL..
Check out the following site which
uses the same concept as of my
clients, although they are rivals,
they have used images for links, news
or advertisements and still the site
is not that heavy..
http://www.sahilonline.net/
Update: I misunderstood that you want to store the text as images. The recommendations below are for storing native text, not image data. I have to agree with #Col that this is a very bad idea - performance-wise, in regards to search engine visibility (no indexing will take place), accessibility for people with visual or other impairments, different screen resolutions, mobile devices... Although I can understand that the selection of browser-available fonts does arabic characters even less justice than they do ours, I would try and get the client away from this idea.
For new projects, definitely VARCHAR or TEXT / LONGTEXT with a UTF-8 character set.
The main reason being that only (VAR)CHAR and TEXT can do fulltext search
How you store your data in the data base has nothing to do with arabic font support or encoding issues. A UTF-8 table can store arabic text without problems.
For some very thorough basic reading on encoding issues, there's Joel Spolsky's famous Unicode article.
mySQL 5 String type overview
Blob stands for "Binary Object" while varchar stands for "variable number of characters", so, the answer is obvious, in my opinion
Though I cannot keep myself from commenting: keeping a blog post title as an image in the database is the most ridiculous solution I have ever seen in my life.
you can save your Image or other Blob object in directory hierarchy and just save the file address in database then you can use Varchar or Text for your field!
I suggest use this method other than save blob object
It will be better if you use blob. As it stores the data in format what you have inserted & displays as it is. also VARCHAR have some limit.
Please do not go for text as you are saying that "hardly wants any text there". It will be better to store Images & any Unicode formatted text in the data type blob.
Just take care of editor that you will be giving to the client for inputting the data, which should support Arabic. Plug-ins for those are available.