I am attempting to upload a .pdf file into a mysql database using php.
It is all good except for the contents of the file. No matter how I seem try to escape special characters, the query always fails, mostly with "Unknown Command \n".
I have used addslashes, mysql_real_escape_string, removeslashes etc.
Does anyone have any ideas on how to escape file contents?
Many Thanks,
I don't see why you would want to store a file in a database, but I suggest you take a look at prepared statements.
I've used the following sequence before, which seems to work nicely, and will store any data into the db, including images, pdfs, arrays of data, etc... :)
Storing the data (can be a string, array, object, etc.);
First, turn the data into a base64 encoded string
$strData = strtr(
base64_encode(
addslashes(
gzcompress( serialize($dataToStore) , 9)
)
) , '+/=', '-_,');
Then store that string data in the db...
Retrieving the data;
Extract the string data from the db
decode the data back to what you want (you may need to perform an extra step after this depending on the input data, array, image, etc.)
$returnData = unserialize(
gzuncompress(
stripslashes(
base64_decode(
strtr($strDataFromDb, '-_,', '+/=')
)
)
)
);
This certainly helped me to store what I needed to store in a mySQL db!
Guess: You may be encountering errors due to the incompatibility between character sets. PDF is probably a binary file so you need to make sure that db column is set up to handle it that.
Beside the escaping problem you might run into "packet too large" errors if the (MySQL) system variable max_allowed_packet is set to a "small" value.
Using the mysqli extension, prepared statements and mysqli_stmt::send_long_data you can avoid both problems.
Related
I am trying to generate a JSON string using php from an array, and I would like to store that string in a csv file, also with PHP. I want to do this, because I am working with a quite large amount of data, and I would like to use MySQL's LOAD DATA LOCAL INFILE to populate and update my database table.
This is the code I have:
$tmpFileProducts = 'path/to/file';
$tmpFileProductsHandler= fopen($tmpFileProducts, 'w');
foreach ($attributeBatch as $productId => $batch) {
fputcsv($tmpFileProductsHandler, array($productId, $batch['title'], $batch['parsed'], json_encode($batch['attributes'])), "|", "\"");
}
My problem is, that when I am creating the CSV file, the JSON double-quotes are not escaped, thus I end up with simmilar lines in my csv file:
43541|"telefon mobil 3l 2020 4g "|"2020-12-05 17:38:19"|"{""color"":""dark chrome"",""memory_value"":4294967296,""storage_value"":68719476736,""sim_slot"":""dual""}"
My first possible solution would be, to change the string enclosure of my CSV file, but what enclosure should I use, to ensure no conflicts could arrise with the inner json column? I am in complete control of the array that is stringified, it will only contain ASCII characters, in it.
Would there be a way, to keep the current string enclosure, and instead escape the JSON string somehow? Later on, I will need to fetch the data that is included to the database, and convert it to an array again.
DISCLAIMER: I am well aware that instead of storing the data as a JSON string, I could store it in a specific relational table (which I am also doing), but I would need quick access to this data, for a background script that is running, and I would like to save on the time of the queries, to the relational table, as when the background script will use this data, it doesn't need to search in it.
Follow up question: as I am explicitly telling the fputcsv function what to use as string enclosure shouldn't it automatically escape all the simmilar inner strings?
Is it possible to escape a serialized string using PDO before it is inserted in the database?
I've built something where content from a WYSIWYG editor will be serialized. If someone pastes text from Word to the editor, and saves, I'll get the following error because multiple style tags where added:
unserialize(): Error at offset 105 of 1020
I've tried saying don't paste from Word haha, however I would like to build it so that it is possible even it's not the best way to do it.
I found the PDO function quote, but I'm not sure if that is what I'm looking for.
Besides that function, I couldn't find any other solutions. I'm already using PDO prepared statements.
I would like to know if it is possible. Thanks for the effort.
I believe it is related to encoding.
You should do base64_encode before save and base64_decode after it. As wrote here:
$toDatabse = base64_encode(serialize($data)); // Save to database
$fromDatabase = unserialize(base64_decode($data)); //Getting Save Format
Also, to avoid problems with encoding when you connect to database execute this SQL request:
"SET NAMES 'utf8'"
Context
I store image in BLOBs columns in a MySql DB with PDO (yeah, this is needed).
I upload a base64_encoded .png from client's browser to a .php webservice through AJAX, and store it on my data base using base64_decode().
Later, I get it back on the client's browser. And upload it again, and so on until space-time continuum breaks.
Retrieving a valid BLOB (imported directly on phpMyAdmin, so 100% sure) from the database is fine, I can print it well on browser.
But storing it on MySql...
Issue
Setting looks like that :
$dbh = new PDO('mysql:host=localhost;dbname=Me', 'My', 'Myself');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$dbh->exec("SET CHARACTER SET utf8"); //I tried playing with charset too
$query = $dbh->prepare("UPDATE i_like_underscores SET `my_blob`=:my_blob WHERE `it`=`:belongs");
$my_blob = base64_decode($_POST['my_blob']);
$query->bindParam(':my_blob', $my_blob, PDO::PARAM_LOB);
//The WHERE clause has really no importance here, so I don't even bind it
$query->execute();
It seems that PDO systematically removes some special characters of my blob during this process (but I can't diagnostic when exactly), cause when I later get my picture back (and encode it on base64), all the + and = are gone from my base64 string (while / stills) === corrupted.
I guess it automatically escapes when I bind it, but I can't tell as base64_decoded .png data is encoded in a unreadable weird charset.
I spent many hours on it, and tried :
Changing the encoding
PDO::quote() after base64_decode()
Putting various quotes on my SQL query
Prepared statements and direct PDO::query
Surfing the web for docs on all the PDO function I used
Finding similar cases, on StackOverflow too, no luck
To see how it's done in phpMyAdmin
Setting type PDO::PARAM_STR
Not thinking about quitting PDO just for that special case
Writing all my code backwards
And black magic
Without luck... Could someone give me a clue?
Thanks to davidstrachan, I've solved the removal of = using :
$dbh->setAttribute( PDO::ATTR_EMULATE_PREPARES, false );
Then, I came back to that problem later, just to see that the serialization of my form in jQuery was done wrong.
It escaped all the + as spaces (or %20) instead of %2B.
I have tinyMCE editor which is passing data to php processing file.
If I use $variable=$_POST(['tinyMCE_textarea']); everything is ok.
But I want to secure it so nothing bad will come from user who entered some data into textarea.
And when I use $variable=mysql_real_escape_string($_POST(['tinyMCE_textarea']));
The result becomes dammaged with some \" signs. So how can I add maximum security without changing the variable ?
TinyMCE is able to clean up data, however it is critical that you don't rely on client-side stuff.
To secure data for database, you use mysql_real_escape_string(). The result is intended for use with mysql and not for display.
To secure data for display, you use the htmlspecialchars() function. htmlentities() also works but would convert all applicable entities, so for security you only need htmlspecialchars().
So the simplified picture is
.// Insert to database
mysql_query("INSERT INTO data (content) VALUES ('" . mysql_real_escape_string( $_POST['tinyMCE_textarea'] ) . "')");
.// Display to user - doesn't matter whether the data is from post or database
echo htmlspecialchars ( $_POST['tinyMCE_textarea'] );
use prepared statement or PDO.
use htmlentities() or covert atleast '<' and '<' '"' to "& gt;" and so ..
Just remember to escape the user input before outputting (using for example htmlentities()) and escape the string before storing it in your database.
Use SQL parameter binding, and you are safe from any injection.
I have a form that, among other things, accepts an image for upload and sticks it in the database. Previously I had a function filtering the POSTed data that was basically:
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES);
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
When, in an effort to fix some weird entities that weren't getting converted properly I changed the function to (all that has changed is I added that 'UTF-8' bit in htmlentities):
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES, 'UTF-8'); //added UTF-8
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
And now images will not upload.
What would be causing this? Simply removing the 'UTF-8' bit allows images to upload properly but then some of the MS Word entities that users put into the system show up as gibberish. What is going on?
**EDIT: Since I cannot do much to change the code on this beast I was able to slap a bandaid on by using htmlspecialchars() rather than htmlentities() and that seems to at least leave the image data untouched while converting things like quotes, angle brackets, etc.
bobince's advice is excellent but in this case I cannot now spend the time needed to fix the messy legacy code in this project. Most stuff I deal with is object oriented and framework based but now I see first hand what people mean when they talk about "spaghetti code" in PHP.
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES);
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
This function represents a basic misunderstanding of string processing, one common to PHP programmers.
SQL-escaping, HTML-escaping and input validation are three separate functions, to be used at different stages of your script. It makes no sense to try to do them all in one go; it will only result in characters that are ‘special’ to any one of the processes getting mangled when used in the other parts of the script. You can try to tinker with this function to try to fix mangling in one part of the app, but you'll break something else.
Why are images being mangled? Well, it's not immediately clear via what path image data is going from a $_FILES temporary upload file to the database. If this function is involved at any point though, it's going to completely ruin the binary content of an image file. Backslashes removed and HTML-escaped... no image could survive that.
mysql_real_escape_string is for escaping some text for inclusion in a MySQL string literal. It should be used always-and-only when making an SQL string literal with inserted text, and not globally applied to input. Because some things that come in in the input aren't going immediately or solely to the database. For example, if you echo one of the input values to the HTML page, you'll find you get a bunch of unwanted backslashes in it when it contains characters like '. This is how you end up with pages full of runaway backslashes.
(Even then, parameterised queries are generally preferable to manual string hacking and mysql_real_escape_string. They hide the details of string escaping from you so you don't get confused by them.)
htmlentities is for escaping text for inclusion in an HTML page. It should be used always-and-only in the output templating bit of your PHP. It is inappropriate to run it globally over all your input because not everything is going to end up in an HTML page or solely in an HTML page, and most probably it's going to go to the database first where you absolutely don't want a load of < and & rubbish making your text fail to search or substring reliably.
(Even then, htmlspecialchars is generally preferable to htmlentities as it only encodes the characters that really need it. htmlentities will add needless escaping, and unless you tell it the right encoding it'll also totally mess up all your non-ASCII characters. htmlentities should almost never be used.)
As for stripslashes... well, you sometimes need to apply that to input, but only when the idiotic magic_quotes_gpc option is turned on. You certainly shouldn't apply it all the time, only when you detect magic_quotes_gpc is on. It is long deprecated and thankfully dying out, so it's probably just as good to bomb out with an error message if you detect it being turned on. Then you could chuck the whole processInput thing away.
To summarise:
At start time, do no global input processing. You can do application-specific validation here if you want, like checking a phone number is just numbers, or removing control characters from text or something, but there should be no escaping happening here.
When making an SQL query with a string literal in it, use SQL-escaping on the value as it goes into the string: $query= "SELECT * FROM t WHERE name='".mysql_real_escape_string($name)."'";. You can define a function with a shorter name to do the escaping to save some typing. Or, more readably, parameterisation.
When making HTML output with strings from the input or the database or elsewhere, use HTML-escaping, eg.: <p>Hello, <?php echo htmlspecialchars($name); ?>!</p>. Again, you can define a function with a short name to do echo htmlspecialchars to save on typing.