The title sounds confusing, but I don't know how to describe it better.
I've been given the task to do a database migration from program A to program B.
Program A uses a MSSQL database and stores all its files in the database.
Program B handles files with stored in the "normal" way, meaning in the file system.
Now I have to extract, transform and download the database stored files to file system files with PHP, but I was unsuccessful in my attempts to convert them.
For testing purposes, I created a simple .txt file with the content Test document for migration and program A stores it like this in the database:
0x5465737420646F63756D656E7420666F72206D6967726174696F6E'
What format is that, and how do I convert it to a normal document.txt file ?
Edit
Lot of thanks to #PanagiotisKanavos. This now does work with a stream:
$query = "select top(1) DESCRIPTION, FILETYPE, DOCUMENT from dbo.Documents;";
$stmt = sqlsrv_query($this->sqlsrv_conn, $query);
if (sqlsrv_fetch($stmt)) {
$document = sqlsrv_get_field($stmt, 2, SQLSRV_PHPTYPE_STREAM(SQLSRV_ENC_BINARY));
$fileName = sqlsrv_get_field($stmt, 0, SQLSRV_PHPTYPE_STRING(SQLSRV_ENC_CHAR));
$ext = sqlsrv_get_field($stmt, 1, SQLSRV_PHPTYPE_STRING(SQLSRV_ENC_CHAR));
file_put_contents(
$fileName . '.' . $ext,
stream_get_contents($document),
);
}
Now what is the most efficient way to do this with ALL the files? Do I have to execute a query for each and every row?
With PDO I could use $stmt->fetchAll(FETCH_ASSOC) which gave me a nice array of assoc arrays with data inside.
sqlsrv has a similar function sqlsrv_fetch_array which is explained php.net and [docs.microsoft] with the following example:
while( $row = sqlsrv_fetch_array( $stmt, SQLSRV_FETCH_ASSOC) ) {
echo $row['LastName'].", ".$row['FirstName']."<br />";
}
But as much as I could search, I couldn't find a way to loop over the result set without fetching it and then fetching each row individually with the stream and string type mixed. sqlsrv_fetch_array accepts only SQLSRV_FETCH_NUMERIC, SQLSRV_FETCH_ASSOC and SQLSRV_FETCH_BOTH and then the result set is already fetched, and I can't use sqlsrv_get_field to set the type of each field.
I can't be the first person ever that needs something like this, but I'm unable to find anything about it. Probably I'm searching wrongly, or I misunderstood a concept.
I found the solution!
I tested multiple online tool to decode strings and find out what the encoding was.
Even tried hex2bin() but all the tools told me that it wasn't a valid hex string.
Until I stumbled upon this godsent tool which translated the invalid hex to a ? but translated the rest of it which resulted in:
?Test document for migration
From then on it was easy to deduct that 0x was the troublemaker. After removing it, the conversion works like a charm and I could "convert" even complexer files like .doc. Here is the code:
file_put_contents(
// 'DESCRIPTION' is the file name
'files/' . $dbDocument['DESCRIPTION'] .
// 'FILETYPE' is the extension
'.' . mb_strtolower($dbDocument['FILETYPE']),
// 'DOCUMENT' is the document content hex encoded with prepended '0x'
hex2bin(str_replace('0x', '', $dbDocument['DOCUMENT']))
);
This isn't a format, it's how some database administration tools, eg SSMS, display binary data. The data is already binary and doesn't need to be converted.
Reading Large Objects (LOBs) as if they were numbers or strings is really slow though, caching the entire document in both server and client memory, even though the object won't be reused, and doesn't even need to be held in memory. After all, a BLOB in SQL Server can be 2GB and more. That's why almost all databases and data access libraries allow handling LOBs as file streams.
Microsoft's PHP doc examples show how to read LOBs as file streams both for PDO and SQLSRV.
Copying from the example, this parameterized query will search for a user's picture:
/* Get the product picture for a given product ID. */
$tsql = "SELECT LargePhoto
FROM Production.ProductPhoto AS p
JOIN Production.ProductProductPhoto AS q
ON p.ProductPhotoID = q.ProductPhotoID
WHERE ProductID = ?";
$params = array(&$_REQUEST['productId']);
/* Execute the query. */
$stmt = sqlsrv_query($conn, $tsql, $params);
Instead of reading the entire picture as a single value though, it's loaded as a file stream :
$getAsType = SQLSRV_PHPTYPE_STREAM(SQLSRV_ENC_BINARY);
if ( sqlsrv_fetch( $stmt ) )
{
$image = sqlsrv_get_field( $stmt, 0, $getAsType);
fpassthru($image);
}
else
{
echo "Error in retrieving data.</br>";
die(print_r( sqlsrv_errors(), true));
}
$getAsType = SQLSRV_PHPTYPE_STREAM(SQLSRV_ENC_BINARY); specifies that the data will be retrieved as a stream.
$image = sqlsrv_get_field( $stmt, 0, $getAsType); retrieves the 1st field using the specified type, in this case a stream. This doesn't load the actual contents.
fpassthru copies the stream contents directly to the output. The picture may be 2GB but it will never be held in the web server's memory.
Related
So I have a LOT (millions) of records which I am trying to process. I've tried MongoDB and Neo4j and both simply grind my dual core ubuntu box to a halt.
I am wondering (and I don't believe there is) if there is any way to store PHP arrays in a file but only load one array into memory. So for example:
<?php
$loaded = array('hello','world');
$ignore_me = array('please','ignore');
$ignore_me2 = array('please','ignore','again');
?>
So effectively I could call the $loaded array but the others aren't loaded into memory (even though they're in the same file)? I know about fread/fopen but that tends to be where the file is a general block of text.
If (as I suspect) the answer is no - how would something like a NoSQL database not need to a) create a file per record and b) load everything into memory?? I know Neo4j uses Java but PHP should be able to match that!!
Did you consider Relational Databases such as Mysql, PostgreSql, MS Sql server?
I see that you tried MongoDB, an object-oriented database, and Neo4J, a node-oriented database.
I know that NoSQL is a great trend, but I tried NoSQL with my collections of millions of records and it performs so bad that I switched back to Relational SQL.
If you still insist to go with NoSQL, try Redis and Memcached, they are in-memory databases.
You could use PHP Streams to read/write to a file.
Read file and convert to array
$content = file_get_contents('/tmp/file.csv'); //file.csv contains a,b,c
$csv = implode(","$content); //csv is now array('a', 'b', 'c');
Write to file
$line = array("a", "b", "c");
// create stream
$file = fopen("/tmp/file.csv","w");
// add line as csv
fputcsv($file, $line);
// close stream
fclose($file);
You can also loop and add lines to a csv and loop and retrieve the lines.
https://secure.php.net/manual/en/function.fputcsv.php
You can retrieve multiple lines with fgetcsv which keeps a pointer of the next line to access as well
https://secure.php.net/manual/en/function.fgetcsv.php
When I read the EXIF data from a raw file with exif_read_data() a lot of the data gets corrupted. Or so I think.
The file I'm trying to read is a DNG Raw file from a Pentax K-x camera.
Here is a demo: http://server.patrikelfstrom.se/exif/?file=_IGP6211.DNG
(I've added a standard JPEG from a Canon EOS 1000D as comparison)
I get no errors on this site and it seems to include data that exif_read_data() doesn't return.
http://regex.info/exif.cgi
And the corrupt data I'm talking about is: ...”¯/ѳf/ÇZ/íÔ.ƒ.9:./<ñ.TÛ¨.zâh!o†!™˜...
And: UndefinedTag:0xC65A
The server is running PHP version 5.5.3
Just because the data isn't human readable doesn't mean it's garbage.
Those values that you're seeing are private EXIF fields which are left up to the implementer to determine. They could be binary data, they could be text, they could be anything. This listing can help you determine what some of those values are.
For example, tag 0xC634 is DNGPrivateData which is data specifically for programs that deal with DNG files.
You can map the undefined tags to what they most likely are using this file:
https://github.com/peterhudec/image-metadata-cruncher/blob/master/includes/exif-mapping.php
It looks like your script is dying on 0xc634 => 'SR2Private'
Looking here http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Pentax.html it looks like it is used to store information about the flash on the camera? I don't know for sure, but it probably is not imporant information, and probably not meant to be viewed in text format.
I would probably just make a list of what keys it seems to die on, loop through the exif data, see if it starts with undefinedkey: and either rename the key to the mapped one, or unset those items:
$bad_keys = array('0xc634', ..., '0xc723');
foreach ( $exif as $key => $value ) {
if ( strtolower( substr( $key, 0, 13 ) ) == 'undefinedtag:' ) {
//use the file with the map of undefined tags
//either change the key, or unset it if it's one
//that seems to be corrupt
}
}
I'm creating a C# to PHP Data Connector to allow for a standardized connection to a web server to host data from a database to a C# WinForm application. Everything is working with this one small exception.
The basic of use is this.
C# sends an AES encrypted command to the server. The server parses the command and performs the SQL query and returns an AES encrypted string. This string is then converted to a DataTable in C#.
When the SQL contains a column that is a BLOB I'm only getting back a small part of the full data. It seems that the field is being limited to only the first 2792 bytes.
Is there a setting that is preventing the full contents of the BLOB to be returned?
I'm not sure if it will be helpful, but here is the code that does the work.
$DataConnection = new PDO('mysql:host=10.10.100.102;dbname=jmadata', "root", "nbtis01");
$DataConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
if (isset($Parameters['SQLQuery'])) { // Default List
$SQLQuery = $Parameters['SQLQuery'];
unset($Parameters['SQLQuery']);
}
if (isset($Parameters['LimitOverride'])) {
if (!strpos(strtoupper($SQLQuery), "LIMIT"))
$SQLQuery = rtrim($SQLQuery, ';') . " LIMIT " . $Parameters['LimitOverride'];
unset($Parameters['LimitOverride']);
}
$QueryParams = array();
foreach ($Parameters as $key => $value)
if ($key !== '')
$QueryParams[$key] = $value;
$Query = $DataConnection->prepare($SQLQuery);
$Query->execute($QueryParams);
$ReturnArray = $Query->fetchAll(PDO::FETCH_ASSOC);
if (!$ReturnArray)
$ReturnArray[0] = array("NoResults" => "");
EDIT -- ANSWER
I found my issue. The problem had nothing to do with PDO, PHP or MySQL. I was taking the BLOB data and doing a Base64 to it before putting it in the array, as the split characters I was using to build the result string that would be converted to datatable in c# used non-printable characters and the binary data as a string might have included these characters. The issue was when I was doing a convert in c# to get the original string so that could convert that to a byte array. I was using System.Text.Encoding.ASCII.GetString to convert the Base64 byte array to the original string. This was working on everything but the binary data from the BLOB fields.
The suggestion that it might be a terminating character is what made me find it. Once the Base64 was converted to string using ASCII there was something that was turning into a terminator and it was stopping the convert at that point. Once I found this I changed to System.Text.Encoding.Default.GetString and now it works perfect.
Posted the answer in case anyone else might be trying to do this and having this same issue.
More details in the Edit of the question.
Changed from System.Text.Encoding.ASCII.GetString to System.Text.Encoding.Default.GetString and the issue was resolved.
Thank you crush for pointing me in the right direction to find the solution.
I was reading from a few tutorials on how to upload my image into the DB as binary opposed to putting them on the server itself, well I got it to work like this:
PHP:
$image = chunk_split(base64_encode(file_get_contents($tmpfile)));
mysql_query("INSERT INTO images (`img_location`, `caption`, `user`, `genre`, `when`) VALUES ('$image', '$caption', '$id', '$genre', '$when')");
My issue is how do you now pull it from the database, I've read several ways of doing it, tried them all, can't figure it out, I'm not getting a MySQL error, here's how I'm trying it:
$get_pics = mysql_query("SELECT * FROM images WHERE user='$id' ");
while($get_pics2 = mysql_fetch_array($get_pics))
{
$sixfour_enc = base64_decode($get_pics2['img_location']);
$new .= "<img src=\"".$sixfour_enc."\" >";
}
This works... kind of, what's happening is that it's printing out raw binary in the IMG tag.
How do I get this to compile to a readble image again? Also, is storing the images in the database stupid? Should I just do what I usually do and store them on the server?
Thank you
-mike
You can store images in your database if you want to (though there's nothing wrong with just storing them as files either, choose whatever is appropriate in your situation), but store the raw binary data in a BLOB (i.e. don't encode it with base64). You can embed the binary data you get from file_get_contents in your query directly, provided you use the proper escape function (mysql_real_escape_string in your case) first.
As for the outputting of the image, you can do it the way you're doing it right now, but you'll have to output it base64-encoded and with a data URI scheme like this:
echo '<img alt="embedded image" src="data:image/png;base64,' . chunk_split(base64_encode($get_pics2['img_location'])) . '">';
Note that there are some advantages and disadvantages of embedded image data. Some important disadvantages to be aware of are the severe overhead of base64 encoding (around 33% larger than original) and potential caching problems.
If you don't have any particular reason to store image data in your database then I strongly advise that you just do it the normal way.
This is because, your database engine will be taking a significant performance hit since you will be getting and putting more data than necessary.
Furthermore, BLOB fields on MySQL tables are significantly larger than other tables in terms of size and perform slower on most scenarios.
Just imagining the overhead on your server once you implement this gives me the chills. ;)
I guess it all boils down to scalability. Once your database fills up, your server become less responsive and will eventually become a real hog. Your next options would be to increase server RAM or revise your code to accommodate the bigger load.
Just my 2 cents, hope this helps!
Good luck!
Actually, the best aproach is to have a PHP script to output the binary of the image with proper headers (say getimage.php) and the script used to generate the HTML will have a code like this:
$get_pics = mysql_query("SELECT id FROM images WHERE user='$userid' ");
while($imglist = mysql_fetch_array($get_pics))
{
$new .= '<img src="getimage.php?id='.$imglist['id'].'" title="'.$imglist['caption'].'" />';
}
So the HTML will look like:
<img src="getimage.php?id=12345" title="the caption of" />
You must have a primary key (why not already?) on the table images, let's say id.
The "getimage.php" script shall extract the binary and output the content (assuming that the field img_location contains the actually content of the image:
<?php
// this will output the RAW content of an image with proper headers
$id=(int)$_GET['id'];
$get_img = #mysql_fetch_assoc(#mysql_query("SELECT img_location FROM images WHERE id='$id' "));
$image = imagecreatefromstring(base64_decode($get_img['img_location']));
if ($image){
//you may apply here any transformations on the $image resource
header("Content-Type: image/jpeg"); //OR gif, png, whatever
imagejpeg($image,null,75); //OR imagegif, imagepng, whatever, see PHP doc.
imagedestroy($image); //dump the resource from memory
}
exit();
#mysql_close();
?>
This approach will give you flexibility.
However, you may need to check and sanitize the variables and choose another method of connection to MySQL, as MYSQLi or PDO.
Another good idea is the use of prepared statements to prevent SQL injections.
I was trying to output UTF-8 text read from SQLite database to a text file using fwrite function, but with no luck at all.
When I echo the content to the browser I can read it with no problem. As a last resort, I created the same tables into MySQL database, and surprisingly it worked!
What could be the cause, how can I debug this so that I can use SQLite DB?
I am using PDO.
Below is the code I am using to read from DB and write to file:
$arFile = realpath(APP_PATH.'output/Arabic.txt');
$arfh = fopen($arFile, 'w');
$arTxt = '';
$key = 'somekey';
$sql = 'SELECT ot.langv AS orgv, et.langv AS engv, at.langv AS arbv FROM original ot LEFT JOIN en_vals et ON ot.langk=et.langk
LEFT JOIN ar_vals at ON ot.langk=at.langk
WHERE ot.langk=:key';
$stm = $dbh->prepare($sql);
$stm->execute(array(':key'=>$key));
if( $row = $stm->fetch(PDO::FETCH_ASSOC) ){
$arTxt .= '$_LANG["'.$key.'"] = "'.special_escape($row['arbv']).'";'."\n";
}
fwrite( $arfh, $arTxt);
fclose($arfh);
What could be the cause, how can I debug this so that I can use SQLite DB?
SQLite stores text into the database as it receives it. So if you store UTF-8 encoded text into the SQLite database, you can read UTF-8 text from it.
If you store, let's say, LATIN-1 text into the database, then you can read LATIN-1 text from it.
SQLite itself does not care. So you get out what you put in.
As you write in your question that display in browser looks good I would assume that at least some valid encoded values have been stored inside the database. You might want to look into your browser when you view that data, what your browser tells you in which encoding that data is.
If it says UTF-8 then fine. You might just only view the text-file with an editor that does not support UTF-8 to view files. Because fwrite also does not care about the encoding, it just puts the string data into the file and that's it.
So as long as you don't provide additional information with your question it's hard to tell something more specific.
See as well: How to change character encoding of a PDO/SQLite connection in PHP?