How do I get the complete string of a BLOB using PDO?

How do I get the complete string of a BLOB using PDO? - php

I'm creating a C# to PHP Data Connector to allow for a standardized connection to a web server to host data from a database to a C# WinForm application. Everything is working with this one small exception.
The basic of use is this.
C# sends an AES encrypted command to the server. The server parses the command and performs the SQL query and returns an AES encrypted string. This string is then converted to a DataTable in C#.
When the SQL contains a column that is a BLOB I'm only getting back a small part of the full data. It seems that the field is being limited to only the first 2792 bytes.
Is there a setting that is preventing the full contents of the BLOB to be returned?
I'm not sure if it will be helpful, but here is the code that does the work.
$DataConnection = new PDO('mysql:host=10.10.100.102;dbname=jmadata', "root", "nbtis01");
$DataConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
if (isset($Parameters['SQLQuery'])) { // Default List
$SQLQuery = $Parameters['SQLQuery'];
unset($Parameters['SQLQuery']);
}
if (isset($Parameters['LimitOverride'])) {
if (!strpos(strtoupper($SQLQuery), "LIMIT"))
$SQLQuery = rtrim($SQLQuery, ';') . " LIMIT " . $Parameters['LimitOverride'];
unset($Parameters['LimitOverride']);
}
$QueryParams = array();
foreach ($Parameters as $key => $value)
if ($key !== '')
$QueryParams[$key] = $value;
$Query = $DataConnection->prepare($SQLQuery);
$Query->execute($QueryParams);
$ReturnArray = $Query->fetchAll(PDO::FETCH_ASSOC);
if (!$ReturnArray)
$ReturnArray[0] = array("NoResults" => "");
EDIT -- ANSWER
I found my issue. The problem had nothing to do with PDO, PHP or MySQL. I was taking the BLOB data and doing a Base64 to it before putting it in the array, as the split characters I was using to build the result string that would be converted to datatable in c# used non-printable characters and the binary data as a string might have included these characters. The issue was when I was doing a convert in c# to get the original string so that could convert that to a byte array. I was using System.Text.Encoding.ASCII.GetString to convert the Base64 byte array to the original string. This was working on everything but the binary data from the BLOB fields.
The suggestion that it might be a terminating character is what made me find it. Once the Base64 was converted to string using ASCII there was something that was turning into a terminator and it was stopping the convert at that point. Once I found this I changed to System.Text.Encoding.Default.GetString and now it works perfect.
Posted the answer in case anyone else might be trying to do this and having this same issue.

More details in the Edit of the question.
Changed from System.Text.Encoding.ASCII.GetString to System.Text.Encoding.Default.GetString and the issue was resolved.
Thank you crush for pointing me in the right direction to find the solution.

Related

UTF-8, binary data and special characters issue while reading CSV file in laravel

I am using League/CSV Laravel package to read and manipulate CSV file and save that CSV data into a database but I am facing some issues for some rows only which has some special characters like "45.6 ºF" while reading data from CSV.
I have searched a lot about this problem and found that we should use "UTF-8" or "utf8mb4" in the database collation and save that CSV in "utf8" also but it works only for all those special characters which are on the keyboard.
I want to use all type of special characters like "45.6 ºF" which are not on the keyboard also.
Currently, my code is reading CSV column data and convert it into binary data ' b"column value" ' It adds "b" with the string and converts that string into binary value for only those strings which have any special characters.
I have spent a lot of time but could not find any better solution to this problem. So please help me, I shall be very thankful to you.
$reader = Reader::createFromPath(public_path().'/question.csv', 'r');
$reader->setHeaderOffset(0);
$records = $reader->getRecords();
foreach ($records as $offset => $record) {
$qs = Question::first();
$qs->question = $record['Question'];
$qs->save();
}
It is giving me this result after reading from CSV with "b".
array:2 [▼
"ID" => "1"
"Question" => b"Fahrenheit to Celsius (ºF to ºC) conversion calculator for temperature conversions with additional tables and formulas"
]
but it should be in the string format without "b" binary.
If I copy that string with special characters and assign it to the static variable, then it works fine and saves data into a database like this
$a="Fahrenheit to Celsius (ºF to ºC) conversion calculator for temperature conversions with additional tables and formulas";
$qs = Question::first();
$qs->question = $a;
$qs->save();
After a lot of struggle, i have found the solution of this problem.
I just added this line to code to convert it into utf8_encode before saving in the database.
$r = array_map("utf8_encode", $record);
Don't just copy paste the text from google to save in database because copy paste text and special characters don't work most of the time.
Thanks.

I have found a solution to this problem. below line of code fixed my issue $r = array_map("utf8_encode", $record); We just need to convert into utf8_encode before saving into database.

Do not use any conversion routines; it only leads to "two wrongs accidentally making a right".
With the existence of MySQL's LOAD DATA INFILE, do you even need fgetcsv? Simply execute the LOAD SQL command with the suitable character set specified in the command. The value for that should match the encoding of the csv file. If in doubt, try to get the hex of º from the file:
hex BA --> character set latin1
hex C2BA --> character set utf8 (or utf8mb4)
The column in the database table can be latin1 or utf8 or utf8mb4. The conversion, if needed, will happen during the LOAD.
The degree sign is one of the few special characters that exists in both charsets, so if you have others, latin1 may not be a viable option. (utf8/utf8mb4 has lots more special characters.)
The current use of b"..." may be making things worse by shoehorning C2BA into a latin1 column, leading to Mojibake: Âº instead of º.

How to store and retrieve extended ASCII characters in MSSQL

I was surprised that I was unable to find a straightforward answer to this question by searching.
I have a web application in PHP that takes user input. Due to the nature of the application, users may often use extended ASCII characters (a.k.a. "ALT codes").
My specific issue at the moment is with ALT code 26, which is a right arrow (→). This will be accompanied with other text to be stored in the same field (for example, 'this→that').
My column type is NVARCHAR.
Here's what I've tried:
I've tried doing no conversions and just inserting the value as normal, but the value gets stored as thisâ??that.
I've tried converting the value to UCS-2 in PHP using iconv('UTF-8', 'UCS-2', $value), but I get an error saying Unclosed quotation mark after the character string 't'.. The query ends up looking like this: UPDATE myTable SET myColumn = 'this�!that'.
I've tried doing the above conversion and then adding an N before the quoted value, but I get the same error message. The query looks like this: UPDATE myTable SET myColumn = N'this�!that'.
I've tried removing the UCS-2 conversion and just adding the N before the quoted value, and the query works again, but the value is stored as thisâ that.
I've tried using utf8_decode($value) in PHP, but then the arrow is just replaced with a question mark.
So can anyone answer the (seemingly simple) question of, how can I store this value in my database and then retrieve it as it was originally typed?
I'm using PHP 5.5 and MSSQL 2012. If any question of driver/OS version comes into play, it's a Linux server connecting via FreeTDS. There is no possibility of changing this.

You might try base64 encoding the input, this is fairly trivial to handle with PHP's base64_encode() and base64_decode() and it should handle what ever your users throw at it.
(edit: You can apparently also do the base64 encoding on the SQL Server side. This doesn't seem like something it should be responsible for imho, but it's an option.)

It seems like your freetds.conf is wrong. You need a TDS protocol version >= 7.0 to support unicode. See this for more details.
Edit your freetds.conf:
[global]
# TDS protocol version
tds version = 7.4
client charset = UTF-8
Also make sure to configure PHP correct:
ini_set('mssql.charset', 'UTF-8');

The accepted answer seems to do the job; yes you can encode it to base64 and then decode it back again, but then all the applications that use that remote database, should change and support the fields to be base64 encoded. My thought is that if there is a remote MS SQL Server database, there could be an other application (or applications) that may use it, so that application have to also be changed to support both plain and base64 encoding. And you'll have to also handle both plain text and base64 converted text.
I searched a little bit and I found how to send UNICODE text to the MS SQL Server using MS SQL commands and PHP to convert the UNICODE bytes to HEX numbers.
If you go at the PHP documentation for the mssql_fetch_array (http://php.net/manual/ru/function.mssql-fetch-array.php#80076), you'll see at the comments a pretty good solution that converts the text to UNICODE HEX values and then sends that HEX data directly to MS SQL Server like this:
Convert Unicode Text to HEX Data
// sending data to database
$utf8 = 'Δοκιμή με unicode → Test with Unicode'; // some Greek text for example
$ucs2 = iconv('UTF-8', 'UCS-2LE', $utf8);
// converting UCS-2 string into "binary" hexadecimal form
$arr = unpack('H*hex', $ucs2);
$hex = "0x{$arr['hex']}";
// IMPORTANT!
// please note that value must be passed without apostrophes
// it should be "... values(0x0123456789ABCEF) ...", not "... values('0x0123456789ABCEF') ..."
mssql_query("INSERT INTO mytable (myfield) VALUES ({$hex})", $link);
Now all the text actually is stored to the NVARCHAR database field correctly as UNICODE, and that's all you have to do in order to send and store it as plain text and not encoded.
To retrieve that text, you need to ask MS SQL Server to send back UNICODE encoded text like this:
Retrieving Unicode Text from MS SQL Server
// retrieving data from database
// IMPORTANT!
// please note that "varbinary" expects number of bytes
// in this example it must be 200 (bytes), while size of field is 100 (UCS-2 chars)
// myfield is of 50 length, so I set VARBINARY to 100
$result = mssql_query("SELECT CONVERT(VARBINARY(100), myfield) AS myfield FROM mytable", $link);
while (($row = mssql_fetch_array($result, MSSQL_BOTH)))
{
// we get data in UCS-2
// I use UTF-8 in my project, so I encode it back
echo '1. '.iconv('UCS-2LE', 'UTF-8', $row['myfield'])).PHP_EOL;
// or you can even use mb_convert_encoding to convert from UCS-2LE to UTF-8
echo '2. '.mb_convert_encoding($row['myfield'], 'UTF-8', 'UCS-2LE').PHP_EOL;
}
The MS SQL Table with the UNICODE Data after the INSERT
The output result using a PHP page to display the values
I'm not sure if you can reach my test page here, but you can try to see the live results:
http://dbg.deve.wiznet.gr/php56/mssql/test1.php

Warning: gzdecode(): data error in php

I know this question already asked, but I can't solve my problem, so I explained my problem here kindly help me to solve this.
I am getting data form this example URL by using file_get_contens()
$URL1 = 'abcd.com/xxx';
$URL2 = 'abcd.com/yyy';
$URL3 = 'abcd.com/zzz';
$response1 = file_get_contents($URL1);
$response2 = file_get_contents($URL2);
$response3 = file_get_contents($URL3);
And I compressed response data using gzencode because data too long and added prefix for my reference
then i save compressed data to DB
$arrayResponse['URL1'] = '_|_coMpResSed_|_' . gzencode($response1);
$arrayResponse['URL2'] = '_|_coMpResSed_|_' . gzencode($response2);
$arrayResponse['URL3'] = '_|_coMpResSed_|_' . gzencode($response3);
DB details
Storage Engine : InnoDB
Collation : utf8mb4_unicode_ci
Type : longblob or longtext (both i tried)
And I decompress the data by using gzdecode
$temp1 = explode('_|_coMpResSed_|_', $arrayResponse['URL1']);
$temp2 = explode('_|_coMpResSed_|_', $arrayResponse['URL2']);
$temp3 = explode('_|_coMpResSed_|_', $arrayResponse['URL3']);
if (!empty($temp1[1]) && !empty($temp2[1]) && !empty($temp3[1])) {
$arrayResponse['URL1'] = gzdecode($temp1[1]);//working fine
$arrayResponse['URL2'] = gzdecode($temp2[1]);// getting warning
$arrayResponse['URL3'] = gzdecode($temp3[1]);//working fine
}
And I am getting `Warning:
gzdecode(): data erroron line$arrayResponse['URL2'] = gzdecode($temp2[1]);`
Other lines are working fine . I dont know where I am making mistakes. Can any one help me to get this?

Having same problem, I just had a look at Mysql doc:
https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html
Many encryption and compression functions return strings for which the result might contain arbitrary byte values. If you want to store these results, use a column with a VARBINARY or BLOB binary string data type. This will avoid potential problems with trailing space removal or character set conversion that would change data values, such as may occur if you use a nonbinary string data type (CHAR, VARCHAR, TEXT).
I just change the column data type to VARBINARY and everything's OK.

PostgreSQL Base64 Image decode issue

I am having an issue converting an image stored as base64 in a PostgreSQL database into an image to display on a website. The data type is bytea and I need to get the data via cURL.
I am working with an API to connect to a client's stock system which returns XML data.
I know storing images this way in a DB is not a great idea but that's how the client's system works and it can't be changed as it is a part of an enterprise solution provided by a 3rd Party.
I'm using the following to query the DB for the PICTURE field from the PICTURE table where the PART = 01000015
$ch = curl_init();
$server = 'xxxxxx';
$select = 'PICTURE';
$from = 'picture';
$where = 'part';
$answer = '01000015';
$myquery = "SELECT+".$select."+FROM+".$from.'+WHERE+'.$where."+=+'".$answer."'";
//Define curl options in an array
$options = array(CURLOPT_URL => "http://xx.xxx.xx.xx/GetSql?datasource=$server&query=$myquery+limit+1",
CURLOPT_PORT => "82",
CURLOPT_HEADER => "Content-Type:application/xml",
CURLOPT_RETURNTRANSFER => TRUE
);
//Set options against curl object
curl_setopt_array($ch, $options);
//Assign execution of curl object to a variable
$data = curl_exec($ch);
//Close curl object
curl_close($ch);
//Pass results to the SimpleXMLElement function
$xml = new SimpleXMLElement($data);
//Return String
echo $xml->row->picture;
The response I get from this is: System.Byte[]
Thus if I use base64_decode() in PHP I am obviously just decoding the string "System.Byte[]".
I am guessing that I need to use the DECODE() function in PostgreSQL to convert the data in the query? However, I've tried loads of combinations but I'm stuck. I've had a few downvotes for questions and I'm not too sure why so if this is a bad question I'm sorry, I just really need some help with this one.
(nb:I've replaced the IP and $server with xxxxx for security)
To explain further:
The client has a POS system which is based on ASP.NET and saves the data as XML files on the remote server. I have access to this data via an API which includes a SQL query function using HTTP/cURL defined as follows:
http://remoteserver:82/pos.asmx.GetSql?datasource=DATASOURCE&query=MYQUERY
So to get the field that contains the picture data I am currently usingthe above code.
The query is in the CURL URL i.e. http://remoteserver:82/pos.asmx.GetSql?datasource=12345&query=SELECT+*+FROM+picture+WHERE+part+=+'01000015'";
However, this returns System.Byte[] instead of encoded data which I can then decode in PHP.
Additional info:
PostgreSQL version: PostgreSQL 9.1.3 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-51), 32-bit
Table Schema:
Available here: http://i.stack.imgur.com/sc8Gw.png

You should preferably have the server storing the data in PostgreSQL as a bytea field, then encoding to base64 to send to the client, but it sounds like you don't control the server.
The string System.Byte[] suggests it's an app using .NET, like ASP.NET or similar, and it's not correctly handling a bytea array. Instead of formatting it as base64 for output it's embedding the type name in the output.
You can't fix that on the client side, because the server is sending the wrong data.
You'll need to show the server-side tables and queries.
Update after query amended:
You're storing a bytea and returning it directly. The client doesn't seem to understand byte arrays and tries to output it naïvely, probably something like casting it to a string. Since the documentation says it expects "base64" you should probably provide that, instead of a byte array.
PostgreSQL has a handy function to base64-encode bytea data: encode.
Try:
SELECT
account, company, date_amended,
depot, keyfield, part,
encode(picture, 'base64') AS picture,
picture_size, source
FROM picture
WHERE part = '01000015'
The formating isn't significant, it just makes it easier to read here

Dealing with eacute and other special characters using Oracle, PHP and Oci8

Hi I am trying to store names into an Oracle database and fetch them back using PHP and oci8.
However, if I insert the é directly into the Oracle database and use oci8 to fetch it back I just receive an e
Do I have to encode all special characters (including é) into html entities (ie: é) before inserting into database ... or am I missing something ?
Thx
UPDATE: Mar 1 at 18:40
found this function:
http://www.php.net/manual/en/function.utf8-decode.php#85034
function charset_decode_utf_8($string) {
if(#!ereg("[\200-\237]",$string) && #!ereg("[\241-\377]",$string)) {
return $string;
}
$string = preg_replace("/([\340-\357])([\200-\277])([\200-\277])/e","'&#'.((ord('\\1')-224)*4096 + (ord('\\2')-128)*64 + (ord('\\3')-128)).';'",$string);
$string = preg_replace("/([\300-\337])([\200-\277])/e","'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'",$string);
return $string;
}
seems to work, although not sure if its the optimal solution
UPDATE: Mar 8 at 15:45
Oracle's character set is ISO-8859-1.
in PHP I added:
putenv("NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1");
to force the oci8 connection to use that character set.
Retrieving the é using oci8 from PHP now worked ! (for varchars, but not CLOBs had to do utf8_encode to extract it )
So then I tried saving the data from PHP to Oracle ... and it doesnt work..somewhere along the way from PHP to Oracle the é becomes a ?
UPDATE: Mar 9 at 14:47
So getting closer.
After adding the NLS_LANG variable, doing direct oci8 inserts with é works.
The problem is actually on the PHP side.
By using ExtJs framework, when submitting a form it encodes it using encodeURIComponent.
So é is sent as %C3%A9 and then re-encoded into é.
However it's length is now 2 (strlen($my_sent_value) = 2) and not 1.
And if in PHP I try: $my_sent_value == é = FALSE
I think if I am able to re-encode all these characters in PHP back into lengths of byte size 1 and then inserting them into Oracle, it should work.
Still no luck though
UPDATE: Mar 10 at 11:05
I keep thinking I am so close (yet so far away).
putenv("NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P9"); works very sporadicly.
I created a small php script to test:
header('Content-Type: text/plain; charset=ISO-8859-1');
putenv("NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P9");
$conn= oci_connect("user", "pass", "DB");
$stmt = oci_parse($conn, "UPDATE temp_tb SET string_field = '|é|'");
oci_execute($stmt, OCI_COMMIT_ON_SUCCESS);
After running this once and loggin into the Oracle Database directly I see that STRING_FIELD is set to |¿|. Obviously not what I had come to expect from my previous experience.
However, if I refresh that PHP page twice quickly.... it worked !!!
In Oracle I correctly saw |é|.
It seems like maybe the environment variable is not being correctly set or sent in time for the first execution of the script, but is available for the second execution.
My next experiment is to export the variable into PHP's environment, however, I need to reset Apache for that...so we'll see what happens, hopefully it works.

I presume you are aware of these facts:
There are many different character sets: you have to pick one and, of course, know which one you are using.
Oracle is perfectly capable of storing text without HTML entities (é). HTML entities are used in, well, HTML. Oracle is not a web browser ;-)
You must also know that HTML entities are not bind to a specific charset; on the contrary, they're used to represent characters in a charset-independent context.
You indistinctly talk about ISO-8859-1 and UTF-8. What charset do you want to use? ISO-8859-1 is easy to use but it can only store text in some latin languages (such as Spanish) and it lacks some common chars like the € symbol. UTF-8 is trickier to use but it can store all characters defined by the Unicode consortium (which include everything you'll ever need).
Once you've taken the decision, you must configure Oracle to hold data in such charset and choose an appropriate column type. E.g., VARCHAR2 is fine for plain ASCII, NVARCHAR2 is good for UTF-8.

This is what I finally ended up doing to solve this problem:
Modified the profile of the daemon running PHP to have:
NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1
So that the oci8 connection uses ISO-8859-1.
Then in my PHP configuration set the default content-type to ISO-8859-1:
default_charset = "iso-8859-1"
When I am inserting into an Oracle Table via oci8 from PHP, I do:
utf8_decode($my_sent_value)
And when receiving data from Oracle, printing the variable should just work as so:
echo $my_received_value
However when sending that data over ajax I have had to use:
utf8_encode($my_received_value)

If you really cannot change the character set that oracle will use then how about Base64 encoding your data before storing it in the database. That way, you can accept characters from any character set and store them as ISO-8859-1 (because Base64 will output a subset of the ASCII character set which maps exactly to ISO-8859-1). Base64 encoding will increase the length of the string by, on average, 37%
If your data is only ever going to be displayed as HTML then you might as well store HTML entities as you suggested, but be aware that a single entity can be up to 10 characters per unencoded character e.g. ϑ is ϑ

I had to face this problem : the LatinAmerican special characters are stored as "?" or "¿" in my Oracle database ... I can't change the NLS_CHARACTER_SET because we're not the database owners.
So, I found a workaround :
1) ASP.NET code
Create a function that converts string to hexadecimal characters:
public string ConvertirStringAHex(String input)
{
Encoding encoding = System.Text.Encoding.GetEncoding("ISO-8859-1");
Byte[] stringBytes = encoding.GetBytes(input);
StringBuilder sbBytes = new StringBuilder(stringBytes.Length);
foreach (byte b in stringBytes)
{
sbBytes.AppendFormat("{0:X2}", b);
}
return sbBytes.ToString();
}
2) Apply the function above to the variable you want to encode, like this
myVariableHex = ConvertirStringZHex( myVariable );
In ORACLE, use the following:
PROCEDURE STORE_IN_TABLE( iTEXTO IN VARCHAR2 )
IS
BEGIN
INSERT INTO myTable( SPECIAL_TEXT )
VALUES ( UTL_RAW.CAST_TO_VARCHAR2(HEXTORAW( iTEXTO ));
COMMIT;
END;
Of course, iTEXTO is the Oracle parameter which receives the value of "myVariableHex" from ASP.NET code.
Hope it helps ... if there's something to improve pls don't hesitate to post your comments.
Sources:
http://www.nullskull.com/faq/834/convert-string-to-hex-and-hex-to-string-in-net.aspx
https://forums.oracle.com/thread/44799

If you have different charsets between the server side code (php in this case) and the Oracle database, you should set server side code charset in the Oracle connection, then Oracle made the conversion.
Example: Let's assume:
php charset utf-8 (default).
Oracle charset AMERICAN_AMERICA.WE8ISO8859P1
In the connection to Oracle made by php you should set UTF8 (third parameter).
oci_pconnect("USER", "PASS", "URL"),"UTF8");
Doing this, you write code in utf-8 (not doing any conversion at all) and get utf-8 from the database through this connection.
So you could write something like SELECT * FROM SOME_TABLE WHERE TEXT = 'SOME TEXT LIKE áéíóú Ñ' and also get utf-8 text as a result.
According to the php documentation, by default, Oracle client (oci_pconnect) takes the NLS_LANG environment variable from the Operating system. Some debian based systems has no NLS_LANG enviromental variable, so I think Oracle client use it's own charset (AMERICAN_AMERICA.WE8ISO8859P1) if we don't specify the third parameter.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.