PHP MYSQL BLOB export to csv format - php

I am trying to create a script that can export data from a mysql database containing files (pdf mainly). This data is then imported onto the local version of the system.
However I have a problem with the BLOB field, when I export and import using PHPMYADMIN, it works fine, however when using my script, the BLOB field has additional code added to the top. Almost as if it is code to instruct programs how to deal with it. When i try to import this version into PHPMYADMIN, it doesnt work.
Is is a formatting error, do i need to convert it to a type. At present it is simply pulled from the database as a row['content'] item and then outputted to the csv
Any help would be much appreciated
Regards
///////////////// source code of my script
$file = 'supportingFiles'; // csv name.
$result = mysql_query("SHOW COLUMNS FROM files");
$a = 0;
if (mysql_num_rows($result) > 0) { //print column titles based on number of columns in the two files
while ($row = mysql_fetch_assoc($result)) {
$a++;
}
}
$values = mysql_query("SELECT * FROM files "); //select client details wehere required
while ($rows = mysql_fetch_array($values)) { //print client information
for ($k=0;$k<$a;$k++) {
$csv_output .= $rows[$k].",";
}
$csv_output .= "\n"; //end of line
}
$filename = $file."_".date("d-m-Y_H-i",time());
header("Content-type: application/csv");
header("Content-disposition: csv" . date("Y-m-d") . ".csv");
header( "Content-disposition: filename=supportingFiles.csv");
echo $csv_output; //output data file
when an export is ran using this script the following proceeds the actual content outputted when a csv export of the table is run from phpmyadmin...
%âãÏÓ
%%ISIS AfpToPdf-V.6.2/h3 '2008-05-19 (build:6.20.0.08205)'
4 0 obj
[
/DeviceRGB
]
endobj
5 0 obj
[/Pattern 4 0 R]
endobj
6 0 obj
[
/DeviceCMYK
]
endobj
7 0 obj
[/Pattern 6 0 R]
endobj
14 0 obj
<</Length 1221/Filter/FlateDecode>>
stream
xÚ•WÛnã6ýÿümÌ%)’¢úo6Š6Û„ûTÈckW–RYÎ6ýüh‡CJ¶Ûi€\à™áp.gÇ‹|Æ!ÿ6{wÃA#~‚ãçgø3Í %ph×3ü^ïÿæ[P2cB‚É4Käw?
R<¦èØo-\ïðs³_oüqŽ.yL11°Ø·uåžà}ÕìÜTûáÏüã-\-ó©.ÿ„ÅŸ^þ!Ÿý=óõ‘YÆRƒ=±ÌÀ¬Á$´îÊ”)•TÉ ±6a¶×JŸúXËY¦ÇJµÖ²äìQ«/]k&ÍyÇœyV›¦gµši{^›0c†¨\c-jÏÆlì!ªÚQ! Íä^#YrÁóQ™_juÊ5x¶“˜µaâÂYÉôùZ)ëQÛk³ÉY•2~>_¥Ž1×{þ4[ä’ŒÉÅv2<¾ÿâ{Œ_ïªr…ŒžÂð©©„9øú™FFÄZÄ«ª†1Ö ŒÎ’58Tj˜â‡$áb¯D`²H‡¾
üí</m›ºÛ|¿ƒÕ¦h×n²ã¯—ˆøQc¿¼Ç‰ãvûå·ê k<ÙyJ+¦
"2¦75^‚Ó ãg&x M„ñ4–êØå wÈ°\äå%BÄËØŒ4쉷è÷çßASÃsb¢ñ(¾t_bO¼Gî…›7]Qõe„û¦=ªo:9ø| Ö”3¾2õØéPü74–š 2qÔYle%Ûn‹ö š{xjö-«U³¯»·5Wáhz‹ª¨Wˆ¿¶ÙÂCëËf¿ƒeœ¥ Ÿ”Ã\{IÆðE 9i=î… ð3ŠŠß‹§­«;DõÊ•îî¸ 4ñþYöóÓÎ^¯<ÁeN$…ö±öG ¥§÷kÔå”^]¥Xò2¼cˆ}\>¢LdQÅSA7=6RN¾‘¥B— dXêjë[w{z¿Uâ‡WIÈxVÊK¥`ܬpЗֈx—þ! U¦¨¿ús ½%ÁBŒœãÃ%؃ï¤ïž«p†ËzMÁaÞ6Ê»†"
iç\lMÏ>ÈL®v‘€šû8ê:Ú<kwh
‰h¶l#ì†ý"ô·+ÊjÇN½ü§7\2¥éû=0å‹¡¾ô½&±HW´ƒ„J^…–ǯ"?

that is because CSV is not well standardized as a format (not even the "comma" as in "comma seperated values") and it is because phpmyadmin does some encoding/decoding on the values when exporting/importing.
You need this encoding/decoding part, because your binary BLOB (as you said, mostly PDF) can easily contain commas, line breaks, quotes, and all kinds of stuff that breaks a CSV parser.
To import your files using phpmyadmin, you would have to replicate the encoding machanism used there - lucky you that it's open source and you can have a look at the code.
Alternatively, if you want your own export/import mechanism (lets say: you write your own importer that matches your exporter) then you could make good use of base64 encoding here to ensure your CSV (which is intended as a plain-text format by the way) stores binary data correctly.
exporter:
// convert binary blob to text format
$plaintextdata_for_csv = base64_encode($binarydata_from_blob);
importer:
// decode text format to binary blob
$binarydata_for_blob = base64_decode($plaintextdata_from_csv);

Related

CSV Import showing weird characters - PHP CSVLeague

I am trying to import csv. And, here is my csv file for test.
https://drive.google.com/file/d/1LU4SHG_EbSj9OTRXRakNHgllZD45iaS3/view?usp=sharing
I am providing exact file so that the file structure / encoding may not changed if I paste the csv text here.
I am using CSV League https://csv.thephpleague.com/ and I have code for import as follows:
$header_row = 0;
$filename = 'test_extra.csv'
$csv_file_path = storage_path('app/files/').$filename;
if (!ini_get("auto_detect_line_endings")) {
ini_set("auto_detect_line_endings", TRUE);
}
$csv = Reader::createFromPath($csv_file_path, 'r');
//trying to conver encode to utf-8
$csv->addStreamFilter('convert.iconv.ISO-8859-15/UTF-8');
$csv->setHeaderOffset($header_row);
$sample_data = $csv->fetchOne($header_row);
print_r($sample_data);
But, It is printing data as:
This is totally weird because the first row possess data something like:
‌0858,‌0858-A1,‌One Bedroom All.a1,1,1,702,2,‌1X1,0
Is there something I might be missing?

Why is file_get_contents() not reading certain data from file CSV file

I have a PHP v5.6 script that reads an uploaded CSV file containing a few rows and columns of plaintext, and one of the cells contains JavaScript code. Here is the output from my Ubuntu 18.04 bash shell when I do cat test.csv...
First Name,Last Name
Jane,Doe
John,<script>alert(‘test’);</script>
In my PHP script, the uploaded file is read directly from the superglobal array: $csv = $_FILES["Uploaded_File"]["tmp_name"].
If I do $contents = strip_tags(file_get_contents($csv));, then the contents of the JS text cell is read, along with all the other plaintext cells: var_dump($contents) displays string(55) "First Name,Last Name Jane,Doe John,alert(‘test’); "
But if I do $contents2 = file_get_contents($csv);, then all the data in the CSV is read, EXCEPT for the JS text cell: var_dump($contents2) displays string(72) "First Name,Last Name Jane,Doe John, "
Why is $contents2 = file_get_contents($csv); not showing the JS text cell?
Why is var_dump($contents2) showing string(72) when the actual string has more bytes than what is being displayed?
Here is a snippet of the script:...
$csv = $_FILES["Uploaded_File"]["tmp_name"];
$contents = strip_tags(file_get_contents($_FILES["Uploaded_File"]["tmp_name"]));
echo __FILE__." = ".__LINE__." = ";var_dump($contents);echo "<hr />";
$contents2 = file_get_contents($_FILES["Uploaded_File"]["tmp_name"]);
echo __FILE__." = ".__LINE__." = ";var_dump($contents2);echo "<hr />";
I read the PHP manual about file_get_contents(), but there is no mention about tags not being allowed in the file that is being read from (https://www.php.net/manual/en/function.file-get-contents.php). There is an option to use a context, but I do not recognize any option that would solve this.
Maybe it is a php.ini setting...? :/
If you want to view the result in a browser, you should output the content as plain text and also with a appropriate charset because the quote marks around test (or Yikes) are not ASCII characters.
header('Content-Type: text/plain; charset=gb18030');
var_dump($contents2);

PHP - HTML Table convert to CSV just works for ~ 700 rows

I'm trying to convert HTML tables to CSV with my PHP-Script.
I have a lot of HTML files and each of them contains just one table with very simple structure
<table><th></th><tr><td></td></tr></table>
The HTML tables in the different files have between 300 and 2000 rows. My PHP script converts all of them with about 800 rows in under a second to CSV and everything works. But the others (with 900 rows a more) don't work. I always get an empty CSV File with just "" in it (opening it in Excel).
I run the script local on MAMP and the PHP error log says for that non-working files:
PHP Fatal error: Uncaught Error: Call to a member function find() on boolean in /Applications/MAMP/htdocs/convertcsv.php:29
Thats my single fucntion in convertcsv.php script which makes the whole conversion:
function convertToCSV($input,$output) {
$newFileContent = "";
file_put_contents($output, $newFileContent);
echo "File created (" . $output . ")";
$table = file_get_contents($input);
$html = str_get_html($table);
//Generate the CSV file header
header("Content-type: application/vnd.ms-excel");
header("Content-Encoding: UTF-8");
header("Content-type: text/csv; charset=UTF-8");
$fp = fopen($output, "w");
fwrite($fp,"\xEF\xBB\xBF");
foreach($html->find('tr') as $element) {
$td = [];
$kinder = $element->children();
foreach( $kinder as $kind) {
$td[] = $kind->plaintext;
}
fputcsv($fp, $td, ';');
}
fclose($fp);
}
Line 29 is the foreach-Loop.
Maybe you know why the script does work perfectly with tables with up to ~ 800 rows and not with bigger ones?
Thanks a lot guys
I found the problem. I just have to increase the MAX_FILE_SIZE in the included simplehtmldom.php - Greetings

Cut string and gunzip using PHP

I have a router config export file, which contains a header of 20 bytes, followed by zlib compressed data. Once uncompressed it should contain plain xml content.
My code strips the first 20 bytes, and decompresses the file. The exported data is still binary. I used file_get_contents and file_put_contents first, but assumed (wrongly) it wasn't binary-safe. I've tried to change the 20 in anything from 1-1000 without avail.
<?
$fp_orig = fopen('config.cfg', "rb");
$data_orig = fread($fp_orig, filesize('config.cfg'));
fclose($fp_orig);
$bytes = 20; // tried 1-1000
$data_gz = substr($data_orig,$bytes);
$fp_gz = fopen('config.cfg.gz', 'w');
fwrite($fp_gz, $data_gz);
fclose($fp_gz);
$fp_gz = gzopen('config.cfg.gz', 'rb');
$fp_xml = fopen('config.cfg.xml', 'wb');
while(!gzeof($fp_gz))
{
fwrite($fp_xml, gzread($fp_gz, 4096));
}
fclose($fp_xml);
gzclose($fp_gz);
echo file_get_contents('config.cfg.xml'); // gives binary data
?>
I'm not particularly looking for turnkey working code, but rather a push into the right direction.

python how to write data into a single file with different call

I have a python script and a PHP page on my website. I have to create a csv file and write the data onto it in following manner-
PHP processes the GET requests and calls the python script by shell_exec, passing the GET variables into it, is JSON format. The script here looks like-
...
$var_a = $_GET['a'];
$var_b = $_GET['b'];
$var_c = $_GET['c'];
$post_data = array('item' => 'get-list',
'var a' => "$var_a",
'var b' => "$var_b",
'var c' => "$var_c"
);
$json_data = json_encode($post_data);
$temp = shell_exec('python process_data.py "' . $json_data .'"');
echo $temp;
...
This script is quite straight forward, which does simply collects the GET variables and forms JSON object of it. Now this data is passed to process_data.py, which processes the data and saves the data into a csv file, the code for which is-
import sys, json, time, csv
json_data = json.dumps(sys.argv[1])
def scr(json_data):
json_data = json_data
count = 0
json_list = json_data.split(',')
for i in json_list:
count=count+1
if(count==5):
lati = json_list[0].strip('{')
longi = json_list[1]
datestmp = json_list[2]
timestmp = json_list[3]
out = open('file.csv', 'w')
out.write('%s;' % lati.split(':')[1])
out.write('\n')
out.write('%s;' % longi.split(':')[1])
out.write('\n')
out.write('%s;' % datestmp.split(':')[1])
out.write('\n')
out.write('%s;' % timestmp.split(':',1)[1])
out.write('\n')
out.close()
print 'Latitude: ',lati.split(':')[1],', Longitude: ',longi.split(':')[1],', Datestamp: ',datestmp.split(':')[1],', Time: ', timestmp.split(':',1)[1]
else:
print 'Wrong data received'
scr(json_data)
All the things are working well. The data is processed and saved. The PHP script also saves the current data into database. So PHP does two operations-
1. Saves the data into database
2. Pass the data to python script which saves the data into CSV file.
However, the GET data is coming at regular time-interval(say 1 or 2 seconds). So, the call to Python file gets regular for these intervals. It will have to open the CSV file every 1-2 seconds, write the data there. This will slow down the processing.
I want the CSV file to remain open till the data is coming, and close it when the GET request is not made for a long time. Is there any way to do so?

Categories