Parse large CSV file in a short time in PHP - php

I have been looking for how to find a value in one line and return the value of another column in a CSV file.
This is my function and it works fine but in small files:
function find_user($filename, $id) {
$f = fopen($filename, "r");
$result = false;
while ($row = fgetcsv($f, 0, ";")) {
if ($row[6] == $id) {
$result = $row[5];
break;
}
}
fclose($f);
return $result;
}
The problem is that the actual file with which I must work has a size of 4GB. And the time it takes to search is tremendous.
Navigating through Stack Overflow, I found the following post:
file_get_contents => PHP Fatal error: Allowed memory exhausted
There they give me the following function that (from what I understood) makes it easier for me to search for huge CSV values:
function file_get_contents_chunked($file,$chunk_size,$callback)
{
try
{
$handle = fopen($file, "r");
$i = 0;
while (!feof($handle))
{
call_user_func_array($callback,array(fread($handle,$chunk_size),&$handle,$i));
$i++;
}
fclose($handle);
}
catch(Exception $e)
{
trigger_error("file_get_contents_chunked::" . $e->getMessage(),E_USER_NOTICE);
return false;
}
return true;
}
And the way of using it seems to be the following:
$success = file_get_contents_chunked("my/large/file",4096,function($chunk,&$handle,$iteration){
/*
* Do what you will with the {&chunk} here
* {$handle} is passed in case you want to seek
** to different parts of the file
* {$iteration} is the section fo the file that has been read so
* ($i * 4096) is your current offset within the file.
*/
});
if(!$success)
{
//It Failed
}
The problem is that I do not know how to adapt my initial code to work with the raised function to speed up the search in large CSVs. My knowledge in PHP is not very advanced.

No matter how you read the file, there's no way to make search faster since you always have to scan every character while searching for the correct row and column. Worst case is when the row you're looking for is the last one in a file.
You should import your CSV to a proper indexed database and modify your application to further save new records to that database instead of a CSV file.
Here's a rudimentary example using SQLite. I created a CSV file with 100 million records (~5GB) and tested with it.
Create a SQLite database and import your CSV file into it:
$f = fopen('db.csv', 'r');
$db = new SQLite3('data.db');
$db->exec('CREATE TABLE "user" ("id" INT PRIMARY KEY, "name" TEXT,
"c1" TEXT, "c2" TEXT, "c3" TEXT, "c4" TEXT, "c5" TEXT)');
$stmt = $db->prepare('INSERT INTO "user"
("id", "name", "c1", "c2", "c3", "c4", "c5") VALUES (?, ?, ?, ?, ?, ?, ?)');
$stmt->bindParam(1, $id, SQLITE3_INTEGER);
$stmt->bindParam(2, $name, SQLITE3_TEXT);
$stmt->bindParam(3, $c1, SQLITE3_TEXT);
$stmt->bindParam(4, $c2, SQLITE3_TEXT);
$stmt->bindParam(5, $c3, SQLITE3_TEXT);
$stmt->bindParam(6, $c4, SQLITE3_TEXT);
$stmt->bindParam(7, $c5, SQLITE3_TEXT);
$db->exec('BEGIN TRANSACTION');
while ($row = fgetcsv($f, 0, ';')) {
list($c1, $c2, $c3, $c4, $c5, $name, $id) = $row;
$stmt->execute();
}
$db->exec('COMMIT');
This takes a long time, over 15 minutes on my computer, resulting a 6.5GB file.
Search from a database:
$id = 99999999;
$db = new SQLite3('data.db');
$stmt = $db->prepare('SELECT "name" FROM "user" WHERE "id" = ?');
$stmt->bindValue(1, $id, SQLITE3_INTEGER);
$result = $stmt->execute();
print_r($result->fetchArray());
This executes virtually instantenously.

Related

Inserting data from mysql from CSV using PHP PDO

I have list of data in CSV and need to insert this data into a MySQL database. These data should be safely inserted i.e sanitation. So, I have used PDO object to rectify SQL injection. But, it fails to get data from CSV file and inserts null values.
Here is the example,
<?php
$servername = "localhost";
$username = "root";
$password = "";
try {
$conn = new PDO("mysql:host=$servername;dbname=contact_list",$username,$password);
$conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
echo "connection successfully";
}
catch(PDOException $e)
{
echo "connection Failed:" . $e -> getMessage();
}
// Create CSV to Array function
function csvToArray($filename = '', $delimiter = ',')
{
if (!file_exists($filename) || !is_readable($filename)) {
return false;
}
$header = NULL;
$result = array();
if (($handle = fopen($filename, 'r')) !== FALSE) {
while (($row = fgetcsv($handle, 1000, $delimiter)) !== FALSE) {
if (!$header)
$header = $row;
else
$result[] = array_combine($header, $row);
}
fclose($handle);
}
return $result;
}
// Insert data into database
$all_data = csvToArray('contact.csv');
foreach ($all_data as $data) {
$data = array_map(function($row){
return filter_var($row, FILTER_SANITIZE_STRING, FILTER_SANITIZE_FULL_SPECIAL_CHARS);
}, $data);
$sql = $conn->prepare("INSERT INTO contact
(title, first_name,last_name,company_name,date_of_birth,notes)
VALUES (:t, :fname, :lname,:cname,:dob,:note)");
$sql->bindParam(':t', $data[1], PDO::PARAM_STR);
$sql->bindParam(':fname', $data[2], PDO::PARAM_STR);
$sql->bindParam(':lname', $data[3], PDO::PARAM_STR);
$sql->bindParam(':cname', $data[0], PDO::PARAM_STR);
$sql->bindParam(':dob', $data[4], PDO::PARAM_STR);
$sql->bindParam(':note', $data[15], PDO::PARAM_STR);
print_r($data);
$sql->execute();
}
?>
Can anyone help me to solve this?
If you take a look at the documentation for array_combine() you'll see that its purpose is to build an associative array. You use this function in csvToArray() but later in your code you are trying to get data using numeric keys. I wouldn't expect you'd ever have anything inserted.
On a side note, you are completely defeating the purpose of prepared statements by repeatedly preparing the same statement over and over again. Prepare once and execute many times. Individually binding parameters is rarely needed, in almost all cases you can provide the data to PDOStatement::execute() as an array. It's also bad form to store HTML entities in a database; if you need to output to HTML, you perform escaping at that point.
Something like this should work (adjust array key names as necessary.)
$all_data = csvToArray('contact.csv');
$sql = $conn->prepare("INSERT INTO contact
(title, first_name, last_name, company_name, date_of_birth, notes)
VALUES (:t, :fname, :lname,:cname,:dob,:note)");
foreach ($all_data as $data) {
$params = [
":t" => $data["t"],
":fname" => $data["fname"],
":lname" => $data["lname"],
":dob" => $data["dob"],
":note" => $data["note"],
];
$sql->execute($params);
}

Do While loop works well with echo but loops only once with a function inside loop

This code is supposed to insert 100 rows into the DB.
Yet when I run it, it loops only once, inserts one row and stops.
I replaced the function call with :
echo $keywords[4].'<br>';
It works perfectly. with no probleb
What is missing so that it will insert all rows into DB?
what should I change q add so that the code will insert all rows in the file
Here is the loop code:
do{
//Insert row content into array.
$keywords = preg_split("#\<(.*?)\>#", $row);
//Insert relevant data into DB
add_data($keywords);
}
else{
// If row is irrelevant - continue to next row
continue;
}
}while (strpos($row, 'Closed P/L') != true);
Here is the function
function add_data($keywords)
{
global $db;
$ticket =$keywords[2];
$o_time = $keywords[4];
$type = $keywords[6];
$size = $keywords[8];
$item = substr($keywords[10], 0, -1);
$o_price = $keywords[12];
$s_l = $keywords[14];
$t_p = $keywords[16];
$c_time = $keywords[18];
$c_price = $keywords[20];
$profit = $keywords[28];
try
{
$sql = "
INSERT INTO `data`
(ticket, o_time, type, size, item, o_price, s_l, t_p, c_time, c_price, profit)
VALUES
(:ticket, :o_time, :type, :size, :item, :o_price, :s_l, :t_p, :c_time, :c_price, :profit)";
$stmt = $db->prepare($sql);
$stmt->bindParam('ticket', $ticket, PDO::PARAM_STR);
$stmt->bindParam('o_time', $o_time, PDO::PARAM_STR);
$stmt->bindParam('type', $type, PDO::PARAM_STR);
$stmt->bindParam('size', $size, PDO::PARAM_STR);
$stmt->bindParam('item', $item, PDO::PARAM_STR);
$stmt->bindParam('o_price', $o_price, PDO::PARAM_STR);
$stmt->bindParam('s_l', $s_l, PDO::PARAM_STR);
$stmt->bindParam('t_p', $t_p, PDO::PARAM_STR);
$stmt->bindParam('c_time', $c_time, PDO::PARAM_STR);
$stmt->bindParam('c_price', $c_price, PDO::PARAM_STR);
$stmt->bindParam('profit', $profit, PDO::PARAM_STR);
$stmt->execute();
//return true;
}
catch(Exception $e)
{
return false;
echo 'something is wrong. Here is the system\'s message:<br>'.$e;
}
}

Xml to Mysql Fatal error

Im trying to get this working, but i do something wrong, im now trying 2 days to get it worked but no luck. Maybe someone can help me with the code
I have a xml file on the internet (see the example) and i want to put that xml in my mysql with php.
This is the example of my XML file
<AssignmentItems xmlns="http://server.my.net" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:schemaLocation="http://server.my.net/static/xsd/datafeed_assignments.xsd" total="29">
<Assignment>
<Id>253049101</Id>
<Status>Enroute</Status>
<Location>EBCI</Location>
<From>EBCI</From>
<Destination>EHGG</Destination>
<Assignment>1 PixCha Passer</Assignment>
<Amount>1</Amount>
<Units>passengers</Units>
<Pay>911.00</Pay>
<PilotFee>0.00</PilotFee>
<Expires>3 days</Expires>
<ExpireDateTime>2017-12-01 06:01:56</ExpireDateTime>
<Type>Trip-Only</Type>
<Express>False</Express>
<Locked>sharee</Locked>
<Comment/>
</Assignment>
</AssignmentItems>
i use this PHP code to send it to my MYSQL
<?php
$db = new PDO('mysql:host=localhost;dbname=test','root','');
$xmldoc = new DOMDocument();
$xmldoc->load('**XML URL**');
$xmldata = $xmldoc->getElementsByTagName('Assignment');
$xmlcount = $xmldata->length;
for ($i=0; $i < $xmlcount; $i++) {
$Id = $xmldata->item($i)->getElementsByTagName('Id')->item(0)->childNodes->item(0)->nodeValue;
$Status = $xmldata->item($i)->getElementsByTagName('Status')->item(0)->childNodes->item(0)->nodeValue;
$Location = $xmldata->item($i)->getElementsByTagName('Location')->item(0)->childNodes->item(0)->nodeValue;
$Fram = $xmldata->item($i)->getElementsByTagName('From')->item(0)->childNodes->item(0)->nodeValue;
$Destination = $xmldata->item($i)->getElementsByTagName('Destination')->item(0)->childNodes->item(0)->nodeValue;
$Assignment = $xmldata->item($i)->getElementsByTagName('Assignment')->item(0)->childNodes->item(0)->nodeValue;
$Amount = $xmldata->item($i)->getElementsByTagName('Amount')->item(0)->childNodes->item(0)->nodeValue;
$Units = $xmldata->item($i)->getElementsByTagName('Units')->item(0)->childNodes->item(0)->nodeValue;
$Pay = $xmldata->item($i)->getElementsByTagName('Pay')->item(0)->childNodes->item(0)->nodeValue;
$PilotFee = $xmldata->item($i)->getElementsByTagName('PilotFee')->item(0)->childNodes->item(0)->nodeValue;
$Expires = $xmldata->item($i)->getElementsByTagName('Expires')->item(0)->childNodes->item(0)->nodeValue;
$ExpireDateTime = $xmldata->item($i)->getElementsByTagName('ExpireDateTime')->item(0)->childNodes->item(0)->nodeValue;
$Type = $xmldata->item($i)->getElementsByTagName('Type')->item(0)->childNodes->item(0)->nodeValue;
$Express = $xmldata->item($i)->getElementsByTagName('Express')->item(0)->childNodes->item(0)->nodeValue;
$Locked = $xmldata->item($i)->getElementsByTagName('Locked')->item(0)->childNodes->item(0)->nodeValue;
$stmt = $db->prepare("insert into jobs values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)");
$stmt->bindParam(1, $Id);
$stmt->bindParam(2, $Status);
$stmt->bindParam(3, $Location);
$stmt->bindParam(4, $Fram);
$stmt->bindParam(5, $Destination);
$stmt->bindParam(6, $Assignment);
$stmt->bindParam(7, $Amount);
$stmt->bindParam(8, $Units);
$stmt->bindParam(9, $Pay);
$stmt->bindParam(10, $PilotFee);
$stmt->bindParam(11, $Expires);
$stmt->bindParam(12, $ExpireDateTime);
$stmt->bindParam(13, $Type);
$stmt->bindParam(14, $Express);
$stmt->bindParam(15, $Locked);
$stmt->execute();
printf($Id.'<br/>');
printf($Status.'<br/>');
printf($Location.'<br/>');
printf($Fram.'<br/>');
printf($Destination.'<br/>');
printf($Assignment.'<br/>');
printf($Amount.'<br/>');
printf($Units.'<br/>');
printf($Pay.'<br/>');
printf($PilotFee.'<br/>');
printf($Expires.'<br/>');
printf($ExpireDateTime.'<br/>');
printf($Type.'<br/>');
printf($Express.'<br/>');
printf($Locked.'<br/>');
}
?>
Than i get this error on my php page:
Notice: Trying to get property of non-object in
C:\xampp\htdocs\xml\xml.php on line 11
Fatal error: Uncaught Error: Call to a member function item() on null
in C:\xampp\htdocs\xml\xml.php:11 Stack trace: #0 {main} thrown in
C:\xampp\htdocs\xml\xml.php on line 11
I hope someone can help with this error code
Michael
You can simply get whole XML data in PHP with simplexml_load_file function and then can add all XML data to database with for loop with MySqli Prepared Statements like below:
$xml=simplexml_load_file("http://www.example.com/sample.xml") or die("Error:
Cannot create object");
$count = count($xml->Assignment);
for($i=0;$i<$count;$i++)
{
$Id = $xml->Assignment[$i]->Id;
$Status = $xml->Assignment[$i]->Status;
$Location = $xml->Assignment[$i]->Location;
$Fram = $xml->Assignment[$i]->Fram;
$Destination = $xml->Assignment[$i]->Destination;
$Assignment = $xml->Assignment[$i]->Assignment;
$Amount = $xml->Assignment[$i]->Amount;
$Units = $xml->Assignment[$i]->Units;
$Pay = $xml->Assignment[$i]->Pay;
$PilotFee = $xml->Assignment[$i]->PilotFee;
$Expires = $xml->Assignment[$i]->Expires;
$ExpireDateTime = $xml->Assignment[$i]->ExpireDateTime;
$Type = $xml->Assignment[$i]->Type;
$Express = $xml->Assignment[$i]->Express;
$Locked = $xml->Assignment[$i]->Locked;
$stmt = $db->prepare("insert into jobs values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)");
$stmt->bindParam(1, $Id);
$stmt->bindParam(2, $Status);
$stmt->bindParam(3, $Location);
$stmt->bindParam(4, $Fram);
$stmt->bindParam(5, $Destination);
$stmt->bindParam(6, $Assignment);
$stmt->bindParam(7, $Amount);
$stmt->bindParam(8, $Units);
$stmt->bindParam(9, $Pay);
$stmt->bindParam(10, $PilotFee);
$stmt->bindParam(11, $Expires);
$stmt->bindParam(12, $ExpireDateTime);
$stmt->bindParam(13, $Type);
$stmt->bindParam(14, $Express);
$stmt->bindParam(15, $Locked);
$stmt->execute();
printf($Id.'<br/>');
printf($Status.'<br/>');
printf($Location.'<br/>');
printf($Fram.'<br/>');
printf($Destination.'<br/>');
printf($Assignment.'<br/>');
printf($Amount.'<br/>');
printf($Units.'<br/>');
printf($Pay.'<br/>');
printf($PilotFee.'<br/>');
printf($Expires.'<br/>');
printf($ExpireDateTime.'<br/>');
printf($Type.'<br/>');
printf($Express.'<br/>');
printf($Locked.'<br/>');
// Get all data in Variables and Execute MYSQLi Prepared Statements
// ---- //
}
I catch your error. The error is noticed when you try to access item in position 1 and there isn't an item in that position:
When you call this function $xmlcount = $xmldata->length; $xmlcount has a value of 2 so your for loop make an empty cycle with a null data. For avoid this fast without too much control make this:
$xmlcount = $xmldata->length - 1;
[Edit]
Your XML is not formatted correctly. You missed a closed tag of <AssignmentItems>

PHP imported data integrity

I will try to explain situation as well as possible:
I have script, that imports CSV file data to MS Access database.
I have 2 access Tables:
A) Users and their information(ID, name, last name etc.)
B) Table which contains data from CSV file
Problem is, data imported from file, (2nd table) contains Users name and lastname. I want to get idea, how to, while reading csv file line by line, check what name line contains, and assign userID from table 1 instead of name and lastname on table 2. It should be done while importing, because, on each import there are roughly 3k lines being imported. Any ideas appreciated. Images given bellow.
Import script:
<?php
function qualityfunction() {
error_reporting(0);
require_once '/Classes/PHPExcel.php'; // (this should include the autoloader)
require_once '/CLasses/PHPExcel/IOFactory.php';
$excel_readers = array(
'Excel5' ,
'Excel2003XML' ,
'Excel2007'
);
$files = glob('data files/quality/QA*.xls');
$sheetname= 'AvgScoreAgentComments';
if (count($files) >0 ) {
foreach($files as $flnam) {
$reader = PHPExcel_IOFactory::createReader('Excel5');
$reader->setReadDataOnly(true);
$reader->setLoadSheetsOnly($sheetname);
$path = $flnam;
$excel = $reader->load($path);
$writer = PHPExcel_IOFactory::createWriter($excel, 'CSV');
$writer->save('data files/quality/temp.csv');
/*
$filename = basename($path);
if (strpos($filename,'tes') !== false) {
echo 'true';
}*/
require "connection.php";
$handle = fopen("data files/quality/temp.csv", "r");
try {
$import= $db->prepare("INSERT INTO quality(
qayear,
qamonth,
lastname,
firstname,
score) VALUES(
?,?,?,?,?)");
$i = 0;
while (($data = fgetcsv($handle, 1000, ",", "'")) !== FALSE) {
if($i > 3) {
$data = str_replace('",', '', $data);
$data = str_replace('"', '', $data);
$import->bindParam(1, $data[1], PDO::PARAM_STR);
$import->bindParam(2, $data[2], PDO::PARAM_STR);
$import->bindParam(3, $data[3], PDO::PARAM_STR);
$import->bindParam(4, $data[4], PDO::PARAM_STR);
$import->bindParam(5, $data[7], PDO::PARAM_STR);
$import->execute();
}
$i++;
}
fclose($handle);
$removal=$db->prepare("DELETE FROM quality WHERE score IS NULL;");
$removal->execute();
}
catch(PDOException $e) {
echo $e->getMessage()."\n";
}};
Data table 1 (Users info):
Data table 2 (In which data from CSV file is imported)
Found a solution. Thanks for help.
$lastname = "lastname";
$firstname = "firstname";
$showdata = $db->prepare("SELECT userID FROM users WHERE lastname= :lastname AND firstname= :firstname");
$showdata->bindParam(':lastname', $lastname);
$showdata->bindParam(':firstname', $firstname);
$showdata->execute();
$rowas= $showdata->fetch(PDO::FETCH_ASSOC);
echo $rowas['userID'];

PDO Stored Procedure return value

I'm working with a SQL Server stored procedure that returns error codes; here is a very simple snippet of the SP.
DECLARE #ret int
BEGIN
SET #ret = 1
RETURN #ret
END
I can get the return value with the mssql extension using:
mssql_bind($proc, "RETVAL", &$return, SQLINT2);
However, I can't figure out how to access the return value in PDO; I'd prefer not to use an OUT parameter, as alot of these Stored Procedures have already been written. Here is an example of how I am currently calling the procedure in PHP.
$stmt = $this->db->prepare("EXECUTE usp_myproc ?, ?");
$stmt->bindParam(1, 'mystr', PDO::PARAM_STR);
$stmt->bindParam(2, 'mystr2', PDO::PARAM_STR);
$rs = $stmt->execute();
$result = $stmt->fetchAll(PDO::FETCH_ASSOC);
Check out MSDN for info on how to correctly bind to this type of call
Your PHP code should probably be tweaked to look more like this. This may only work if you're calling through ODBC, which is honestly the strongly preferred way to do anything with SQL Server; use the SQL Native Client on Windows systems, and use the FreeTDS ODBC driver on *nix systems:
<?php
$stmt = $this->db->prepare("{?= CALL usp_myproc}");
$stmt->bindParam(1, $retval, PDO::PARAM_STR, 32);
$rs = $stmt->execute();
$rows = $stmt->fetchAll(PDO::FETCH_ASSOC);
echo "The return value is $retval\n";
?>
The key thing here is that the return value can be bound as an OUT parameter, without having to restructure the stored procedures.
Just had this same problem:
<?php
function exec_sproc($sproc, $in_params)
{
global $database;
$stmnt = $database->prepare("EXEC " . $sproc);
if($stmnt->execute($in_params))
{
if($row = $stmnt->fetch())
{
return $row[0];
}
}
return -1;
}
?>
can't u use SELECT to return the results?
Then you can use a dataset (resultset in php?) to pick it up?
I don't know know PHP, but in c# its quite simple - use a dataset.
pretty sure PDO::exec only returns number of rows.. this would be $rs in your example
If I understand your question properly you shouldn't have to call fetchAll()...
$stmt = $this->db->prepare("EXECUTE usp_myproc ?, ?");
$stmt->bindParam(1, $mystr, PDO::PARAM_STR);
$stmt->bindParam(2, $mystr2, PDO::PARAM_STR);
$rs = $stmt->execute();
echo "The return values are: $mystr , and: $mystr2";
PDOStatement::bindParam
public function callProcedure($sp_name = null, $sp_args = []) {
try {
for($i = 0; $i < count($sp_args); $i++) {
$o[] = '?';
}
$args = implode(',', $o);
$sth = $connection->prepare("CALL $sp_name($args)");
for($i = 0, $z =1; $i < count($sp_args); $i++, $z++) {
$sth->bindParam($z, $sp_args[$i], \PDO::PARAM_STR|\PDO::PARAM_INPUT_OUTPUT, 2000);
}
if($sth->execute()) {
return $sp_args;
}
} catch (PDOException $e) {
this->error[] = $e->getMessage();
}
}
I had a similar problem and was able to solve it by returning the execute like so...
function my_function(){
$stmt = $this->db->prepare("EXECUTE usp_myproc ?, ?");
$stmt->bindParam(1, 'mystr', PDO::PARAM_STR);
$stmt->bindParam(2, 'mystr2', PDO::PARAM_STR);
return $stmt->execute();
}
All that is left is to call the function using a variable and then analyse said variable.
$result = my_function();
You can now analyse the contents of $result to find the information you're looking for. Please let me know if this helps!
Try $return_value

Categories