I have a php script that steps through a folder containing tab delimited files, parsing them line by line and inserting the data into a mysql database. I cannot use LOAD TABLE because of security restrictions on my server and I do not have access to the configuration files. The script works just fine parsing 1 or 2 smaller files but when when working with several large files I get a 500 error. There do not appear to be any error logs containing messages pertaining to the error, at least none that my hosting provider gives me access to. Below is the code, I am also open to suggestions for alternate ways of doing what I need to do. Ultimately I want this script to fire off every 30 minutes or so, inserting new data and deleting the files when finished.
EDIT: After making the changes Phil suggested, the script still fails but I now have the following message in my error log "mod_fcgid: read data timeout in 120 seconds", looks like the script is timing out, any idea where I can change the timeout setting?
$folder = opendir($dir);
while (($file = readdir($folder)) !== false) {
$filepath = $dir . "/" . $file;
//If it is a file and ends in txt, parse it and insert the records into the db
if (is_file($filepath) && substr($filepath, strlen($filepath) - 3) == "txt") {
uploadDataToDB($filepath, $connection);
}
}
function uploadDataToDB($filepath, $connection) {
ini_set('display_errors', 'On');
error_reporting(E_ALL);
ini_set('max_execution_time', 300);
$insertString = "INSERT INTO dirty_products values(";
$count = 1;
$file = #fopen($filepath, "r");
while (($line = fgets($file)) !== false) {
$values = "";
$valueArray = explode("\t", $line);
foreach ($valueArray as $value) {
//Escape single quotes
$value = str_replace("'", "\'", $value);
if ($values != "")
$values = $values . ",'" . $value . "'";
else
$values = "'" . $value . "'";
}
mysql_query($insertString . $values . ")", $connection);
$count++;
}
fclose($file);
echo "Count: " . $count . "</p>";
}
First thing I'd do is use prepared statements (using PDO).
Using the mysql_query() function, you're creating a new statement for every insert and you may be exceeding the allowed limit.
If you use a prepared statement, only one statement is created and compiled on the database server.
Example
function uploadDataToDB($filepath, $connection) {
ini_set('display_errors', 'On');
error_reporting(E_ALL);
ini_set('max_execution_time', 300);
$db = new PDO(/* DB connection parameters */);
$stmt = $db->prepare('INSERT INTO dirty_products VALUES (
?, ?, ?, ?, ?, ?)');
// match number of placeholders to number of TSV fields
$count = 1;
$file = #fopen($filepath, "r");
while (($line = fgets($file)) !== false) {
$valueArray = explode("\t", $line);
$stmt->execute($valueArray);
$count++;
}
fclose($file);
$db = null;
echo "Count: " . $count . "</p>";
}
Considering you want to run this script on a schedule, I'd avoid the web server entirely and run the script via the CLI using cron or whatever scheduling service your host provides. This will help you avoid any timeout configured in the web server.
Related
I have a set of PHP scripts that load files into a database for an auto-updater program to use later. The program works fine until a file exceeds the 10MB range. The rough idea of the script is that it pulls files from disk in a specific location, and loads them into the database. This allows us to store in source control, and update in sets as needed.
Initially, I thought that I was hitting a limit on the database SQL based on my initial searches. However, after further testing, it seems to be something PHP specific. I checked the Apache error log, but I did not see any errors for this script or the includes. Once the PHP script reaches the addslashes function, the script seems to stop executing. (I added echo statements between each script statement.)
I'm hoping that it is something simple that I am missing, but I couldn't find anything related to addslashes failing after several hours of searching online.
Any ideas?
Thanks in advance.
mysql_connect('localhost', '****', '****') or die('Could not connect to the database');
mysql_select_db('****') or die('Could not select database');
function get_filelist($path)
{
return get_filelist_recursive("/build/".$path);
}
function get_filelist_recursive($path)
{
$i = 0;
$list = array();
if( !is_dir($path) )
return get_filedetails($path);
if ($handle = opendir($path))
{
while (false !== ($file = readdir($handle)))
{
if($file!='.' && $file!='..' && $file[0]!='.')
{
if( is_dir($path.'/'.$file) )
{
$list = $list + get_filelist_recursive($path.'/'.$file);
}
else
{
$list = $list + get_filedetails($path.'/'.$file);
}
}
}
closedir($handle);
return $list;
}
}
function get_filedetails($path)
{
$item = array();
$details = array();
$details[0] = filesize($path);
$details[1] = sha1_file($path);
$item[$path] = $details;
return $item;
}
$productset = mysql_query("select * from product where status is null and id=".$_REQUEST['pid']);
$prow = mysql_fetch_assoc($productset);
$folder = "product/".$prow['name'];
$fileset = get_filelist($folder);
while (list($key, $val) = each($fileset))
{
$fh = fopen($key, 'rb') or die("Cannot open file");
$data = fread($fh,$val[0]);
$data = addslashes($data);
fclose($fh);
$filename = substr( $key, strlen($folder) + 1 );
$query = "insert into file(name,size,hash,data,manifest_id) values('".$filename."','".$val[0]."','".$val[1]."','".$data."','".$prow['manifest_id']."')";
$retins = mysql_query($query);
if( $retins == false )
echo "BUILD FAILED: $key, $val[0] $val[1].<br>\n";
}
header("Location: /patch/index.php?pid=".$_REQUEST['pid']);
Don't use addslashes, use mysql_real_escape_string in this case. Also, you could likely be hitting a max_allowed_packet limit by trying to insert such large files. The default value is 1MB.
If you use mysqli (which is recommended) you can indicate that the column is binary, and it will send the query in chunks.
Also make sure you aren't hitting any PHP memory limits, or maximum execution time.
I'm trying to import data from my students.csv file into mysql using php. The entries in the csv file is in such a way that column (student_number, fname, lname, level) will be inserted into biodata table..
I'm also uploading the student.csv file from my computer.
When I run the page I dont get anything out on the screen.
session_start();
require('includes/dbconnect.php');
require 'includes/header.inc.php';
//check for file upload
if (isset($_FILES['csv_file']) && is_uploaded_file($_FILES['csv_file']['tmp_name'])) {
//upload directory
$upload_dir = "C:\Users\DOTMAN\Documents\students.csv";
//create file name
$file_path = $upload_dir . $_FILES['csv_file']['name'];
//move uploaded file to upload dir
if (!move_uploaded_file($_FILES['csv_file']['tmp_name'], $file_path)) {
//error moving upload file
echo "Error moving file upload";
}
//open the csv file for reading
$handle = fopen($file_path, 'r');
//turn off autocommit and deletethe bio data
mysql_query("SET AUTOCOMMIT=0");
mysql_query("BEGIN");
mysql_query("TRUNCATE TABLE biodata") or die(mysql_error());
while (($data = fgetcsv($handle, 1000, ',')) !== FALSE) {
//Access field data in $data array ex.
$student_number = $data[0];
$fname = $data[1];
$lname = $data[2];
$level = $data[3];
//Use data to insert into db
$query = "INSERT INTO biodata (student_number, fname, lname, level)
VALUES ('$student_number', '$fname', '$lname', '$level')";
mysql_query($query) or die (mysql_error());
}
}
I'd suggest you to upload CSV-file with LOAD DATA INFILE command. This is fast method.
if you only need to do this once, i would consider using something like: http://csv2sql.com/
One immediate issue I can see is here:
$upload_dir = "C:\Users\DOTMAN\Documents\students.csv";
//create file name
$file_path = $upload_dir . $_FILES['csv_file']['name'];
You are already assigning the entire path, including the file name, to the $upload_dir variable - and then you're appending the uploaded file name again.
If you think there are errors in your code, start by adding
ini_set('display_errors', 1);
error_reporting(E_ALL);
to the beginning of your PHP code and fix any warnings/errors displayed. You can then turn off printing error messages by changing the second parameter to 0 in the first call.
Have u debug the $_FILES:
print_r($_FILES);
before doing any thing
Solution using PHP
$file = 'path/to.csv';
$lines = file($file);
$firstLine = $lines[0];
foreach ($lines as $line_num => $line) {
if($line_num==0) { continue; } //escape the header column
$arr = explode(",",$line);
$column1= $arr[0];
$column2= $arr[1];
echo $column1.$column2."<br />";
//put the mysql insert statement here
}
I made a script a while ago that wrote to a file, I did the same thing here, only added a part to read the file and write it again. What I am trying to achive is quite simple, but the problem is eluding me, I am trying to make my script write to a file basically holding the following information
views:{viewcount}
date-last-visited:{MM/DD/YYYY}
last-ip:{IP-Adress}
Now I have done a bit of research, and tried several methods to reading the data, none have returned anything. My current code is as follows.
<?php
$filemade = 0;
if(!file_exists("stats")){
if(!mkdir("stats")){
exit();
}
$filemade = 1;
}
echo $filemade;
$hwrite = fopen("stats/statistics.txt", 'w');
$icount = 0;
if(filemade == 0){
$data0 = file_get_contents("stats/statistics.txt");
$data2 = explode("\n", $data0);
$data1 = $data_1[0];
$ccount = explode(":", data1);
$icount = $ccount[1] + 1;
echo "<br>icount:".$icount."<br>";
echo "data1:".$data1."<br>";
echo "ccount:".$ccount."<br>";
echo "ccount[0]:".$ccount1[0]."<br>";
echo "ccount[1]:".$ccount1[1]."<br>";
}
$date = getdate();
$ip=#$REMOTE_ADDR;
fwrite($hwrite, "views:" . $icount . "\nlast-viewed:" . $date[5] . "/" . $date[3] . $date[2] . "/" . $date[6] . "\nlast-ip:" . $ip);
fclose($hwrite);
?>
the result is always:
views:1
last-viewed://
last-ip:
the views never go up, the date never works, and the IP address never shows.
I have looked at many sources before finally deciding to ask, I figured I'd get more relevant information this way.
Looking forward to some replies. PHP is my newest language, and so I don't know much.
What I have tried.
I have tried:
$handle_read = fopen("stats/statistics.txt", "r");//make a new file handle in read mode
$data = fgets($handle_read);//get first line
$data_array = explode(":", $data);//split first line by ":"
$current_count = $data_array[1];//get second item, the value
and
$handle_read = fopen("stats/statistics.txt", "r");//make a new file handle in read mode
$pre_data = fread($handle_read, filesize($handle_read));//read all the file data
$pre_data_array = explode("\n", $pre_data);//split the file by lines
$data = pre_data_array[0];//get first line
$data_array = explode(":", $data);//split first line by ":"
$current_count = $data_array[1];//get second item, the value
I have also tried split instead of explode, but I was told split is deprecated and explode is up-to-date.
Any help would be great, thank you for your time.
Try the following:
<?php
if(!file_exists("stats")){
if(!mkdir("stats")) die("Could not create folder");
}
// file() returns an array of file contents or false
$data = file("stats/statistics.txt", FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
if(!$data){
if(!touch("stats/statistics.txt")) die("Could not create file");
// Default Values
$data = array("views:0", "date-last-visited:01/01/2000", "last-ip:0.0.0.0");
}
// Update the data
foreach($data as $key => $val){
// Limit explode to 2 chunks because we could have
// IPv6 Addrs (e.x ::1)
$line = explode(':', $val, 2);
switch($key){
case 0:
$line[1]++;
break;
case 1:
$line[1] = date('m/d/Y');
break;
case 2:
$line[1] = $_SERVER['REMOTE_ADDR'];
break;
}
$data[$key] = implode(':', $line);
echo $data[$key]. "<br />";
}
// Write the data back into the file
if(!file_put_contents("stats/statistics.txt", implode(PHP_EOL, $data))) die("Could not write file");
?>
Strange issue i'm having. when I perform a file check with File_exist or is_file it only checks half the files... what my script does is processes a csv file and inserts the data into the table only if the file exist on the server. if I remove the file check everything processes fine. I've double check to make sure all files exist on the server it just stop half way through for some reason.
$column_headers = array();
$row_count = 0;
if (mysql_result(
mysql_query(
"SELECT count(*) FROM load_test WHERE batch_id='".$batchid."'"
), 0
) > 0) {
die("Error batch already present");
}
while (($data = fgetcsv($handle, 0, ",")) !== FALSE) {
if ($row_count==0){
$column_headers = $data;
} else {
$dirchk1 = "/temp/files/" . $batchid . "/" .$data[0] . ".wav";
$dirchk2 = "/files/" . $batchid . "/" . $data[1] . ".wav";
if (file_exists($dirchk1)) {
$importword="INSERT into load_test SET
word = '".$data[2]."',
batch_id = UCASE('".$batchid."'),
accent = '".$data[15]."'
";
mysql_query($importword);
$word_id = mysql_insert_id();
echo $word_id . "\n";
}
}
++$row_count;
}
Try it using "-e" test condition.
For eg ::
if(-e $dirchk1){
print "File exists\n";
}
Also make sure if your variable $dirchk1 etc are getting correctly populated or not.
Please check if it works or not.
the script processed correctly, human error whey verifying on my part.
I am trying to read 738627 records from a flat file into MySQl. The script appears to run fine, but is giving me the above memory errors.
A sample of the file is:
#export_dategenre_idapplication_idis_primary
#primaryKey:genre_idapplication_id
#dbTypes:BIGINTINTEGERINTEGERBOOLEAN
#exportMode:FULL
127667880285760002817317350
127667880285760002818261461
127667880285760002825372301
127667880285760002827785570
127667880285760002827930241
127667880285760002827987861
127667880285760002828089791
127667880285760002828168361
127667880285760002828192041
127667880285760002829144541
127667880285760002829351511
I have tried increasing the allowed memory using
ini_set("memory_limit","80M");
and it still fails. Do I keep upping this until it runs?
The code in full is
<?php
ini_set("memory_limit","80M");
$db = mysql_connect("localhost", "uname", "pword");
// test connection
if (!$db) {
echo "Couldn't make a connection!";
exit;
}
// select database
if (!mysql_select_db("dbname",$db))
{
echo "Couldn't select database!";
exit;
}
mysql_set_charset('utf8',$db);
$delimiter = chr(1);
$eoldelimiter = chr(2) . "\n";
$fp = fopen('genre_application','r');
if (!$fp) {echo 'ERROR: Unable to open file.</table></body></html>'; exit;}
$loop = 0;
while (!feof($fp)) {
$loop++;
$line = stream_get_line($fp,128,$eoldelimiter); //use 2048 if very long lines
if ($line[0] === '#') continue; //Skip lines that start with #
$field[$loop] = explode ($delimiter, $line);
$fp++;
$export_date = $field[$loop][0];
$genre_id = $field[$loop][1];
$application_id = $field[$loop][2];
$query = "REPLACE into genre_apps
(export_date, genre_id, application_id)
VALUES ('$export_date','$genre_id','$application_id')";
print "SQL-Query: ".$query."<br>";
if(mysql_query($query,$db))
{
echo " OK !\n";
}
else
{
echo "Error<br><br>";
echo mysql_errno() . ":" . mysql_error() . "</font></center><br>\n";
}
}
fclose($fp);
?>
Your loop fills the variable $field for no reason (it writes to a different cell on every loop iteration), thereby using up more memory with every line.
You can replace:
$field[$loop] = explode ($delimiter, $line);
$export_date = $field[$loop][0];
$genre_id = $field[$loop][1];
$application_id = $field[$loop][2];
With:
list($export_date, $genre_id, $application_id) = explode($delimiter, $line);
For improved performance, you could take advantage of the ability to insert several lines using REPLACE INTO by grouping N lines into a single query.