I am using PHP to expose vehicle GPS data from a CSV file. This data is captured at least every 30 seconds for over 70 vehicles and includes 19 columns of data. This produces several thousand rows of data and file sizes around 614kb. New data is appended to end of the file. I need to pull out the last row of data for each vehicle, which should represent the most the recent status. I am able to pull out one row for each unit, however since the CSV file is in chronological order I am pulling out the oldest data in the file instead of the newest. Is it possible to read the CSV from the end to the beginning? I have seen some solutions, however they typically involve loading the entire file into memory and then reversing it, this sounds very inefficient. Do I have any other options? Thank you for any advice you can offer.
EDIT: I am using this data to map real-time locations on-the-fly. The data is only provided to me in CSV format, so I think importing into a DB is out of the question.
With fseek you can set the pointer to the end of the file and offset it negative to read a file backwards.
If you must use csv files instead of a database, then perhaps you could read the file line-by-line. This will prevent more than the last line being stored in memory (thanks to the garbage collector).
$handle = #fopen("/path/to/yourfile.csv", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
// old values of $last are garbage collected after re-assignment
$last = $line;
// you can perform optional computations on past data here if desired
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
// $last will now contain the last line of the file.
// You may now do whatever with it
}
edit: I did not see the fseek() post. If all you need is the last line, then that is the way to go.
Related
I'm parsing a 1 000 000 line csv file in PHP to recover this datas: IP Address, DNS , Cipher suites used.
In order to know if some DNS (having several mail servers) has different Cipher suites used on their servers, I have to store in a array a object containing the DNS name, a list of the IP Address of his servers, and a list of cipher suites he uses. At the end I have an array of 1 000 000 elements. To know the number of DNS having different cipher suites config on their servers I do:
foreach($this->allDNS as $dnsObject){
$res=0;
if(count($dnsObject->getCiphers()) > 1){ //if it has several different config
res++;
}
return $res;
}
Problem: Consumes too much memory, i can't run my code on 1000000 line csv (if I don't store these data in a array, I parse this csv file in 20 sec...). Is there a way to bypass this problem ?
NB: I already put
ini_set('memory_limit', '-1');
but this line just bypass the memory error.
Saving all of those CSV data will definitely take its toll on the memory.
One logical solution to your problem is to have a database that will store all of those data.
You may refer to this link for a tutorial on parsing your CSV file and storing it to database.
Write the processed Data (for each Line seperately) into one File (or Database)
file_put_contents('data.txt', $parsingresult, FILE_APPEND);
FILE_APPEND will append the $parsingresult at the End of the File-Content.
Then you can access the processed Data by file_get_contents() or file().
Anyways. I think, using a Database and some Pre-Processing would be the best Solution if this is needed more often.
You can use fgetcsv() to read and parse the CSV file one line at a time. Keep the data you need and discard the line:
// Store the useful data here
$data = array();
// Open the CSV file
$fh = fopen('data.csv', 'r');
// The first line probably contains the column names
$header = fgetcsv($fh);
// Read and parse one data line at a time
while ($row = fgetcsv($fh)) {
// Get the desired columns from $row
// Use $header if the order or number of columns is not known in advance
// Store the gathered info into $data
}
// Close the CSV file
fclose($fh);
This way it uses the minimum amount of memory needed to parse the CSV file.
Using PHP, is it possible to load just a single record / row from a CSV file?
In other words, I would like to treat the file as an array, but don't want to load the entire file into memory.
I know this is really what a database is for, but I am just looking for a down and dirty solution to use during development.
Edit: To clarify, I know exactly which row contains the info I am looking for.
I would just like to know if there is a way to get it without having to read the entire file into memory.
As I understand you are looking for a row with certain data. Therefore you could probably implement the following logic:
(1) scan file for the given data (ex. value which is in the row that you are trying to find),
(2) load only this line of file,
(3) perform your operations on that line.
fgetcsv() operates over a file resource handle, so if you want you can obtain the position of the line you can fseek() the resource to that position and use fgetcsv() normally.
If you don't know which line you are looking for until after you have read the row, your best bet is reading the record until you find the record by testing the array that is returned.
$fp = fopen('data.csv', 'r');
while(false !== ($data = fgetcsv($fp, 0, ','))) {
if ($data['field'] === 'somevalue') {
echo 'Hurray';
break;
}
}
If you are looking to read a specific line, use the splfile object and seek to the record number. This will return a string that you must convert to an array
$file = new SplFileObject('data.csv');
$file->seek(2);
$record = $file->current();
$data = explode(",", $record);
I have csv file with 104 fields, but I need only 4 fields to use in mysql database. each file has about a million rows.
could somebody tell me efficient way to do this? reading each line to array takes long time.
thanks
You have to read every line in its entirety by definition. This is necessary to find the delimiter for the next record (i.e. the newline character). You only need to discard the data you have read that you don't need. E.g.:
$data = array();
$fh = fopen('data.csv', 'r');
$headers = fgetcsv($fh);
while ($row = fgetcsv($fh)) {
$row = array_combine($headers, $row);
$data[] = array_intersect_key($row, array_flip(array('foo', 'bar', 'baz')));
// alternatively, if you know the column index, something like:
// $data[] = array($row[1], $row[45], $row[60]);
}
This only retains the columns foo, bar and baz and discards the rest. The reading from file (fgetcsv) is about as fast as it gets. If you need it any faster, you'll have to implement your own CSV tokenizer and parser which skips over the columns you don't need without even temporarily storing them in memory; how much of a performance boost this brings vs. development time necessary to implement this bug free is very debatable.
simple excel macro can drop all unnecessary columns (100 out of 104)
within second. I am looking for similar solution.
That is because Excel, once a file is opened, has all data in memory and can act on it very quickly. For an accurate comparison you need to compare the time it takes to open the file in Excel + dropping of the columns, not just dropping the columns.
I have records into .CSV file and I want to import them into MySQL database.
Whenever I import the .CSV I get the message Import has been successfully finished... but only 79 out of 114 records are be inserted into the database.
When I try to import the .CSV file with 411 records, just 282 are be inserted. The CSV file which got 411 records includes two categories of records Active and Sold whereby 114 records are Active.
Has someone gotten this type of problem? If so what should be done?
I wrote my own csv importer with php. I use php command fgetcsv to read the csv file and then I use mysql insert command in a loop.
$handle = fopen($this->file, "r");
$i=0;
$delimiter = ($this->fieldDelimiter == 'TAB') ? chr(9) : $this->fieldDelimiter;
while (($data = fgetcsv($handle, 10000, $delimiter)) !== FALSE)
{
$mydata[] = $data;
}
fclose ($handle);
reset ($mydata);
if ($this->CSVhasTitle)
{
$mydata = array_slice($mydata,1); //delete first row
}
Then I loop through my array and I use mysql insert:
foreach ($mydata as $value)
{
INSERT INTO $table (...) VALUES (....)
}
But I add exact columnnames into the array before the loop. I've an array of all columnames.
I had this problem too. Even though its a bit old and these recommendations go in one direction here is the solution I found. I was creating a large database and import and had the same thing happen, after trial and trials I realized that a key was created somehow assigned (I didn't recognize it because Iwas using the new skin and always use the old skin). Took out the key that somehow got assigned not by me and left the primary only and boom, absorbed the data upload. Also, had issues with view now, and request times out with respect to your question. I pushed up the viewer display a lot to thousands and now stuck, cannot access the config file anywhere with my CP. So as will hang and hosting customer support to lazy to read my concerns and override it on their end-I will have to remove whole table instead of any DROP as can't even run SQL as freezes with overload. So, food for thought would be to keep you table view down, which sucks like in my case b/c I need to look at 17,000 rcords visually quickly to ensure my .csv was correct rather then functions as if issues then can spot them and correct in the control which makes more sense to me anyway.
Take a look at your CSV file. It very likely contains something like
1,2,"some data",1
2,5,"data,with,comma",2
If you don't specify COLUMNS OPTIONALLY ENCLOSED BY '"' (SINGLE_QUOTE DOUBLE_QUOTE SINGLE_QUOTE) then the commas embedded in the string data in the second row, third column will not be imported properly.
Check the CSV to see what enclosure character is being used and specify that in the phpmyadmin interface.
I have text files containing structured data (it is a proprietary format and not something simple or common like CSV). I'm trying to put this data into a database. The text files are as large as 50GB so it's impossible for me to read the entire file into memory, extract it into an array, then process it into the database.
The text files are structured in such a way that data on a particular "item" (a specific id in the database) can have multiple lines (new lines) of information in the text file. Items in the text file always start with a line that begins with '01' and can have an infinite number of additional lines (all one after the other), that will all start with 02 or 03 ... up to 08. A new item begins when a new line starts with 01.
01some_data_about_the_first_item
02some_more_data_about_the_first_item
05more_data_about_the_first_item
01the_first_line_of_the_second_item
I'd like to use PHP to process this data.
How can I load a piece of this text file into memory where I can analyze it, get all the lines for an item, and then process it? Is there a way to load all lines up to the next line that starts with 01, process that data, then begin the next scan of the text file at the end of the last scan?
Processing the data once I've loaded pieces of it into memory is not the problem.
Sure. Since you tagged the question with csv, I'll assume you have a CSV file. In that case, fgetcsv is a good function to use, which get one line from the file at a time. Using that you can get as many lines as you need for one record, then process it, then continue with the next one. Rough mockup:
$fh = fopen('file.csv', 'r');
$record = array();
do {
$line = fgetcsv($fh);
if ($line && $line[0] != '01') {
// any line that does not start with 01 is part of the current record,
// adjust condition as necessary
$record[] = $line;
} else if ($record) {
/* put current $record into database */
// start next record
$record = array($line);
}
} while ($line);
Here is a start:
<?php
$fp=fopen('big.txt','r');
while($line=fgets($fp)){
$number=substr($line,0,2);
$data=substr($line,2);
// proccess each line
echo $number.' - '.$data;
}
fclose($fp);
?>