handle csv import with larga array through a three step process - php

i need some help with a project of mine. It is about a dvd database. In the moment i am planning to implement a csv data function to import dvds with all information from a file.
I will do this in three steps.
Step 1
- show data i want to import, building array
- import data, building session arrays
Step 2
- edit informations
Step 3
- showing result before update
- update data
so far it works but i have a problem with large files. the csv data has 20 columns (title, genre, plot etc.) and for each line in the csv there are some arrays i create to use it in the next steps.
When i have more about 500 lines the browser often collapse while importing. I get no response.
Anyway now i trying to do this as an ajax call process. The advantage is, that i can define how many procedures the system handle each call and the user can see that the system is still working, like an statusbar when down/uploading a file.
In the moment i try to find some usefull example illustrating how i can do this, but i could not find something useful till now.
Maybe you have some tipps or an example how this could work, saying processing 20 lines each call, building the array.
After i would like to use the same function to build the session arrays using in the next step and so on.
Some information:
i use fgetcsv() to read the rows from the file. i go through the rows and each column i have different querys like is the item id unique, the title exist, description exist etc.
So if one of these data is not entered i get an error which row and column the error occures.
I´d appreciate any help from you

use 'LOAD DATA INFILE' syntax. ive used it on files upwards of 500mb with 3mil rows and it takes seconds, not minutes.
http://dev.mysql.com/doc/refman/5.0/en/load-data.html

While this is not the direct answer you were looking for
500 lines shouldnt take too long to process, so.. heres another thought for you.
Create a temporary table with the right structure of fields
you can then extract from it using select statements the various unique entries for the plot, genre etc rather than making a bunch of arrays along the way
mysql import would be very fast of your data
You can then edit it as required, and finally insert into your final table the data you have from your temporary but now validated table.
In terms of doing it with ajax, you would have to do a repeating timed event to refresh the status, the problem is rather than 20 lines, it would need to be a specific time period, as your browser has no way to know, assuming the csv is uploaded and you can process it in 20 line chunks.
If you enter the csv in a big big textbox, you could work on by taking the first 20 lines, passing it the remainder to the next page etc, would strike me as potential mess.
So, while I know ive not answered your question directly, I hope I gave you food for thought as to alternative and possibly more practical alternatives

Related

Is there a way to store multiple records rather than using multiple rows in MySQL?

I would like to make full use out of MySQL for the purpose of a (web) application I have developed for a chiropractor.
So far I have been storing in a single row for [every year] for what are called progress notes. The table structure looks something like this (progress_note_id, patient_id, date (Y-0-0), progress_note). When the client wishes to append for the year of the current progress notes, he simply clicks at the top of a textarea (html), which I use TinyMCE JavaScript library, to make a new entry date along with the shorthand notes to go at the beginning of the column (progress_note). So far its been working ok, if there are 900+ clients (est.) there could potentially be 1300+ progress notes, for each year since the beginning of the application (2018).
Now the client wishes to be able to see previous progress notes (history), but is unable to modify any previous notes, while still be able to write new ones. The solution I have come up with is to use XML inside the textarea, and use PHP to decipher the new notes from the old ones.
My problem however is if I should have to convert my entire table from a yearly to a daily, that it could take up a lot of time and energy to convert multiple notes into each single rows, (est. 10x) Which could end up being 13,000+ rows. I realize that no matter what method I choose to do is going to be a lot of work. Another way around this perhaps I found was to use XML column type in MySQL to potentially store multiple records, and if I wish to append it, all I would need is PHP to interpret the entire XML and add a new child node, to the beginning. Each progress note is 255 - 500 chars. And in worst case scenario, if the patient was to be 52 times a year (1 for every week), there shouldn't be a large enough overhead.
Is this the correct way to solving this problem? I do wish to keep with MySQL DB and I realize that MySQL is not an intended for XML. And for some clarification, what I hope to accomplish is the same thing I intended to do with current progress notes, but with XML. I believe in ascending order (newer -> oldest).
<xml_result>
<progress_note>
<date>2020-08-16</date>
<content></content>
</progress_note>
<xml_result>
Thank-you for any of your time and for any suggestions.
Firstly, 13000+ is not a problem for mysql. In most case for web application, mysql can handle more than 10m+ records for a single instance with a good performance.
Secondly, you can use either XML or JSON format in a text field and handle the decoding in your application.

Can I display a million record in a html table in one go?

I'm currently creating a web-based system that would have a millions of data after some years (3 years = 1 million record, just guessing).
Now I have a webpage where I display all records in a html table dynamically.
If the time comes can it display these amount of data?
What are the things I need to consider?
What about hardware requirements (for the server probably)?
The set up would be a LAN set up to be use by 7 users simultaneously.
Any help would be appreciated.
Here is a my php code:
my php code
and this is the result:
Result
I guess your browser will crash if you display this inside of a table or list.
The only way i see is to lazy load and keep the DOM as small as possible while scrolling through.
Why do you want to display one million records?
Possible but browser will possibly crash.
Best approach is to have a pagination that will display like, 100 per page maybe, and have a search function.
it's not a human readable ,
you can get all data from the database is possible and saved it in an array
then make search in array and get the result what you want
I am using 1 million cells (not rows) in a plain HTML table.
It takes some time do download the data and some more to render it. But if you really need it you can display it. I don't see a production use case. I only display it to spot inconsistent data. So, not going to prod.
There are multiple component libraries to handle continuous scroll.

Elegant PHP parsing solution for large pipe delimited text source files

I'm currently attempting to come up with a solution to the following problem:
I have been tasked with parsing large (+-3500 Lines 300kb) pipe delimited text files and comparing them line by line to corresponding codes within our database. An example of a file would be:
File name: 015_A.txt
File content (example shows only 4 lines):
015|6999|Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.|1|1|0|0|2016/01/01
015|3715|It has roots in a piece of classical Latin literature from 45 BC|1|1|213.5|213.5|2016/01/01
015|3724|Making it over 2000 years old.|1|1|617.4|617.4|2016/01/01
015|4028|Words will go here.|1|1|74|74|2016/01/01
I will be providing a web interface which I have already built to allow a file to be selected from the browser and then uploaded to the server.
Using the above example pipe file I will only be using these:
Code (using above line 1 as an example: 6999)
Price (using above line 1 as an example: 0)
I would then (to my mind not sure if this is best method) need to run a query (our DB is MSSQL) for each line, example:
SELECT t.Price
FROM table t
WHERE t.code = '6999'
If t.Price === 0 then line 1 has passed. As it is equal to the source file.
This is where I believe I just needed to ask some advice as I am sure there are many ways to tackle this problem, I would just like to, if possible be pointed in the direction of doing this in an efficient manner. (Example best method of parsing the file? Do I run a query per code or rather do a SQL statement using an IN clause and then compare every code and price? Should I scrap this idea and use some form of pure SQL tool bearing in mind I have pipe file to deal with / import.)
Any advice would be greatly appreciated.
Your story appears to end somewhat prematurely. Is the only thing this script should do is check the values in the database match the files in the file? If so, it would be simpler just to extract the data from the database and overwrite the file. If not, then this implies you need to retain some record of the variations.
This has some bearing on the approach taken to the reconcilliation; running 3500 queries against the database is going to take some time - mostly spent on the network and in query parsing (i.e. wasted). OTOH comparing 3500 records in a single SELECT to find mismatches will take no time at all.
The problem is that your data is out at the client and uploading via a browser only gets it halfway to the database. If you create another table on the database (not a temporary table - add a column to represent the file) it is possible to INSERT multiple rows in a single DML statement, but really you should batch them up in lots of 100 or so records, meaning you only need to execute 36 queries to complete the operation - and you've got a record of the data in the database which simplifies how you report the mismatches.
You probably should not use the DBMS supplied utilities for direct import unless you ABSOLUTELY trust the source data.

How to parse CSV file in PHP and store fields in a database?

I need help parsing the following a CSV file in PHP, so I can insert the contents into a database.
I know I use file_get_contents() but after that I feel a bit lost.
What I'd like to store.
Collection1 - events.text & date
Collection2 - position & name.text & total
I'm not sure how best structure the data to insert into a database table.
"**collection1**"
"events.href","**events.text**","**date**","index","url"
"tur.com/events/classic.html","John Deere Classic","Thursday Jul 9
- Sunday Jul 12, 2015","1","tur.com/r.html"
"collection2"
"**position**","name.href","**name.text**","**total**","index","url"
"--","javascript:void(0);","Scott","--","2","tur.com/r.html"
"--","javascript:void(0);","Billy","--","3","tur.com/r.html"
"--","javascript:void(0);","Jon","--","4","tur.com/r.html"
"--","javascript:void(0);","Bill","--","5","tur.com/r.html"
"--","javascript:void(0);","Tim","--","6","tur.com/r.html"
"--","javascript:void(0);","Carlos","--","7","tur.com/r.html"
"--","javascript:void(0);","Robert","--","8","tur.com/r.html"
"--","javascript:void(0);","Rod","--","9","tur.com/r.html"
As per your previous question, I think this needs to be broken down into sections. As it stands it is rather too broad to answer.
Read the information using file_get_contents(). Make sure this works first, by echoing it to the console. (It sounded from your other question that you felt this would not work if the URL does not have a .csv suffix. It should work regardless of the file extension - try it. If it fails it may be dependent on cookies or JavaScript or some other problem).
Design and create your table structure in MySQL. It seems like you have two tables. They should both have a primary key. Are they related in some fashion? If so, perhaps one has a foreign key to the other one?
Explode your text file on the new line character and loop across the resulting array of lines.
If your CSV data has a title row in the first row position, delete that from your array.
For each line, read the elements of interest using PHP's build-in CSV parsing functions, and store them in variables.
Pass these variables to a custom function that saves the data.
For each save, you'll need to do an INSERT. I recommend using PDO here. Make sure you bind your parameters.
Where you get stuck on a specific problem, you can ask a new and focussed question. At present, the task is to break things down into discrete and researchable pieces.
One trick worth remembering is this shortcut to the PHP manual. If you do not know how fgetcsv works, for example, type php.net/fgetcsv into your browser address bar, and the PHP site will find the function for you. The documentation is excellent.

PHP Search through CSV File

I have a CSV file which is more than 5 MB. As an example like this,
id,song,album,track
200,"Best of 10","Best of 10","No where"
230,"Best of 10","Best of 10","Love"
4340,"Best of 10","Best of 10","Al Road"
I have plenty of records. I need to get a one record at a time. Assume I need to get the detail of the id 8999. My main question is Is there any method to get exact data record from CSV just like we query in mysql table.
Other than that I have following solutions in my mind to perform the task. What could be the best way.
Read the CSV and load in to array. Then search trough it and get data. (I have to get all records. The order is different that's why I face this issue. Other wise I could get record by record.)
Export CSV to MYSQL database and then query from that table
Read the CSV all the time without loading it to array
If I can search in CSV file in quick way it will be great. Appreciate your suggestions.
Thank you.
I'm not aware if there are any libraries that allow you to directly query a CSV file. If there are, that would be your best bet. If not, here's another solution.
If you want to get details of id 8999, it would be extremely memory inefficient to load in the entire file.
You could do something like this
Read in a line using fgets, explode on comma and check 0th element. If it is not the ID you want, discard and repeat.
Not familiar with PHP, but if this query is frequent, consider keeping the data in the memory and index them by id using a mapping data structure. (should also be Array in PHP)

Categories