Read a CSV file or make a MySQL query

Read a CSV file or make a MySQL query - php

I'm making a CMS where once a user searches for something a cache file (CSV) is generated using MySQL and after that the same CSV is included and served by PHP for the same search.
Now I want to allow users to filter data from that same cache/static file using jQuery.
I have two options of
Make a DB query to generate the result based on user's filter parameters
Read that cache/static (which is in CSV format) and generate the result based on user's parameters using PHP only.
Both my Database and CSV files are small about 2000 rows in the MySQL Database and Max 500 lines in a CSV file. Average length of the CSV file would be around 50 lines. There will be several(say about 100) CSV files for diferrent searches.
Which technique will be faster and efficient? I'm on a shared host.
Search results are like product information on eCommerce websites.

MySQL servers in shared hosts context are most of the time ridiculously overloaded and may be very slow / unresponding sometimes.
If you want a work around, you could make your php script create a CSV file from the data table for the first user of the day, then read the CSV file for the rest of the day.

Because of you're on a shared host, total number 2K is not a problem, but the IO of harddisk is.
Put the database search results in memeory, such as mysql memeory engine table,
let redis manages the cache with TTL is better.

Related

CSV with 56.6mb of data, store as CSV file or in database?

I am creating a website where I want people to submit location addresses. To avoid spelling mistakes I would like users to select from a list (town name, county).
I came across the following site http://www.doogal.co.uk/london_postcodes.php which allows me to download a 56 mb large csv file containing the location data I need.
However I have never used a csv file larger than 1mb before with more than 30000 rows of data on my websites. Usually I would just upload to PhpMyAdmin.
What is better? Uploading the csv file to PHPmyadmin database or accessing the '.csv' file directly using php?
Regards

It depends on what you want to achieve. I am not so sure that using a CSV is benefitial since using a database will allow you to
Cache data
Create Indexes for fast searching
Create complex queries
Do data manipulation, etc
The only way I would think a CSV is better, is if you would use always, all the data. Otherwise, I would go for a database. The end result would be much more organized, much faster, and you could build on top of it.
Hope this helps.

If you're going to do lookups, I'd recommend you put it into a database table and add indexes on the fields that you will be searching on (https://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html). A flat file is not a good way to store data that you have to access or filter quickly.

Which is faster? 1 Million lines array or database?

In a PHP script, I got a set of params (zip codes / addresses) that will not change frequently, so I'm looking to move this particular db table to a config file. Would it be faster reading a file containing an array with 1 million lines with zip codes as keys or a db table to scan and get the remaining items of the address (street, city, state).
Thanks,

Try to store data in database rather than file.for million line i guess database if faster than file.
if you want to achieve performance you can use cache like APCCache or use index in databse over zip field.
sphinx is opensource index which allows faster performance over text search.

Based on the number of zip codes I'd say go with the DB instead of the associative array. And you'll be able to search either addresses or zip codes or even ids.

Need help getting data from 3rd party program into web server database on a regular basis

I am trying to build a data management system online using mysql/php/javascript/etc.
I have used Access for this in the past and it works great for the part I am struggling with. Data comes from an HPLC (lab instrument), and the software that determines the results auto exports this as an excel file, but can also do csv. After the export, the HPLC software runs a command file that opens access and runs a form which imports the file and places the data in the correct fields with proper identifiers.
I am now wanting to have all data on a web-based db. This will allow for better access off-site, and an easier to maintain system, especially off site. My problem is, I am not sure how to get the data from the HPLC to the database.
I think it may be possible to use mysql commands to upload the .csv file, but then to format and use proper table relations for the data, I am stuck! How can I upload the data AND run a program to normalize?

Export to .csv.
Write an MySQL programm which
creates a temporary table for your .csv input;
uploads the data from the .csv file with the LOAD DATA statement;
normalizes the data to your structured database tables by selecting them from the grand table with the csv data and inserting into your various tables.

Best practice for storing and searching applicant Résumé or CV file

I am starting a recruitment consultancy and sooner or later we would be dealing with many applicant résumés or CV (curriculum vitae). I am building a simple application with PHP and MySQL (target server to be windows) to let applicant upload CV on our website. Currently I would be restricting upload files to be only MS Word docs and MAX size 500 KB.
Now my question is around two operations which would be performed on these files.
Search content inside these files on specific key words to find relevant skills matching resumes.
Then serve these files to our employers either through download file link or email the resumes to them.
Coming straight to the questions
Do I store the actual files on File System and perform Windows search on them?
Or I only insert the content in to the MySQL blob/cblob, perform search on the table and then serve the content from the table itself to the employer.
Or I Store the file on File System and also insert the content in mysql blob. Search the content in mysql and serve the file from File System.
I am of the opinion that once the number of résumés reaches thousands, the Windows search would be extremely slow but then I search on internet and find that it is not advisable to store huge amount of file contents in a database.
So I just need your suggestion on the approach I should adopt in light of the assumption that at some point of time we would be storing and retrieving thousands of resumes.
Thanks in advance for your help.

One option, a hybrid: Index the resumes into a db, but store a filesystem path as the location. When you get a hit in the db and want to retrieve the resume, get it off the file system via the path indicated in the db.

What you want is a fulltext index of the documents. This tends to be a job for e.g. Solr (see this cross reference on StackOverflow: How do I index documents in Solr). The database would keep a reference to the file on the disk. You should not try to save blob data to an innodb table that does not run on the barracuda format using row_format=dynamic. Please refer to the MySQL performance blog for further details on the Blob storage in innodb topic.

PHPExcel large data sets with multiple tabs - memory exhausted

Using PHPExcel I can run each tab separately and get the results I want but if I add them all into one excel it just stops, no error or any thing.
Each tab consists of about 60 to 80 thousand records and I have about 15 to 20 tabs. So about 1600000 records split into multiple tabs (This number will probably grow as well).
Also I have tested the 65000 row limitation with .xls by using the .xlsx extension with no problems if I run each tab it it's own excel file.
Pseudo code:
read data from db
start the PHPExcel process
parse out data for each page (some styling/formatting but not much)
(each numeric field value does get summed up in a totals column at the bottom of the excel using the formula SUM)
save excel (xlsx format)
I have 3GB of RAM so this is not an issue and the script is set to execute with no timeout.
I have used PHPExcel in a number of projects and have had great results but having such a large data set seems to be an issue.
Anyone every have this problem? work around? tips? etc...
UPDATE:
on error log --- memory exhausted
Besides adding more RAM to the box is there any other tips I could do?
Anyone every save current state and edit excel with new data?

I had the exact same problem and googling around did not find a valuable solution.
As PHPExcel generates Objects and stores all data in memory, before finally generating the document file which itself is also stored in memory, setting higher memory limits in PHP will never entirely solve this problem - that solution does not scale very well.
To really solve the problem, you need to generate the XLS file "on the fly". Thats what i did and now i can be sure that the "download SQL resultset as XLS" works no matter how many (million) row are returned by the database.
Pity is, i could not find any library which features "drive-by" XLS(X) generation.
I found this article on IBM Developer Works which gives an example on how to generate the XLS XML "on-the-fly":
http://www.ibm.com/developerworks/opensource/library/os-phpexcel/#N101FC
Works pretty well for me - i have multiple sheets with LOTS of data and did not even touch the PHP memory limit. Scales very well.
Note that this example uses the Excel plain XML format (file extension "xml") so you can send your uncompressed data directly to the browser.
http://en.wikipedia.org/wiki/Microsoft_Office_XML_formats#Excel_XML_Spreadsheet_example
If you really need to generate an XLSX, things get even more complicated. XLSX is a compressed archive containing multiple XML files. For that, you must write all your data on disk (or memory - same problem as with PHPExcel) and then create the archive with that data.
http://en.wikipedia.org/wiki/Office_Open_XML
Possibly its also possible to generate compressed archives "on the fly", but this approach seems really complicated.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.