How to find the different lines between 2 files in PHP - php

I'm working on a small PHP application which update a stock products regularly, i'm getting the updated file from the server, and i have the old one in my directory, so what is the best way to get only the updated products(lines) between these two files, for information both files contain arround 70000 product lines.
I though to store the data of each file into an array, then use "array_diff" to compare them, it will work theoretically, but will be good idea with 70000 on each array?
Thanks in advance.

I'd use the diff command.
For reference:
https://www.geeksforgeeks.org/diff-command-linux-examples/

Related

Which is faster? 1 Million lines array or database?

In a PHP script, I got a set of params (zip codes / addresses) that will not change frequently, so I'm looking to move this particular db table to a config file. Would it be faster reading a file containing an array with 1 million lines with zip codes as keys or a db table to scan and get the remaining items of the address (street, city, state).
Thanks,
Try to store data in database rather than file.for million line i guess database if faster than file.
if you want to achieve performance you can use cache like APCCache or use index in databse over zip field.
sphinx is opensource index which allows faster performance over text search.
Based on the number of zip codes I'd say go with the DB instead of the associative array. And you'll be able to search either addresses or zip codes or even ids.

Best way to loop through thousands of files in one directory

I have up to 40 000 images stored in one directory (ID photos). I need to periodically synchronize them with a database. I'm currently using a PHP script to loop through the directory and add new files and remove missing files, through ODBC. Obviously, this is not working so well.
Is there a robust way to it in PHP? Or, what is the best alternative?
Thanks

Retrieving the proper file from a directory with ~280,000 files

I have ~280,000 files that will need to be searched through, and the proper file returned and opened. The file names are exact matches of the expected search terms.
The search terms will be taken by an input box using PHP. What is the best way to accomplish this so that searches do not take a large amount of time?
Thanks!
I suspect the file system itself will struggle with 280,000 files in one directory.
An approach I've taken in the past is to put those files in subdirectories based upon the initial letters of the filename e.g.
1/100000.txt
1/100001.txt
...
9/900000.txt
etc. You can subdivide further using the second letter etc.
Its good you added mysql to your tags. Ideally i would have a CRON task that would index the directories into a mysql table and use that to do the actual search. Algebra is faster than File System iteration. You could run the task daily or hourly depending on how often your files change. Or use something like Guard to monitor the file system for changes and make appropriate updates.
See: https://github.com/guard/guard

Simplest way to total up columns?

I have a PHP script that imports up to 10 or so different CSV files. Each row of each file contains bank account info, including balance. After I've imported all the CSV data into my database, I'd like to make sure the data got in there correctly by comparing the total account balance in the database to the total account balance of the CSV files.
I see at least a few options:
Manually total up all the account balances in Excel - yuck.
Write a PHP script to read each CSV file and total up the account balances - also yuck.
Some third option that I hope exists. It would be amazing if I could do something like:
excel --file="cd.csv" | sum --column="E"
That's obviously not a real thing but hopefully you get the idea. Using some combination of PHP, MySQL, Linux commands, Excel and/or any other tools, is there a simple way to do this?
Don't have to complete answer for you, but AWK should be able to solve your problem: Have a look at these 2 posts:
https://superuser.com/questions/53652/transforming-csv-file-using-sed
Shell command to sum integers, one per line?
I'm not enough of a AWK expert to give the solution, but perhaps someone else can help us here.
Another option (which you may also consider yuck) is to use a library like PHPExcel
You can iterate over the CSV file using fgetcsv() which converts each line to an array of values. You can accumulate the value of the array element containing the balance as you move thru each iteration until you get the sum total. Use glob to get the list of CSV files in a folder.
You may not have to "manually" total up the account balances, if you can use Excel functions from your application, the Excel formula in VBA would be:
Application.Sum(Range("A:A"))
where A:A is for column A.
Try using CSVFix with it's summary option. It will get you more data than you need, but should be easy to use.
Otherwise, this sounds like a good use for Awk.
Why can't you just automagically total up the account balances in excel with a formula and export them with the rest of the data?
A bit of a different angle: Make use of the MySql CSV engine to expose your CSV files to Mysql and then do a normal SQL SUM.
See also: The CSV Storage Engine

PHP programming logic inquiry

With the help of this community, I just recently solved some issues with PHPExcel reading multiple files through a foreach(). Now it seems I've got logic issues.
I'm reading some excel files and extracting some data from them. This data are student grades. You can check a lightly modified code (I didn't paste it here because of quoting issues). $filelist is an array with the filenames available in the $folder I chose.
Now, while the first and fourth result show me the values of their respective files, all the other results do not (I've tried with 24 files so far). They show me the results from the first file.
What do you think might be happening here?
I guess it's a problem with the way I'm doing it, but I can't figure out the cause.
Thank you very much in advance.
PS: If you need any further data, please let me know. Thanks.
It must have been something within the files. I had a hard time redoing all the files into a single file with multiple sheets, and the program now works as intended.

Categories