i am trying to write in txt file and my data are comma separated numbers which are appended every time. i have two types of data which i am trying to compare so if we are using one txt file then it has to be written in new line .
1,5,10,15,22,32
other line is
1,5,7,12,8,99
so its simple data but with time it grows as new data are appended and in worst situation may become big. may be 10-15000 numbers in each line
i have also doubt that if these txt files are open and closed fast enough data may corrupt.
so i would like to write these two lines in two different txt file instead of one. where is more integrity and performance ??
should i use only one file or two files are preferred.
dont want to use database as database is always naughty and may create some load on server which can be avoided with txt file
Related
I'm going to be using a JSON file to contain a list of links to a few posts, and it can be updated at any time. However, I'm stumped with the mode to use with PHP's fopen() function. This will be a flat-file database and primarily is for me learning to work with files, PHP, and JSON before moving onto a proper relational database (that, and it's not a huge collection of pages that I'm worried about using SQL or something like that yet...)
The process I'm using is that once a blog post is typed up, it will create a directory, save a new index.php file to it with all of the stuff that lets me view the page, and then, where I'm currently stuck, update a JSON file with the Title, Author, Date, and link to the newly created page.
Based on the PHP Manual, there are three modes I might want to use. r+, w+, or a+.
The process I am looking to use is to take the JSON file and place the data into an array. Update the array, then save it back to the file.
a+ places the pointer at the end of the file and writes are always appended, so I'm assuming this is the worst choice for this situation since I wouldn't add a new JSON entry at the end of the file (I'm tempted to actually insert any new data at the beginning of the JSON object instead of at the end).
w+ mentions read and write, but also truncating the file - does this happen upon saving data to the file, or does this happen the moment the file is opened? If I used this mode on an existing JSON file, would I then be reading a blank file before I can even modify the array and re-save it to the object?
r+ mentions placing the pointer at the beginning of the file - does saving data overwrite what's there or will it insert the data BEFORE what's existing there? If it inserts, how would I manually clear the file and then save the newly-modified array to the JSON object?
Which of those modes are best suited for what I'm looking to do? Is there a better way of doing this, anyway?
If you're always reading or writing an entire file, you don't have to work with file handles at all - PHP provides a pair of functions file_get_contents($file_name) and file_put_contents($file_name, $content) which are much simpler to work with.
File handles with their various modes are most useful when you're working with parts of files. For instance, if you are using CSV, you can read or write one line at a time, without having the full set of data in memory at once. Or, with binary file formats, you might know the location in the file you want to read from, and can "seek" the file handle to that location.
You should probably read the entire file first (eg with file_get_contents(), and then open it with w+ to write the new data. (Edit: Or rather, as the other answer points out, use file_put_contents(), which is always simpler when you are only making one write operation.)
r+ will overwrite as much of the file as you are writing, but won't erase beyond that. If your data always increases in size, this should be the same as overwriting the file entirely, but even if it's true now, that's an assumption that will likely mess up your data in the future.
I have 1000 plus txt files with file name as usernames. Now i'm reading it by using loop. here is my code
for($i=0; $i<1240; $i++){
$node=$users_array[$i];
$read_file="Uploads/".$node."/".$node.".txt";
if (file_exists($read_file)) {
if(filesize($read_file) > 0){
$myfile = fopen($read_file, "r");
$file_str =fread($myfile,filesize($read_file));
fclose($myfile);
}
}
}
when loop runs, it takes too much time and server gets timed out.
I don't know why it is taking that much time because files have not much data in it. read all text from a txt file should be fast. am i right?
Well, you are doing read operations on HDD/SSD which are not as fast as memory, so you should expect a high running time depending on how big the text files are. You can try the following:
if you are running the script from browser, I recommend running it from command line, this way you will not get a web server time out and the script will manage to finish if there is no time execution limit set on php, case in which maybe you should increase it
on your script above you can set "filesize($read_file)" into a variable so that you do not execute it twice, it might improve running the script
if you still can't finish the job consider running it in batches of 100 or 500
keep an eye on memory usage, maybe that is why the script dies
if you need the content of the file as a string you can try "file_get_contents" and maybe skip "filesize" check all together
It sounds like your problem is having 1000+ files in a single directory. On a traditional Unix file system, finding a single file by name requires scanning through the directory entries one by one. If you have a list of files and try to read all of them, it'll require traversing about 500000 directory entries, and it will be slow. It's an O(n^2) algorithm and it'll only get worse as you add files.
Newer file systems have options to enable more efficient directory access (for example https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Hash_Tree_Directories) but if you can't/don't want to change file system options you'll have to split your files into directories.
For example, you could take the first two letters of the user name and use that as the directory. That's not great because you'll get an uneven distribution, it would be better to use a hash, but then it'll be difficult to find entries by hand.
Alternatively you could iterate the directory entries (with opendir and readdir) and check if the file names match your users, and leave dealing with the problems the huge directory creates for later.
Alternatively, look into using a database for your storage layer.
There are a lot of different scenarios that are similar (replace text in file, read specific lines etc) but I have not found a good solution to what I want to do:
Messages (strings) are normally sent to a queue. If the server that handles the queue is down the messages are saved to a file. One message per line.
When the server is up again I want to start sending the messages to the server. The file with messages could be "big" so I do not want to read the entire file into memory. I also only want to send a message once, so the file need to reflect if a message has been sent(in other words: don't get 100 lines and then PHP timeout after 95 so the next time the same thing will happen again).
What I basically need is to read one line from a big text file and then delete that line when it has been processed by my script, without constantly reading/writing the whole file.
I have seen different solutions (fread, SplFileObject etc) that can read a line from a file without reading the entire file (into memory) but I have not seen a good way to delete the line that was just read without going through the entire file and saving it again.
I'm guessing that it can be done since the thing that needs to be done is to remove x bytes from the beginning or the end of the file, depending where you read the lines from.
To be clear: I do not think it's a good solution to read the first line from the file, use it, and then read all the other lines just to write them to a tmp-file and then from there to the original file. Read/write 100000 lines just to get one line.
The problem can be fixed in other ways, like creating a number of smaller files so they can be read/written without to much performance problems, but I would like to know if anyone has a solution to the exact problem.
Update:
Since it can't be done did I end up using Sqlite.
There is an array of numbers, divided into partitions containing the same number of elements (as an output of array_chunk()). They are written into separate files, file 1.txt contains the first chunk, 2.txt - the second and so on. And now I want these files to contain a different number of elements of the initial array. Of course, I can read them into one array and split it again, but it requires quite a large amount of memory. Could you please help me with a more efficient solution? (The number of files and the size of the last are stored separately) I have no other ideas...
Do you know what the different number is? If you do, then you can easily read data in, and then whenever you fill a chunk write data out. In pseudo-code:
for each original file:
for each record:
add record to buffer
if buffer is desired size:
write new file
clear buffer
write new file
Obviously you'll need to keep new files separate from old ones. And then, once you've rewritten the data, you can swap them out somehow. (I would personally suggest having two directories, then rename directories after you're done.)
If you don't know what the size of your chunks should be (for instance you want a specific number of files) then first do whatever work it needs to figure that out, then proceed with the original solution.
I'm trying to write a PHP script which will handle sorting of a CSV file by one or multiple columns and outputting the result to another file.
Is there a way to sort the CSV file without loading it entirely into memory?
No there is no reasonable way. You need the data in memory to compare and write in a file.
You could try a bubblesort if you know the length of each line. Read one line of origin and last line of a new "ordered" file. Compare them and append or prepend in new file. After this iteration do again with the new file as origin until it is sorted.
You should use a database like MySQL.