PHP symlink a huge files list - php

I have a huge file's list (more than 48k files with paths) and I wanna to do a symlink for these files
Here is my PHP code:
$files=explode("\n","files.txt");
foreach($files as $file){
$file=trim($file);
#symlink($file,"/home/".$file."#".rand(1,80000).".txt");
}
The problem is the process takes more than 1 hour
I thought about checking if the file exists first and then do a symlink, so I made some research in php.net and there some functions like is_link() and readlink() for what I wanted in the first place, but a comment took my attention:
It is neccesary to be notified that readlink only works for a real link file, if this link file locates to a directory, you cannot detect and read its contents, the is_link() function will always return false and readlink() will display a warning is you try to do.
So I made this new code:
$files=explode("\n","files.txt");
foreach($files as $file){
$file=trim($file);
if(!empty(readlink($file))){
#symlink($file,"/home/".$file."#".rand(1,80000).".txt");
}
}
The problem now : "there is no symlink files !"
How I can prevent this problems ? Should I use a multi threading or there is another option

Obviously you are running Linux-based operating system and your question is related to File system.
In this case I would recommend to create bash script to read the file.txt and create the symlinks for all of them.
Good start to this is:
How to symlink a file in Linux?
Linux/UNIX: Bash Read a File Line By Line
Random number from a range in a Bash Script
So you may try something like this:
#!/bin/bash
while read name
do
# Do what you want to $name
ln -s /full/path/to/the/file/$name /path/to/symlink/shuf -i 1-80000 -n 1$name'.txt'
done < file.txt
EDIT:
One line solution:
while read name; do ln -s /full/path/to/the/file/$name /path/to/symlink/shuf -i 1-80000 -n 1$name'.txt'; done < file.txt
Note: Replace the "file.txt" with full path to the file. And test it on small amount of files if anything goes wrong.

Related

php-updating a file and reading concurrently

I want to regularly update (rewrite, not append) a txt file from php by using file_put_contents. another php API reads this file and prints the content for the user.
is it possible that when the user wants to read the file via PHP API, it returns empty? because when the first PHP file tries to update the file, it erases the data and then writes new content. if it is possible, how to avoid it?
It can prevent and sure the source file won't be empty try following solution :
you can keep your processing text file in tmp folder e.g. tmp_txt which you can create parallel to same location where as your current text file, so first your text file goes to in this tmp folder
Create a shell script file and keep that under the tmp folder or any other folder
add the shell script which will observer the file size, and put that in to cron job scheduler
find /your project root path/tmp_txt/ -type f -size +1 -name "mytext.txt" -exec mv {} /your project root paht/folder where you want it/
"find" is command for search the file and next your tmp folder path"
"-type f" this will consider only the file
"-size +1" +1 mean above 1 KB
"-name "mytext.txt"" you can define your file name, if dynamic names then -name "*.txt"
"-exec mv {}" this will move the file on path that next to it, if match the file size with above condition which is 1KB you can change that as per your need
e.g. cronjob entry which will run the every minutes
bash /yor project root path/tmp_txt/shellscriptfilename>> /dev/null 2>&1

PHP script that checks SHA1 or MD5 hashes for all files in directory against checksums scraped from XML file; recursive, loop

I've done a bulk download from archive.org using wget which was set to spit out a list of all files per IDENTIFIER into their respective folders.
wget -r -H -nc -np -nH -e robots=off -l1 -i ./itemlist.txt -B 'http://archive.org/download/'
Which results in folders organised thus from a root, for example:
./IDENTIFIER1/file.blah
./IDENTIFIER1/something.la
./IDENTIFIER1/thumbnails/IDENTIFIER_thumb001.gif
./IDENTIFIER1/thumbnails/IDENTIFIER_thumb002.gif
./IDENTIFIER1/IDENTIFIER_files.xml
./IDENTIFIER2/etc.etc
./IDENTIFIER2/blah.blah
./IDENTIFIER2/thumbnails/IDENTIFIER_thumb001.gif
etc
The IDENTIFIER is the name of a collection of files that comes from archive.org, hence, in each folder, there is also the file called IDENTIFIER_files.xml which contains checksums for all the files in that folder, wrapped in the various xml tags.
Since this is a bulk download and there's hundreds of files, the idea is to write some sort of script (preferably bash? Edit: Maybe PHP?) that can select each .xml file and scrape it for the hashes to test them against the files to reveal any corrupted, failed or modified downloads.
For example:
From archive.org/details/NuclearExplosion, XML is:
https://archive.org/download/NuclearExplosion/NuclearExplosion_files.xml
If you check that link you can see there's both the option for MD5 or SHA1 hashes in the XML, as well as the relative file paths in the file tag (which will be the same as locally).
So. How do we:
For each folder of IDENTIFIER, select and scrape the XML for each filename and the checksum of choice;
Actually test the checksum for each file;
Log outputs of failed checksums to a file that lists only the failed IDENTIFIER (say a file called ./RetryIDs.txt for example), so a download reattempt can be tried using that list...
wget -r -H -nc -np -nH -e robots=off -l1 -i ./RetryIDs.txt -B 'http://archive.org/download/'
Any leads on how to piece this together would be extremely helpful.
And another added incentive---probably a good idea too, if there is a solution, if we let archive.org know so they can put it on their blog. I'm sure I'm not the only one that will find this very useful!
Thanks all in advance.
Edit: Okay, so a bash script looks tricky. Could it be done with PHP?
If you really want to go the bash route, here's something to you started. You can use the xml2 suite of tools to convert XML into something more amendable to traditional shell scripting, and then do something like this:
#!/bin/sh
xml2 < $1 | awk -F= '
$1 == "/files/file/#name" {name=$2}
$1 == "/files/file/sha1" {
sha1=$2
print name, sha1
}
'
This will produce on standard output a list of filenames and their corresponding SHA1 checksum. That should get you substantially closer to a solution.
Actually using that output to validate the files is left as an exercise to the reader.

tar gz file extract exclude folder "data"

I have a little problem, I have a large 41gb file on a server and I need to extract it..
How would i go about it, the file is in a tar.gz format and it will take 24hr on a godaddy server and then it stops for some reason
I need to exclude a folder name data this contains the bulk of the data 40.9gb the rest is just php.
home/xxx/public_html/xxx.com.au/data << this is the folder I don't need
I have been searching google and other sites for day's but it doesn't work..
shell_exec('tar xvf xxx_backup_20140921.tar.gz'); this is the command I use I have even used the 'k' to skip files and it dont work
I have used the -exclude command but nothing,
Try this:
shell_exec("tar xzvf xxx_backup_20140921.tar.gz --exclude='home/xxx/public_html/xxx.com.au/data'");
This should prevent the path listed (relative to the root of the archive) from being extracted.

File not created (Linux-PHP-C++)

I am trying to save a file to the current directory in Linux and then display it on a webpage. Currently I run a C++ executable from a php script with the following code
exec("/var/www/html/radsim/plotFluence $rMin $rMax $zMin $zMax $lum $graphStyle $basepath[$path]", $return);
When I run the executable from the console in Linux the file is created fine, the problem arises when I try from within the php; the file is simply not in the directory . The user inputs values and the executable is run but no file is made. The C++ looks like this
canvas->Print(("/var/www/html/radsim/"+histoName+_FileFormat).c_str());
The permisions are set to 777. In addition, on another PHP script, I use fopen("data.txt", 'w') or die() to create a text file, but it always dies.
Seems like a sandbox. There must be a php config option - best to start here: http://php.net/manual/en/configuration.changes.php

Creating a zipped folder at the click of a button

I have a folder with files, and at the click of a button from an application, I would like to create a zipped version of the folder. I understand that it's possible to create a tar.gz version on a UNIX system by passing in the command exec(tar -cvf foldername destination_filename).
Question: Is it possible to create a zipped file on a UNIX system? If it is possible, what's the logic/command behind it?
Most *nix systems will support the following commnad:
zip -r destination_filename.zip foldername
Use php zip extension? http://www.php.net/manual/en/class.ziparchive.php
And i guess you'll find useful classes from phpclasses too: http://www.phpclasses.org/search.html?words=zip&x=0&y=0&go_search=1
You need to pass the -z option to tar to have it also zip the file:
tar -czf foo.tar.gz foo/
Make sure you are in a place that the webserver has write privileges. For example, you may need chdir into the parent folder of the folder you wish to zip, and then get a temp file name in /tmp, and then create the command to zip to that temp file name.
It sounds like you have the rest figured out!
http://www.gnu.org/software/tar/manual/tar.html

Categories