Ok, here's the scenario. I need to generate about 200 MS Word documents based off of data collected and stored in my database. Generating the word docs with or without photos is a user option. After Word doc generation I then want to create a zip file of all the files generated. The zip part is done, the word file generation for the most part is done.
When the user chooses to generate the reports without the photos, the site queries the database and returns about 200 records for the report, and then with a foreach loop I run the PHPWord coding to generate and write the files to a temp folder and then after the foreach loop I'm running code to zip them all up and then delete the temp files. Works great. BUT, when the option to generate the reports WITH photos is selected, it starts generating the Word docs, but the file sizes increase every time a file is created. First file is 70k, the second file is 140k, the third is 210k and so on, where each file should only be 70k each. The only difference in the two operations in the inclusion of the addImage commands with the table cells like so:
$table->addCell()->addImage('photos/thumb_image.jpg, $imageStyle);
Help please!
Use Media::resetElements() between loops that were available since 0.10.0.
Related
So I'm writing a script to delete images from my server. Basically I have a table in my Database which contains a list of buildings and each building has multiple images associated with an id. I'm saving my images on the server in a single folder and each image has the following naming format:
buildingID_imagename.jpg. For example, if I have a building with id=23, my images in my folder will appear as 23_imagename1.jpg, 23_imagename2.jpg, etc.
Now, I know how to delete an image using PHP using the unlink function. However, to delete all the images, I need to check each file file name one by one, do a split string manipulation, check for the id and then delete. The issue arises when I have like 10000 images in that folder. This becomes an expensive task to do although it will work.
My question, is there a simple way to check the image name and delete it from the folder?
Thanks
EDIT
After typing this, I just thought of one possible way. Getting all image links from my database table into an array, loop through it and delete just those. Would that be a good method to do it? Of course, after I get the images into an array, I'm also deleting them from the table.
Iterate over the dataset, check if file exists and remove the file.
Maybe execute it as a cron job in-case you think there could be thousands of file in this operation.
if(file_exists($fileName)){
unlink($fileName);
}
You can also use the
<? array_map ('unlink',glob('23_imagename*.jpg')); ?> // this example deletes all files with .jpg extension that starts with 23_imagename
With the glob function you can apply regex to efficiently find files that you want to delete
http://php.net/manual/en/function.glob.php
I have somewhere in the region of 60,000 URLs that I want to submit to Google. Given the restriction of 10,000 URLs per file i'm going to need to make a sitemap index and link to at least 6 sitemap files in that index.
I don't know what the most efficient way of doing this is. My idea was to go to my DB, take the TOP 10000 rows, run my foreach on the data and generate my links. My first idea was to create placeholder sitemap files (eg. sm1.xml, sm2.xml, etc.) and after each 10,000 rows increment the file index and insert the next 10,000 into the next file. The problem is that the data in the DB is always being added to, so next month I could have 70,000 URLs - meaning I'd have to create another placeholder file.
So with this in mind, I'd like to create the individual sitemap files dynamically but I don't know how.
Some idea's that might help, you on your way to building a sitemap generator in your project.
get the urls from your route.php file
get the classes/methods using the reflections class
get the data from the database or text file
Loop through each data set like you stated above and create indexed files for them.
use a CRON job to index your files via ping.
Use the ping service provided by these search engines.
You should maybe only ping the services at the end of each day or second day,
don't ping them once a new row is created!
Google Ping
http://www.google.com/webmasters/sitemaps/ping?sitemap=http://www.yourdomain.com/sitemap.xml
MSN
http://www.bing.com/webmaster/ping.aspx?siteMap=http://www.yourdomain.com/sitemap.xml
I'm making a CMS where once a user searches for something a cache file (CSV) is generated using MySQL and after that the same CSV is included and served by PHP for the same search.
Now I want to allow users to filter data from that same cache/static file using jQuery.
I have two options of
Make a DB query to generate the result based on user's filter parameters
Read that cache/static (which is in CSV format) and generate the result based on user's parameters using PHP only.
Both my Database and CSV files are small about 2000 rows in the MySQL Database and Max 500 lines in a CSV file. Average length of the CSV file would be around 50 lines. There will be several(say about 100) CSV files for diferrent searches.
Which technique will be faster and efficient? I'm on a shared host.
Search results are like product information on eCommerce websites.
MySQL servers in shared hosts context are most of the time ridiculously overloaded and may be very slow / unresponding sometimes.
If you want a work around, you could make your php script create a CSV file from the data table for the first user of the day, then read the CSV file for the rest of the day.
Because of you're on a shared host, total number 2K is not a problem, but the IO of harddisk is.
Put the database search results in memeory, such as mysql memeory engine table,
let redis manages the cache with TTL is better.
I have ~280,000 files that will need to be searched through, and the proper file returned and opened. The file names are exact matches of the expected search terms.
The search terms will be taken by an input box using PHP. What is the best way to accomplish this so that searches do not take a large amount of time?
Thanks!
I suspect the file system itself will struggle with 280,000 files in one directory.
An approach I've taken in the past is to put those files in subdirectories based upon the initial letters of the filename e.g.
1/100000.txt
1/100001.txt
...
9/900000.txt
etc. You can subdivide further using the second letter etc.
Its good you added mysql to your tags. Ideally i would have a CRON task that would index the directories into a mysql table and use that to do the actual search. Algebra is faster than File System iteration. You could run the task daily or hourly depending on how often your files change. Or use something like Guard to monitor the file system for changes and make appropriate updates.
See: https://github.com/guard/guard
Basically i have simple form which user uses for files uploading. Files should be stored under /files/ directory with some subdirectories for almost equally splitting files. e.g. /files/sub1/sub2/file1.txt
Also i need to not to store equal files (by filename).
I have own solution. Calculate sha1 from filename. Take first 5 symbols - abcde for example and put file in /files/a/b/c/d/e/ this works well, but gives situation when one folder contains 4k files, 2nd 6k files. Is there any way to make files count be more closer to each other? Max files count can be 10k or 10kk.
Thanks for any help.
P.S. May be i explained something wrong, so once again :) Task is simple - you have only html and php (without any db) and files directory where you should store only uploaded files without any own data. You should develop script that can handle storing uploads to files directory without storing duplicates (by filename) and split uploaded files by subdirectories by files count in each directory (optimal and count files in each directory should be close to each other).
I have no idea why you want it taht way. But if you REALLY have to do it this way, iI would suggest you set a limit how many bytes are stored in each folder. Everytime you have to save the data you open a log with
the current sub
the total number of bytes written to that directory
If necesary you create a new sub diretory(you coulduse th current timestempbecause it wont repeat) and reset the bytecount
Then you save the file and increment the byte count by the number of bytes written.
I highly doubt it is worth the work, but I do not really know why you want to distribute the files that way.