PHPExcel: finding page breaks after when creating worksheet on the fly - php

for development, i use PHPExcel (https://github.com/PHPOffice/PHPExcel) for creating a Excel (xlsx) document filled with data from a mysql database. So far this works good.
I want to print this Excel document too, as hardcopy to spread around for people reading it from paper. Printing wont be a problem.
The problem lies in: it's a long document, far more rows then will fit on 1 or 2 pages. The worksheet consists of blocks which i want to keep together, when printed, to be on the same page.
I can use a method to set breaks on a row. That i will be able to get to work. But after finishing putting the data on the sheet, i don't know where the automatically placed page breaks are. There is a function (getBreaks()) on the worksheet, which should provide an array with the breaks, but the returned array remains empty, so i can't find out which blocks go over a break.
Can someone help me?
Should i first save the Excel sheet, then open it again and do my job then?
Excel document is created on a shared hosting webserver. No printing options there i guess, at least not where i can manage them.
Other solution is: put a macro's in that Excel document, and let the macro run (to put the breaks on the right places) and print the Excel sheet. (not my favorite solution as i am honest).

Related

Issue converting to .pdf a merged .docx file that opens fine in Word

So, I have the following scenario.
I am working on a system for academical papers. I have several inputs that are for stuff like author name, coauthors, title, type of paper, introduction, objectives and so on. I store all that information in a database. The user has a Preview button which when clicked, generates a Word asynchronously and sends the file location back to the user and that file is afterwards shown to the user in an iframe using Google Doc Viewer.
There's a specific use case where the user/author of the paper can attach a .docx file with a table, or a .jpeg file for a figure. That table/figure has to be included inside the final .docx file.
For the .docx generation process I am using PHPWord.
So up until this point everything works fine, but my issues start when I try to mix everything and put together the .docx file.
Approach Number One
My first approach on doing this was to do everything with PHPWord. I create the file, add the texts where required and in the case of the image just insert the image and after that the figure caption below the image.
Things get tricky though, when I try doing the same thing with the .docx table file. My only option was to get the table XML using this. It did the trick, but the problem I ran into was that when I opened the resulting Word file, the table was there, but had lost all of its styling and had transparent borders. Because of those transparent borders, afterwards when converting it to PDF the borders were ignored and the table info is just scrambled text.
Approach Number Two (current one)
After fighting with Approach Number One and just complicating stuff more, I decided to do something different. Since I already generated one docx file with the main paper information and I needed to add another docx file, I decided to use the DocX Merge Library.
So, what i basically did was I have three generated word files, one for the main paper information, one for the table and one for the table caption (that last one is mainly to not overcomplicated the order of information). Also, that data is not in the table .docx file.
Then I run this:
$dm->merge( [
'paper-info.docx',
'attached-table.docx',
'attached-table-caption.docx'
], 'complete-file.docx');
So, afterwards, I check and the Word file is generated just as I need it with the table maintaining its original styles and dimensions.
If I open it in LibreOffice though, I get this error message:
Then if I continue and open the file, the file opens correctly with all the data with the only exception that it no longer respects the fonts of the file as they appear in Word.
So, the problem comes in the next step. Since I need to present a preview of the file using Google Doc Viewer using this syntax:
<iframe src="https://docs.google.com/gview?embedded=true&hl=es_LA&url=https://usersite.net/complete-file.docx?pid=explorer&efh=false&a=v&chrome=false&embedded=true" width="100%" height="600" style="border: none;"></iframe>
The document gets loaded fine, but when I review it what I see is that it only shows the content of the first paper-info.docx file and ends right where the table and table caption should appear. I open the exact same file in Word and it shows the table and caption.
The other issue is when I try to convert the file to PDF.
If I use PHPWord's method of conversion in combination with DomPDF I get the exact same issue as with the Google Docs Viewer, I just have the content of the first file, using this code:
$phpWordPDF = \PhpOffice\PhpWord\IOFactory::load('complete-file.docx');
$xmlWriterPDF = \PhpOffice\PhpWord\IOFactory::createWriter($phpWordPDF, 'PDF');
$xmlWriterPDF->save('complete-file-pdf');
So my only other viable route was to use LibreOffice's command line using this command:
soffice --headless --convert-to pdf complete-file.docx
This converts the file correctly, but has the issue mentioned when trying to open the .docx file in LibreOffice, the font styles are disconfigured.
Also weird part is that if I try to run this in my PHP script:
shell_exec('soffice --headless --convert-to pdf complete-file.docx');
Nothing happens.
I am running Apache 2.4.25, PHP 7.4.11 on Windows 10 x64.
Conclusion
Until now my best result was by merging the files, but it also caused this issue. So maybe the issue is coming from the merging process I am using. What would be ideal is to be able to just insert the table with styles and everything using PHPWord, but I haven't been able to and haven't found any examples on how to do that.
Another option that I've seen is this library, but the merge features is only in the license that's $599 USD, and since I am pretty close to solving this, I am not sure if it would solve my issue. If it does, I'd invest in it since I need to get this done ASAP, but I wanted to check with you guys what your recommendations would be for this case. Maybe another merging library or doing everything via PHPWord.
Help is appreciated!
After a lot of attempts to fix it, I wasn't able to achieve what I wanted with PHPWord and the merging library I mentioned.
Since I needed to fix this I decided to invest in the paid library I mentioned in my question. It was an expensive purchase, but for those who are interested, it does exactly what was required and it does it perfectly.
The two main functions I required were document merging and importing of content to a .docx file.
So I had to purchase the Premium package. Once there, the library literally does everything for you.
Example for docx files merge code:
require_once 'classes/MultiMerge.php';
$merge = new MultiMerge();
$merge->mergeDocx('document.docx', array('second.docx', 'other.docx'), 'output.docx', array());
Example for how to import a table from another docx file
require_once 'classes/CreateDocx.php';
$docx = new CreateDocxFromTemplate('document.docx');
// import tables
$referenceNode = array(
'type' => 'table',
);
$docx->importContents('document_1.docx', $referenceNode);
$docx->createDocx('output');
As you can see it is pretty easy. This answer is by no means an ad for this library, but for those that have the same problem as me, this is a life saver.

Setting Active Cell for excel generated by PHPSpreadsheet

I am using PHPSpreadsheet
I am using freezePane('A9') in every excel generated by our web application.
At end, I am also adding setSelectedCell('A9');.
But when I open excel file it gets open with "A10" as the active cell. And also "A9" is scrolled down. So one has to scroll up to see row "9".
Check here an image of the issue.
This sounds a little like your issue, https://github.com/PHPOffice/PhpSpreadsheet/issues/389
Their response of a sorts of fix is this:
can be corrected by explicitly providing 'topLeftCell' argument:
$spreadsheet->getActiveSheet()->freezePane('A2','A2');

Extracting dynamically changing data in excel via php

I have an open excel sheet that's constantly being updated by another program via DDE. I wish to have a php script that accesses some of the data in this excel sheet. I have tried using PHPExcel and it seems that I cannot have the changes I make (e.g. via setCellValue) being immediately reflected in the open excel sheet. Similarly, if I change value of a cell (without saving sheet to the file system) the new value of the cell is not available via getValue().
Is this functionality supported by phpExcel? If so, could someone please point me to documentation that shows how this can be done? Alternatively, is there another way (not using phpExcel, for example) to do this?
Thanks.
I was able to do this using the method shown at the rarified blog webpage
This worked for me, both for "pushing" cell values from php to excel, as well as getting modified values (without saving the file) from excel to perl. This site also has a nifty ajax-based function that keeps auto-refreshing my webpage with the latest values in excel.
Many thanks to the author of the blog.
It's not supported by PHPExcel.... PHPExcel loads the workbook into memory at the point in time when you issue the load() call , and at that point it can't "autorefresh" whenever the workbook is changed by your DDE because the DDE update is to the workbook on disk, not the PHPEXcel copy that's in PHP memory.
You'd need to be constantly loading and reloading to pick up changes to the underlying file.
Likewise, if you change the workbook in PHPExcel, it doesn't write that change back to the file on disk unless you explicitly save(), so the change will not be visible to your DDE program.
I'm not aware whether you can even do this with MS Excel itself... if you load a workbook using MS Excel itself, you're loading from disk into memory, and if anything else is accessing that workbook at the time, you find that you've loaded it in read-only mode, and (as far as I'm aware) it won't automatically refresh whenever the DDE program updates the original version. If anything can work with this the way you need, it's likely to be COM, but I wouldn't build up your hopes too much.

PHPExcel large data sets with multiple tabs - memory exhausted

Using PHPExcel I can run each tab separately and get the results I want but if I add them all into one excel it just stops, no error or any thing.
Each tab consists of about 60 to 80 thousand records and I have about 15 to 20 tabs. So about 1600000 records split into multiple tabs (This number will probably grow as well).
Also I have tested the 65000 row limitation with .xls by using the .xlsx extension with no problems if I run each tab it it's own excel file.
Pseudo code:
read data from db
start the PHPExcel process
parse out data for each page (some styling/formatting but not much)
(each numeric field value does get summed up in a totals column at the bottom of the excel using the formula SUM)
save excel (xlsx format)
I have 3GB of RAM so this is not an issue and the script is set to execute with no timeout.
I have used PHPExcel in a number of projects and have had great results but having such a large data set seems to be an issue.
Anyone every have this problem? work around? tips? etc...
UPDATE:
on error log --- memory exhausted
Besides adding more RAM to the box is there any other tips I could do?
Anyone every save current state and edit excel with new data?
I had the exact same problem and googling around did not find a valuable solution.
As PHPExcel generates Objects and stores all data in memory, before finally generating the document file which itself is also stored in memory, setting higher memory limits in PHP will never entirely solve this problem - that solution does not scale very well.
To really solve the problem, you need to generate the XLS file "on the fly". Thats what i did and now i can be sure that the "download SQL resultset as XLS" works no matter how many (million) row are returned by the database.
Pity is, i could not find any library which features "drive-by" XLS(X) generation.
I found this article on IBM Developer Works which gives an example on how to generate the XLS XML "on-the-fly":
http://www.ibm.com/developerworks/opensource/library/os-phpexcel/#N101FC
Works pretty well for me - i have multiple sheets with LOTS of data and did not even touch the PHP memory limit. Scales very well.
Note that this example uses the Excel plain XML format (file extension "xml") so you can send your uncompressed data directly to the browser.
http://en.wikipedia.org/wiki/Microsoft_Office_XML_formats#Excel_XML_Spreadsheet_example
If you really need to generate an XLSX, things get even more complicated. XLSX is a compressed archive containing multiple XML files. For that, you must write all your data on disk (or memory - same problem as with PHPExcel) and then create the archive with that data.
http://en.wikipedia.org/wiki/Office_Open_XML
Possibly its also possible to generate compressed archives "on the fly", but this approach seems really complicated.

CSV files and multi line text cells

I am generating a simple csv file using php. The file contains some user's personal data.
When I open the generated file in office, the addresses are not displayed in full height. I have to double click on the cell for the address to be shown fully (in full width and height) otherwise I can only see the first word/number of the address.
Also, I have date of births displayed as ######, I have to expand the whole column to see them fully.
This doesn't happen in open office.
Is there any way to force MS Office to show all fields in full? Because otherwise it'll be to confusing for the people who will use (Hey where are all the details!:)
Thanks :)
I don't think you can "format" your sheets with CSV. You will have to produce some other file format that Excel understands. I would suggest XML which is really easy to generate.
Just make a sample sheet with the data you want, save it as XML and you'll see how your file should be generated.
Or you could use some ready-made PHP solution for writing excel files if you can't be bothered with analysing the XML file.
you could try the auto-size columns feature.
This is a UI issue with how Excel works, you can't force Excel or anything else how they handle it.
The quickest work around is to perhaps create an XLS file that runs a macro to retrieve the CVS file and format the cells as needed, but there's nothing you can do inside the CSV to affect what Excel is displaying.

Categories