Problem editing word file in PHP - php

So I need to edit some text in a Word document. I created a Word document and saved it as XML. It is saved correctly (I can open the XML file in MS Word and it looks exactly like the docx original).
So then I use PHP DOM to edit some text in the file (just two lines) (EDIT - bellow is already fixed working version):
<?php
$firstName = 'Richard';
$lastName = 'Knop';
$xml = file_get_contents('template.xml');
$doc = new DOMDocument();
$doc->loadXML($xml);
$doc->preserveWhiteSpace = false;
$wts = $doc->getElementsByTagNameNS('http://schemas.openxmlformats.org/wordprocessingml/2006/main', 't');
$c1 = 0; $c2 = 0;
foreach ($wts as $wt) {
if (1 === $c1) {
$wt->nodeValue .= ' ' . $firstName;
$c1++;
}
if (1 === $c2) {
$wt->nodeValue .= ' ' . $lastName;
$c2++;
}
if ('First Name' === substr($wt->nodeValue, 0, 10)) {
$c1++;
}
if ('Last Name' === substr($wt->nodeValue, 0, 9)) {
$c2++;
}
}
$xml = str_replace("\n", "\r\n", $xml);
$fp = fopen('final-xml.xml', 'w');
fwrite($fp, $xml);
fclose($fp);
This gets executed properly (no errors). These two lines:
<w:t>First Name:</w:t>
<w:t>Last Name:</w:t>
Get replaced with these:
<w:t>First Name: Richard</w:t>
<w:t>Last Name: Knop</w:t>
However, when I try to open the final-xml.xml file in MS Word, it doesn't open (Word freezes). Any suggestions.
EDIT:
I tried using levenstein():
$xml = file_get_contents('template.xml');
$xml2 = file_get_contents('final-xml.xml');
$str = str_split($xml, 255);
$str2 = str_split($xml2, 255);
$i = 0;
foreach ($str as $s) {
$dist = levenshtein($s, $str2[$i]);
if (0 <> $dist) {
echo $dist, '<br />';
}
$i++;
}
Which outputted nothing.
Which is weird. When I open the final-xml.xml file in notepad, I can clearly see that those two lines have changed.
EDIT2:
Here is the template.xml file: http://uploading.com/files/61b2922b/template.xml/

This is a problem related to DOS vs UNIX line endings. Word 2007 does not tolerate a \n line ending, it requires \r\n whereas Word 2010 is more tolerant and accepts both versions.
To fix the problem make sure that you replace all UNIX line breaks with DOS ones before saving the output file:
$xml = str_replace("\n", "\r\n", $xml);
Full sample:
<?php
$firstName = 'Richard';
$lastName = 'Knop';
$xml = file_get_contents('template.xml');
$doc = new DOMDocument();
$doc->loadXML($xml);
$doc->preserveWhiteSpace = false;
$wts = $doc->getElementsByTagNameNS('http://schemas.openxmlformats.org/wordprocessingml/2006/main', 't');
foreach ($wts as $wt) {
echo $wt->nodeValue;
if ('First Name:' === $wt->nodeValue) {
$wt->nodeValue = 'First Name: ' . $firstName;
}
if ('Last Name:' === substr($wt->nodeValue, 0, 10)) {
$wt->nodeValue = 'Last Name: ' . $lastName;
}
}
$xml = $doc->saveXML();
// Replace UNIX with DOS line endings
$xml = str_replace("\n", "\r\n", $xml);
$fp = fopen('final-xml.xml', 'w');
fwrite($fp, $xml);
fclose($fp);
?>

XML Word files have certain checksums stored near the top of the dom (to my recollection). You may have to change these, such as the size, or general checksum itself.
I know this was my problem when I was (dumb) enough to make an HTML file in word and save it, it has thousands of useless things in it that only served to make editing worse.

Related

Modifying text and grabbing the Newest dates

Im working with someone else's code.
lets say I have an id have something like this in a csv text file(no header this is just an example and is imported into a database)
|id|partNumber|vendorNumber|dateModified|
123|18-302|fe32l|8/27/2020
123|18-302|fe32l|8/22/2020
123|18-302|fe32l|8/27/2021
321|18-3032|fe32l|8/27/2020
321|18-3032|fe32l|5/24/2022
My conclusion is because it gabs everything, rather than the newest, as a result it doesn't import the newest information.
Question
How do I modify the text file it creates to only write then newest dates for each id.(which is the first and last field in the text file). It can even delete the fields after it writes it.
I think the file is created within lines 253-297
protected function writeImportFile($tmpFile)
{
$data = file_get_contents($tmpFile);
$data = str_replace('\\', '', $data);
$data = str_replace("\t", '', $data);
file_put_contents($tmpFile, $data);
unset($data);
$stopwatch = new Stopwatch();
$stopwatch->start('write');
$tmpDir = $this->tmpDir;
$fileImport = $tmpDir . $this->tableName . '.txt';
$fileRead = new \SplFileObject($tmpFile);
$reader = new CsvReader($fileRead, '|');
$reader->setHeaderRowNumber(0);
$writer = new CsvWriter('|');
$writer->setStream(fopen($fileImport, 'w'));
$numLines = $reader->count();
$i=0;
$this->modColumns();
foreach ($reader as $row) {
$newRowArray = $this->modRow($row);
if (!empty($newRowArray)) {
$writer->writeItem($newRowArray);
}
$i++;
if($i%1000 == 0){
$this->writeLog(' import file has wrote '.$i.' of '.$numLines);
}
}
unlink($tmpFile);
$exTime = $stopwatch->stop('write');
$timeMil = $exTime->getDuration();
$this->writeLog(' Time to write import file for ' . $this->tableNameSw . ' ' . gmdate("H:i:s", $timeMil / 1000));
return $fileImport;
}

Can't get whitespace added to string in PHP

I"m trying to make a web form that outputs to flat text file line by line what the input to the web form is. Several of the fields are not required but the output file must input blank spaces in for whatever is not filled out. Here is what I'm trying:
$output = $_SESSION["emp_id"];
if(!empty($_POST['trans_date'])) {
$output .= $_POST["trans_date"];
}else{
$output = str_pad($output, 6);
}
if(!empty($_POST['chart'])) {
$output .= $_POST["chart"];
}else{
$output = str_pad($output, 6);
}
write_line($output);
function write_line($line){
$file = 'coh.txt';
// Open the file to get existing content
$current = file_get_contents($file);
// Append a new line to the file
$current .= $line . PHP_EOL;
// Write the contents back to the file
file_put_contents($file, $current);
}
However, when I check my output the spaces don't show up. Any ideas on what's going on with this? Thanks in advance!
str_pad is padding with spaces, not adding spaces. You're padding an existing value with spaces so that it is 6 characters long, not adding 6 whitespaces to the value. So if $_SESSION["emp_id"] is 6 characters long or more, nothing will be added.
str_pad() won't add that number of spaces, but rather makes the string that length by adding the appropriate number of spaces. Try str_repeat():
$output = $_SESSION["emp_id"];
if(!empty($_POST['trans_date'])) {
$output .= $_POST["trans_date"];
}else{
$output = $output . str_repeat(' ', 6);
}
if(!empty($_POST['chart'])) {
$output .= $_POST["chart"];
}else{
$output = $output . str_repeat(' ', 6);
}
write_line($output);
function write_line($line) {
$file = 'coh.txt';
// Open the file to get existing content
$current = file_get_contents($file);
// Append a new line to the file
$current .= $line . PHP_EOL;
// Write the contents back to the file
file_put_contents($file, $current);
}
Cheers!

changing new line to <br> in text area

My problem is pretty simple. I want to change new lines in text area to <br> tags BUT I need the final string to be one-line text. I tried using nl2br function but as a result I get string with <br> tags and new lines. I also tried to simply replace &#013 or &#010 symbols with <br> using str_replace but it doesn't work.
Here is sample of my latest code:
Godziny otwarcia: <textarea name="open" rows="3" cols="20">'."$openh".'</textarea>
<input type="submit" name="openb" value="Zmień"/><br>
if($_POST['openb']) {
$open = $_POST['open'];
str_replace('&#010', '<br>', $open);
change_data(21, $open);
}
The $openh is result of this:
$tab = explode('<br>', $openh);
$openh = null;
for($i=0;$i<count($tab);$i++)
$openh = $openh . $tab[$i] . '&#013';
(yes, I know i could use str_replace, don't ask why I did it this way)
and the original $openh is $openh = 'Pon-pt 9:00-17:00<br>Środa 12:00-17:00'
Also you may want to see my change_data function as it is connected to why i need the string to be in one line, so here it is:
function change_data($des_line, $data) {
$file = 'config.php';
$lines = file($file);
$i=1;
foreach($lines as $line_num => $line) {
$wiersz[$i] = $line;
$i++;
}
$change = explode("'", $wiersz[$des_line]);
$wiersz[$des_line] = $change[0] . "'" . $data . "'" . $change[2];
$i = 1;
$f = fopen($file, w);
while($i <= count($wiersz)) {
fwrite($f, $wiersz[$i]);
$i++;
}
fclose($f);
header('location: index.php?p=admin');
}
I'm not PHP specialist so sometimes I do things little "hard" way.. I had huge problems with reading file config.php line by line and these are results of my few-hours effort :(
have you tried the php constant PHP_EOL? in you str_replace code?
$open=str_replace(PHP_EOL,"<br>",$_POST["open"]);
There is a ready made PHP function for that named nl2br
Source: https://stackoverflow.com/a/16376133/469161

how to find line number for DOM elements in php?

I want to check whether a <img> tag has alt="" text or not and also need to find what line number in DOM that img tag is. At the moment I have the following codes written but stuck with finding the line number.
for example:
$doc = new DOMDocument();
$doc->loadHTMLFile('http://www.google.com');
$htmlElement = $doc->getElementsByTagName('html');
$tags = $doc->getElementsByTagName('img');
echo $tags->item(0)->getLineNo();
foreach ($tags as $image) {
// Get sizes of elements via width and height attributes
$alt = $image->getAttribute('alt');
if($alt == ""){
$src = $image->getAttribute('src');
echo "No alt text ";
echo '<img src="http://google.com/'.$src.'" alt=""/>'. '<br>';
}
else{
$src = $image->getAttribute('src');
echo '<img src="http://google.com/'.$src.'" alt=""/>'. '<br>';
}
}
from the above code at the moment I am getting images and text saying that "no alt text" beside the image, but I want to get what line number that img tag appears.
for example here the line number is 57,
56. <div class="work_item">
57. <p class="pich"><img src="images/works/1.jpg" alt=""></p>
58. </div>
Use DOMNode::getLineNo(), e.g.$line = $image->getLineNo().
HTML has no real concept of line numbers, since they are just whitespace.
With that in mind, you might be able to count how many newlines there are in all the text nodes preceding the target node. You might be able to do this with DOMXPath:
$xpath = new DOMXPath($doc);
$node = /* your target node */;
$textnodes = $xpath->query("./preceding::*[contains(text(),'\n')]",$node);
$line = 1;
foreach($textnodes as $textnode) $line += substr_count($textnode->textContent,"\n");
// $line is now the line number of the node.
Please note that I have not tested this, nor have I ever used axes in xpath.
I think i have figured out what i was trying to achieve but not sure is that the right way. It is doing the job. Please leave comments or any other idea how can i improve it.
If you go to the following site and type any URL. It will produce a report with accessibility issues in a webpage. It is an accessibility checker tool.
http://valet.webthing.com/page/
All i am trying to do is achieve that kind of layout. The code below will produce the DOM of supplied URL and find any image tag that does not have alternative text.
<html>
<body>
<?php
$dom = new domDocument;
// load the html into the object
$dom->loadHTMLFile('$yourURLAddress');
// keep white space
$dom->preserveWhiteSpace = true;
// nicely format output
$dom->formatOutput = true;
$new = htmlspecialchars($dom->saveHTML(), ENT_QUOTES);
$lines = preg_split('/\r\n|\r|\n/', $new); //split the string on new lines
echo "<pre>";
//find 'alt=""' and print the line number and html tag
foreach ($lines as $lineNumber => $line) {
if (strpos($line, htmlspecialchars('alt=""')) !== false) {
echo "\r\n" . $lineNumber . ". " . $line;
}
}
echo "\n\n\nBelow is the whole DOM\n\n\n";
//print out the whole DOM including line numbers
foreach ($lines as $lineNumber => $line) {
echo "\r\n" . $lineNumber . ". " . $line;
}
echo "</pre>";
?>
</body>
</html>
I like to thank everyone who helped specially "chwagssd" and Mike Johnson.

File manupulation search and replace csv php

I need a script that is finding and then replacing a sertain line in a CSV like file.
The file looks like this:
18:110327,98414,127500,114185,121701,89379,89385,89382,92223,89388,89366,89362,89372,89369
21:82297,79292,89359,89382,83486,99100
98:110327,98414,127500,114185,121701
24:82297,79292,89359,89382,83486,99100
Now i need to change the line 21.
This is wat i got so far.
The first 2 to 4 digits folowed by : ar a catergory number. Every number after this(followed by a ,) is a id of a page.
I acces te id's i want (i.e. 82297 and so on) from database.
//test 2
$sQry = "SELECT * FROM artikelen WHERE adviesprijs <>''";
$rQuery = mysql_query ($sQry);
if ( $rQuery === false )
{
echo mysql_error ();
exit ;
}
$aResult = array ();
while ( $r = mysql_fetch_assoc ($rQuery) )
{
$aResult[] = $r['artikelid'];
}
$replace_val_dirty = join(",",$aResult);
$replace_val= "21:".$replace_val_dirty;
// file location
$file='../../data/articles/index.lst';
// read the file index.lst
$file1 = file_get_contents($file);
//strip eerde artikel id van index.lst
$file3='../../data/articles/index_grp21.lst';
$file3_contents = file_get_contents($file3);
$file2 = str_replace($file3_contents, $replace_val, $file1);
if (file_exists($file)) {
echo "The file $filename exists";
} else {
echo "The file $filename does not exist";
}
if (file_exists($file3)) {
echo "The file $filename exists";
} else {
echo "The file $filename does not exist";
}
// replace the data
$file_val = $file2;
// write the file
file_put_contents($file, $file_val);
//write index_grp98.lst
file_put_contents($file3, $replace_val);
mail('info#', 'Aanbieding catergorie geupdate', 'Aanbieding catergorie geupdate');
Can anyone point me in the right direction to do this?
Any help would be appreciated.
You need to open the original file and go through each line. When you find the line to be changed, change that line.
As you can not edit the file while you do that, you write a temporary file while doing this, so you copy over line-by-line and in case the line needs a change, you change that line.
When you're done with the whole file, you copy over the temporary file to the original file.
Example Code:
$path = 'file';
$category = 21;
$articles = [111182297, 79292, 89359, 89382, 83486, 99100];
$prefix = $category . ':';
$prefixLen = strlen($prefix);
$newLine = $prefix . implode(',', $articles);
This part is just setting up the basics: The category, the IDs of the articles and then building the related strings.
Now opening the file to change the line in:
$file = new SplFileObject($path, 'r+');
$file->setFlags(SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY);
$file->flock(LOCK_EX);
The file is locked so that no other process can edit the file while it gets changed. Next to that file, the temporary file is needed, too:
$temp = new SplTempFileObject(4096);
After setting up the two files, let's go over each line in $file and compare if it needs to be replaced:
foreach ($file as $line) {
$isCategoryLine = substr($line, 0, $prefixLen) === $prefix;
if ($isCategoryLine) {
$line = $newLine;
}
$temp->fwrite($line."\n");
}
Now the $temporary file contains already the changed line. Take note that I used UNIX type of EOF (End Of Line) character (\n), depending on your concrete file-type this may vary.
So now, the temporary file needs to be copied over to the original file. Let's rewind the file, truncate it and then write all lines again:
$file->seek(0);
$file->ftruncate(0);
foreach ($temp as $line) {
$file->fwrite($line);
}
And finally you need to lift the lock:
$file->flock(LOCK_UN);
And that's it, in $file, the line has been replaced.
Example at once:
$path = 'file';
$category = 21;
$articles = [111182297, 79292, 89359, 89382, 83486, 99100];
$prefix = $category . ':';
$prefixLen = strlen($prefix);
$newLine = $prefix . implode(',', $articles);
$file = new SplFileObject($path, 'r+');
$file->setFlags(SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY);
$file->flock(LOCK_EX);
$temp = new SplTempFileObject(4096);
foreach ($file as $line) {
$isCategoryLine = substr($line, 0, $prefixLen) === $prefix;
if ($isCategoryLine) {
$line = $newLine;
}
$temp->fwrite($line."\n");
}
$file->seek(0);
$file->ftruncate(0);
foreach ($temp as $line) {
$file->fwrite($line);
}
$file->flock(LOCK_UN);
Should work with PHP 5.2 and above, I use PHP 5.4 array syntax, you can replace [111182297, ...] with array(111182297, ...) in case you're using PHP 5.2 / 5.3.

Categories