How to download attachment on email using php - php

I have tried to save attachment but i couldn't download attachment with this code .it create a folder for downloading file. any one? How to save a attachment on email in php?
thanks in advance
$struct = imap_fetchstructure($mbox,$msgno);
$contentParts = count($struct->parts);
if ($contentParts >= 2)
{
for ($i=2;$i<=$contentParts;$i++)
{
$att[$i-2] = imap_bodystruct($mbox,$msgno,$i);
}
for ($k=0;$k<count($att);$k++)
{
if ($att[$k]->parameters[0]->value == "us-ascii" || $att[$k]->parameters[0]->value == "US-ASCII")
{
if ($att[$k]->parameters[1]->value != "")
{
$selectBoxDisplay[$k] = $att[$k]->parameters[1]->value;
}
}
elseif ($att[$k]->parameters[0]->value != "iso-8859-1" && $att[$k]->parameters[0]->value != "ISO-8859-1")
{
$selectBoxDisplay[$k] = $att[$k]->parameters[0]->value;
}
}
}
//print_r($att);
if (count($selectBoxDisplay) > 0)
{
for ($j=0;$j<sizeof($selectBoxDisplay);$j++)
{
$strFileName = $att[$j]->parameters[0]->value;
$strFileType = strrev(substr(strrev($strFileName),0,4));
$fileContent = imap_fetchbody($mbox,$msgno,$j+2);
$ret = downloadFile($strFileType,$strFileName,$fileContent);
$selectBoxDisplay[$j] = str_replace($search,'',$selectBoxDisplay[$j]);
$filenametmp = $selectBoxDisplay[$j];
$handle = fopen($filenametmp, "w+");
fwrite($handle, $ret);
fclose($handle);
//chmod($filenametmp, 777);
}
}

Related

Why base64_encode() return null

I have a 22M docx file and want to encode it using base64_encode() function in php. But It always returns NULL value after running this function. Is there any limit file size or condition for this function. My code:
$handle = fopen($fullpathfile, "rb");
$imagestr = base64_encode(fread($handle, filesize($fullpathfile)));
fclose($handle);
Try this code
$fh = fopen($fullpathfile, 'rb');
$cache = '';
$eof = false;
while (1) {
if (!$eof) {
if (!feof($fh)) {
$row = fgets($fh, 4096);
} else {
$row = '';
$eof = true;
}
}
if ($cache !== '')
$row = $cache.$row;
elseif ($eof)
break;
$b64 = base64_encode($row);
$put = '';
if (strlen($b64) < 76) {
if ($eof) {
$put = $b64."\n";
$cache = '';
} else {
$cache = $row;
}
} elseif (strlen($b64) > 76) {
do {
$put .= substr($b64, 0, 76)."\n";
$b64 = substr($b64, 76);
} while (strlen($b64) > 76);
$cache = base64_decode($b64);
} else {
if (!$eof && $b64{75} == '=') {
$cache = $row;
} else {
$put = $b64."\n";
$cache = '';
}
}
if ($put !== '') {
echo $put;
}
}
fclose($fh);

jQuery/PHP File Upload Times Out Before Loading Next Page

So I am developing the following image upload script, based off an existing open-source script. It's currently viewable live here: http://images.oneightynyc.com/
Now if you take any series of regular sized images (under 5mb) and proceed to upload them, the upload process goes just fine. Uploads the files, and brings you to a page that displays the link codes to those files. However let's say you upload a few large images, like the following:
http://imaging.nikon.com/lineup/dslr/d90/img/sample/pic_005b.jpg
http://imaging.nikon.com/lineup/dslr/d90/img/sample/pic_003b.jpg
The uploads happen in the process, however the script never brings you to the uploaded page. The only way I am aware that the upload has actually taken place is if I browse to the Gallery page and see that the files are listed there.
Here is the uploader.php file which handles the upload:
<?
//ob_start();
session_start();
$auth_id=$_SESSION['userid'];
if (!$auth_id || empty($auth_id) || $auth_id==""){
$auth_id = 0;
}
require_once("config.php");
require_once("limits.php");
require_once("ftp.class.php");
require_once("func.php");
$link = mysql_connect($db_server, $db_user, $db_password) or die("Could not connect to the database.");
mysql_select_db($db_name) or die("Could not select the database.");
if ($config[Uploads] == 0) {
$msg= "<center><b><br><br><br>Uploads are temporarily disabled by the site admin</center></b>";
}
else if ($config[Uploads] == 1 && !$auth_id) {
$msg= "<center><b><br><br><br>You have to Register before you will be able to upload photos.</center></b>";
}
$query = "select count(*) as total from ftp where status=1";
$result = mysql_query($query) or die("Query failed.");
while ($row = mysql_fetch_array($result))
{
$total=$row[total];
}
if($total<=0)
{
$no_server="1";
$ftpid=0;
$url=$server_url."/images/";
}
else
{
$query = "select * from ftp where status=1 ORDER BY RAND() limit 1";
$result = mysql_query($query) or die("Query failed.");
while ($row = mysql_fetch_array($result))
{
$no_server="0";
$ftpid=$row['ftpid'];
$path=$row['name'];
$url=$row['dir'];
$host=$row['host'];
$user=$row['user'];
$pass=$row['ftppass'];
}
}
// get variables for fields on upload screen
$tos = $_POST['tos'];
$prv = $_POST['prv'];
if($prv!="1")
$prv=0;
$uploaderip = $_SERVER['REMOTE_ADDR'];
$messages="";
$msg="";
$newID="";
$FileName="";
$FileFile="";
$FileUrl="";
$FileUrlLink="";
$FiletnUrl="";
// check for blocked ip address
if ($uploaderip != "") {
$query = "select ip from blocked where ip = '$uploaderip'";
$result = mysql_query($query) or die("Query failed.");
$num_rows = mysql_num_rows($result);
if ($num_rows > 0) {
$msg= "Your IP address (".$uploaderip.") has been blocked from using this service.";
}
}
if ($config[AcceptTerms]=="1"){
if ($tos=="")
{
$msg= "You must check the box stating you agree to our terms.";
echo "<script language='javascript'>parent.upload('".$msg."','".$newID."','".$messages."','".$FileName."','".$FileFile."','".$FileUrl."','".$FileUrlLink."','".$FiletnUrl."','".$page_url."','".$server_url."','".$site_name."','".$HotLink."');</script>";
}
}
if($msg=="")
{
// check for a file
for($i=0;$i<=14;$i++)
{
$err="0";
$thefile = $_FILES['thefile'.$i];
if ($thefile['name']!="")
{
// check for valid file extension
$path_parts = pathinfo($thefile['name']);
$file_ext = strtolower($path_parts['extension']);
if ($err == "0")
{
// check for valid file type
if (!in_array_nocase($file_ext, $valid_file_ext))
{
$messages.= "|<em>".$thefile['name']."</em> is not in a valid format (".$valid_mime_types_display.")";
$err="1";
}
}
if ($err == "0") {
// check for valid image file
$imageinfo = getimagesize($_FILES['thefile0']['tmp_name']);
if(!eregi('image',$imageinfo['mime'])) {
$messages.="|". "Sorry, This is not a valid image file!";
$err="1"; } }
if ($err == "0")
{
// check for valid file size
if ($thefile['size'] > ($max_file_size_b))
{
$filesizemb =($thefile['size']/1048576);
$filesizemb = number_format($filesizemb, 3);
$messages.="Sorry but this image size is ".$filesizemb." MB which is bigger than the max allowed file size of ".$max_file_size_mb." MB.";
$err="1";
}
}
// save the file, if no error messages
if ($err == "0")
{
// replace special chars with spaces
$thefile['name'] = eregi_replace("[^a-z0-9.]", " ", $thefile['name']);
// Replace multiple spaces with one space
$thefile['name'] = ereg_replace(' +', ' ', $thefile['name']);
// Replace spaces with underscore
$thefile['name'] = str_replace(' ', '_', $thefile['name']);
// Replace hyphens with underscore
$thefile['name'] = str_replace('-', '_', $thefile['name']);
// Replace multiple underscores with one underscore
$thefile['name'] = ereg_replace('_+', '_', $thefile['name']);
$path_parts = pathinfo($thefile['name']);
// if php < 5.2
if(!isset($path_parts['filename'])){
$path_parts['filename'] = substr($path_parts['basename'], 0,strpos($path_parts['basename'],'.'));
}
$thefile['name'] = strpos($path_parts['filename'], '.');
$thefile['name'] = substr($path_parts['filename'], 0, 22); // limit file name length to 22 chars from the beginning
$thefile['name'] = $thefile['name'] . "." . strtolower($path_parts['extension']);
// Generate prefix to add to file name
$prefix = rand(99,999);
// Add prefix to file name
$newFileName = $prefix . $thefile['name'];
// SAVE THE PICTURE
$FileName.="|". newImageName($thefile['name']);
$FileFile.="|". $server_dir . $newFileName;
$newFile = $server_dir . $newFileName;
$newFileUrl = $url . $newFileName;
$FileUrl.="|". $url . $newFileName;
$newFileUrlLink = $server_save_directory . $newFileName;
$FileUrlLink.="|". $newFileName;
if (in_array_nocase($file_ext, $valid_file_ext))
{
$lx = 3;
if ($file_ext == "jpeg") {
$lx = 4; }
$tnFileName = substr($newFileName, 0, strlen($newFileName) - $lx) . "jpg";
$tnFileName = str_replace('.', '_tn.', $tnFileName);
$tnFile = $server_dir . $tnFileName;
$FiletnUrl.="|". $url . $tnFileName;
$tnFileUrl = $url . $tnFileName;
}
else
{
$tnFileName = "";
$tnFile = "";
$tnFileUrl = "";
}
$filesize = $thefile['size'];
$newID = "";
if (!#copy($thefile['tmp_name'], $newFile))
{
$messages.="|". "Please check site settings in admin panel and set proper value for server local path.<br><br>Also please make sure the images folder is chmodded to 0777";
}
else
{
// add to database
if($auth_id)
$uid=$auth_id;
else $uid=0;
//ftpupload($host,$user,$pass,$path."/".$dir."/".$newFileName,$newFileUrl);
//ftpupload
if($no_server=="0")
{
$ftp =& new FTP();
if ($ftp->connect($host)) {
if ($ftp->login($user,$pass)) {
$ftp->chdir($path);
$ftp->put($newFileName,$newFile);
}
}
// unlink($newFile);
}
//ftpupload
$date_add=time();
$query = "INSERT INTO images (prv,ftpid,userid,filename, tn_filename, filepath, ip, filesize,added) VALUES ($prv,$ftpid,$uid,'$newFileName', '$tnFileName', '$url', '$uploaderip', $filesize,$date_add)";
mysql_query($query) or die("Database entry failed.");
$newID.="|". mysql_insert_id();
}
if ($file_ext == "jpeg" ||$file_ext == "jpg" || $file_ext == "png" || $file_ext == "gif" || $file_ext == "bmp")
{
if ($file_ext == "jpg")
{
$source_id = imagecreatefromjpeg($newFile);
}
if ($file_ext == "jpeg")
{
$source_id = imagecreatefromjpeg($newFile);
}
elseif ($file_ext == "png")
{
$source_id = imagecreatefrompng($newFile);
}
elseif ($file_ext == "gif")
{
$source_id = imagecreatefromgif($newFile);
}
elseif ($file_ext == "bmp")
{
$source_id = ImageCreateFromBMP($newFile);
}
$true_width = imagesx($source_id);
$true_height = imagesy($source_id);
}
}
}
}
mysql_close($link);
// create URL links to display to user
$showURL1 = false; // image on hosted page - image only
$showURL2 = false; // direct link to file - all
$showURL3 = false; // HTML for img - image only
$showURL4 = false; // [img][/img] tags - image only
$showURL5 = false; // thumbnail pic - image only
// determine flags
$showURL2 = true;
if ($file_ext == "jpg" || $file_ext == "jpeg"|| $file_ext == "gif" || $file_ext == "png" || $file_ext == "bmp") {
$showURL1 = true;
$showURL3 = true;
$showURL4 = true;
}
if ($file_ext == "jpg" || $file_ext == "gif" || $file_ext == "png"|| $file_ext == "jpeg" || $file_ext == "bmp") {
$showURL5 = true;
}
echo "<script language='javascript'>parent.upload('".$msg."','".$newID."','".$messages."','".$FileName."','".$FileFile."','".$FileUrl."','".$FileUrlLink."','".$FiletnUrl."','".$page_url."','".$server_url."','".$site_name."','".$HotLink."');</script>";
}
else
{
echo "<script language='javascript'>parent.uploaderror('".$msg."');</script>";
exit;
}
function newImageName($fname) {
$timestamp = time();
$new_image_file_ext = substr($fname, strlen($fname) - 3, strlen($fname));
if ($new_image_file_ext == "peg") {
$ext = ".jpg";
} else {
$ext = "." . $new_image_file_ext;
}
$newfilename = randString() . substr($timestamp, strlen(timestamp) - 4, strlen(timestamp)) . $ext;
return $newfilename;
}
function randString() {
$newstring="";
while(strlen($newstring) < 3) {
$randnum = mt_rand(0,61);
if ($randnum < 10) {
$newstring .= chr($randnum + 48);
} elseif ($randnum < 36) {
$newstring .= chr($randnum + 55);
} else {
$newstring .= chr($randnum + 61);
}
}
return $newstring;
}
function in_array_nocase($item, $array) {
$item = &strtoupper($item);
foreach($array as $element) {
if ($item == strtoupper($element)) {
return true;
}
}
return false;
}
?>
And the upload.js script which takes care of producing the uploaded page:
var cp = new cpaint();
cp.set_transfer_mode('get');
cp.set_response_type('xml');
cp.set_debug(1);
function uploaderror(msg)
{
alert(msg);
}
function showfile()
{
var countfld=1;
countfld=document.getElementById("countfld").value+countfld;
fld=countfld.length;
if(fld>14)
{
alert("Sorry, i can upload max 15 files at once.");
return false;
}
else
{
document.getElementById("f"+fld).style.display="block";
document.getElementById("countfld").value=countfld;
}
var file=document.getElementById("f"+fld).value;
if(file=="")
{
msg="Please fill this field.";
alert(msg);
document.getElementById("f"+fld).focus();
return false;
}
}
function showfileux()
{
var countfld=1;
countfld=document.getElementById("countfldu").value+countfld;
fld=countfld.length;
if(fld>14)
{
alert("Sorry, i can upload max 15 files at once.");
return false;
}
else
{
document.getElementById("u"+fld).style.display="block";
document.getElementById("countfldu").value=countfld;
}
}
function showfileu()
{
var countfld=1;
countfld=document.getElementById("countfldu").value+countfld;
fld=countfld.length;
fldx=fld-1;
fldxx=fld.value;
if(fldxx=="")
{
msg="Email Address cannot be left empty.";
alert(msg);
document.getElementById("u"+fldxx).select();
document.getElementById("u"+fldxx).focus();
return false;
}
if(fld>14)
{
alert("Sorry, i can upload max 15 files at once.");
return false;
}
else
{
document.getElementById("u"+fld).style.display="block";
document.getElementById("countfldu").value=countfld;
}
}
function uploadfile(id)
{
if(document.getElementById(id).value==1)
{
document.getElementById("showurl").style.display="none";
document.getElementById("showfl").style.display="block";
return true;
}
if(document.getElementById(id).value==2)
{
document.getElementById("showfl").style.display="none";
document.getElementById("showurl").style.display="block";
return true;
}
document.getElementById("countfldu").value="0";
document.getElementById("countfld").value="0";
}
function show_loading()
{
document.getElementById('loading').style.display = "block";
document.getElementById('newupload').submit;
document.getElementById('submit').disabled = true;
// return true;
}
function show_loading1()
{
document.getElementById('loading1').style.display = "block";
document.getElementById('newupload1').submit;
document.getElementById('submit').disabled = true;
}
function upload(msg,newID,messages,FileName,FileFile,FileUrl,FileUrlLink,FiletnUrl,page_url,server_url,site_name,HotLink)
{
var html='<div id="wrapper"><div style="width:760px;"><center><FONT SIZE="4" COLOR="#00A4B7">Photo Links</FONT></h4><br></center><span class="body"><form name="uploadresults" action="uploademail.php" method="post">';
if(newID)
{
html=html+'<input type="hidden" name="idx[]" value="'+newID+'">';
}
if(msg)
{
var getmsg = msg.split("|");
for(i=0;i<getmsg.length;i++)
{
if(getmsg[i] && getmsg[i]!="on")
html=html+'<span style="font-weight: bold; color: red;">'+getmsg[i]+'</span><br>';
}
}
html=html+'<br><center>';
if(messages)
{
var getmessages = messages.split("|");
for(i=0;i<getmessages.length;i++)
{
if(getmessages[i] && getmessages[i]!="on")
html=html+'<span style="font-weight: bold; color: red;">'+getmessages[i]+'</span>';
}
html=html+'</center>';
}
if(FileName)
{
var getFileName = FileName.split("|");
var getFileFile = FileFile.split("|");
var getFileUrl = FileUrl.split("|");
var getFileUrlLink = FileUrlLink.split("|");
var getFiletnUrl = FiletnUrl.split("|");
var getHotLink = HotLink.split("|");
for(i=0;i<getFileName.length;i++)
{
if(getFileName[i] && getFileName[i]!="on") {
html=html+'<center><br><img src="'+getFileUrl[i]+'" style="max-width: 550px;"" /><br><br>';
html=html+'<strong>Link to add tags and delete the photo <br><div align="center"><textarea name="url1[]" cols="80" rows="1" READONLY onfocus="javascript: this.select()">'+server_url+'/view2.php?filename='+getFileUrlLink[i]+'
Let me know what you think is causing this error, as this is the final step I need to fix.
I've had similar issue with creating excel files from large data bases. What it boils down to is that the PHP script exceeds the servers set time limit. There are multiple ways to delay/extend this from built in PHP functions, some or all may be used. I personally had use a combination of the ability with AJAX to allow it run in the backgroun and then redirect that page.
Here is the documentation on how to delay/extend it:
http://php.net/manual/en/function.set-time-limit.php
Here is the documentation on how check for a time out as well:
http://php.net/manual/en/function.connection-timeout.php
If you end up going the AJAX route as I did, I highly recommend going the jQuery route instead of vanilla JS.

Email fetching of listed users

I am fetching email of my INBOX at gmail. I want to fetch emails of those IDs only which I will give. I don't understand how to implement this.
When I tried doing it in the imap_search I didn't get any success, it was like -
$emails = imap_search($inbox,'rajeev#gmail.com');
but it gave me all the emails of the inbox.
Any suggestion to achieve I want?
My code is -
function connect_mail(){
$hostname = '{imap.gmail.com:993/imap/ssl}INBOX';
$username = '***#gmail.com';
$password = 'password';
$inbox = imap_open($hostname,$username,$password) or die(t('Cannot connect to Gmail: ' . imap_last_error()));
$emails = imap_search($inbox,'UNREAD');
$Msgcount = count($emails);
for ($x = 1; $x <= $Msgcount; $x++)
{
$overview = imap_fetch_overview($inbox, $x);
$title = $overview[0]->subject;
$struct = imap_fetchstructure($inbox,$x);
$contentParts = count($struct->parts);
if ($contentParts >= 2) {
for ($i=2;$i<=$contentParts;$i++) {
$att[$i-2] = imap_bodystruct($inbox,$x,$i);
}
for ($k=0;$k<sizeof($att);$k++) {
if ($att[$k]->parameters[0]->value == "us-ascii" || $att[$k]->parameters[0]->value == "US-ASCII") {
if ($att[$k]->parameters[1]->value != "") {
$selectBoxDisplay[$k] = $att[$k]->parameters[1]->value;
}
}
elseif ($att[$k]->parameters[0]->value != "iso-8859-1" && $att[$k]->parameters[0]->value != "ISO-8859-1") {
$selectBoxDisplay[$k] = $att[$k]->parameters[0]->value;
}
}
}
if (sizeof($selectBoxDisplay) > 0) {
echo "<select name=\"attachments\" size=\"3\" class=\"tblContent\" onChange=\"handleFile(this.value)\" style=\"width:170;\">";
for ($j=0;$j<sizeof($selectBoxDisplay);$j++) {
$filename = $selectBoxDisplay[$j];
echo "\n<option value=\"$j\">". $selectBoxDisplay[$j] ."</option>";
}
echo "</select>";
}
$strFileName = $att[$file]->parameters[0]->value;
$strFileType = strrev(substr(strrev($filename),0,4));
$fileContent = imap_fetchbody($inbox, $x, $file+2);
$ContentType = "application/octet-stream";
list($file_first_name, $extension) = explode('. ', $filename);
if ($extension == ".jpg" || $extension == "jpeg" || $extension == ".JPG")
$ContentType = "image/jpeg";
if ($extension == ".doc")
$ContentType = "application/msword";
header ("Content-Type: $ContentType");
header ("Content-Disposition: attachment; filename=$strFileName; size=$fileSize;");
if (substr($ContentType,0,4) == "text") {
echo imap_qprint($fileContent);
}
else {
echo base64_decode($fileContent);
}
}
}
I see that you want to search emails from certain user that is still unread. Try this in your imap_search():
$emails = imap_search ( $mailbox, 'FROM "user#email.com" UNSEEN' );
The code above will loop for emails from user user#email.com that has UNSEEN flag.
UPDATE, BASED ON COMMENT CHAT
just before you do:
for ($x = 1; $x <= $Msgcount; $x++)
{
//long process here
}
you wrap it with one more checking:
$emails = imap_search ( $mailbox, 'FROM "user#email.com" UNSEEN' );
if(false !== $emails){
}
the end result would be something like this:
$emails = imap_search ( $mailbox, 'FROM "user#email.com" UNSEEN' );
if(false !== $emails){
for ($x = 1; $x <= $Msgcount; $x++)
{
//long process here
}
}

FTP_DELETE not working?

Hey guys I have my script here that is supposed to do some stuff then delete a file, unfortunetly my files never unlink. I"m wondering what the reason for this might be? Permissions was the only thing I could think of, or maybe the output buffer is messing up? I really don't know, but would appreciate some advice on how to handle it. Issue in question is that last IF() block.
public function remoteFtp() {
$enabled = Mage::getStoreConfig('cataloginventory/settings/use_ftp');
$remove = Mage::getStoreConfig('cataloginventory/settings/ftp_remove_file');
if ($enabled == 0) {
return true;
}
$base_path = Mage::getBaseDir('base');
$ftp_url = Mage::getStoreConfig('cataloginventory/settings/ftp_url');
$ftp_user = Mage::getStoreConfig('cataloginventory/settings/ftp_user');
$ftp_pass = Mage::getStoreConfig('cataloginventory/settings/ftp_password');
$ftp_remote_dir = Mage::getStoreConfig('cataloginventory/settings/ftp_remote_dir');
$ftp_filename_filter = Mage::getStoreConfig('cataloginventory/settings/ftp_remote_filename');
$ftp_file = $base_path . '/edi/working/working.edi';
$handle = fopen($ftp_file, 'w');
$conn_id = ftp_connect($ftp_url);
ftp_login($conn_id, $ftp_user, $ftp_pass) or die("unable to login");
if ($ftp_remote_dir) {
ftp_chdir($conn_id, $ftp_remote_dir);
}
//is there a file
$remote_list = ftp_nlist($conn_id, ".");
$exists = count($remote_list);
if ($exists > 0) {
$len = strlen($ftp_filename_filter) - 1;
foreach ($remote_list as $name) {
if (substr($ftp_filename_filter, 0, 1) == "*") {
if (substr($name, '-' . $len) == substr($ftp_filename_filter, '-' . $len)) {
$ftp_remote_name = $name;
}
}
if (substr($ftp_filename_filter, strlen($name) - 1) == "*") {
if (substr($ftp_filename_filter, 0, $len) == substr($name, 0, $len)) {
$ftp_remote_name = $name;
}
}
if ($ftp_filename_filter == $name) {
$ftp_remote_name = $name;
}
}
}
if (ftp_fget($conn_id, $handle, $ftp_remote_name, FTP_ASCII, 0)) {
echo "successfully written to $ftp_file <br />";
if ($remove == 1) {
ftp_delete($conn_id, $ftp_remote_name);
}
} else {
echo "There was a problem while downloading $ftp_remote_name to $ftp_file <br />";
}
ftp_close($conn_id);
}
The answer was that the system variable $remove = Mage::getStoreConfig('cataloginventory/settings/ftp_remove_file'); was set to BOOL(false)

PHP DOCX convert to HTML [duplicate]

I want to be able to upload an MS word document and export it a page in my site.
Is there any way to accomplish this?
//FUNCTION :: read a docx file and return the string
function readDocx($filePath) {
// Create new ZIP archive
$zip = new ZipArchive;
$dataFile = 'word/document.xml';
// Open received archive file
if (true === $zip->open($filePath)) {
// If done, search for the data file in the archive
if (($index = $zip->locateName($dataFile)) !== false) {
// If found, read it to the string
$data = $zip->getFromIndex($index);
// Close archive file
$zip->close();
// Load XML from a string
// Skip errors and warnings
$xml = DOMDocument::loadXML($data, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NOERROR | LIBXML_NOWARNING);
// Return data without XML formatting tags
$contents = explode('\n',strip_tags($xml->saveXML()));
$text = '';
foreach($contents as $i=>$content) {
$text .= $contents[$i];
}
return $text;
}
$zip->close();
}
// In case of failure return empty string
return "";
}
ZipArchive and DOMDocument are both inside PHP so you don't need to install/include/require additional libraries.
One may use PHPDocX.
It has support for practically all HTML CSS styles. Moreover you may use templates to add extra formatting to your HTML via the replaceTemplateVariableByHTML.
The HTML methods of PHPDocX also allow for the direct use of Word styles. You may use something like this:
$docx->embedHTML($myHTML, array('tableStyle' => 'MediumGrid3-accent5PHPDOCX'));
If you want that all your tables use the MediumGrid3-accent5 Word style. The embedHTML method as well as its version for templates (replaceTemplateVariableByHTML) preserve inheritance, meaning by that that you may use a predefined Word style and override with CSS any of its properties.
You may also extract selected parts of your HTML using 'JQuery type' selectors.
You can convert Word docx documents to html using Print2flash library. Here is an PHP excerpt from my client's site which converts a document to html:
include("const.php");
$p2fServ = new COM("Print2Flash4.Server2");
$p2fServ->DefaultProfile->DocumentType=HTML5;
$p2fServ->ConvertFile($wordfile,$htmlFile);
It converts a document which path is specified in $wordfile variable to a html page file specified by $htmlFile variable. All formatting, hyperlinks and charts are retained. You can get the required const.php file altogether with a fuller sample from Print2flash SDK.
this is a workaround based on David Lin's answer above
removing "w:" in a docx's xml tags leave behing Html like tags
function readDocx($filePath) {
// Create new ZIP archive
$zip = new ZipArchive;
$dataFile = 'word/document.xml';
// Open received archive file
if (true === $zip->open($filePath)) {
// If done, search for the data file in the archive
if (($index = $zip->locateName($dataFile)) !== false) {
// If found, read it to the string
$data = $zip->getFromIndex($index);
// Close archive file
$zip->close();
// Load XML from a string
// Skip errors and warnings
$xml = new DOMDocument("1.0", "utf-8");
$xml->loadXML($data, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NOERROR | LIBXML_NOWARNING|LIBXML_PARSEHUGE);
$xml->encoding = "utf-8";
// Return data without XML formatting tags
$output = $xml->saveXML();
$output = str_replace("w:","",$output);
return $output;
}
$zip->close();
}
// In case of failure return empty string
return "";
}
Ok Im in very late, but thought I'd post this to save you all some time.
This is some php code I have put together not just to read the text from docx but the images too, currently it does not support floating images / text, but what I have done so far is a massive move forwards to whats already been posted on here - note you need to update https://example.co.uk to YOUR domain name.
<?php
class Docx_ws_imglnk {
public $originalpath = '';
public $extractedpath = '';
}
class Docx_ws_rel {
public $Id = '';
public $Target = '';
}
class Docx_ws_def {
public $styleId = '';
public $type = '';
public $color = '000000';
}
class Docx_p_def {
public $data = array();
public $text = "";
}
class Docx_p_item {
public $name = "";
public $value = "";
public $innerstyle = "";
public $type = "text";
}
class Docx_reader {
private $fileData = false;
private $errors = array();
public $rels = array();
public $imglnks = array();
public $styles = array();
public $document = null;
public $paragraphs = array();
public $path = '';
private $saveimgpath = 'docimages';
public function __construct() {
}
private function load($file) {
if (file_exists($file)) {
$zip = new ZipArchive();
$openedZip = $zip->open($file);
if ($openedZip === true) {
$this->path = $file;
//read and save images
for ( $i = 0; $i < $zip->numFiles; $i ++ ) {
$zip_element = $zip->statIndex( $i );
if ( preg_match( "([^\s]+(\.(?i)(jpg|jpeg|png|gif|bmp))$)", $zip_element['name'] ) ) {
$imglnk = new Docx_ws_imglnk;
$imglnk->originalpath = $zip_element['name'];
$imagename = explode( '/', $zip_element['name'] );
$imagename = end( $imagename );
$imglnk->extractedpath = dirname( __FILE__ ) . '/' . $this->savepath . $imagename;
$putres = file_put_contents( $imglnk->extractedpath, $zip->getFromIndex( $i ));
$imglnk->extractedpath = str_replace('var/www/', 'https://example.co.uk/', $imglnk->extractedpath);
$imglnk->extractedpath = substr($imglnk->extractedpath, 1);
array_push($this->imglnks, $imglnk);
}
}
//read relationships
if (($styleIndex = $zip->locateName('word/_rels/document.xml.rels')) !== false) {
$stylesRels = $zip->getFromIndex($styleIndex);
$xml = simplexml_load_string($stylesRels);
$XMLTEXT = $xml->saveXML();
$doc = new DOMDocument();
$doc->loadXML($XMLTEXT);
foreach($doc->documentElement->childNodes as $childnode)
{
$nodename = $childnode->nodeName;
if($childnode->hasAttributes())
{
$rel = new Docx_ws_rel;
for ($a = 0; $a < $childnode->attributes->count(); $a++)
{
$attrNode = $childnode->attributes->item($a);
if (strcmp( $attrNode->nodeName, 'Id') == 0)
{
$rel->Id = $attrNode->nodeValue;
}
if (strcmp( $attrNode->nodeName, 'Target') == 0)
{
$rel->Target = $attrNode->nodeValue;
}
}
array_push($this->rels, $rel);
}
}
}
//attempt to load styles:
if (($styleIndex = $zip->locateName('word/styles.xml')) !== false) {
$stylesXml = $zip->getFromIndex($styleIndex);
$xml = simplexml_load_string($stylesXml);
$XMLTEXT = $xml->saveXML();
$doc = new DOMDocument();
$doc->loadXML($XMLTEXT);
foreach($doc->documentElement->childNodes as $childnode)
{
$nodename = $childnode->nodeName;
//get style
if (strcmp($nodename, "w:style") == 0)
{
$ws_def = new Docx_ws_def;
for ($a=0; $a < $childnode->attributes->count(); $a++ )
{
$item = $childnode->attributes->item($a);
//style id
if (strcmp($item->nodeName, "w:styleId") == 0)
{
$ws_def->styleId = $item->nodeValue;
}
//style type
if (strcmp($item->nodeName, "w:type") == 0)
{
$ws_def->type = $item->nodeValue;
}
}
}
//push style to the array of styles
if (strcmp($ws_def->styleId, "") != 0 && strcmp($ws_def->type, "") != 0)
{
array_push($this->styles, $ws_def);
}
}
}
if (($index = $zip->locateName('word/document.xml')) !== false) {
$stylesDoc = $zip->getFromIndex($index);
$xml = simplexml_load_string($stylesDoc);
$XMLTEXT = $xml->saveXML();
$this->document = new DOMDocument();
$this->document->loadXML($XMLTEXT);
}
$zip->close();
} else {
switch($openedZip) {
case ZipArchive::ER_EXISTS:
$this->errors[] = 'File exists.';
break;
case ZipArchive::ER_INCONS:
$this->errors[] = 'Inconsistent zip file.';
break;
case ZipArchive::ER_MEMORY:
$this->errors[] = 'Malloc failure.';
break;
case ZipArchive::ER_NOENT:
$this->errors[] = 'No such file.';
break;
case ZipArchive::ER_NOZIP:
$this->errors[] = 'File is not a zip archive.';
break;
case ZipArchive::ER_OPEN:
$this->errors[] = 'Could not open file.';
break;
case ZipArchive::ER_READ:
$this->errors[] = 'Read error.';
break;
case ZipArchive::ER_SEEK:
$this->errors[] = 'Seek error.';
break;
}
}
} else {
$this->errors[] = 'File does not exist.';
}
}
public function setFile($path) {
$this->fileData = $this->load($path);
}
public function to_plain_text() {
if ($this->fileData) {
return strip_tags($this->fileData);
} else {
return false;
}
}
public function processDocument() {
$html = '';
foreach($this->document->documentElement->childNodes as $childnode)
{
$nodename = $childnode->nodeName;
//get the body of the document
if (strcmp($nodename, "w:body") == 0)
{
foreach($childnode->childNodes as $subchildnode)
{
$pnodename = $subchildnode->nodeName;
//process every paragraph
if (strcmp($pnodename, "w:p") == 0)
{
$pdef = new Docx_p_def;
foreach($subchildnode->childNodes as $pchildnode)
{
//process any inner children
if (strcmp($pchildnode, "w:pPr") == 0)
{
foreach($pchildnode->childNodes as $prchildnode)
{
//process text alignment
if (strcmp($prchildnode->nodeName, "w:pStyle") == 0)
{
$pitem = new Docx_p_item;
$pitem->name = 'styleId';
$pitem->value = $prchildnode->attributes->getNamedItem('val')->nodeValue;
array_push($pdef->data, $pitem);
}
//process text alignment
if (strcmp($prchildnode->nodeName, "w:jc") == 0)
{
$pitem = new Docx_p_item;
$pitem->name = 'align';
$pitem->value = $prchildnode->attributes->getNamedItem('val')->nodeValue;
if (strcmp($pitem->value, "left") == 0)
{
$pitem->innerstyle .= "text-align:" . $pitem->value . ";";
}
if (strcmp($pitem->value, "center") == 0)
{
$pitem->innerstyle .= "text-align:" . $pitem->value . ";";
}
if (strcmp($pitem->value, "right") == 0)
{
$pitem->innerstyle .= "text-align:" . $pitem->value . ";";
}
if (strcmp($pitem->value, "both") == 0)
{
$pitem->innerstyle .= "word-spacing:" . 10 . "px;";
}
array_push($pdef->data, $pitem);
}
//process drawing
if (strcmp($prchildnode->nodeName, "w:drawing") == 0)
{
$pitem = new Docx_p_item;
$pitem->name = 'drawing';
$pitem->value = '';
$pitem->type = 'graphic';
$extents = $prchildnode->getElementsByTagName('extent')[0];
$cx = $extents->attributes->getNamedItem('cx')->nodeValue;
$cy = $extents->attributes->getNamedItem('cy')->nodeValue;
$pcx = (int)$cx / 9525;
$pcy = (int)$cy / 9525;
$pitem->innerstyle .= "width:" . $pcx . "px;";
$pitem->innerstyle .= "height:" . $pcy . "px;";
$blip = $prchildnode->getElementsByTagName('blip')[0];
$pitem->value = $blip->attributes->getNamedItem('embed')->nodeValue;
array_push($pdef->data, $pitem);
}
//process spacing
if (strcmp($prchildnode->nodeName, "w:spacing") == 0)
{
$pitem = new Docx_p_item;
$pitem->name = 'paragraphSpacing';
$bval = $prchildnode->attributes->getNamedItem('before')->nodeValue;
if (strcmp($bval, '') == 0)
$bval = 0;
$pitem->innerstyle .= "padding-top:" . $bval . "px;";
$aval = $prchildnode->attributes->getNamedItem('after')->nodeValue;
if (strcmp($aval, '') == 0)
$aval = 0;
$pitem->innerstyle .= "padding-bottom:" . $aval . "px;";
array_push($pdef->data, $pitem);
}
}
}
if (strcmp($pchildnode, "w:r") == 0)
{
foreach($pchildnode->childNodes as $rchildnode)
{
//process text
if (strcmp($rchildnode->nodeName, "w:t") == 0)
{
$pdef->text .= $rchildnode->nodeValue;
if (count($pdef->data) == 0)
{
$pitem = new Docx_p_item;
$pitem->name = 'styleId';
$pitem->value = '';
array_push($pdef->data, $pitem);
}
}
if (strcmp($rchildnode->nodeName, "w:rPr") == 0)
{
foreach($rchildnode->childNodes as $rPrchildnode)
{
if (strcmp($rPrchildnode->nodeName, "w:b") == 0 )
{
$pitem = new Docx_p_item;
$pitem->name = 'textBold';
$pitem->value = '';
$pitem->innerstyle .= "text-weight: 500;";
array_push($pdef->data, $pitem);
}
if (strcmp($rPrchildnode->nodeName, "w:i") == 0 )
{
$pitem = new Docx_p_item;
$pitem->name = 'textItalic';
$pitem->value = '';
$pitem->innerstyle .= "text-style: italic;";
array_push($pdef->data, $pitem);
}
if (strcmp($rPrchildnode->nodeName, "w:u") == 0 )
{
$pitem = new Docx_p_item;
$pitem->name = 'textUnderline';
$pitem->value = '';
$pitem->innerstyle .= "text-decoration: underline;";
array_push($pdef->data, $pitem);
}
if (strcmp($rPrchildnode->nodeName, "w:sz") == 0 )
{
$pitem = new Docx_p_item;
$pitem->name = 'textSize';
$sz = $rPrchildnode->attributes->getNamedItem('val')->nodeValue;
if ($sz == '')
{
$sz=0;
}
$pitem->value = $sz;
array_push($pdef->data, $pitem);
}
}
}
}
}
}
array_push($this->paragraphs, $pdef);
}
}
}
}
}
public function to_html()
{
$html = '';
foreach($this->paragraphs as $para)
{
$styleselect = null;
$type = 'text';
$content = $para->text;
$sz = 0;
$extent = '';
$embedid = '';
$pinnerstylesid = '';
$pinnerstylesunderline = '';
$pinnerstylessz = '';
if (count($para->data) > 0)
{
foreach($para->data as $node)
{
if (strcmp($node->name, "styleId") == 0)
{
$type = $node->type;
$pinnerstylesid = $node->innerstyle;
foreach($this->styles as $style)
{
if (strcmp ($node->value, $style->styleId) == 0)
{
$styleselect = $style;
}
}
}
if (strcmp($node->name, "align") == 0)
{
$pinnerstylesid .= $node->innerstyle. ";";
}
if (strcmp($node->name, "drawing") == 0)
{
$type = $node->type;
$extent = $node->innerstyle;
$embedid = $node->value;
}
if (strcmp($node->name, "textSize") == 0)
{
$sz = $node->value;
}
if (strcmp($node->name, "textUnderline") == 0)
{
$pinnerstylesunderline = $node->innerstyle;
}
}
}
if (strcmp($type, 'text') == 0)
{
//echo "has valid para";
//echo "<br>";
if ($styleselect != null)
{
//echo "has valid style";
//echo "<br>";
if (strcmp($styleselect->color, '') != 0)
{
$pinnerstylesid .= "color:#" . $styleselect->color. ";";
}
}
if ($sz != 0)
{
$pinnerstylesid .= 'font-size:' . $sz . 'px;';
//echo "sz<br>";
}
$span = "<p style='". $pinnerstylesid . $pinnerstylesunderline ."'>";
$span .= $content;
$span .= "</p>";
//echo $span;
$html .= $span;
}
if (strcmp($type, 'graphic') == 0)
{
$imglnk = '';
foreach($this->rels as $rel)
{
if(strcmp($embedid, '') != 0 && strcmp($rel->Id, $embedid) == 0)
{
foreach($this->imglnks as $imgpathdef)
{
if (strpos($imgpathdef->extractedpath, $rel->Target) >= 0)
{
$imglnk = $imgpathdef->extractedpath;
//echo "has img link<br>";
//echo $imglnk . "<br>";
}
}
}
}
if ($styleselect != null)
{
//echo "has valid style";
//echo "<br>";
if (strcmp($styleselect->color, '') != 0)
{
$pinnerstylesid .= "color:#" . $styleselect->color. ";";
}
}
if ($sz != 0)
{
$pinnerstylesid .= 'font-size:' . $sz . 'px;';
//echo "sz<br>";
}
$span = "<p style='". $pinnerstylesid . $pinnerstylesunderline ."'>";
$span .= "<img style='". $extent ."' alt='image coming soon' src ='". $imglnk ."'/>";
$span .= "</p>";
//echo $span;
$html .= $span;
}
}
return $html;
}
public function get_errors() {
return $this->errors;
}
private function getStyles() {
}
}
function getDocX($path)
{
//echo $path;
$doc = new Docx_reader();
$doc->setFile($path);
if(!$doc->get_errors()) {
$doc->processDocument();
$html = $doc->to_html();
echo $html;
}
return "";
}
?>

Categories