UTF8 Variable from FLASH Post to filename - php

I am using a script to send a "$filename" variable from flash to PHP in order to create an xml file. The problem is that when I am typing Greek Characters as Filename the filename on the server gets values such as these for example: (δσωδσαωςεωςεβ.qxml)
I do not have any problem a) When writing english characters, b) When writing greek characters data in the xml file.
I am using file_put_contents function.
If instead of getting the Post variable as filename, I set my own filename such as "Ελληνικά.qxml" it works without a problem.
Thanks a lot in advance.
$string = $_POST['xmldata'];
$filename = $_POST['filename'];
$path = "test/";
//$dir_handle = #opendir($path) or mkdir("{$path}", 0777, true);
file_put_contents($path."/".$filename."", $string);
This problem was solved, but another arose. When I try to open the file from flash it does not recognise it now because it is in Greek.

The problem is that flash is sending the data in different encoding. From the comments in the PHP manual for mb_convert_encoding I can see that you should use the following to get it to work (tested on danisch charactors and not greek)
<?php
$string = isset($_POST['xmldata'])?$_POST['xmldata']:"";
$filename = isset($_POST['filename'])?$_POST['filename']:"";
//tested on danish chars
/*
$string = mb_convert_encoding($string, "ISO-8859-1", "UTF-8");
$filename = mb_convert_encoding($filename, "ISO-8859-1", "UTF-8");
*/
//tested on greek chars
$string = mb_convert_encoding($string, "ISO-8859-7", "UTF-8");
$filename = mb_convert_encoding($filename, "ISO-8859-7", "UTF-8");
$path = "test/";
//$dir_handle = #opendir($path) or mkdir("{$path}", 0777, true);
file_put_contents($path."/".$filename."", $string);
?>

Related

Unable to create directory in Windows using PHP and UTF-8

I am trying to create some directories with unicode names in Windows. The names displays correctly in the Browser but when the Directory is created then it is converted into garbage text.
I have tried ecoding conversions removing special characters.
$myfile = fopen("unicode.csv", "r") or die("Unable to open file!");
$lines = file("unicode.csv", FILE_IGNORE_NEW_LINES);
echo '<table border="1">';
foreach($lines as $k=>$v){
$parts = preg_split('/[\t]/', $v);
echo '<tr>';
foreach($parts as $key=>$val){
if($key==0){
$dir = str_replace("/", "", $val);
$dir = str_replace("\\", "", $dir);
$encode = mb_detect_encoding($dir, mb_detect_order(), false);
$dir = mb_convert_encoding($dir , 'UTF-8' , 'UTF-8');
echo '<td>'.$dir.'</td><td>'.$encode.'</td>';
$result = mkdir ($dir, "0777");
}
echo '<td>'.$val.'</td>';
}
echo '</tr>';
}
Expected result is directory name should be readable in UTF-8.
It turns out to be in garbage text.
Thanks to #eryksun :
Based on your results, it looks like PHP mkdir does not transcode from UTF-8 to native Windows UTF-16LE in order to call [W]ide-character CreateDirectoryW. It probably just calls C mkdir. This naively passes bytes to CreateDirectoryA, which decodes the UTF-8 name using the system [A]NSI encoding (e.g. codepage 1252). Starting with Windows 10, we can set [A]NSI to UTF-8 in the system locale configuration. This change requires a reboot.

php string invisible character after get_file_contents()

I have a .txt file generated from the management shell of an exchange server.
I am using php and want to use get_file_contents() to get the content of the file and then a preg_match() with "DisplayName". This problem applies to the whole file.
$file = file_get_contents("file.txt");
//Does not work
preg_match("/DisplayName/", $file, $matches);
//Does work
preg_match("/D.i.s.p.l.a.y.N.a.m.e/", $file, $matches);
//Returns 1
preg_match("/D(.)i/", $file, $matches);
echo strlen($matches[1][0]);
How do I remove these invisible characters or what could it be?
Is there a function in php to find out what this character might be?
https://www.soscisurvey.de/tools/view-chars.php says there are no hidden characters.
Example:
DisplayName : Name
ServerName : Server
PrimarySmtpAddress : Email
EmailAddresses : {Email list}
I hope you guys are able to help me.
Looks like the file is encoded as Unicode where you expect it to be plain ASCII.
Try this:
$file = file_get_contents("file.txt", FILE_TEXT);
or a custom function:
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8',
mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
$file = file_get_contents_utf8("file.txt");
Thanks for your help, I was able to fix it like this:
$file = file_get_contents("file.txt");
$file = str_replace(chr(0), "", $file);

PHP Cant read file with special chars

I'm loading a file by this script:
$hFile = fopen($sFile, "r");
$sContent = "";
while(!feof($hFile)) {
$sContent .= fread($hFile, 4096);
}
fclose($hFile);
It works as it should do, but i tried to load a file called test.txt
which contains the following string: <>863b?)(/&(§&/))!)!=WLKM! K!*ÜQWW!W3³³w2_:LPE
The variable $sContent now doesn't contain anything.
$html = mb_convert_encoding($html, "UTF-8");
Do this before writing to the file in the first place.

php - file_get_contents - Downloading files with spaces in the filename not working

I am trying to download files using file_get_contents() function.
However if the location of the file is http://www.example.com/some name.jpg, the function fails to download this.
But if the URL is given as http://www.example.com/some%20name.jpg, the same gets downloaded.
I tried rawurlencode() but this coverts all the characters in the URL and the download fails again.
Can someone please suggest a solution for this?
I think this will work for you:
function file_url($url){
$parts = parse_url($url);
$path_parts = array_map('rawurldecode', explode('/', $parts['path']));
return
$parts['scheme'] . '://' .
$parts['host'] .
implode('/', array_map('rawurlencode', $path_parts))
;
}
echo file_url("http://example.com/foo/bar bof/some file.jpg") . "\n";
echo file_url("http://example.com/foo/bar+bof/some+file.jpg") . "\n";
echo file_url("http://example.com/foo/bar%20bof/some%20file.jpg") . "\n";
Output
http://example.com/foo/bar%20bof/some%20file.jpg
http://example.com/foo/bar%2Bbof/some%2Bfile.jpg
http://example.com/foo/bar%20bof/some%20file.jpg
Note:
I'd probably use urldecode and urlencode for this as the output would be identical for each url. rawurlencode will preserve the + even when %20 is probably suitable for whatever url you're using.
As you have probably already figured out urlencode() should only be used on each portion of a URL that requires escaping.
From the docs for urlencode() just apply it to the image file name giving you the problem and leave the rest of the URL alone. From your example you can safely encode everything following the last "/" character
Here is maybe a better solution. If for any reason you are using a relative url like:
//www.example.com/path
Prior to php 5.4.7 this would not create the [scheme] array element which would throw off maček function. This method may be faster as well.
$url = '//www.example.com/path';
preg_match('/(https?:\/\/|\/\/)([^\/]+)(.*)/ism', $url, $result);
$url = $result[1].$result[2].urlencode(urldecode($result[3]));
Assuming only the file name has the problem, this is a better approach. only urlencode the last section ie. file name.
private function update_url($url)
{
$parts = explode('/', $url);
$new_file = urlencode(end($parts));
$parts[key($parts)] = $new_file;
return implode("/", $parts);
}
This should work
$file = 'some file name';
urlencode($file);
file_get_contents($file);

PHP editing Microsoft Word document str_replace and preg_replace don't work

Assume, I've got MSWord file source.doc with next content "Content of Microsoft Word file".
For example, I'd like to open it via PHP and replace word "Microsoft" to "Openoffice" and save the result into result.doc.
Here is the code using preg_replace:
$content = file_get_contents( SOMEPATH . '/source.doc' );
$new_content = preg_replace( '/Microsoft/i', 'Openoffice', $content );
file_put_contents( SOMEPATH . '/target.doc', $new_content );
Or using str_replace:
$content = file_get_contents( SOMEPATH . '/source.doc' );
$new_content = str_replace( 'Microsoft', 'Openoffice', $content );
file_put_contents( SOMEPATH . '/target.doc', $new_content );
None of them doesn't work. Code runs without any exceptions, but target.doc is the same as source.doc. Replacement not performs.
I've tried a lot of different reciepts, such as regular expression modificators, iconv and so on, but nothing helps.
var_dump of $content shows raw structure of source.doc that is full of unusual characters and as I suppose some of it stops str_replace or preg_replace scanning. Can't figure out which char is it and what should I do if I'll find it.
var_dump of $new_content is identical to $content.
Thanks forward for any help!
If you have a DOCX file you need to replace something in, its basically a zipped up xml archive.
Here's an example on how to replace the word "Microsoft" with "Openoffice" in a DOCX file.
$zip = new ZipArchive;
//This is the main document in a .docx file.
$fileToModify = 'word/document.xml';
$wordDoc = "Document.docx";
if ($zip->open($wordDoc) === TRUE) {
//Read contents into memory
$oldContents = $zip->getFromName($fileToModify);
//Modify contents:
$newContents = str_replace('Microsoft', 'Openoffice', $oldContents);
//Delete the old...
$zip->deleteName($fileToModify);
//Write the new...
$zip->addFromString($fileToModify, $newContents);
//And write back to the filesystem.
$return =$zip->close();
If ($return==TRUE){
echo "Success!";
}
} else {
echo 'failed';
}
Hope this helps!
I think this is what you are looking for :) http://phpword.codeplex.com/ since doc files are not ordinary text files (try opening one with notepad..you'll get my point)

Categories