php string invisible character after get_file_contents() - php

I have a .txt file generated from the management shell of an exchange server.
I am using php and want to use get_file_contents() to get the content of the file and then a preg_match() with "DisplayName". This problem applies to the whole file.
$file = file_get_contents("file.txt");
//Does not work
preg_match("/DisplayName/", $file, $matches);
//Does work
preg_match("/D.i.s.p.l.a.y.N.a.m.e/", $file, $matches);
//Returns 1
preg_match("/D(.)i/", $file, $matches);
echo strlen($matches[1][0]);
How do I remove these invisible characters or what could it be?
Is there a function in php to find out what this character might be?
https://www.soscisurvey.de/tools/view-chars.php says there are no hidden characters.
Example:
DisplayName : Name
ServerName : Server
PrimarySmtpAddress : Email
EmailAddresses : {Email list}
I hope you guys are able to help me.

Looks like the file is encoded as Unicode where you expect it to be plain ASCII.
Try this:
$file = file_get_contents("file.txt", FILE_TEXT);
or a custom function:
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8',
mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
$file = file_get_contents_utf8("file.txt");

Thanks for your help, I was able to fix it like this:
$file = file_get_contents("file.txt");
$file = str_replace(chr(0), "", $file);

Related

How do I read a file from a destination to variable?

I am trying to read a file's content, while doing something like this
$con="HDdeltin";
$fp = fopen($_SERVER['DOCUMENT_ROOT']. "/HDdeltin/Users/$con", "r");
echo $fp;
It doesn't return anything. What am I doing wrong?
Fast solution :
file_get_contents($_SERVER['DOCUMENT_ROOT'] . "/HDdeltin/Users/HDdeltin");
Use file_get_contents(), and you can prevent bad url with realpath().
$con = "HDdeltin";
$path = realpath($_SERVER['DOCUMENT_ROOT'] . "/HDdeltin/Users/$con");
echo file_get_contents($path);
To get current root, you can also use :
getcwd() function.
dirname(__FILE__), you can get more here.
See more :
faster fopen or file_get_contents?
you can read by using this function
echo file_get_contents($_SERVER['DOCUMENT_ROOT']. "/HDdeltin/Users/$con");
You will likely need to use file_get_contents. http://php.net/manual/en/function.file-get-contents.php
Here is the example they provide:
// Read 14 characters starting from the 21st character
$section = file_get_contents('./people.txt', NULL, NULL, 20, 14);
var_dump($section);
So you will likely need to use
$fp = file_get_contents($_SERVER['DOCUMENT_ROOT']. "/HDdeltin/Users/$con", true);
echo $fp;
There are some differences between PHP versions it seems
<?php
// <= PHP 5
$file = file_get_contents('./people.txt', true);
// > PHP 5
$file = file_get_contents('./people.txt', FILE_USE_INCLUDE_PATH);
?>
So be sure to check out the first link provided.
fopen return a resource, not the text
Use file_get_contents($_SERVER['DOCUMENT_ROOT']. "/HDdeltin/Users/$con") instead

convert ending line of a file with a php script

I'd like to know if it's possible to convert the endings lines mac (CR : \r) to windows (CRLF : \r\n) with a php script.
Indeed I've got a php script which run periodically on my computer to upload some files on a FTP server and the ending lines need to be changed before the upload. It's easy to do it manually but I would like to do it automatically.
Can you just use a simple regular expression like the following?
function normalize_line_endings($string) {
return preg_replace("/(?<=[^\r]|^)\n/", "\r\n", $string);
}
It's probably not the most elegant or fastest solution but it should work pretty well (i.e it won't mess up existing Windows (CRLF) line-endings in a string).
Explanation
(?<= - Start of a lookaround (behind)
[^\r] - Match any character that is not a Carriage Return (\r)
| - OR
^ - Match the beginning of the string (in order to capture newlines at the start of a string
) - End of the lookaround
\n - Match a literal LineFeed (\n) character
Basically load the file to a string and call something like :
function normalize($s) {
// Normalize line endings
// Convert all line-endings to UNIX format
$s = str_replace(array("\r", "\n"), "\r\n", $s);
// Don't allow out-of-control blank lines
$s = preg_replace("/\r\n{2,}/", "\r\n\r\n", $s);
return $s;
}
This is a snippet from here, last regeg might need some further tinkering with.
Edit: Fixed logic to remove duplicate replacements.
In the end the safer way is to change what you don't want replaced first, here my function :
/**Convert the ending-lines CR et LF in CRLF.
*
* #param string $filename Name of the file
* #return boolean "true" if the conversion proceed without error and else "false".
*/
function normalize ($filename) {
echo "Convert the ending-lines of $filename into CRLF ending-lines...";
//Load the content of the file into a string
$file_contents = #file_get_contents($filename);
if (!file_contents) {
echo "Could not convert the ending-lines : impossible to load the file.PHP_EOL";
return false;
}
//Replace all the CRLF ending-lines by something uncommon
$DontReplaceThisString = "\r\n";
$specialString = "!£#!Dont_wanna_replace_that!#£!";
$string = str_replace($DontReplaceThisString, $specialString, $file_contents);
//Convert the CR ending-lines into CRLF ones
$file_contents = str_replace("\r", "\r\n", $file_contents);
//Replace all the CRLF ending-lines by something uncommon
$file_contents = str_replace($DontReplaceThisString, $specialString, $file_contents);
//Convert the LF ending-lines into CRLF ones
$file_contents = str_replace("\n", "\r\n", $file_contents);
//Restore the CRLF ending-lines
$file_contents = str_replace($specialString, $DontReplaceThisString, $file_contents);
//Update the file contents
file_put_contents($filename, $file_contents);
echo "Ending-lines of the file converted.PHP_EOL";
return true;
}
I tested it but there's some error : it seems that instead of replacing the CR ending-line it add a CRLF ending-line, here's the function, i slightly modified it to avoid to open the file outside this function :
// FONCTION CONVERTISSANT LES FINS DE LIGNES CR TO CRLF
function normalize ($filename) {
echo "Convert the ending-lines of $filename... ";
//Load the file into a string
$string = #file_get_contents($filename);
if (!string) {
echo "Could not convert the ending-lines : impossible to load the file.\n";
return false;
}
//Convert all line-endings
$string = str_replace(array("\r", "\n"), "\r\n", $string);
// Don't allow out-of-control blank lines
$string = preg_replace("/\r\n{2,}/", "\r\n", $string);
file_put_contents($filename, $string);
echo "Ending-lines converted.\n";
return true;
}
it might be easier to remove all \r characters and then replace \n with \r\n.
this will take care of all variations:
$output = str_replace("\n", "\r\n", str_replace("\r", '', $input));

UTF8 Variable from FLASH Post to filename

I am using a script to send a "$filename" variable from flash to PHP in order to create an xml file. The problem is that when I am typing Greek Characters as Filename the filename on the server gets values such as these for example: (δσωδσαωςεωςεβ.qxml)
I do not have any problem a) When writing english characters, b) When writing greek characters data in the xml file.
I am using file_put_contents function.
If instead of getting the Post variable as filename, I set my own filename such as "Ελληνικά.qxml" it works without a problem.
Thanks a lot in advance.
$string = $_POST['xmldata'];
$filename = $_POST['filename'];
$path = "test/";
//$dir_handle = #opendir($path) or mkdir("{$path}", 0777, true);
file_put_contents($path."/".$filename."", $string);
This problem was solved, but another arose. When I try to open the file from flash it does not recognise it now because it is in Greek.
The problem is that flash is sending the data in different encoding. From the comments in the PHP manual for mb_convert_encoding I can see that you should use the following to get it to work (tested on danisch charactors and not greek)
<?php
$string = isset($_POST['xmldata'])?$_POST['xmldata']:"";
$filename = isset($_POST['filename'])?$_POST['filename']:"";
//tested on danish chars
/*
$string = mb_convert_encoding($string, "ISO-8859-1", "UTF-8");
$filename = mb_convert_encoding($filename, "ISO-8859-1", "UTF-8");
*/
//tested on greek chars
$string = mb_convert_encoding($string, "ISO-8859-7", "UTF-8");
$filename = mb_convert_encoding($filename, "ISO-8859-7", "UTF-8");
$path = "test/";
//$dir_handle = #opendir($path) or mkdir("{$path}", 0777, true);
file_put_contents($path."/".$filename."", $string);
?>

PHP editing Microsoft Word document str_replace and preg_replace don't work

Assume, I've got MSWord file source.doc with next content "Content of Microsoft Word file".
For example, I'd like to open it via PHP and replace word "Microsoft" to "Openoffice" and save the result into result.doc.
Here is the code using preg_replace:
$content = file_get_contents( SOMEPATH . '/source.doc' );
$new_content = preg_replace( '/Microsoft/i', 'Openoffice', $content );
file_put_contents( SOMEPATH . '/target.doc', $new_content );
Or using str_replace:
$content = file_get_contents( SOMEPATH . '/source.doc' );
$new_content = str_replace( 'Microsoft', 'Openoffice', $content );
file_put_contents( SOMEPATH . '/target.doc', $new_content );
None of them doesn't work. Code runs without any exceptions, but target.doc is the same as source.doc. Replacement not performs.
I've tried a lot of different reciepts, such as regular expression modificators, iconv and so on, but nothing helps.
var_dump of $content shows raw structure of source.doc that is full of unusual characters and as I suppose some of it stops str_replace or preg_replace scanning. Can't figure out which char is it and what should I do if I'll find it.
var_dump of $new_content is identical to $content.
Thanks forward for any help!
If you have a DOCX file you need to replace something in, its basically a zipped up xml archive.
Here's an example on how to replace the word "Microsoft" with "Openoffice" in a DOCX file.
$zip = new ZipArchive;
//This is the main document in a .docx file.
$fileToModify = 'word/document.xml';
$wordDoc = "Document.docx";
if ($zip->open($wordDoc) === TRUE) {
//Read contents into memory
$oldContents = $zip->getFromName($fileToModify);
//Modify contents:
$newContents = str_replace('Microsoft', 'Openoffice', $oldContents);
//Delete the old...
$zip->deleteName($fileToModify);
//Write the new...
$zip->addFromString($fileToModify, $newContents);
//And write back to the filesystem.
$return =$zip->close();
If ($return==TRUE){
echo "Success!";
}
} else {
echo 'failed';
}
Hope this helps!
I think this is what you are looking for :) http://phpword.codeplex.com/ since doc files are not ordinary text files (try opening one with notepad..you'll get my point)

Response any web page in the internet from a PHP file

How can I create a simple PHP file, which will retrieve the HTML and the Headers of any web page in the internet, change images/resources url to their full url (for example: image.gif to http://www.google.com/image.gif), and then response it?
Okay first of all to get the headers use the PHP get_headers function.
<?php
$url = "http://www.example.com/";
$headers = get_headers($url, true);
?>
Then read the content of the page into a variable.
<?php
$handle = fopen($url, r);
$content = '';
while(! feof($handle)) {
$text .= fread($handle, 8192);
}
fclose($handle);
?>
You then need to run through the content looking for resources and pre-pending the url to get the absolute path to the resource if it isn't already an absolute path. The following regex example will work on src attributes (e.g. images and javascript) and should give you a starting point to look at other resources such as CSS which uses href="". This regex won't match if a : is in the source a good indicator that it contains http:// and is therefore an absolute path. PLEASE NOTE this is by no means perfect and won't account for all sorts of weird and wonderful resource locations but it's a good start.
<?php
$pattern = '#src="([0-9A-Za-z-_/\.])+"#';
preg_match_all($pattern, $text, $matches);
foreach($matches[0] as $match) {
$src = str_replace('src="', '', $match);
$text = str_replace($match, 'src="' . $url . $src, $text);
}
print($text);
?>
<?
$file = "http://www.somesite/somepage";
$handle = fopen($file, "rb");
$text = '';
while (!feof($handle)) {
$text .= fread($handle, 8192);
}
fclose($handle);
print($text);
?>
I think what you're looking for is a PHP Proxy script. There are several on the internet - this is one I created (although don't have time to fix bugs at the moment).
I would recommend using one which is already created over one which you've written yourself, as it's not a trivial thing to do (there are better scripts than mine available as well).

Categories