Files with same content differ in sizes

Files with same content differ in sizes - php

Here is the protocol:
1) I generate a text file online with PHP containing alphanumeric characters. Then I download it and note its size (from Properties menu).
2) I open the text file with Notepad++ and cut all the content in a new text file, then I save the new file (with the same name).
3) To my astonishment, even thought both files have the exact same text content, their size isn't the same!
--TEST 1--
Downloaded file: 1529 Ko
New copy file: 1594 Ko
--TEST 2--
Downloaded file: 52 Ko
New copy file: 54 Ko
So what? Why am I posting this here? Because the file in question is available to my users for download on my website, and they can use it to replace a file in a game's save. However, the game reacts to the new file by rejecting it, whilst the copied one (with the above protocol) works fine.
The only difference I see between both files is their size (slight difference as shown above) - but the content and the name is the same. Any idea why there is that size difference?

This will most likely be newlines that are converted between unix (1 byte) and windows (2 bytes).
As mentioned in the comments, it could also be encoding, but NotePad++ is pretty good at encoding. It's also unlikely to account for the difference.
You need to convert the "\r\n" to "\n" to get the smaller filesize. Here's a page I just found with a few options: http://darklaunch.com/2009/05/06/php-normalize-newlines-line-endings-crlf-cr-lf-unix-windows-mac
Another thingto watch for is a trailing "newline" which is not very obvious. Again, strip it out before doing your comparisson.

Is your client and server are different platforms ? Say linux and windows ? In that case there is a difference in the way new line characters are stored. This can cause size difference.
Another reason could be character encoding used but it is little less likely.

Related

PHP - file_put_contents file manipulation

I'm trying to write a PHP file on a server and to bypass the extension in the end.
This is the PHP file - 1.php:
<?php
file_put_contents("folder\\".$GET['file'].".PNG",$_GET['content']);
?>
I'm trying to bypass the PNG extension and to write a PHP file.
like this:
1.php?file=attack.php%00&content=blabla
but it's not working
I tried:
Null char (%00,%u0000)
Long filename
CRLF chars
space char
?,&,|,>,<,(,),{,},[,],\,!,~,:,; chars
backspace char
../
php protocol
php://filter/write=convert.base64-decode/resource=1.php
(will not work because the folder in the begging)
Anyone have any idea?
Thanks!

There are several fundamental problems here;
This code is very unsafe, I could set get as ../../1.php and overwrite this file to do whatever I want. It appears that you're doing some security testing however, so I guess that may be the problem
php is not a protocal, it's a language so php://anything should not work.
folder\\ doesn't make sense, what is this supposed to be/do?
That said though, for educational purposes prepending ../../ should allow you to escape out of the folder/ directory.
For example if this is in /home/Zak/mytest/ with the expectation of a directory within that called folder designated to store these PNG files, then a file of ../../zak_homedir should put a file at /home/Zak/zak_homedir.PNG due to relative path resolution.

File reading from PHP using python script

Okay, this is driving me crazy. I have a small file. Here is the dropbox link https://www.dropbox.com/s/74nde57f07jj0zj/transcript.txt?dl=0.
If I try to read the content of the file using python f.read(), I can easily read it. But, if I try to run the same python program using php shell_exec(), the file read fails. This is the error I get.
Traceback (most recent call last):
File "/var/www/python_code.py", line 2, in <module>
transcript = f.read()
File "/opt/anaconda/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 107: ordinal not in range(128)
I have checked all the permission issues and there is no problem with that.
Can anyone kindly shed some light?
Here is my python code.
f = open('./transcript/transcript.txt', 'r')
transcript = f.read()
print(transcript)
Here is my PHP code.
$output = shell_exec("/opt/anaconda/bin/python /var/www/python_code.py");
Thank you!
EDIT: I think the problem is in the file content. If I replace the content with simple 'I eat rice', then I can read the content from php. But the current content cannot be read. Still don't know why.

The problem appears is that your file contains non-ASCII characters, but you're trying to read it as ASCII text.
Either it is text, but is in some encoding or other that you haven't told us (probably UTF-8, Latin-1, or cp1252, but there are countless other possibilities), or it's not text at all, but rather arbitrary binary data.
When you open a text file without specifying an encoding, Python has to guess. When you're running from inside the terminal or whatever IDE you use, presumably, it's guessing the same encoding that you used in creating the file, and you're getting lucky. But when you're running from PHP, Python doesn't have as much information, so it's just guessing ASCII, which means it fails to read the file because the file has bytes that aren't valid as ASCII.
If you want to understand how Python guesses, see the docs for open, but briefly: it calls locale.getpreferredencoding(), which, at least on non-Windows platforms, reads it from the locale settings in the environment. On a typical linux system that's not new enough to be based on systemd but not too old, the user's shell will be set up for a UTF-8 locale, but services will be set up for C locale. If all of that makes sense to you, you may see a way to work around your problem. If it all sounds like gobbledegook, just ignore it.
If the file is meant to be text, then the right solution is to just pass the encoding to the open call. For example, if the file is UTF-8, do this:
f = open('./transcript/transcript.txt', 'r', encoding='utf-8')
Then Python doesn't have to guess.
If, on the other hand, the file is arbitrary binary data, then don't open it in text mode:
f = open('./transcript/transcript.txt', 'rb')
In this case, of course, you'll get bytes instead of str every time you read from it, and print is just going to print something ugly like b'aq\x9bz' that makes no sense; you'll have to figure out what you actually want to do with the bytes instead of printing them as a bytes.

UTF-8, PHP, Win7 - Is there a solution now to save UTF-8-filenames on Win 7 using php?

Update: Just to not make you reading through all: PHP starting with
7.1.0alpha2 supports UTF-8 filenames on Windows. (Thanks to Anatol-Belski!)
Following some link chains on stackoverflow I found part of the answer:
https://stackoverflow.com/a/10138133/3716796 by Umberto Salsi
(and on the same question: https://stackoverflow.com/a/2950046/3716796 by Artefacto)
In short: 'PHP communicate[s] with the underlying file system as a "non-Unicode aware program"', and because of that all filenames given to PHP by Windows and vice versa are automatically translated/reencoded by Windows. This causes the errors. And you seemingly can't stop the automatic reencoding.
(And https://stackoverflow.com/a/2888039/3716796 by Artefacto: "PHP does not use the wide WIN32 API calls, so you're limited by the codepage.")
And at https://bugs.php.net/bug.php?id=47096 there is the bug report for PHP.
Though on there nicolas suggests, that a COM-object might work! $fs = new COM('Scripting.FileSystemObject', null,
CP_UTF8);
Maybe I will try that sometimes.
So there is the part of my questionleft : Is there PHP6 out, or was it withdrawn, or is there anything new on PHP about that topic?
// full Question
The most questions about this topic are 1 to 5 years old.
Could php now save a file using
file_put_contents($dir . '/' . $_POST['fileName'], $_POST['content']);
when the $_POST['fileName'] is UTF-8 encoded, for example "Крым.xml" ?
Currently it is saved as
ÐšÑ€Ñ‹Ð¼.xml
I checked the fileName variable, so I can be sure it's UTF-8:
echo mb_detect_encoding($_POST['fileName']);
Is there now anything new in PHP that could accomplish it?
At some places I read PHP 6 would be able to do it, but PHP 6 if i I remember right, has been withdrawn. ?
In Windows Explorer I can change the name of a file to "Крым.xml". As far as I have understood the old questions&answers, it should be possible to use file_put_contents if the fileName-var is simply encoded to the encoding used by windows 7 and it's NTFS disc.
There is even 3 old question with answers that claim to have succeeded: PHP File Handling with UTF-8 Special Characters
Convert UTF-16LE to UTF-8 in php
and PHP: How to create unicode filenames
Overall and most approved answers say it is not possible.
I checked all suggested answers already myself, and none works.
How to definitly and with absolute accuracy find out, in which encoding my Win 7 and Explorer saves the filename on my NTFS disc and with German language setting?
As said: I can create a file "Крым.xml" in the Explorer.
My conclusion:
1. Either file_put_contents doesn'T work correctly when handing over the fileName (which I tried with conversions to UTF-16, UTF-16LE, ISO-8859-1 and Windows-1252) to Windows,
2. or file_put_contents just doesn't implement a way to call Windows' own file function in the appropriate way (so this second possibility would mean it's not a bug but just not implemented.) (For example notepad++ has no problems creating, writing and renaming a file called Крым.xml.)
Just one example of the error messages I got, in this case when I used
mb_convert_encoding($theFilename , 'Windows-1252' , 'UTF-8')
"Warning: file_put_contents(dirToSaveIn/????.xml): failed to open stream: No error in C:\aa xampp\htdocs\myinterface.lo\myinterface\phpWriteLocalSearchResponseXML.php on line 26 "
With other conversion I got other error messages, ranging from 'invalid characters' to no string recognized at all.
Greetings
John

PHP starting with 7.1.0alpha2 supports UTF-8 filenames on Windows.
Thanks.

completely deleting a file from server

I want to delete a file by using PHP. I have used the unlink() function, but I was wondering about the security of unlink. Is the file completely deleted from the server? I want to make sure that there is no way to get the file back and the file is completely removed from the server.

open the file in binary mode for writing, write 1's over the entire file, close the file, and then unlink it. overwrites any data within the file so it cannot be recovered.
Personally i would say use 1's instead of 0's as 1's are actual data and will always write, where as 0's may not write, depending on several factors.
Edit: After some thought, and reading of comments, i would go with a hybrid approach, depending on "how deleted" you want the file to be, if you simply wish to make it so the data cannot be recovered, overwrite the entire files length with 1's as this is fast, and destroys the data, the problem with this, is it leaves a set length of uniform data on the disk which infers a file USED to be there and gives away the files length, giving vital pieces of forensic information. Simply writing random data will not avoid this also, as if all the drive sectors around this file are untouched, this will also leave a forensic trace.
The best solution factoring in forensic deletion, obfuscation and plausible deniability (again, this is overkill, but im adding it for the sake of adding it), overwrite the entire length of the file with 1's and then, for HALF the length of the file in bytes, write from mt_rand in random length sizes, from random starting points, leaving the impression that many files of varying lengths used to be in this area, thus creating a false trail. (again, this is completely overkill and is generally only needed by serial killers and the CIA, but im adding it for the sake of doing so).

the US government used to recommend a seven step wipe, for disks.
1) all '1's
2) all '0's
3) the pattern '01'
4) the pattern '10'
5) a random pattern
6) all '1'
7) a random pattern,
re the code sample, using a language like PHP is wrong for this type of wipe as your relaying on the OS really wipeing the file and not doing something cleaver like only wipeing it the last time or just unlinking it, however...
(untested)
$filename = "/usr/local/something.txt";
$size = filesize($filename);
$pat1 = chr(0);
$pat2 = chr(255);
$pat3 = chr(170);
$pat4 = chr(85);
$mask = str_repeat($pat1, $size);
file_put_contents($filename, $mask);
$mask = str_repeat($pat2, $size);
file_put_contents($filename, $mask);
$mask = str_repeat($pat3, $size);
file_put_contents($filename, $mask);
$mask = str_repeat($pat4, $size);
file_put_contents($filename, $mask);

This might not answer HOW to perfectly delete a file "with PHP", but it answers your question: "Is the file completely deleted from the server ?"
In some cases, No! (on UNIX/POSIX OS).
According to the highest voted comment on the offical PHP unlink() manual page, the unlink function does not really delete the file, it's deleting the system link to the file's content ! As files can have several files names (!) [symlinks?] the file will only be deleted when ALL file names are unlinked. So, if your file has 2 names, then unlink() will not really delete the file unless you unlink() both file names. Dear linux guys, please correct me here if necessary.
This might be why the function is called unLINK() and not delete() !!!
Here a full quote of the excellent comment:
Deleted a large file but seeing no increase in free space or decrease of disk usage? Using UNIX or other POSIX OS? The unlink() is not about removing file, it's about removing a file name. The manpage says: `unlink - delete a name and possibly the file it refers to''. Most of the time a file has just one name -- removing it will also remove (free, deallocate) thebody' of file (with one caveat, see below). That's the simple, usual case.
However, it's perfectly fine for a file to have several names (see the link() function), in the same or different directories. All the names will refer to the file body and keep it alive', so to say. Only when all the names are removed, the body of file actually is freed. The caveat: A file's body may *also* bekept alive' (still using diskspace) by a process holding the file open. The body will not be deallocated (will not free disk space) as long as the process holds it open. In fact, there's a fancy way of resurrecting a file removed by a mistake but still held open by a process...
Have a look on unlink()'s sister function link() here.
The (imo) best way to delete a file via PHP:
The way to go to really delete a file with PHP (in linux) is to use the exec() function, which executes real bash commands (doing things with linux bash feel correct btw). In this case, the file test.jpg would be deleted by doing:
exec("rm test.jpg);
More info on how to use rm (remove) correctly can be found for example here. Please note: PHP needs the right to delete the file!
UPDATE: Unfortunatly, the linux rm command ALSO does not really delete the file if it has two names/links. Look here for more info.
I'll have a deeper research on that and give feedback...

It is possible that because of some fragmentation on the disk some parts of file will stay, even if the file is totally overwritten.
The other way is to run (by shell_exec()) external program, that is system specific. Here is an example (for Windows), however I have not tested it.

You should do multiple passes of overwriting to deminish traces. For instance using the US DoD 5220-22.M : "Overwrite all addressable locations with a character, its complement, then a random character and verify" (from killdisk site)

Here's what the EFF recommends to permanently remove a file http://ssd.eff.org/tech/deletion.

In my embedded Ubuntu device, I use: echo exec('rm /usr/share/subdirectory/subdirectory/filename'); This works for me.
if you use rm -f (--force) then linux will
ignore nonexistent files and arguments, never prompt
rm -d will
remove empty directories
If you enter rm --help at the prompt you get the help screen. The last lines read:
Note that if you use rm to remove a file, it might be possible to recover some of its contents, given sufficient expertise and/or time. For greater assurance that the contents are truly unrecoverable, consider using shred.
Since my system is a "closed" system then I'm not concerned about violating security issues. My logic being that one must have the system password to SSH into the OS and the only user interface is via web pages.
#Sliq's comments are still true to date. You need to decide for your case.

How to run or load .po/.mo files for localization in php

I have installed poedit and running my file through it, it creates .po and .mo files for them. But I have a problem to load and use these files for translating my text. I don't know how to load or open the translated files and to show the translated content.
Can anyone help me about this. I tried every possible source but not succeeded.

First of all you need to inform PHP which locale and domain you are using.
putenv("LANG=da_DK");
setlocale('LC_ALL', "da_DK");
bindtextdomain("mycatalog", "./locale/");
textdomain("mycatalog");
In this case I'm having a Danish translation and a file called mycatalog.mo (and .po). These files are placed (from your root) here: locale/da_DK/LC_MESSAGES/mycatalog.mo/po
In order to show your translation, you will do this:
echo _("Hello world"); // Which would become "Hej verden"
_(); is an alias of gettext(); The smart thing about gettexts is that if there's no translation you will not have an ugly language code like "MSG_HELLO_WORLD" in your UI, but instead a better alternative: Simply the plain English text.
In the messages.po file you must have all the messages (case-sensitive and also with respect to used commas, dots, colons, etc.) on this form:
msgid "Hello world!"
msgstr "Hej verden!"
When you have added this to your .po file, you open this file in poedit, hit "Save" and it will generate a .mo file. This file is uploaded to the same directory as the .po file (typically something like \locale\da_DK\LC_MESSAGES\ from the script root)
To translate dynamic/variable content you can use - among other things - sprintf, in this manner:
echo sprintf(_("My name is %s"), $name);
In this case the %s will occur in the .po file; When you have the translated string (which contains the %s), sprintf will make sure to replace the %s with the variable content. IF the variable must be translated too, you can do this:
echo sprintf(_("The color of my house is %s"), _($color));
Then you don't need a full sentence for every color, but you still get the colors translated.
It is important to note that the first time a .mo is run on the server it is cached - and there is no way of removing this file from the cache without restarting (Apache or the like itself should be enough). This means that any changes you make to the .mo after the first time it is used, will not be effective. There are a number of hacks to work around this, but honestly, they are mostly not very pretty (they include copying the .mo, add the time() behind it and then import and cache it again). This last paragraph is only of importance if you aren't going to translate the whole thing at once, but in chunks.
If you want to create your own translation tool at some point, this tool helps you convert .po to .mo using PHP:
http://www.josscrowcroft.com/2011/code/php-mo-convert-gettext-po-file-to-binary-mo-file-php/

See (and explore) http://php.net/manual/en/book.gettext.php. There are user-comments on that page that should give you an idea on how to procede.
Also, your question is a duplicate of Get translations from .po or .mo file

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.