read file in php until a specified string is found - php

I have a file that I wish to read in php and split into smaller files.
The file is base64encoded but each section is delimited in the file with a (unecoded) tilde followed by the original filename of the base64 encoded data followed by another tilde.
As a silly example, the file could look like :
NbAYnnBBA~file1.txt~NbAYnnBBANbAYnnBBANbAYnnBBANbAYnnBBA~file2.txt~
I don't want to use file_get_contents as the files could be huge and I don't want to hit memory limited.
Can anyone think of a way of doing it without having to use fgetc to do it a char at a time ?
There are no line breaks in the file by the way - it is one continous block.

Related

writing to a file with php is changing the original EOL type

I am using file_get_contents() to retrieve an SVG file from an external URL, once I have it I am storing it as a variable then serving the file using a header('Content-type: image/svg+xml'); the saved file retains the original EOL type, in this case it's LF.
However, I find that if I save the file with file_put_contents() then the EOL type in the saved file is changed to CRLF. I also just tested with fwrite but it has the same result, the saved file is changed to CRLF
Is there a way to use file_put_contents or fwrite and retain the EOL type or any other method that may better serve this purpose? I have checked the php docs on these functions but could not find a parameter that would help.. also searched this forum and google for a solution and could not find anything.. I have spent some time of this and feel its time to turn it over to others who may have better insight.
The files I am downloading may either contain LF OR CRLF so I am looking for a way to save the files and keep the original EOL type.
Here is a link to the original svg file: https://svgur.com/s/_ey
You can confirm the EOL type by opening the file with notepad++ and going to View -> Show Symbols -> Show End Of Line

PHP Encoded File From Database

I would like to know, if any of you had similar problem and, if yes, how did you fixed it?
The problem is about encoded file in the database base. I have to retrieve the file and save it as regular file in system storage.
But I don't really know, why, but I can't decode it. It looks like standard base64, but seems it's actually not.
So part of the file (maybe someone will find out which algorithm is used here):
0M8R4KGxGuFcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMD5cMANcMP7/CVwwBlwwXDBcMFwwXDBcMFwwXDBcMFwwXDADXDBcMFwwPAFcMFwwXDBcMFwwXDBcMBBcMFwwPgFcMFwwAVwwXDBcMP7///9cMFwwXDBcMDkBXDBcMDoBXDBcMDsBXDBcMP/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////spcFcMH9gFQRcMFww8BK/XDBcMFwwXDBcMFwwEFwwXDBcMFwwXDAGXDBcMGrnAVwwDlwwYmpiauaH5odcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDAVBBZcMC7+AVwwhO1cMFwwhO1cMFwwEd9cMFwwXDBcMFwwXDAmXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMP//D1wwXDBcMFwwXDBcMFwwXDBcMP//D1wwXDBcMFwwXDBcMFwwXDBcMP//D1wwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDCkXDBcMFwwXDBcMGAEXDBcMFwwXDBcMFwwYARcMFwwYARcMFwwXDBcMFwwXDBgBFwwXDBcMFwwXDBcMGAEXDBcMFwwXDBcMFwwYARcMFwwXDBcMFwwXDBgBFwwXDAUXDBcMFwwXDBcMFwwXDBcMFwwXDBcMHQEXDBcMFwwXDBcMFwwPEBcMFwwXDBcMFwwXDA8QFwwXDBcMFwwXDBcMDxAXDBcMDhcMFwwXDB0QFwwXDBEXDBcMFwwuEBcMFwwlFwwXDBcMHQEXDBcMFwwXDBcMFwwX01cMFww7FwwXDBcMFhBXDBcMFwwXDBcMFwwWEFcMFwwXCJcMFwwXDB6QVwwXDBcMFwwXDBcMHpBXDBcMFwwXDBcMFwwekFcMFwwXDBcMFwwXDB6QVwwXDBcMFwwXDBcMHpBXDBcMFwwXDBcMFwwekFcMFwwXDBcMFwwXDBCTFwwXDACXDBcMFwwRExcMFwwXDBcMFwwXDBETFwwXDBcMFwwXDBcMERMXDBcMFwwXDBcMFwwRExcMFwwXDBcMFwwXDBETFwwXDBcMFwwXDBcMERMXDBcMCRcMFwwXDBLTlwwXDBoAlwwXDCzUFwwXDBuXDBcMFwwaExcMFwwsVwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwYARcMFwwXDBcMFwwXDBWQlwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwekFcMFwwXDBcMFwwXDB6QVwwXDBcMFwwXDBcMFZCXDBcMFwwXDBcMFwwVkJcMFwwXDBcMFwwXDBoTFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBcMFwwXDBgBFwwXDBcMFwwXDBcMGAEXDBcMFwwXDBcMFwwekFc
And here part of the file after base64 decoding:
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0<\0\0\0\0\0\0\0\0\0>\0\0\0\0\0ţ˙˙˙\0\0\0\09\0\0:\0\0;\0\0˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙ěĽÁ\0`\0\0đż\0\0\0\0\0\0\0\0\0\0\0\0\0jç\0\0bjbjćć\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0.ţ\0í\0\0í\0\0ß\0\0\0\0\0\0&\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0˙˙\0\0\0\0\0\0\0\0\0˙˙\0\0\0\0\0\0\0\0\0˙˙\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0¤\0\0\0\0\0`\0\0\0\0\0\0`\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0\0\0\0\0\0\0\0t\0\0\0\0\0\0<#\0\0\0\0\0\0<#\0\0\0\0\0\0<#\0\08\0\0\0t#\0\0D\0\0\0¸#\0\0\0\0\0t\0\0\0\0\0\0_M\0\0ě\0\0\0XA\0\0\0\0\0\0XA\0\0\"\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0BL\0\0\0\0\0DL\0\0\0\0\0\0DL\0\0\0\0\0\0DL\0\0\0\0\0\0DL\0\0\0\0\0\0DL\0\0\0\0\0\0DL\0\0$\0\0\0KN\0\0h\0\0łP\0\0n\0\0\0hL\0\0ą\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0`\0\0\0\0\0\0VB\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0VB\0\0\0\0\0\0VB\0\0\0\0\0\0hL\0\0\0\0\0\0\0\0\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0zA\0\0\0\0\0\0\0\0\0\0\0\0\0\0zA\0\0\0\0\0\0M\0\0\0\0\0K\0\0\0\0\0\0K\0\0\0\0\0\0K\0\0\0\0\0\0VB\0\0#\0\0`\0\0\0\0\0\0zA\0\0\0\0\0\0`\0\0\0\0\0\0zA\0\0\0\0\0\0BL\0\0\0\0\0\0\0\0\0\0\0\0\0\0K\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0VB\0\0\0\0\0\0BL\0\0\0\0\0\0\0\0\0\0\0\0\0\0K\0\0\0\0\0\0\0\0\0\0\0\0\0\0K\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0K\0\0\0\0\0\0zA\0\0\0\0\0\0LA\0\0\0\0\0 ¨ŕIżČ\0\0\0\0\0\0\0\0<#\0\0\0\0\0\0H\0\0`\0\0K\0\0\0\0\0\0\0\0\0\0\0\0\0\0BL\0\0\0\0\0\0/M\0\00\0\0\0_M\0\0\0\0\0\0K\0\0\0\0\0\0!Q\0\0\0\0\0\0öI\0\0Đ\0\0\0!Q\0\0\0\0\0\0K\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0!Q\0\0\0\0\0\0\0\0\0\0\0\0\0\0`\0\0\0\0\0\0K\0\0$\0\0zA\0\0\"\0\0\0A\0\0\0\0\0K\0\0\0\0\0\0´A\0\0\0\0\0ČA\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0hL\0\0\0\0\0\0hL\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0ĆJ\0\0X\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0zA\0\0\0\0\0\0_M\0\0\0\0\0\0VB\0\0\0\0\0\0VB\0\0\0\0\0\0VB\0\0\0\0\0\0VB\0\0\0\0\0\0\0\0\0\0\0\0\0\0t\0\0\0\0\0\0t\0\0\0\0\0\0t\0\0D:\0\0¸>\0\0\0\0t\0\0\0\0\0\0t\0\0\0\0\0\0t\0\0\0\0\0\0¸>\0\0\0\0\0\0t\0\0\0\0\0\0t\0\0\0\0\0\0t\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0`\0\0\0\0\0\0˙˙˙˙\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0Z\0a\0Bc\0z\0n\0i\0k\0 \0n\0r\0 \01\0 \0D\0o\0 \0Z\0a\0r\0z\0d\0z\0e\0n\0i\0a\0 \0n\0r\0 \07\07\0/\02\00\00\07\0 \0B\0u\0r\0m\0i\0s\0t\0r\0z\0a\0 \0G\0m\0i\0n\0y\0 \0B\0r\0w\0i\0n\0ó\0w\0 \0 \0z\0 \0d\0n\0i\0a\0 \01\00\0 \0g\0r\0u\0d\0n\0i\0a\0 \02\00\00\07\0r\0.\0 \0 \0 \0 \0W\0y\0k\0a\0z\0 \0k\0o\0n\0t\0 \0d\0l\0a\0 \0b\0u\0d\0|e\0t\0u\0 \0U\0r\0z\0d\0u\0 \0G\0m\0i\0n\0y\0 \0B\0r\0w\0i\0n\0ó\0w\0 \0 \0 \0A\0.\0 \0W\0y\0k\0a\0z\0 \0k\0o\0n\0t\0\0\01\0.\0 \0 \0K\0o\0n\0t\0a\0 \0b\0i\0l\0a\0n\0s\0o\0w\0e\0 \0\01\03\03\0 \0-\0 \0R\0a\0c\0h\0u\0n\0e\0k\0 \0b\0u\0d\0|e\0t\0u\0\01\03\04\0 \0-\0 \0K\0r\0e\0d\0y\0t\0y\0 \0b\0a\0n\0k\0o\0w\0e\0\01\03\07\0 \0-\0 \0R\0a\0c\0h\0u\0n\0k\0i\0 \0[r\0o\0d\0k\0ó\0w\0 \0f\0u\0n\0d\0u\0s\0z\0y\0 \0p\0o\0m\0o\0c\0o\0w\0y\0c\0h\0\01\04\00\0 \0-\0
The hell is that?
I tried to save encoded data as regular file to my file system.
file_put_contents($path, $data);
Variables names, says all.
$path is just a path to place where the file will be saved and $data is just an variable with encoded data inside. (Of course I have mime type of the file, so if it is, let's say .doc, path will be "something/something/upload/file.doc")
But what I get is just file with base64's encoded string.
So PHP Community, I need your help!

special characters in url filename cause problems

I have the following www.mywebsite.com/upload/server/php/files/foto/test/Aston_Martin_DBS_V12_coupé_(rear)_b-w.jpg
This file is uploaded trough a script. The file exists on the server.
However, because the special character in the url (é), I am experiencing some problems.
The filename on the server is Aston_Martin_DBS_V12_coup%C3%A9_(rear)_b-w.jpg, which is correct. However somehow my browser (Chrome) requests this page as ISO-8859-1 instead of UTF-8.
Therefore, I get a 404.
I am using jQuery file upload plugin.
I deleted my answer from here and i wrote new:
Usually websites does not contain files with non-standard characters. Files usually have removed non standard characters, sometimes that characters are replaced by similar standard chars (Polish ą to a, ś to s). For example - im renaming files manually, or when i have a lot of files - i just use bash or php script that removes/replaces that characters in filenames on server.
Anyway, if you HAVE TO use original filenames - you have to decode them from ISO and encode them to UTF8.
Take look at that php code fragment here:
how to serve HTTP files with special characters
Some special Charater make problem in url for filename
like
+ ,#,%,&
For those file which are accessing through url make file which not contain above letters
forex
str_replace(array(" ","&","'","+","#","%"),"-","filename")
it will works fine
If the filename contains the % character codes, you will need to encode those in your URL. Try accessing Aston_Martin_DBS_V12_coup%25C3%25E9_(rear)_b-w.jpg

Image src with special characters

I have a problem with my images not being displayed when they have a # or % symbol.
I am using PHP to read a directory and display all images but any with those symbols just have broken links. The images are uploaded to the server fine but wont display.
I think you'll need to write a function which replaces the % and # characters with their corresponding url-encoding symbols, you can find a reference here:
http://www.w3schools.com/tags/ref_urlencode.asp
you should compare the output of your script to the directory list you get for that dir in your browser, then you will obviously see the correct mapping for your special chars.
are you uploading images via php? then you could maybe map special chars to spaces or dashes. don't think having special chars in file names is a good idea
Best will be to strip out all those characters before uploading the images. (If you are using FTP, rename your files. If you are using PHP to upload, write a function which does it)
Otherwise you will need to escape the symbols: Take a look at urlencode

how can I use a different line termination for reads in php?

I'm trying to read a CSV file generated by M$ Excel on linux.
The file has quoted multi-line (x0A separated) columns and a 0x0d0a line termination.
PHP on Linux uses 0x0a as line terminator, so all the line-based tools (file, fgets, fgetcsv) thinks there are record breaks in the middle of the data cells.
Short of processing the file byte by byte, can I temporarily change PHP's end of line character (PHP_EOL constant) so I can easily parse the file.
I think it can be done in perl with "$\". Is there something similar in PHP?
I realize I can parse byte by byte, but I'm looking for a cleaner approach.
If conceptDawg's suggestion of auto_detect_line_endings doesn't work, I would recommending reading in the entire file via file_get_contents() and then calling explode() to break up the file into multiple lines. You can pass whatever character you want to explode()
You might try using the 'auto_detect_line_endings' run-time configuration option. It says that using this will automatically figure out the correct line endings. From the docs:
When turned on, PHP will examine the data read by fgets() and file() to see if it is using Unix, MS-Dos or Macintosh line-ending conventions.
This enables PHP to interoperate with Macintosh systems, but defaults to Off, as there is a very small performance penalty when detecting the EOL conventions for the first line, and also because people using carriage-returns as item separators under Unix systems would experience non-backwards-compatible behaviour.
If that doesn't work then you could always read the entire file into memory (depending on the file size this might not be feasible) and do a preg_replace on the characters in question, replacing them for the "correct" characters.

Categories