I'm sorry if the question is ambiguous, I'll try to explain.
I'm working on an existing PHP download script for videos and some parts of it are broken. There's code in there that's supposed to place a specific member code inside the video file before download, but it doesn't work. Here's the code:
//embed user's code in video file
$fpTarget = fopen($filename, "a");
fwrite($fpTarget, $member_code);
fclose($fpTarget);
$member_code is a random 6-character code.
Now, this would make sense to me if it were a text file, but since it's a video file, how could this possibly work and what is it supposed to do? If the member code is somehow added to the video, how can I see it after download it? I have no experience with video files, so any help is appreciated (a modification of the available code or new code would be equally welcome).
I'm sorry I can't give a more precise description of what the code is supposed to do, I'm trying to figure that out myself.
It may work, depending on the format/type of the video. MPG files are fairly tolerant of "noise" in a file and players would skip over your code because it doesn't look like valid video frame data.
Other formats/players may puke, because the format requires certain data be at specific offsets relative to the end of the file, which you've now shifted by 6 characters.
Your best bet is to figure see if whatever format you're serving up has provisions for metadata in its specifications. e.g. there might be support for a comment field somewhere that you can simply slap the code into.
However, if you're doing all this for 'security' or tracking unauthorized sharing of the video, then simply writing the number into a header is fairly easy to bypass. A better bet would be to watermark the video somehow so that the code is embedded in the actual video data, so that "This video belongs to member XYZ only" is displayed while playing.
You don't write to the content of the file directly, not like you would with a text file. As you've noticed, this effectively corrupts the video and you have no way of reasonably reading the information.
For audio/video files, you write to meta-data that's packaged with the file. How this is packaged and what you can do with it generally depends heavily on the container format used for the file. (Remember that container and codec are two different things. The codec is the format used to encode the audio/video, the container is the file format in which that data stream is stored.)
A library like getID3 might be a good place to start. I've never used it, but it seems to be what you're looking for. What you would essentially do is write a value to the meta-data in the container (either a pre-defined value for that container or maybe a custom key/value pair, etc.) which would be part of the file. Then, when reading the file, you can get that data. (Now, that last part depends heavily on what's reading the file. The data is there, but not every player cares about it. You'll want to match up what you're writing to with what you usually see/read from the file's internal meta-data.)
Related
I am currently developing a web solution in PHP 8.0 using Symfony 5.3.7 where I need to allow user to download a file with custom metadatas.
For example, I have on the server a file a.jpg and I created a metadata "Resume: John is looking to Marie", which is stored and linked to the file in database.
If a user click on a button to download the file, I need to set the metadata stored in database to the file before the user download it, then if he bring is a.jpg in USB key or whatever, the metadata is in it.
Does anyone knows how to do with Symfony or even native PHP?
I am thinking of create a download function to do this but I don't find how to write the metadata in the file.
The only thing I could find is this https://www.php.net/manual/fr/pharfileinfo.setmetadata.php, but I don't even understand how it works.
I need this for multiple file types : images, videos, audios and PDFs.
There is no universal way to add arbitrary metadata like this to a file, regardless of whether you're using PHP or anything else.
At its most basic, a file is just a series of bits, and the way we interpret those bits is what we call a "file format". Some file formats are very simple - the whole file is interpreted as a series of letters, or a series of coloured pixels; others are much more complex, with different sections interpreted as different types of information.
A JPEG image file, for instance, can have sections of data referred to as "EXIF information", which can store details about the image, including arbitrary text. Similarly, MP3 files can have sections called "ID3", which are used to store things like track and artist names to be displayed by media players. You can probably find tools or libraries for editing both of those formats, but you won't be able to use an EXIF editor on an MP3 file or an ID3 editor on a JPEG.
The function you found was for managing the metadata on PHAR files, which are PHP code archives. It's not going to be useful for editing images, PDFs, or anything else.
So, you need to identify the different types of file you want to edit, and then find out two things: firstly, does the file format have anywhere to put the metadata at all; and secondly, what tools can edit the metadata in that particular file type. You might find some libraries or tools which can manage metadata in multiple audio formats, or multiple image formats, but chances are you're going to need to integrate more than one to get the breadth of support you seem to want.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Here is the scenario:
I have a variable in php that has raw contents of an excel file and I want to parse the contents of that variable (which is in an excel format, or can also be in a pdf format) for a certain value. I am looking for a keyword near the end of the contents of the file and will need to extract some of the contents near the desired value inside the contents of the file so I can get it into a variable in php and output to my webpage. From what I know the file is in binary, or hex representation but the ascii conversion is represented as readable text with diamond characters (with a question mark) and rectangles with a border and other extraneous characters including readable text content.
Here are the requirements:
I don't want parse the contents of the file by first storing or saving on disk. I want to parse the contents of the retrieved file directly while in a php variable.
Here is my question:
How do I go about this? Should I rely upon PHPExcel to read this content if possible? If not, what php libraries can accomplish this task?
Should I rely upon PHPExcel to read this content if possible?
It is not possible (see below).
If not, what PHP libraries can accomplish this task?
None that I know of.
How do I go about this?
An Excel file (rather, an Excel 2003+ XLSX file - Excel97 XLS files are a wholly different can of worms) is a ZIP archive containing XML and other files in a tree structure. So your first stage is to decompress a ZIP file in a string; PHPExcel relies on the ZipArchive class, and this, in turn, does not support string reading and also bypasses most stream hacks. A similar problem - actually exactly the same problem - is described in this question.
You could think of using stream wrapping to decode the file from a string, and the first part - the reading - would work. The writing of the files would not. And you cannot modify the ZipArchive class so that it writes to a memory object, because it is a native class.
So you can employ a slight variation, from one of the answers above (the one by toster-cx). You need to decode the ZIP structure yourself, and thus get the offset in the ZIP file where the file you need begins. This will either be /xl/worksheets/sheet1.xml or /xl/sharedStrings.xml, depending on whether the string has been inlined by Excel, or not. This also assumes that the format is the newer XLSX. Once you have that, you can extract the data from the string and decompress it, then search it for the token.
Of course, a more efficient use of the time would be to determine exactly why you don't want to use temporary files. Maybe that problem can be solved another way.
Speed problem
Actually, reading/writing an Excel file is not so terrible, because in this case you don't need to do that. You can almost certainly consider it a Zip file, and open it using ZipArchive and getStream() to directly access the internal sub-file you're interested in. This operation will be quite fast, also because you can run the search from the getStream() read cycle. You do need to write the file once, but nothing more.
In fact, chances are that you can write the file while it is being uploaded (what do you use for Web upload? The plupload JS library has a very nice hook to capture very large files one chunk at a time). You still need a temporary area on the disk where to store the data, but in this case the time expenditure will be exclusively dedicated to the decompression and reading of the XML sub-file - the same thing you'd have needed to do with a string object.
It is also (perhaps, depending on several factors, mainly the platform and operating system) possible to offload this part of the work to a secondary process running in the background, so that the user sees the page reload immediately, while the information appears after a while. This part, however, is pretty tricky and can rapidly turn into a maintenance nightmare (yeah, I do have first-hand experience on this. In my case it was tiled image conversion).
Cheating
OK, fact is I love cheating; it's so efficient. You say that you control the XLSX and PDF being created? Well! It turns out that in both cases, you can add hidden metadata to the file. And those metadata are much more easily read than you might think.
For example, you can add zip archive comments to a XLSX file, since it is a Zip file. Actually you could add a fake file with zero length to the archive, call it INVOICE_TOTAL_12345.xml, and that would mean that the invoice total is 12345. The advantage is that the file names are stored in the clear inside the XLSX file, so you can just use preg_match and look for INVOICE_TOTAL_([0-9]+)\.xml and retrieve your total.
Same goes for PDF. You can store keywords in a PDF. Just add a keyword attribute named "InvoiceTotal" (check the PDF to see how that turns out). But there is also a PDF ID inside the PDF, and that ID will be at the very end of the PDF. It will be something like /ID [<ec144ea3ecbb9ab8c22b413fec06fe29><ec144ea3ecbb9ab8c22b413fec06fe29>]^, but just use a known sequence such as deadbeef and ec144ea3ecbb9ab8c22deadbeef12345 will, again, mean the total is 12345. The ID before the known sequence will be random, so the overall ID will still be random and valid.
In both cases you could now just look for a known token in the string, exacly as requested.
I need generate a PDF file in format X-1a:2001 using Photoshop or InDesign, and write over it using PHP (or other language).. using a specific font (inside pdf file).. and export it as X-1a:2001 also..
It's possible? I googled but found nothing about it.
Anyone already did something like that?
Thanks.
I tried open x-1a:2001 pdf in FPDF as sourcefile.. but, when i exported, it loses x-1a:2001 format
To answer your question as literally as possible: yes, it's possible.
PDF/X-1a is not magic, it's just a very well defined subset of PDF. So, as long as the objects you add to the PDF/X-1a file are compliant to the specification (which, for example, says that all objects must be in a few well-defined color spaces such as CMYK, gray or spot color), you won't break compliancy.
Of course the second requirement is that your PDF engine (the library you end up using) does the right thing as well. It shouldn't throw away the PDF/X-1a identification in the file and it shouldn't add content that makes the file non-compliant.
By the way, don't rely on simply looking at the file's metadata to determine whether it is PDF/X-1a compliant. That metadata only says the file claims to be compliant; which has nothing to do with the file actually being compliant.
This is going to sound like an odd request.
I have a PHP script pulling a mp3 stream from SoundCloud and repeating the stream with the correct headers to allow WinAmp to play the file. But it only shows the local url I have the script running from. Before anyone asks, I am injecting ID3v1 into the file before echoing it.
Is there any way to provide WinAmp with the meta data from php?
Just to clarify, you are effectively proxying an MP3 file from SoundCloud, and you want to embed metadata into it?
Winamp will pick up ID3 tags in an HTTP-served MP3 file. However, if you are using ID3v1, those tags don't exist until the very end of the file. If you want the file to be identified without having to download the whole file, you must use ID3v2 which are typically located at the beginning of the file. (I actually recommend using both ID3v1 and ID3v2 for broader player compatibility, but almost everything supports ID3v2, so it is your choice.)
Now, there is another method but if you use this method the metadata won't be saved in the file when downloaded. You can use SHOUTcast-style metadata. Basically, Winamp and other clients (like VLC) send a request header, Icy-MetaData: 1. This tells the server that it supports SHOUTcast-style metadata. In your server response, you would insert metadata every 8KB or so. Basically, you want the reverse of what I have detailed here: https://stackoverflow.com/a/4914538/362536
In the end, simply adding ID3v2 tags will solve your problem in the best way, but I wanted to mention the alternative option in case you needed it for something else.
I want to give users a preview of certain files on my site and will be using scribd API. Does anyone know how I can access the full file from my server and save the file under a different name , which I will then show to users..Can't think of a way to do this with PHP for .docx and image files...Help is much appreciated.
For "splitting" images, use an image processing library like gd to crop the image (lots of examples to be found on how to do that all over the place). For Word documents, use a library like PHPWord (or one of the other myriad such libraries) to open the document, remove/extract as much text as you need, then save that into a new Word file.
For other file types, find the appropriate method that allows you to manipulate that format, then do whatever you need to do with it.