I have a php script which uses the Flickr API to download my images from Flickr, parse the associated text and metadata, and save versions on my server with the metadata embedded in the image files. I work with historic images and want to display them in date order on my smartphone (I'm trying out F-Stop app on Android).
I've got the metadata update working using the PHP JPEG Metadata Toolkit - http://www.ozhiker.com/electronics/pjmt/ - by writing XMP data to the files. But for the life of me I can't seem to get the 'date taken' working!
Here are some sample images:
This is the original file from Flickr, with the date set as the date I created the file http://metapicz.com/#landing?imgsrc=http%3A%2F%2Fwww.whatsthatpicture.com%2Ftools%2FPHP_JPEG_Metadata_Toolkit%2Fflickr.jpg (right-click on the image and save it if you want to inspect it locally)
Here's my first attempt with the toolkit. It has updated the XMP 'DateCreated' but not the EXIF CreateDate or XMP CreateDate http://metapicz.com/#landing?imgsrc=http%3A%2F%2Fwww.whatsthatpicture.com%2Ftools%2FPHP_JPEG_Metadata_Toolkit%2Fprocessed_orig.jpg
So I then forced it to change the XMP CreateDate http://metapicz.com/#landing?imgsrc=http%3A%2F%2Fwww.whatsthatpicture.com%2Ftools%2FPHP_JPEG_Metadata_Toolkit%2Fprocessed_new.jpg. This then showed up in Windows Explorer as the date created, but not in the F-Stop app.
So I was wondering if the EXIF CreateDate, which is still at the value from Flickr, was taking precedence so I stripped that metadata out (the toolset doesn't allow you to modify EXIF, as far as I can see) http://metapicz.com/#landing?imgsrc=http%3A%2F%2Fwww.whatsthatpicture.com%2Ftools%2FPHP_JPEG_Metadata_Toolkit%2Fprocessed_new2.jpg
In none of these cases does F-Stop interpret the date correctly. I have contacted the devs but I don't actually think it's the app at fault, I think it's the metadata format in the files. That's because when I displayed that original file in Windows Explorer and changed the Date Taken there, this file works perfectly http://metapicz.com/#landing?imgsrc=http%3A%2F%2Fwww.whatsthatpicture.com%2Ftools%2FPHP_JPEG_Metadata_Toolkit%2Fflickr_win.jpg
Can anyone tell what is going on, or suggest another way I might go about this?
OK, I've now solved this.
It seems that the FStop app can't read dates from the xmp files so it was either reading them from the JFIF/App12/"Ducky" segment at the beginning of the file, or if that didn't exist it was reading the file timestamp. Of course that meant my plan to create a single xmp profile with all my metadata wasn't going to work.
I switched to ImageMagick but that faced the same problem - I could strip profiles and load/change an xmp profile but couldn't immediately see a way to get it to create/update the date values in the JFIF segment.
So in the end I resorted to calling exiftool via an exec command
exec("exiftool -AllDates='1863-07-23 12:00:00' -overwrite_original testfile.jpg");
(I'll change it away from AllDates and just set CreateDate, but I need to test that)
A bit clunky, but it works! I'm using Imagick anyway for modifying the actual images, so if anyone does know a way I might modify those headers there then I'd be delighted to hear it.
Related
I am currently developing a web solution in PHP 8.0 using Symfony 5.3.7 where I need to allow user to download a file with custom metadatas.
For example, I have on the server a file a.jpg and I created a metadata "Resume: John is looking to Marie", which is stored and linked to the file in database.
If a user click on a button to download the file, I need to set the metadata stored in database to the file before the user download it, then if he bring is a.jpg in USB key or whatever, the metadata is in it.
Does anyone knows how to do with Symfony or even native PHP?
I am thinking of create a download function to do this but I don't find how to write the metadata in the file.
The only thing I could find is this https://www.php.net/manual/fr/pharfileinfo.setmetadata.php, but I don't even understand how it works.
I need this for multiple file types : images, videos, audios and PDFs.
There is no universal way to add arbitrary metadata like this to a file, regardless of whether you're using PHP or anything else.
At its most basic, a file is just a series of bits, and the way we interpret those bits is what we call a "file format". Some file formats are very simple - the whole file is interpreted as a series of letters, or a series of coloured pixels; others are much more complex, with different sections interpreted as different types of information.
A JPEG image file, for instance, can have sections of data referred to as "EXIF information", which can store details about the image, including arbitrary text. Similarly, MP3 files can have sections called "ID3", which are used to store things like track and artist names to be displayed by media players. You can probably find tools or libraries for editing both of those formats, but you won't be able to use an EXIF editor on an MP3 file or an ID3 editor on a JPEG.
The function you found was for managing the metadata on PHAR files, which are PHP code archives. It's not going to be useful for editing images, PDFs, or anything else.
So, you need to identify the different types of file you want to edit, and then find out two things: firstly, does the file format have anywhere to put the metadata at all; and secondly, what tools can edit the metadata in that particular file type. You might find some libraries or tools which can manage metadata in multiple audio formats, or multiple image formats, but chances are you're going to need to integrate more than one to get the breadth of support you seem to want.
So I am developing a new course-format, in which a picture is associated with each activity in a course, and presented visually. I created the course format, overrode the renderer etc. That worked all fine. However, the images are supposed to be custom generated and since it has to work for all existing and future, I put some additional code into the general course module form, enabling an image upload.
After admittedly some struggle on my part to get the File API working, it now all works fine. Only in my course format, there is an additional heading, under which you can upload a single image. This gets saved to the database fine, it is not in draft and it is viewable in my dataroots filedir perfectly if I follow the contenthash in the database. It even gets loaded into the form as a default fine. However, if I try to work with the image, all tests run fine (.is_valid_img()etc) and I even get offered to download a file. However, when I do it is corrupted and my file viewer says: "Critical Error: Not a png file". Needless to say it is not displayed on my actual course site.
When I look at the file in filedir, it very clearly is a png. Please, I would be thankful for any help, since I have tried alot and am at my wits end.
It sounds to me like you are getting some sort of output on the page before the PNG file is sent - that would be added to the start of the file and cause it not to work as a PNG file.
I would suggest you open the file in a hex editor and check the start of the file - it should look like https://en.wikipedia.org/wiki/Portable_Network_Graphics#File_header, so look for extra characters before that.
As for where the extra characters come from - they may be an obvious warning / error message (which should be easy to track down and fix). Alternatively, you may have some stray 'echo' statements (again, fairly easy to track down). The worst problems to find are extra characters before the opening 'php' tags of a file somewhere in your install or after the closing tag at the end of a file (which is why you should never use closing PHP tags). Finding these will come down to searching through all your customised code files to locate them.
I am currently trying to get the downloadURL from a response sent via my server of which, whenever $file->getdownloadUrl() is used it returns ['downloadURL'] =>
My question is, is it possible to download Google Documents in the application/vnd.google-apps.document MIME Type?
My assumption is, these would contain a link to the online version of the document, but it would be good to be able to edit the document in the correct format so that any formatting done would be retained when re-uploaded to drive,
Regards,
Nope, you cannot download Google Documents in application/vnd.google-apps.document MIME type. You only can export it to other formats.
Some workarounds:
Apps script Document Services provide a little bit better control over the document, but you won't be able to get full control over all formatting for now.
Export file as known formats such as Microsoft words and edit it. When you upload it back to Drive, you can request to convert it back to Google Docs format. Although you might possibly lose or corrupt with some formatting.
I want to make a web app that can get the values from a commonly used file type (such as xsl or ppt) to allow me to convert it into a custom format (like Google Drive). With an xsl (excel document) file, for example, I want to be able to get the value for each cell. I would be fine getting html for a file (like getting the html code that would display a word document) because values can be extracted out of that. I would like to be able to do it on the client side, but I am okay with using it on the server side with PHP.
Another approach would be to import the file as XML. PHP has great support for XML and could make short work of this. If you can get the files uploaded as Open Doc Format you can parse just about any of the types you listed (XLS, PPT, DOC, etc).
A pretty easy way to get data out of an excel sheet online is to use a Google Apps Script. The process would be a lot to explain here, but with a bit of google searching, you can find all your answers.
As for a PPT, I can't think of an easy way.
As for documents (i.e. pdf, doc, docx), you can use Google Apps Script as well.
Although, if you're making your own tool for this, you may want to just research how the data is stored in the file and work from there.
So I have files....
.doc
.docx
.xls
.xlsx
and .pdf
that are on the my server.
Is it possible (and if it is, how) to extract the meta data from those files using PHP?
I'm looking for things like Author, keywords, title, etc...
In office documents it's the information stored along with the document properties (File...Properties...Summary for 2003, Prepare...Properties for 2007).
In PDFs it's information found in Document Properties.
This is not on a Windows server.
I have managed to extract a lot of Meta information using XPDF on a linux system a few years back. Nowadays, though, I would say Zend_PDF is your best bet. Haven't used it myself but looks good and promises everything you need. Seems to have no library dependencies, either.
For Word .DOCs, if you don't find a better way, plug into an OpenOffice server instance / command line and convert the files to ODT, which is XML and parseable. If it's not possible to extract the meta data per Macro - it should be, but I don't know how much work it is. This OpenOffice Forum entry gives a ton of starting points for automated conversion.
The ...X formats are some sort of XML, so it should be easily possible to fetch the meta data from them. Alternatively, you should be able to use OpenOffice's conversion filters here as well, if they transport the meta data.