I'm working on a project where I need to post-process a bunch of audio files in various formats.
Firstly, the files need to be converted to .WAV format.
Secondly, depending on their length, I need to insert a short audible watermark at certain intervals in each of the new .WAV files.
The first part is easy, using the LAME encoder cli.
The second part is where it get's difficult - I've tried a few method with both LAME and FFmpeg, but can't seem get it working.
The script is running as a cron job in the background, so full cli access is available.
If possible, it would be great if someone could point me to an example script/gem or class that does this in some related way.
This gets complicated. You need to actually mix the audio, which to my knowledge, isn't possible with FFMPEG. The other problem you're going to have is loss of quality if you take an MP3, convert it to WAV so you can work with it, and re-encode it back to MP3.
I think you can use Sox for this: http://sox.sourceforge.net/
Use FFMPEG first to decode the audio to WAV, adjusting sample rate and bit depth as necessary.
Then, call out to soxmix: http://linux.die.net/man/1/soxmix
If you're ready to take the Python route, I would suggest SciPy, which can read WAV files into NumPy arrays:
from scipy.io import wavfile
fs, data = wavfile.read(filename)
(the official documentation contains details).
Sounds can be conveniently manipulated through the array manipulation routines of NumPy.
scipy.io.wavfile can then write the file back to the WAV format.
SciPy and NumPy are general scientific data tools. More music-centric Python modules can be found on the official web site.
Related
The thing is that the client wants to upload a pdf with images as a way of batch processing multiple images at once.
I already looked around and out of the box PHP can't read PDF's.
What are my alternatives?
I already know the host has not installed imageMagick or any pdf library and the exec function is disabled. That's basicly leaving me with nothing to work with, I guess?
Does anyone know if there is an online service that can do this, with an api of sorts?
thanks in adv
AFAIK, there is no PHP module to do it. There is a command line tool, pdfimages (part of xpdf). For reference, here's how that works:
pdfimages -j source.pdf image
Which will extract all images from source.pdf as image-000.jpg, image-001.jpg, etc. Note the output format is always Jpeg.
Possible Options
Being a command line tool, you need exec (or system, passthru, any of the command executing functions built into PHP). As your environment doesn't have that, I see four options:
Beg that exec be turned on for you (your hosting provider can limit what you can exec to a single command)
Change the design -- how about a ZIP upload?
Roll your own, using the source code of pdfimages as a model
Let pdfimages do the heavy lifting, by running it on a remote host you do control
Regarding #3, rolling your own, I don't think rolling your own, to solve a very narrow definition of requirements, would be too difficult. I seem to recall that the image boundaries in PDF are well defined: just read in the file to a boundary, cut to the end of the boundary, base64_decode, and write to a file -- repeat. However, that may be too much...
If rolling your own is too complicated, then option #4 is kind of like what Joel Spolsky describes for working with complicated Excel objects (see the numbered list under the bold heading "Let Office do the heavy work for you").
Find a cheap hosting environment (eg Amazon EC2) that let's you exec and curl
Install pdfimages
Write a PHP script that takes a URL to a PDF, curl opens that PDF, writes it to disk, passes it to pdfimages, then returns the URL to the resulting images.
An example exchange could look like this:
GET http://www.cheaphost.com/pdfimages.php?extract=http://www.limitedhost.com/path/to/uploaded.pdf
Content-type: text/html
<html>
<body>
<ul>
<li>http://www.cheaphost.com/pdfimages.php?retrieve=ab9895v/image-000.jpg</li>
<li>http://www.cheaphost.com/pdfimages.php?retrieve=ab9895v/image-001.jpg</li>
</ul>
</body>
</html>
So your single pdfimages.php script (running on the host with the exec functionality) can both extract images, and give you access to the extracted images. When extracting, it reads a PDF you tell it, runs pdfimages on it, and gives you back a list of URL to call to retrieve the extracted images. When retrieving, it just gives you back a straight image.
You would need to deal with cleanup, perhaps the thing to do would be to delete the image after retrieval. You would also need to handle security -- don't know what's in these images, but the content might need to be wrapped in SSL and other precautions taken.
You can use pdfimages and install it this way:
apt install poppler-utils
Then use it this way to get all the images as PNG files:
pdfimages -j mypdf.pdf image -png
Images will be placed in the same folder under image-000.png, image-001.png, etc.
There are many options available, including some to change the output format, more information here.
I hope this helps!
I want compress 1 small file/data, only file size matter.
No need file information store, like filename, size, date, etc...
If I use rar/7zip/zip as CLI, file information added to archive. It's not good for me.
I finding the BEST compression solution for file size.
In PHP I can use gzdeflate() or bzcompress() to compress string then save to file as compressed. I finding a same or CLI solution.
Environment: Linux, 32/64 bit.
I want to use 7zip/7za as same for string/stream compression.
If I using a binary version of 7z, for example: 7za a -mx9 output.7z input.dat
But this time in .7z found file/date/size information and file size is bigger.
How can I use 7zip or other better compressor like as bzcompress or gzdeflate to compress data stream only, without file informations?
Maybe I cannot use 7zip actually in PHP because not supported yet.
Someone can recommend/create a small C/C++ CLI application/source or in other language what can usable in Linux CLI to compress 1 file and output to 1 file?
For example I want shell exec:
7zcpp input.dat output.7z
or
7zcpp -mx9 input.dat output.7z
Summary: Compression speed not important, only better, smaller file size. I want compress only 1 file (string/stream), every byte count, no need filename/date information inside the archive. I can use any better compressor than 7zip, but I think this is one of best actually. Any ideas, recommendations?
Thanks
7Zip memory compression functions - You could check out their SDK and try to look through the functions, - In (C, C++, Java and C#, too bad no php, unless you create your own php extension to bring 7z functionality into your app). In the LZMA SDK, go through the path C\Util\Lzma, find the LzmaUtil.c, it's a perfect example to help you . But I've personally used zlib, lzma compresses to 8% of data set size, compared to zlib's 12%, even though zlib is much faster. But since you don't give a shit about speed, lzma is best for you then
What is, according to you, the best way to convert uploaded files of any kind (.doc, .docx,...) into a pdf-file using nothing but php. Is it even possible to do so?
I looked at FPDF, but this creates the pdf files from text.
An other solution previously given was to use the PDFlib library on your server, but unfortunately, my server doesn't support this library...
What is the best way to convert to files my users upload on my site to pdf files?
A simpler approach would be to restrict uploads to .PDF format programmatically and require your users to only upload .pdf files. Provide a link on the upload page to a free and open source pdf printer (e.g. Cuteftp) that the user can install to create .pdf documents from any file that can be printed.
Trying to do it through PHP will be problematic because the uploads could be generated from many different programs that would be impossible to cater for in their entirety. e.g. How would it handle Scribus or ABC Flowcharter or any other 'non-standard' application someone used to create a document?
Much better to filter the upload upfront.
The best server-side PDF generator from those I tried was, so far, wkhtmltopdf, a WebKit-based, self-contained invisible browser that can render any HTML+CSS and generate a PDF from it. Reasonably fast and fairly reliable, has some useful PDF options, such as page size, orientation, etc.
The second part of the job in your case is to convert documents to HTML prior to feeding them to wkhtmltopdf. If possible, have your users upload the docs in HTML (Word and Co. can export (crappy) HTML). If this is not an option, you will have to find a tool just for that, which, in my opinion, is much easier than finding a tool that converts Word docs directly into PDF.
Good thing about wkhtmltopdf is also that you can feed the output of your PHP script to it using the ob_xxx() functions.
PHP Excel best simple way to create doc, docx, xls, xlsx, pdf files with PHP. Its lot easier with clear documentation.
Use Microsoft Office to render Microsoft Office documents, if you care about accuracy at all. This is easily done by invoking Office over COM.
Get access to your server, and install what you need. Doing so would be far easier than monkeying around with sub-par solutions.
Well... I can think of one way of doing it quite easily, but it doesn't involve using PHP.
Upload your documents to a folder on your server, that are browsable by your users.
EG: http://mysite.com/docs/
Then get your users to install a virtual printer driver such as Primo PDF
http://www.primopdf.com/index.aspx
then they can load the document into their browser, and print to PDF for offline browsing.
If this is not an option, and your dealing with office documents that conform to the openXML standard, you could attempt to parse the XML doc into a PHP page for display in the browser, then use JavaScript to trigger a print.
Unfortunately, it does still depend on your user having a PDF printer installed.
Alternatively, you could just load the docs natively, and print to your own PDF printer, then upload the PDF's to the web server for download.
I can't think of any easy way of doing this otherwise, without installing all sorts of different document parser tool-kits and doing a huge amount of behind the scenes work.
Does anybody know a ready-made, reliable way to tell the dimensions (width x height) of a MP4 encoded using the H.264 codec without ffmpeg or similar extensions, in pure PHP?
Thanks for all the answers folks. The bounty is running out and I will not have time to check the offered solutions before it does. I will accept the solution that I feel has the greatest likelihood to work.
getID3 is pure php and extracts an amazing amount of information from media files of all sorts. It will depend on what encoded your file in the first place as to what metadata is available and how reliable it is. getID3 has a nice demo page with lots of different file types. I tried to post more links but as a newbie I only get one.
It sounds like http://code.google.com/p/php-mp4info/ might be your answer. It reads MP4's but it doesn't mention anything about H.264.
also, what OS are you using?
What comes to mind:
mediainfo a huge project with GUI, but also has a CLI
mp4info (part of the seemingly defunct mp4mpeg project) is almost perfect for this
ffmpeg although this is overkill for the task. then again, you very well may need it for other tasks
ffmpeg and php: http://www.lampdeveloper.co.uk/linux/detecting-a-videos-dimensions-using-php-and-ffmpeg.html
php-reader is a full implementation of the ISO 14496 done in pure PHP. You can use this library to read all of the boxes which the mp4 consist of, like the moov atom containing metadata about the file.
Native PHP does not support anything like this, ffmpeg is only one library that come on my mind.
I need a way to extract the audio from some video (in PHP). I have the video streaming in from YouTube, so I would really like it if it were on the fly streaming, not I have to save it to a temp directory and process it there (though that is acceptable.) Thanks, Isaac Waller
Edit: to be more specific, I have a MP4 and I want it to be a MP3.
You're going to want to use something like ffmpeg and call it using php's exec command. If you look around in the docs, I'm sure you can figure out what flag to use to only get the audio.
I've used this app before on a project for live transcoding of video, works like a charm. Just make sure your server has it correctly installed.
Mplayer should do this for you, and there are libraries and codecs that you can call (PHP supports C libraries) which will strip the video from the AV stream on the fly.
Given that you're targeting youttube your job is a bit easier because they use a very small subset of file encodings.
If you take the time to learn the format, you can very easily remove the video stream on the fly and return only the audio stream.
If you give a little more information, such as what you're encoding it to, or where it's going to end up we may be able to help more specifically.
-Adam