Programmatically determine video file format? - php

Ok, I get the basics of video format - there are some container formats and then you have core video/audio formats. I would like to write a web based application that determines what video/audio codec a file is using.
How best can I programmatically determine a video codec? Would it be best to use a standard library via system calls and parse its output? (eg ffmpeg, transcode, etc?)

mplayer -identify will do the trick. Just calling ffmpeg on a file will also work--it will automatically print a set of info at the start about the input file regardless of what you're telling ffmpeg to actually do.
Of course, if you want to do it from your program without an exec call to an external program, you can just include the avcodec libraries and run its own identify routine directly.
While you could implement your own detection, it will surely be inferior to existing routines given the absolutely enormous number of formats that libav* supports. And it would be a rather silly case of reinventing the wheel.
Linux's "file" command can also do the trick, but the amount of data it prints out depends on the video format. For example, on AVI it gives all sorts of data about resolution, FOURCC, fps, etc, while for an MKV file it just says "Matroska data," telling you nothing about the internals, or even the video and audio formats used.

I have used FFMPEG in a perl script to achieve this.
$info = `ffmpeg -i $path$file 2>&1 /dev/null`;
#fields = split(/\n/, $info);
And just find out what items in #fields you need to extract.

You need to start further down the line. You need to know the container format and how it specifies the codec.
So I'd start with a program that identifies the container format (not just from the extension, go into the header and determine the real container).
Then figure out which containers your program will support, and put in the functions required to parse the meta data stored in the container, which will include the codecs.
-Adam

You really want a big database of binary identifying markers to look for near the start of the file. Luckily, your question is tagged "Linux", and such a dabase already exists there; file(1) will do the job for you.

I would recommend using ffprobe and force output format to json. It would be so much easier to parse. Simplest example:
$meta = json_decode(join(' ', `ffprobe -v quiet -print_format json -show_format -show_streams /path/to/file 2>&1`));
Be warned that in the case of corrupted file you will get null as result and warning depending on your error reporting settings. Complete example with proper error handling:
$file = '/path/to/file';
$cmd = 'ffprobe -v quiet -print_format json -show_format -show_streams ' . escapeshellarg($file).' 2>&1';
exec($cmd, $output, $code);
if ($code != 0) {
throw new ErrorException("ffprobe returned non-zero code", $code, $output);
}
$joinedOutput = join(' ', $output);
$parsedOutput = json_decode($joinedOutput);
if (null === $parsedOutput) {
throw new ErrorException("Unable to parse ffprobe output", $code, $output);
}
//here we can use $parsedOutput as simple stdClass

You can use mediainfo:
sudo apt-get install mediainfo
If you just want to get video/audio codec, you can do the following:
$videoCodec = `mediainfo --Inform="Video;%Format%" $filename`;
$audioCodec = `mediainfo --Inform="Audio;%Format%" $filename`;
In case you want to capture more info, you can parse XML output returned by mediainfo. Here is sample function:
function getCodecInfo($inputFile)
{
$cmdLine = 'mediainfo --Output=XML ' . escapeshellarg($inputFile);
exec($cmdLine, $output, $retcode);
if($retcode != 0)
return null;
try
{
$xml = new SimpleXMLElement(join("\n",$output));
$videoCodec = $xml->xpath('//track[#type="Video"]/Format');
$audioCodec = $xml->xpath('//track[#type="Audio"]/Format');
}
catch(Exception $e)
{
return null;
}
if(empty($videoCodec[0]) || empty($audioCodec[0]))
return null;
return array(
'videoCodec' => (string)$videoCodec[0],
'audioCodec' => (string)$audioCodec[0],
);
}

Related

Finding correct of PHP binary - exec()

I'm trying to execute a separate PHP script from within a PHP page. After some research, I found that it is possible using the exec() function.
I also referenced this SO solution to find the path of the php binary. So my full command looks like this:
$file_path = '192.168.1.13:8080/doSomething.php';
$cmd = PHP_BINDIR.'/php '.$file_path; // PHP_BINDIR prints /usr/local/bin
exec($cmd, $op, $er);
echo $er; // prints 127 which turns out to be invalid path/typo
doSomething.php
echo "Hi there!";
I know $file_path is a correct path because if I open its value; i.e. 192.168.1.13:8080/doSomething.php, I do get "Hi there!" printed out. This makes me assume that PHP_BINDIR.'/php' is wrong.
Should I be trying to get the path of the php binary in some other way?
The file you are requesting is accessible via a web server, not as a local PHP script. Thus you can get the result of the script simply by
$output = file_get_contents($file_path);
If you however for some reason really have to exec the file, then you must provide a full path to that file in your server directory structure instead of server URL:
$file_path = '/full/path/to/doSomething.php';
$cmd = PHP_BINDIR.'/php '.$file_path;
exec($cmd, $op, $er);

mysqldump common install locations for mac/linux

I am trying to know all the common locations for mysqldump. The list I have come up with is as follows:
'/usr/bin/mysqldump', //Linux
'/usr/local/mysql/bin/mysqldump', //Mac OS X
'/usr/local/bin/mysqldump', //Linux
'/usr/mysql/bin/mysqldump'; //Linux
Often mysqldump isn't in the path, so I am trying to have all the locations to look in. (I am running this from a php script)
Are there any that I am missing?
I was unable to find any other paths apart from the ones you have given in your question. However, one thing that does come in my mind is that mysqldump should, in most cases, be in the same directory as the mysql binary. Now, the mysql command will be in the path, in most cases, as well.
And, therefore, you can combine the two logics to have the location of the mysqldump binary, in most cases, like this:
function detect_mysqldump_location() {
// 1st: use mysqldump location from `which` command.
$mysqldump = `which mysqldump`;
if (is_executable($mysqldump)) return $mysqldump;
// 2nd: try to detect the path using `which` for `mysql` command.
$mysqldump = dirname(`which mysql`) . "/mysqldump";
if (is_executable($mysqldump)) return $mysqldump;
// 3rd: detect the path from the available paths.
// you can add additional paths you come across, in future, here.
$available = array(
'/usr/bin/mysqldump', // Linux
'/usr/local/mysql/bin/mysqldump', //Mac OS X
'/usr/local/bin/mysqldump', //Linux
'/usr/mysql/bin/mysqldump' //Linux
);
foreach($available as $apath) {
if (is_executable($apath)) return $apath;
}
// 4th: auto detection has failed!
// lets, throw an exception, and ask the user to provide the path instead, manually.
$message = "Path to \"mysqldump\" binary could not be detected!\n"
$message .= "Please, specify it inside the configuration file provided!"
throw new RuntimeException($message);
}
Now, you can use the above function for your purposes. And, provide a way for the user to provide the explicit path to mysqldump binary manually, if the above function throws an error. Should work for your use cases :)

Equivalent of /dev/null for writing garbage test data?

I need to perform a series of test for picking the fastest branch of code for a set of functions I designed. As this functions output some text/HTML content, I would like to measure the speed without filling the browser with garbage data.
Is there an equivalent to /dev/null in PHP? The closest equivalent to write temporary data I've found are php://temp and php://memory but those two I/O streams store the garbage data and I want for every piece of data to be written in a 'fake' fashion.
I could always write all garbage data in a variable ala $tmp .= <function return value goes here> but I'm sure there must be a more elegant or a better way to accomplish this WITHOUT resorting to functions like shell_exec(), exec(), proc_open() and similar approaches (the production server I'm going to test the final code won't have any of those commands).
Is there an equivalent?
// For what its worth, this works on CentOS 6.5 php 5.3.3.
$fname = "/dev/null";
if(file_exists($fname)) print "*** /dev/null exists ***\n";
if (is_readable($fname)) print "*** /dev/null readable ***\n";
if (is_writable($fname)) print "*** /dev/null writable ***\n";
if (($fileDesc = fopen($fname, "r"))==TRUE){
print "*** I opened /dev/null for reading ***\n";
$x = fgetc($fileDesc);
fclose($fileDesc);
}
if (($fileDesc = fopen($fname, "w"))==TRUE)
{
print "*** I opened /dev/null for writing ***\n";
$x = fwrite($fileDesc,'X');
fclose($fileDesc);
}
if (($fileDesc = fopen($fname, "w+"))==TRUE) {
print "*** I opened /dev/null for append ***\n";
$x = fwrite($fileDesc,'X');
fclose($fileDesc);
}
I think your best bet would be a streamWrapper that profiles your output on write with microtime, that you can then stream_wrapper_register . The example in the manual is pretty good.
If your code is not that complicated or you fell this would be overkill, you can just use the ob_start callback handler
Hope this helps.

How to deal with application/octet-stream (uncompressed gzipped file) in PHP?

I've to parse a lot (10000+) of remote gzipped files. Each zipped file should contain a CSV inside it (maybe in a folder). Right now I'm able to get the body, check for content type and uncompress it, obtaining application/octet-stream.
Question is: what's the octet-stream and how can I check for files or folders inside it?
/** #var $guzzle \Guzzle\Http\Client */
$guzzle = $this->getContainer()->get('guzzle');
$request = $guzzle->get($url);
try {
$body = $request->send()->getBody();
// Check for body content-type
if('application/z-gzip' === $body->getContentType()) {
$body->uncompress();
$body->getContentType(); // application/octet-stream
}
else {
// Log and skip current remote file
}
}
catch(\Exception $e) {
$output->writeln("Failed: {$guzzle->getBaseUrl()}");
throw $e;
}
The EntityBody object that stores the body can only guess the content-type of local files. Use the content-length header of the response to get a more accurate value.
Something like this:
$response = $request->send();
$type = $response->getContentType();
Something like some shell command will work for u
shell_exec('gzip -d your_file.gz');
You can first unzip all your files in a particular directory and then can read each file or whatever computation you have to perform.
As a sidenote :
Take care where the command is run from (ot use a swith to tell "decompress to that directory")
You might want to take a look at escapeshellarg too ;-)
You should be able to use the built in gzuncompress function.
See http://php.net/manual/en/function.gzuncompress.php
Edit: Or other zlib functions depending on what data you are working with. http://php.net/manual/en/ref.zlib.php

Can't execute external process with PHP

I have the following code
function generate_pdf() {
$fdf_data_strings = $this->get_hash_for_pdf();
#$fdf_data_names = array('49a' => "yes");
$fdf_data_names = array();
$fields_hidden = array();
$fields_readonly = array();
$hud_pdf = ABSPATH.'../pdf/HUD3.pdf';
$fdf= forge_fdf( '',
$fdf_data_strings,
$fdf_data_names,
$fields_hidden,
$fields_readonly );
/* echo "<pre>";
print_r($fdf);
echo "</pre>";
die('');
*/
$fdf_fn= tempnam( '.', 'fdf' );
$fp= fopen( $fdf_fn, 'w' );
if( $fp ) {
fwrite( $fp, $fdf );
//$data=fread( $fp, $fdf );
// echo $data;
fclose( $fp );
header( 'Content-type: application/pdf' );
header( 'Content-disposition: attachment; filename=settlement.pdf' ); // prompt to save to disk
passthru( 'pdftk HUD3.pdf fill_form '. $fdf_fn.' output - flatten');
unlink( $fdf_fn ); // delete temp file
}
else { // error
echo 'Error: unable to open temp file for writing fdf data: '. $fdf_fn;
}
}
}
is there anything wrong with it?
the problem is, I have installed pdftk
runing whereis pdftk gives me '/usr/local/bin/pdftk'
physically checked the location, pdftk is there at the said location..
using terminal, if i run pdftk --version or any other command, it runs
if I use php like passthru('/usr/local/bin/pdftk --version') nothing is displayed
if I used php like system("PATH=/usr/local/bin && pdftk --version"); it says '/usr/local/bin /pdftk :there is no directory of file '
when I run this function script , prompt for file download pops, but when i save it, nothng is saved,
i have checked permission for this folder and changed it 0755, 0766, 0777, 0666 i have tried all, nothng works
For 3 days, i am striving to get over it, and I have asked question regarding this too, but Can't figure out what the hell is going on with me.
Can somebody help me before i strike my head with wall?
The pasthru function does not execute the program through the shell.
Pass the exact path into the passthru command.
E.g.
passthru( '/usr/local/bin/pdftk HUD3.pdf fill_form '. $fdf_fn.' output - flatten');
or
passthru( '/usr/local/bin/pdftk' . $hud_pdf . 'fill_form '. $fdf_fn.' output - flatten');
If this still doesn't work test using
<?php passthru("/path/to/pdftk --help"); ?> where /path/to/pdftk is your path returned by which or where is, to ensure path is correct.
If path is correct then the issue may be related to permissions either on the temporary directory you tell pdftk to use or the permissions on the pdftk binary with regards to the apache user.
If these permissions are fine you can verify the pdftk starts up from php but hangs from running your command, then might be able to try the workaround listed here.
Further documentation on passthru is avaliable passthru PHP Manual.
As a side note, the putenv php function is used to set environment variables.
E.g. putenv('PATH='.getenv('PATH').':.');
All 3 PHP functions: exec(), system() and passthru() executes an external command, but the differences are:
exec(): returns the last line of output from the command and flushes nothing.
shell_exec(): returns the entire output from the command and flushes nothing.
system(): returns the last line of output from the command and tries to flush the output buffer after each line of the output as it goes.
passthru(): returns nothing and passes the resulting output without interference to the browser, especially useful when the output is in binary format.
Also see PHP exec vs-system vs passthru SO Question.
The implementation of these functions is located at exec.c and uses popen.
I had the same issue and this is working after lots of experiments :
function InvokePDFtk($pdf_tpl, $xfdf,$output){
$params=" $pdf_tpl fill_form $xfdf output $output flatten 2>&1";
$pdftk_path=exec('/usr/bin/which /usr/local/bin/pdftk');
$have_pdftk= $pdftk_path=='/usr/local/bin/pdftk' ;
$pdftk_path=$have_pdftk ? $pdftk_path : 'pdftk ';
exec($pdftk_path.$params,$return_var);
return array('status'=> $have_pdftk,
'command' =>$pdftk_path.$params, 'output'=>$return_var);
}
hope this might give you some insight . (change according to your needs)
Completing Appleman answer, those 3 functions can be considered as dangerous, because they allow you execute program using php, thus an attacker that exploited one of your script if you are not careful enougth. So in many php configuration that want to be safe they are disabled.
So you should check for the disable_functions directive in you php.ini(and any php configuration file) and see if the function you use is disabled.
Perhaps you should keep fclose out of the if statement, make sure you have it directed to the right file! :)
Is your web server chrooted? Try putting the executable into a directory that is viewable by the server.
Play around around with safe mode and definitely check your web server log file, normally in:
/var/log/apache2/error.log

Categories