using ffmpeg/ffprobe to create a waveform json using php - php

I have many ogg & opus files on my server and need to generate json-waveform numeric arrays on an as-needed basis (example below).
recently i discovered the node based waveform-util which uses ffmpeg/ffprobe for rendering a JSON waveform and it works perfectly. i am undecided if having a node process constantly running is the optimum solution to my issue.
since ffmpeg seems to be able to handle anything i can throw at it, i wish to stick with an ffmpeg solution.
i have three questions:
1) is there a php equivalent? i have found a couple that generate PNG images but not one that generates JSON-waveform numeric arrays
2) are there any significant advantages of going with the node-based solution rather than a php based solution (assuming there is a php based solution)?
3) is there a way using CLI ffmpeg/ffprobe to generate a json-waveform ? i saw all the -show_ options (-show_data, -show_streams, -show_frames) but nothing looked like it produced what i am looking for.
the json-waveform needs to be in this format:
[ 0.0002, 0.001, 0.15, 0.14, 0.356 .... ]
thank you all.

it sounds as if there is a conflict with the way my server is handling cgi. i am using virtualmin and am using the following setting:
PHP script execution mode: CGI wrapper (run as virtual server owner)
after much research, it appears that using pure node.js is more lightweight rather than using a shell executable. i was able to have some success merely by putting a schbang line to call node, but having a node.js script always memory resident is probably the way to go.

For anyone in the future looking to do this with RN:
// convert the file to pcm
await RNFFmpeg.execute(`-y -i ${filepath} -acodec pcm_s16le -f s16le -ac 1 -ar 1000 ${pcmPath}`)
// you're reading that right, we're reading the file using base64 only to decode the base64, because RN doesnt let us read raw data
const pcmFile = Buffer.from(await RNFS.readFile(pcmPath, 'base64'), 'base64')
let pcmData = []
// byte conversion pulled off stack overflow
for(var i = 0 ; i < pcmFile.length ; i = i + 2){
var byteA = pcmFile[i];
var byteB = pcmFile[i + 1];
var sign = byteB & (1 << 7);
var val = (((byteA & 0xFF) | (byteB & 0xFF) << 8)); // convert to 16 bit signed int
if (sign) { // if negative
val = 0xFFFF0000 | val; // fill in most significant bits with 1's
}
pcmData.push(val)
}
// pcmData is the resulting waveform array

Related

How to call a php file with arguments from VBA for Mac? Equivalent of VBA.createObject("wscript.shell") on Mac?

Still in need of help :)
I'm trying to adapt the following chunk of code to VBA for Mac (as the final implementation has to be on Mac):
Dim ws as Object
Dim result as String
Set ws = VBA.CreateObject("wscript.shell")
cd = "php " & dirPHP & "\verificar.php " & FileName
result = ws.Run(cd)
Set ws = Nothing
It runs perfectly on Windows, but when trying to adapt it for Mac I'm encountering many problems.
What I am basically doing on the previous chunk is calling a PHP file that takes the first argument (in this case, FileName) and calls the verify function, returning some values.
The thing is that there are some posts explaining how to do this, but I have seen no examples on how to do it for PHP, and especially for PHP passing an input argument.
This is what I've tried so far:
result = AppleScriptTask("PHPCommand.applescript", "PHPCommandHandler", FileName)
e = "php " & dirPHP & "/verificar.php " & FileName cd = "do shell script """ & e & """" result = MacScript(cd)
(On the Mac Terminal I am able to run the PHP file fine, with the resulting "e" string).
And some other fruitless things, like the shell() function, or some other user-defined functions (I saw someone defined a "system()" function). I also tried many ways of putting the double and simple quotes, and simplified the path to the PHP file (dirPHP) and the path + filename of the argument (FileName) by removing all blank spaces and thus the need of using additional quotes.
Please help me! I'd be really grateful, as yesterday I spent the whole day on this and I can't keep wasting time on something that is so simple on Windows... But I have to do it on Mac.
Thanks so much!!!
Use the VBA Shell function.
Shell
Runs an executable program and returns a Variant (Double) representing
the program's task ID if successful; otherwise, it returns zero.
Syntax
Shell(pathname, [ windowstyle ])

Binary Data from PHP to Angular 6 via http

the goal is to make a http request (empty) from Angular 7 to PHP to receive binary data in Angular for the use with protobuf3.
More specifically, the binary data (encoded like described here: https://developers.google.com/protocol-buffers/docs/encoding) in PHP (source) is encapsulated in a string, while the goal in Angular is a Uint8Array.
Therefore, I currently have the following working code:
PHP Code (a simple ProcessWire root template):
header('Content-Type: application/b64-protobuf');
…
echo base64_encode($response->serializeToString());
Angular:
let res = this.httpClient.get(`${this.API_URL}`, { responseType: 'text' });
res.subscribe((data) => {
let binary_string = atob(data);
let len = binary_string.length;
let bytes = new Uint8Array(len);
for (let i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
let parsedResponse = pb.Response.deserializeBinary(bytes)
})
As you can see I encode the data as base64 before sending it. So, it is not as efficient as it could be, because base64 reduces the amount of information per character. I tried already quite a lot to get binary transmission working, but in the end the data always gets corrupted, i.e. the variable bytes is not identical to the argument of base64_encode.
But still, according to some sources (e.g. PHP write binary response, Binary data corrupted from php to AS3 via http (nobody says it would not be possible)) it should be possible.
So my question is: What must change to directly transfer binary data? Is it even possible?
What have I tried?
using different headers, such as
header('Content-Type:binary/octet-stream;'); or using Blob in Angular.
I also tried to remove base64_encode from the PHP Code and atob
from the Angular Code. The result: the content of the data is modified between serializeToString and deserializeBinary(bytes), which is not desired.
I checked for possible characters before <?php
Specifications:
PHP 7.2.11
Apache 2.4.35
Angular 7.0.2
If further information is needed, just let me know in the comments. I am eager to provide it. Thanks.

Parsing large XML data

I am trying to parse xml files to store data into database. I have written a code with PHP (as below) and I could successfully run the code.
But the problem is, it requires around 8 mins to read a complete file (which is around 30 MB), and I have to parse around 100 files in each hour.
So, obviously my current code is of no use to me. Can anybody advise for a better solution? Or should I switch to other coding language?
What I get from net is, I can do it with Perl/Python or something called XSLT (which I am not so sure about, frankly).
$xml = new XMLReader();
$xml->open($file);
while ($xml->name === 'node1'){
$node = new SimpleXMLElement($xml->readOuterXML());
foreach($node->node2 as $node2){
//READ
}
$xml->next('node1');
}
$xml->close();
Here's an example of my script I used to parse the WURFL XML database found here.
I used the ElementTree module for Python and wrote out a JavaScript Array - although you can easily modify my script to write a CSV of the same (Just change the final 3 lines).
import xml.etree.ElementTree as ET
tree = ET.parse('C:/Users/Me/Documents/wurfl.xml')
root = tree.getroot()
dicto = {} #to store the data
for device in root.iter("device"): #parse out the device objects
dicto[device.get("id")] = [0, 0, 0, 0] #set up a list to store the needed variables
for child in device: #iterate through each device
if child.get("id") == "product_info": #find the product_info id
for grand in child:
if grand.get("name") == "model_name": #and the model_name id
dicto[device.get("id")][0] = grand.get("value")
dicto[device.get("id")][3] +=1
elif child.get("id") == "display": #and the display id
for grand in child:
if grand.get("name") == "physical_screen_height":
dicto[device.get("id")][1] = grand.get("value")
dicto[device.get("id")][3] +=1
elif grand.get("name") == "physical_screen_width":
dicto[device.get("id")][2] = grand.get("value")
dicto[device.get("id")][3] +=1
if not dicto[device.get("id")][3] == 3: #make sure I had enough
#otherwise it's an incomplete dataset
del dicto[device.get("id")]
arrays = []
for key in dicto.keys(): #sort this all into another list
arrays.append(key)
arrays.sort() #and sort it alphabetically
with open('C:/Users/Me/Documents/wurfl1.js', 'w') as new: #now to write it out
for item in arrays:
new.write('{\n id:"'+item+'",\n Product_Info:"'+dicto[item][0]+'",\n Height:"'+dicto[item][1]+'",\n Width:"'+dicto[item][2]+'"\n},\n')
Just counted this as I ran it again - took about 3 seconds.
In Perl you could use XML::Twig, which is designed to process huge XML files (bigger than can fit in memory)
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $file= shift #ARGV;
XML::Twig->new( twig_handlers => { 'node1/node2' => \&read_node })
->parsefile( $file);
sub read_node
{ my( $twig, $node2)= #_;
# your code, the whole node2 string is $node2->sprint
$twig->purge; # if you want to reduce memory footprint
}
You can find more info about XML::Twig at xmltwig.org
In case of Python I would recommend using lxml.
As you are having performance problems, I would recommend iterating through your XML and processing things part by part, this would save a lot of memory and is likely to be much faster.
I am reading on old server 10 MB XML within 3 seconds, your situation might be different.
About iterating with lxml: http://lxml.de/tutorial.html#tree-iteration
Review this line of code:
$node = new SimpleXMLElement($xml->readOuterXML());
Documentation for readOuterXML has a comment, that sometime it is attempting to reach out for namespaces etc. Anyway, here I would suspect big performance problem.
Consider using readInnerXML() if you could.

Returning things to PHP from a Word Macro

The objective is to get an accurate word count for a Microsoft Word file. We have a Windows server that runs Apache and PHP. There is a web service running on that machine that basically gets all the content of the document and runs the content through preg_match_all("/\S+/", $string, $matches); return count($matches[0]);. Works pretty well but it's not at all accurate. So we wrote the following macro:
Sub GetWordCountBreakdown()
Dim x As Integer
Dim TotalWords As Long
Dim FieldWords As Long
TotalWords = ActiveDocument.ComputeStatistics(wdStatisticWords)
For x = 1 To ActiveDocument.Fields.Count
If ActiveDocument.Fields.Item(x).Result.ComputeStatistics(wdStatisticWords) > 25 Then
FieldWords = FieldWords + ActiveDocument.Fields.Item(x).Result.ComputeStatistics(wdStatisticWords)
End If
Next x
MsgBox (TotalWords & " - " & FieldWords & " = " & TotalWords - FieldWords)
End Sub`
When I run this macro in Word, it gives me a neat little alert box counting up all the words and references in the document. I'm not sure how to return those values to PHP so my webservice can convey them back to me.
Update: I was able to just rewrite this macro in PHP and get the correct wordcount. Basically:
$word = new COM("Word.Application")
$word->Documents->Open(file);
$wdStatisticWords = 0;
$wordcount = $word->ActiveDocument->ComputeStatistics($wdStatisticWords);
etc.
If you can read the OLE streams for the doc file, an accurate wordcount for the document should be stored in either the SummaryInformation or the DocumentSummaryInformation stream. I don't have a script that reads the properties from .doc files, but I do have code for reading the metaproperties of Excel xls files that could be adapted fairly easily.
EDIT
I've just checked, and it's property id 0x0F in the SummaryInformation stream.
Why not simply count the number of spaces in the doc string? Or am I missing something?

Python's cPickle deserialization from PHP?

I have to deserialize a dictionary in PHP that was serialized using cPickle in Python.
In this specific case I probably could just regexp the wanted information, but is there a better way? Any extensions for PHP that would allow me to deserialize more natively the whole dictionary?
Apparently it is serialized in Python like this:
import cPickle as pickle
data = { 'user_id' : 5 }
pickled = pickle.dumps(data)
print pickled
Contents of such serialization cannot be pasted easily to here, because it contains binary data.
If you want to share data objects between programs written in different languages, it might be easier to serialize/deserialize using something like JSON instead. Most major programming languages have a JSON library.
Can you do a system call? You could use a python script like this to convert the pickle data into json:
# pickle2json.py
import sys, optparse, cPickle, os
try:
import json
except:
import simplejson as json
# Setup the arguments this script can accept from the command line
parser = optparse.OptionParser()
parser.add_option('-p','--pickled_data_path',dest="pickled_data_path",type="string",help="Path to the file containing pickled data.")
parser.add_option('-j','--json_data_path',dest="json_data_path",type="string",help="Path to where the json data should be saved.")
opts,args=parser.parse_args()
# Load in the pickled data from either a file or the standard input stream
if opts.pickled_data_path:
unpickled_data = cPickle.loads(open(opts.pickled_data_path).read())
else:
unpickled_data = cPickle.loads(sys.stdin.read())
# Output the json version of the data either to another file or to the standard output
if opts.json_data_path:
open(opts.json_data_path, 'w').write(json.dumps(unpickled_data))
else:
print json.dumps(unpickled_data)
This way, if your getting the data from a file you could do something like this:
<?php
exec("python pickle2json.py -p pickled_data.txt", $json_data = array());
?>
or if you want to save it out to a file this:
<?php
system("python pickle2json.py -p pickled_data.txt -j p_to_j.json");
?>
All the code above probably isn't perfect (I'm not a PHP developer), but would something like this work for you?
I know this is ancient, but I've just needed to do this for a Django 1.3 app (circa 2012) and found this:
https://github.com/terryf/Phpickle
So just in case, one day, someone else needs the same solution.
If the pickle is being created by the the code that you showed, then it won't contain binary data -- unless you are calling newlines "binary data". See the Python docs. Following code was run by Python 2.6.
>>> import cPickle
>>> data = {'user_id': 5}
>>> for protocol in (0, 1, 2): # protocol 0 is the default
... print protocol, repr(cPickle.dumps(data, protocol))
...
0 "(dp1\nS'user_id'\np2\nI5\ns."
1 '}q\x01U\x07user_idq\x02K\x05s.'
2 '\x80\x02}q\x01U\x07user_idq\x02K\x05s.'
>>>
Which of the above looks most like what you are seeing? Can you post the pickled file contents as displayed by a hex editor/dumper or whatever is the PHP equivalent of Python's repr()? How many items in a typical dictionary? What data types other than "integer" and "string of 8-bit bytes" (what encoding?)?

Categories