Inline <video> large file with buffer

Inline <video> large file with buffer - php

I'm trying to build a site using HTML5's video tag so that I can share some movies I have made. Their sizes are pretty big (>500 MB), and when I watch them from outside my network, it seems like it's trying to download the whole thing before showing it. I'm wondering how I can make it so that they can be downloaded and watched at the same time.
I'm using php and javascript to build the site, although if there are libraries or techniques available in other languages, I'm more than happy to hear about them.

Video files on the web sometimes need to be encoded in a special way in order for them to be played while downloading. In order for flash based videos to work, data called "moov" must be moved from the end of the stream to the start. A program called mp4 FastStart can do this for you.
Programs like HandBrake have a "web" option that also does this when encoding. The data basically contains the length of the video, etc. Typically this was at the end of the file. However when the web came along that meant downloading the entire thing before being able to play.
Can you tell us what format the video is?

Related

ID3 Tag while merging MP3s in real time, using PHP and HTTP Partial Content

What we want to do is to add a kind of MP3 preroll to an other MP3 file in real time. That means we have two physical MP3 files on the server which are not merged into one yet, because ffmpeg & Co. take too much time. It has to be in real time to not loose time when someone starts the (web)player. The practical case is to add prerolls to podcast files. What we already did (described below) works, except displaying the correct file duration in audio players.
One of my co-workers did this, so I try to describe as good as possible.
What my coworker already did is telling the header that two files are coming in a row by reading both files and echoing them via PHP. HTTP/1.1 206 Partial Content is used for delivering the "merged" content.
The problem is, that there are still two ID3 Tags from both files and most audio players only read the first one, which occurs wrong duration displays. The only case it works 100% is in VLC after downloading the whole thing. No webplayer, no iTunes etc. can manage the "merged" file duration.
Any idea how to create a "virtual ID3 Tag" in real time and how to remove the existing ones without touching the original files?

There are a lot of inaccurate conclusions you've come to, so let me start by correcting those, which may help you solve the problem.
because ffmpeg & Co. take too much time
FFmpeg can merge these audio streams faster than you can stream to clients for sure. If you're using -codec copy (which you should be in this case), it will handle all the demuxing/muxing for you. And, keep in mind that you can stream directly out of FFmpeg. No need for an intermediary file.
The practical case is to add prerolls to podcast files.
The FFmpeg route is what you want.
What my coworker already did is telling the header that two files are coming in a row by reading both files and echoing them via PHP. HTTP/1.1 206 Partial Content is used for delivering the "merged" content.
That's a bit of a wonky way to do this. You could instead just merge the data and send it directly in a single response.
The problem is, that there are still two ID3 Tags from both files and most audio players only read the first one, which occurs wrong duration displays.
No, the usual ID3 tags don't indicate duration. (There is an extension which does, but this is rarely used.) There is nothing in the bare MP3 stream that indicates duration either. Clients estimate this based on file size and bitrate. The bitrate can change mid-stream, so they usually estimate based on the bitrate of the first couple frames.
Undoubtedly, the problem in your case is incorrect length headers due to the way you're handling this merging, and/or a mismatch of bitrate which causes the length estimate from the player to be wrong.
Any idea how to create a "virtual ID3 Tag" in real time and how to remove the existing ones without touching the original files?
I would absolutely use FFmpeg for this work. If anything, because not all podcasts use MP3. There are plenty of AAC in MP4 podcasts, and a handful of Opus in WebM as well.

How do sites like Bing Search, Imgur, and Reddit generate a thumbnail of the website from a URL?

In Imgur, you can input an image URL and a few seconds later, there's a thumbnail of the image. Or in Bing Search, you can (or used to) be able to view a thumbnail of the website in the search results before visiting it.
I would love to implement something similar for my website, but I can't wrap my head around on how it is done. Moreover, are there not security concerns? I'd imagine the servers have to at least download the website, render it and take a screenshot. What if it's a malicious website, and you download something malicious on your server?

A headless Web browser engine like PhantomJS can be used for this. See example on their wiki. Yes, it would be prudent to run this in some sort of a sandbox, feeding a queue of URLs into it, then taking the generated thumbnails from the file system.

While I don't know the internal workings of any of the aforementioned services, I'd guess that they download/create a local copy of the images and generate a thumbnail from that.
Imgur, as an image hosting service, definitely needs a copy of the image prior to being able to generate thumbnails or anything else from it. The image may be stored locally or just in memory, but either way, it must be downloaded.
The search engines displaying screenshots of the sites likely have services that periodically take a screenshot of the viewable area when the content is getting indexed, and then serve those screenshots (or derivatives) along with the search results. Taking a screenshot really isn't dangerous, so there's nothing to worry about there, and whatever tools are used to load/parse/index the websites will obviously be written with security considerations in mind.
Of course, there are security concerns about the data you're downloading, too; the images can easily contain executable code (such as PHP) in their EXIF data, so you need to be careful about what you do with the images and how.

Embedding a PDF into a website without a SRC attribute

Currently working on an offshoot of the idea more adequately addressed here.
Creating a Secure File Hosting Server for PDFs
I'm developing a secure PDF hosting website where certain users can download certain PDF's that I have stored outside of the webroot to prevent people from accessing documents they shouldn't access.
I've got the download working using the first solution, but I want to implement a 'view/preview' feature too. I still don't get content headers as well as I should but I believe what is causing the bulk of my issues is I can't put a 'src' attribute on the embed/object/iframe/whatever. And that's kind of the point of the system.
My question is, is there any way to feed a file (as opposed to a url) to an embed/object? I would like to keep my current system and I'm going for simplicity at the moment so the easier the better.
I saw Recommended way to embed PDF in HTML? and will probably check out pdf.js if I'm trying something that isn't doable.

I have not yet had the chance to play with pdf.js, but it either that or a flash player of some sort.
Or you rely on the browser to display it has a webpage and you can iframe it, but that's so lame... it would work only for a fraction of you users.

PDF2SWF - convert PDF to SWF ( 1 page = 1 SWF).
Use other SWF (reader) to load SWF pages via XML or something else.
Use $_SESSION to store ID of PDF document which should be served through e.g. /preview (same link for previewing all documents)
Don't serve original PDF, put a watermark, or make them low-res.
Otherwise, your PDF will never be "secure".
http://www.swftools.org/

Embed code in video file

I'm sorry if the question is ambiguous, I'll try to explain.
I'm working on an existing PHP download script for videos and some parts of it are broken. There's code in there that's supposed to place a specific member code inside the video file before download, but it doesn't work. Here's the code:
//embed user's code in video file
$fpTarget = fopen($filename, "a");
fwrite($fpTarget, $member_code);
fclose($fpTarget);
$member_code is a random 6-character code.
Now, this would make sense to me if it were a text file, but since it's a video file, how could this possibly work and what is it supposed to do? If the member code is somehow added to the video, how can I see it after download it? I have no experience with video files, so any help is appreciated (a modification of the available code or new code would be equally welcome).
I'm sorry I can't give a more precise description of what the code is supposed to do, I'm trying to figure that out myself.

It may work, depending on the format/type of the video. MPG files are fairly tolerant of "noise" in a file and players would skip over your code because it doesn't look like valid video frame data.
Other formats/players may puke, because the format requires certain data be at specific offsets relative to the end of the file, which you've now shifted by 6 characters.
Your best bet is to figure see if whatever format you're serving up has provisions for metadata in its specifications. e.g. there might be support for a comment field somewhere that you can simply slap the code into.
However, if you're doing all this for 'security' or tracking unauthorized sharing of the video, then simply writing the number into a header is fairly easy to bypass. A better bet would be to watermark the video somehow so that the code is embedded in the actual video data, so that "This video belongs to member XYZ only" is displayed while playing.

You don't write to the content of the file directly, not like you would with a text file. As you've noticed, this effectively corrupts the video and you have no way of reasonably reading the information.
For audio/video files, you write to meta-data that's packaged with the file. How this is packaged and what you can do with it generally depends heavily on the container format used for the file. (Remember that container and codec are two different things. The codec is the format used to encode the audio/video, the container is the file format in which that data stream is stored.)
A library like getID3 might be a good place to start. I've never used it, but it seems to be what you're looking for. What you would essentially do is write a value to the meta-data in the container (either a pre-defined value for that container or maybe a custom key/value pair, etc.) which would be part of the file. Then, when reading the file, you can get that data. (Now, that last part depends heavily on what's reading the file. The data is there, but not every player cares about it. You'll want to match up what you're writing to with what you usually see/read from the file's internal meta-data.)

Help creating a script which can play to audio files consecutively no matter the length

I am working on a call board for the hospital I work for. The call board will be used for announcing CODES, FIRE, and a few other types of calls. My problem is that I am trying to play two WAV files consecutively the first file is the type of call and the second file is the location. We have over 700 possible locations and I do not want to have thousands of pre-made recordings. Please help. Also I have thought about using a speech speech synthesizer like Microsoft ANNA.

Save the audio clips as MP3s. Concatenating them is simple - simply output one after the other. MP3s have no internal structure beyond a "frame". An MP3 file is simply a series of these frames, which can be mixed in with other data (ie: mpeg video).
<?php
header('Content-type: audio/mpeg');
readfile('good.mp3');
readfile('morning.mp3');
readfile('zombie attack.mp3');
readfile('imminent.mp3');
readfile('arm yourself.mp3');
Of course, doing this sort of thing via a web inteface seems silly. How would you tell the PA system to poke at the server to play something?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.