OCR reading phone numbers with Tesseract - php

I am trying to complete a project that has to include some OCR. For the job I picked Tesseract OCR but the results are not optimal. I have tried to limit the character set to 1234567890- but the results are not good. Is there an optimal image size I can use or some way to train Tesseract to recognise this kind of string better?
The image is this:
And the result tesseract returns is 05175150152 which is not right, and it should be better since the image is not modified in any way. I use tesseract through PHP with exec with the following command:
"C:\Program Files\Tesseract-OCR\tesseract.exe" C:\wamp\www\a
dwords\phones\center_ctl09_ctl04.png sssd -l eng -psm 7 nobatch letters
Any ideas on what i am doing wrong?

The image resolution of 96 DPI is tough for any OCR engine. Try to rescale it to 300 DPI and you will have better results.
Additionally, JPEG is a lossy image format. Use a different one, like TIFF or PNG, if possible.

Related

PHP Convert SVG To Jpg Missing Elements

I am using php imagemagic to convert svg to jpg and i use the below command for the conversion.
convert -density 250 source.svg target.jpg
I can able to convert svg to jpg successfully but some elements in svg are missing. Please check below example
Input SVG
Output JPG
Here you can clearly see that the light grey shades on the sides are replaced with white color. Can someone let me know how to fix this.
FYI here is the link to download the actual SVG
https://drive.google.com/file/d/1vC5yaXds7ogcsTWjaDzkXZZFyKzLCsgf/view?usp=sharing
Which SVG renderer are you using? It could be the internal Imagemagick MSVG/XML, the RSVG delegate or Inkscape (in order of increasing functionality). You can find out by adding -verbose to your command line.
convert -verbose -density 250 source.svg target.jpg
I used Imagemagick 6.9.10.11 Q16 Mac OSX with its internal MSVG/XML renderer, RSVG 2.42.2_2 and Inkscape 0.92.3_4. All three produced different results. The RSVG was the worst. The Inkscape was the best. Here are my results using your command.
convert -density 250 MSVG:source.svg target_msvg.jpg
convert -density 250 RSVG:source.svg target_rsvg.jpg
convert -density 250 source.svg target_inkscape.jpg
Usually RSVG does better than Imagemagick's MSVG. But here it seems to do worse. It could be due to the way I modified my delegates.xml file in order to be able to run RSVG while Inkscape was installed. Also the MSVG renderer has been improved over the last few releases. So an older version may not produce as good of a result.
Finally I found a workaround for this. I converted svg to canvas image (base64 format) via javascript and through php i have converted the base64 data to jpg image.

Convert video from FFMPEG than video is rotated [duplicate]

When I try to upload videos captured from my iPhone in my app, the server performs a conversion from .mov to .mp4 so that it can be played in other platforms. However the problem is that when I shoot the video (in portrait orientation) and it is converted (using ffmpeg) and then played back from the server, it appears to be rotated. Any idea?
FFMPEG changed the default behavior to auto rotate video sources with rotation metadata in 2015. This was released as v2.7.
If your ffmpeg version is v2.7 or newer, but your rotation metadata isn't respected, the problem is likely that you are using custom rotation based on metadata. This will cause the same logic to be applied twice, changing or cancelling out the rotation.
In addition to removing your custom rotation (recommended), there's an option to turn off auto rotation with -noautorotate.
ffmpeg -noautorotate -i input.mp4...
This will also work in some older releases.
For sake of completeness, the reason this is happening is that iPhones only actually capture video in one fixed orientation. The measured orientation is then recorded in Apple-specific metadata.
The effect is that Quicktime Player reads the metadata and rotates the video to the correct orientation during playback, but other software (e.g., VLC) does not and shows it as oriented in the actual codec data.
This is why rotate=90 (or vflip, or transpose, or etc.) will work for some people, but not others. Depending on how the camera is held during recording, the rotation necessary could be 90, 180, or even 270 degrees. Without reading the metadata, you're just guessing at how much rotation is necessary and the change that fixes one video will fail for another.
What you can also do is remove the QuickTime specific metadata when rotate the .mov.
This will make sure that the video is rotated the same way in VLC and QuickTime
ffmpeg -i in.mov -vf "transpose=1" -metadata:s:v:0 rotate=0 out.mov
Here's the documentation on the -metadata option (from http://ffmpeg.org/ffmpeg.html):
-metadata[:metadata_specifier] key=value (output,per-metadata)
Set a metadata key/value pair.
An optional metadata_specifier may be given to set metadata on streams or chapters. See -map_metadata documentation for details.
This option overrides metadata set with -map_metadata. It is also possible to delete metadata by using an empty value.
For example, for setting the title in the output file:
ffmpeg -i in.avi -metadata title="my title" out.flv
To set the language of the first audio stream:
ffmpeg -i INPUT -metadata:s:a:1 language=eng OUTPUT
Depending on which version of ffmpeg you have and how it's compiled, one of the following should work...
ffmpeg -vf "transpose=1" -i input.mov output.mp4
...or...
ffmpeg -vfilters "rotate=90" -i input.mov output.mp4
Use the vflip filter
ffmpeg -i input.mov -vf "vflip" output.mp4
Rotate did not work for me and transpose=1 was rotating 90 degrees
So - I too ran into this issue, and here my $0.02 on it:
1.) some videos DO have Orientation/Rotation metadata, some don't:
MTS (sony AVHCD) or the AVIs I have - DO NOT have an orientation tag.
MOVs and MP4s (ipad/iphone or samsung galaxy note2) DO HAVE it.
you can check the setting via 'exiftool -Rotation file'.
My videos often have 90 or 180 as the rotation.
2.) ffmpeg - regardless of the man-page with the metadata-tag, just doesn't EVER seem to set it in the output file. - the rotation-tag is ALWAYS '0'.
it correctly reports it in the output - but it's never set right to be reported by exiftool. - But hey - at least it's there and always 0.
3.) rotation angles:
if you want rotate +/- 90: transpose=1 for clockwise 90, 2 ccw
now if you need 180 degree - just add this filter TWICE.
remember - it's a filter-chain you specify. :-) - see further down.
4.) rotate then scale:
this is tricky - because you quickly get into MP4 output format violations.
Let's say you have a 1920x1080 MOV.
rotate by 90 gives 1080x1920
then we rescale to -1:720 -> 1080*(720/1920) = 405 horiz
And 405 horizontal is NOT divisable by 2 - ERROR. fix this manually.
FIXING THIS automatically - requires a bit of shell-script work.
5.) scale then rotate:
you could do it this way - but then you end up with 720x1280. yuck.
But the filter-example here would be:
"-vf yadif=1,scale=-1:720,transpose=1"
It's just not what I want - but could work quite OK.
Putting it all together: - NOTE - 'intentionally WRONG Rotation-tag', just to demonstrate - it won't show up AT ALL in the output !
This will take the input - and rotate it by 180 degree, THEN RESCALE IT - resetting the rotation-tag. - typically iphone/ipad2 can create 180deg rotated material.
you just can leave '-metadata Rotation=x' out the line...
/usr/bin/ffmpeg -i input-movie.mov -timestamp 2012-06-23 08:58:10 -map_metadata 0:0 -metadata Rotation=270 -sws_flags lanczos -vcodec libx264 -x264opts me=umh -b 2600k -vf yadif=1,transpose=1,transpose=1,scale=1280:720 -f mp4 -y output-movie.MP4
I have multiple devices - like a settop box, ipad2, note2, and I convert ALL my input-material (regardless whether it's mp4,mov,MTS,AVI) to 720p mp4, and till now ALL the resulting videos play correct (orientation,sound) on every dev.
Hope it helps.
For including into web pages my portrait-format videos from iPhone, I just discovered the following recipe for getting .mp4 files in portrait display.
Step 1: In QuickTime Player, Export your file to 480p (I assume that 720p or 1080p would work as well). You get a .mov file again.
Step 2: Take the new file in QT Player, and export to “iPad, iPhone…”. You get a .m4v file.
Step 3: I’m using Miro Video Converter, but probably any readily-available converter at all will work, to get your .mp4 file.
Works like a (long-winded) charm.
I've filmed the video with Ipad3 and it was oriented upside down, which I suppose is the common situation of all Apple devices at some versions. Besides of it, the 3-minutes long MOV file (1920x1090) took about 500 Mb in size, which made it not available to share easily. I had to convert it to MP4, and analyzing all threads I've found on stackoverflow, here's the final code string for ffmpeg I've used (ffmpeg ver. 2.8.4):
ffmpeg -i IN.MOV -s 960x540 -metadata:s:v rotate="0" -acodec libmp3lame OUT.mp4
I suppose you may just leave '-metadata:s:v rotate="0"' if you don't need the resize and audio codec change. Note that if you resize the video, width and height should fully divide to 4.
Although the topic is old.
Hope this will help some one:
Get ffmpeg latest version : https://www.ffmpeg.org/download.html
The command that worked for me (to flip 180 degrees):
ffmpeg -noautorotate -i input.mp4 -filter:v "rotate=PI" output.mp4
When the degrees are determined by -filter:v "PI/180*degrees"
for example
-filter:v "45*PI/180" for 45 degrees
A nice explanation is here
https://superuser.com/questions/578321/how-to-rotate-a-video-180-with-ffmpeg
Or... to simply change the tag in an existing file:
Read the current rotation
exiftool -Rotation <file>
then, for example:
exiftool -Rotation=180 <file>
to set it to 180

PHP Imagemagick PDF to GIF

I have the following code which is supposed to convert the first page of PDF to thumbnail:
<?php
$strPDF = "http://www.domain.com/b.pdf";
exec("/usr/bin/convert \"{$strPDF}[0]\" -colorspace RGB -geometry 200 \"output.gif\"");
?>
My host server is Siteground, and apparently /usr/bin/convert is where ImageMagick convert function is. This is my first time using ImageMagick and I'm not sure if it is doing anything. Is my code correct? And if it is, I can't seem to find output.gif.
Give the output file a full path. Example : /home/marlboro/www/output.gif
Also, first download the PDF to your local machine (use curl/wget).
Before using exec from PHP you should be 100% sure the method works when called manually from shell!

Resizing image with exec and convert in PHP

I'm trying to convert a png image in PHP the following way:
exec($cmd, $output, $return_code);
Where $cmd contains the following line of code:
/usr/bin/convert 'images/original/Id1741.png' -thumbnail x200 -quality '90' './cache/a3b84c5931d9619d12a9e244a310cb17_h200.png'
Calling this code on the command line works perfectly fine, but executing it on the webserver gives me the following error message:
Tried to execute : convert 'images/original/Id1741.png' -thumbnail x200 -quality '90' './cache/a3b84c5931d9619d12a9e244a310cb17_h200.png', return code: 1, output: Array()
If I remove the thumbnail option the command executes just fine on the webserver, but oviously it does not resize anything. So it's not a problem with permissions or the setup I guess.
PHP Version is 5.2.17.
ImageMagick Version is: 6.6.0-4 2012-04-26
Anyone ever had a similar issue and can help me with this?
Ok, I finally got it fixed. After redirecting stderr to a file I found the following error:
libgomp: Thread creation failed: Resource temporarily unavailable
Seems that my hoster 1&1 recently upgraded the ImageMagick version which apparently uses more memory than the old one (at least that's what the hoster says).
They recommend limiting the number of Threads created by ImageMagick:
putenv('MAGICK_THREAD_LIMIT=1');
I put this code into my init-script and now it works just fine!
You are converting to PNG, but you are setting -quality 90 (seemingly just analogue to the JPEG quality setting).
However, for PNG output, the -quality setting is very unlike JPEG's quality setting (which simply is an integer from 0 to 100).
For PNG it is composed by two single digits:
The first digit (tens) is (largely) the zlib compression level, and it may go from 0 to 9.
(However the setting of 0 has a special meaning: when you use it you'll get Huffman compression, not zlib compression level 0. This is often better... Weird but true.)
The second digit is the PNG data encoding filter type (before it is compressed):
0 is none,
1 is "sub",
2 is "up",
3 is "average",
4 is "Paeth", and
5 is "adaptive".
In practical terms that means:
For illustrations with solid sequences of color a "none" filter (-quality 00) is typically the most appropriate.
For photos of natural landscapes an "adaptive" filtering (-quality 05) is generally the best.
Maybe you want to revisit your -quality 90 setting in the light of this info.
Maybe you were aware of it already. In this case: my apologies for 'preaching to the choir'. :-)

PNG optimisation tools

A while back I used a PNG optimisation service called (I think) "smush it". You fed it a weblink and it returned a zip of all the PNG images with their filesizes nicely, well, smushed...
I want to implement a similar optimisation feature as part of my website's image upload process; does anyone know of a pre-existing library (PHP or Python preferably) that I can tap into for this? A brief Google has pointed me towards several command line style tools, but I'd rather not go down that route if possible.
Execute with PHP this command line tools
pngcrush -rem gAMA -rem cHRM -rem iCCP -rem sRGB -brute -l 9 -max -reduce -m 0 -q IMAGE
optipng -o7 -q pngout.png
pngout pngout.png -q -y -k0 -s0
advpng -z -4 pngout.png > /dev/null
pngcrush
OptiPNG
pngout
advpng
As long as your PHP is compiled with GD2 support (quite common nowadays):
<?php
$image = imagecreatefromstring(file_get_contents('/path/to/image.original.png'));
imagepng($image, '/path/to/image.smushed.png', 9);
This will read in any image format GD2 understands (not just PNG) and output a PNG gzipped as the maximum compression level without sacrificing quality.
It might be of less use today than years ago though; most image editors already do this, since gzipping doesn't cost as much CPU-wise as it used to.
Have you heard of PNGCrush? You could check out the source, part of PNG and MNG Tools at SourceForge, and transcribe or wrap it in Python.
I would question the wisdom of throwing away other chunks (like gAMA and iCCP), but if that's what you want to do it's fairly easy to use PyPNG to remove chunks:
#!/usr/bin/env python
import png
import sys
input=sys.stdin
out=sys.stdout
def critical_chunks(chunks):
for type,data in chunks:
if type[0].isupper():
yield type,data
chunks = png.Reader(file=input).chunks()
png.write_chunks(out, critical_chunks(chunks))
the critical_chunks function is essentially filtering out all but the critical PNG chunks (the 4 letter type for a critical chunk starts with an uppercase letter).

Categories