PHP Tesseract , text extraction from an image

PHP Tesseract , text extraction from an image - php

I have a requirement where i need to extract the content from the images. Now i have been successful in doing that when it comes to extracting the entire content from an image using the below command .
shell_exec("/usr/bin/tesseract inv.jpg invoice -l eng");
$data = file_get_contents('invoice.txt');
echo '<pre>';
print_r($data);
This gives me the entire content. Now i need to know how to go about extracting the data only from a specific portion of an image using the co-ordinates .
Any advice would be helpful .
Thanks.

Related

php-QRcode generation failed

I'm using PHP QR Code library to generate QR codes.
I included the library and after fetching user information from database, I am trying to create qrcode. And then return the path of the generated qrcode to the front end so that I can pass it to an image tag for showing it to users.
I am fetching name, id,email and user image path from database. I want to include user image to the qrcode, so I get the contents and encode it as string.
I'm not getting errors. I checked the folder, qrcode is not being saved.
require_once 'externalLibraries/qrcode/qrlib.php';
// how to build raw content - QRCode with Business Card (VCard) + photo
$tempDir = QRCODE_PATH; //saves temporary directory path
// we building raw data
$codeContents = 'BEGIN:VCARD'."\n";
$codeContents .= 'FN:'.$name."\n";
$codeContents .= 'ID:'.$id."\n";
$codeContents .= 'EMAIL:'.$email."\n";
$codeContents .= 'PHOTO;JPEG;ENCODING=BASE64:'.base64_encode(file_get_contents('../'.$userAvatar))."\n";
$codeContents .= 'END:VCARD';
// generating
QRcode::png($codeContents, $tempDir.$clientid.'.png', 4, 3);
// displaying
return QRCODE_PATH.$clientid.'.png';
Is this the way to generate qrcodes?

Your code is working for me. The image is saved at the pointed location. I used placeholder for your variables though. To display the image you can use:
$imgpath = QRCODE_PATH.$clientid.'.png';
$src = 'data: '.mime_content_type($imgpath).';base64,'.base64_encode(file_get_contents($imgpath));
echo '<img src="'.$src.'">';
Update:
As mentioned by RST in the comments and stated in these answer a QR-Code can only have a limited size. The image you are using might simply be too large. Try using your generation without the image and see if it works. To answer your question in the comment, you can either reseize the image to be smaller, but no other method will help you hence the QR-Code size is limited. Maybe you think about putting a link to the image into the QR-Code.

Getting images from Directory API using PHP

I'm trying to use the Directory API to pull user information, which I'm able to do, but I can't seem to find any good examples on how to access user images.
I tried the following, which gets me a photo resource:
$photo = $service->users_photos->get($uid);
After that I found the utility method for web-safe base64 decoding and tried to access the data like so:
$data = Google_Utils::urlSafeB64Decode($data);
$data = base64_encode($data);
return "data:" . $mime . ";base64," . $data;
However, this produces a pixelated image. So I'm either missing a step or something is wrong with the data I'm getting. I can't seem to find any documentation on how to get this image using PHP, is there an official method for decoding the bytes received in photoData and converting them to an image?

PDF form population with FPDM

what I'm trying to accomplish is population of PDF form with PHP.
I tried many ways, I found that FPDM (FPDF) is working well when I create a new form, or use the form from source file they provided.
My problem is when I'm using already created PDF form, the form has restrictions such as Owner password, document is signed and certified. I used the app to remove those restrictions, some of them are left. In picture below you can see how my current PDF looks like.
That PDF also was compressed, and because FPDM was throwing the error that 'Object Stream' is not supported I decompressed it through PDFTK, so file went from 1.48 Mb to 6.78 Mb.
To get all form field names I used also PDFTK, so I have them in txt file.
There are two ways I can do by the instructions of FPDM:
First way is only to send an array field_name => value along with PDF I want to change and that's it. So when I use PDF described above I get error:
'FPDF-Merge Error: field form1[0].#subform[0].Line1_GivenName[0] not found'
Just to remind that I have all names and this name exists.
<?php
require('fpdm.php');
$fields = array(
'form1[0].#subform[0].Line1_GivenName[0]' => 'my name'
);
$pdf = new FPDM('test.pdf');
$pdf->Load($fields, false); // second parameter: false if field values are in ISO-8859- 1, true if UTF-8
$pdf->Merge();
$pdf->Output('new_pdf.pdf', 'F');
?>
The other way is that I create FDF file with createXFDF function and then use FPDM to merge FDF to PDF. This solution creates 'new_file.pdf' like I want but empty :)
function createXFDF($file, $info, $enc = 'UTF-8') {
$data = '<?xml version="1.0" encoding="'.$enc.'"?>' . "\n" .
'<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">' . "\n" .
'<fields>' . "\n";
foreach($info as $field => $val) {
$data .= '<field name="' . $field . '">' . "\n";
if(is_array($val)) {
foreach( $val as $opt )
$data .= '<value>' .
htmlentities( $opt, ENT_COMPAT, $enc ) .
'</value>' . "\n";
} else {
$data .= '<value>' .
htmlentities( $val, ENT_COMPAT, $enc ) .
'</value>' . "\n";
}
$data .= '</field>' . "\n";
}
$data .= '</fields>' . "\n" .
'<ids original="' . md5( $file ) . '" modified="' .
time() . '" />' . "\n" .
'<f href="' . $file . '" />' . "\n" .
'</xfdf>' . "\n";
return $data;
}
require('fpdm.php');
$pdf = new FPDM('test.pdf', 'posted-0.fdf');
$pdf->Merge();
$pdf->Output('new_file.pdf', 'F');
One more thing, if I try to open FDF file in Acrobat I get a message
'The file you are attempting to open contains comments or form data that are supposed to be placed on test.pdf. This document cannot be found. It may have been moved, or deleted. Would you like to browse to attempt to locate this document?'
but the file is there, not moved or deleted. When I find it manually the form populates.
If anyone has experience with this, any help or advice would help a lot.
Thank you in advance, Vukasin
EDIT:
More info about the PDF file

I have spent more than a complete day working through issues with FPDM, and was hard pressed to find someone who had similar issues.
The following format worked for me: PDF 1.4 (Acrobat 5). I had to
actually go to Save As -> choose Adobe PDF Optimized, then click the
Settings button. From there I had to choose the version from the
drop-down/fly-out menu.
I received the error: 'not compatible with fast web view' or similar. If in the PDF Optimized settings option you click 'clean up' on the left side you can untoggle fast web view.
Now I am receiving the error, PDF-Merge Error: field 'fieldname' not found. When I run it through pdftk hoping to resolve this, I receive the error: FPDF-Merge Error: Number of objects (35) differs with number of xrefs (36), something , pdf xref table is corrupted :(
To fix this issue, I had to download and install pdftk server utility on ubuntu
sudo apt-get install pdftk
After install, I ran this command to repair a PDF’s corrupted XREF table and stream lengths, if possible:
pdftk broken.pdf output fixed.pdf
When I open fixed.pdf it has no issues whatsoever and populates the fields correctly. Hallelujah this was the most annoying issue in the world. To summarize, I had to take the pdf and put it through the following steps:
Edit PDF to preference
Save As > .pdf optimized > Settings > Acrobat 5 > uncheck fast web under cleanup
Open file in pdftk and resave
Install command line pdftk via ubuntu
run command: pdftk broken.pdf output fixed.pdf
done.

After revealing the creator/producer info the problem is clear.
You do not have a real PDF form, but you have a XFA form (created by LiveCycle Designer), wrapped in a PDF wrapper so that Adobe Reader can display it.
XFA forms do not support (X)FDF. You have to import data using XML. You can try to export the data from a filled version, and then use this as a sample for creating the import XML.
Note that the XML export/import format XFA forms use is not the same as XFDF (which is simply an XML representation of FDF, the PDF-native forms data format).

Thanks to Kyle's answer, I resolved a similar issue.
I have a pdf created in Adobe Acrobat Pro DC and need to populate its form fields from web input.
On my development machine, I created an fdf file from my web form data, and merged it into the pdf using pdftk (called from php with exec() ).
But I couldn't put that on the cloud linux webserver where my site is hosted because it requires deprecated libraries.
So I switched to FPDM and had the following errors:
'Fast Web View mode is not supported'. I fixed that by setting
preferences in 'save as' in Adobe Acrobat Pro (save as -> pdf
optimized -> settings -> clean up -> uncheck Fast Web View).
'Object streams are not supported' - again, fixed in the 'save as'
preferences (clean up -> object compression options -> remove
compression).
'Incremental updates are not supported' - again, fixed using 'save
as' in Acrobat.
Then FPDM ran, but couldn't read any field names.
The One-Step Solution:
Take the original file pdf - with incremental updates, object compression and fast web view - and pass it through pdftk on Windows, exactly as Kyle describes.
> pdftk broken.pdf output fixed.pdf
Now FPDM populates the fields correctly from the fdf file.

I found a tool called Scribus after examining the template used in the fpdf example. You can use it to create pdf templates and the format created plays nice with fpdm. It isn't a complicated program and allows you to create form fields with permissions/parameters around them (like making a form field read-only after you populate data in it from an online form). For my application, I needed to have some fields pre-populated from values in a database that were non-editable, have other fields that were pre-populated, but still editable and some fields that were empty and required completion (force required). It was all possible using the template that Scribus has generated.

MAGNIFICENT work!
The One-Step Solution:
Take the original file pdf - with incremental updates, object compression and fast web view - and pass it through pdftk on Windows, exactly as Kyle describes.
pdftk broken.pdf output fixed.pdf
Now FPDM populates the fields correctly from the fdf file.
I created a PDF with Acrobat, then "fixed" it with pdftk, and FPDM the class merged the data perfectly...

How to treat a PHP image generation script as an image

This is an odd question but I'm stuck on how I would achieve this and I am unable to find any methods of doing so.
I have a simple php script that takes variables (containing file names) from the URL, cleans then and then uses them to generate a single image from the inputted values. This works fine and outputs a new png to the webpage using:
imagepng($img);
I also have a facebook sharing script in PHP that takes a filepath as an input and then shares the image on the users feed where this statement is used to define the image variable:
$photo = './mypic.png'; // Path to the photo on the local filesystem
I don't know how I can link these two together though. I would like to use my generation script as the image to share.
Can anyone point me in the right direction of how to do this? I am not the master of PHP so go easy please.
-Tim
UPDATE
If it helps, here are the links to the two pages on my website containing the outputs. They are very ruff mind you:
The php script generating the image:
http://the8bitman.herobo.com/Download/download.php?face=a.png&color=b.png&hat=c.png
The html page with the img tag:
http://the8bitman.herobo.com/Share.html

Treat it as a simple image:
<img src="http://yourserve/yourscript.php?onlyImage=1" />
yourscript.php
if($_GET['onlyimage']) {
header('Content-type:image/png'); //or your image content type
//print only image
} else {
//print image and text too
}

Store image in variable, echo later

Let's say I have a user enter the URL of an image.
After URL validation, etc. I want to get the image and store it in a PHP variable. Not the image path, but the actual image itself.
I am adding this image-holding variable in between other strings, so I cannot change use header() before echoing the image. Is it possible to use <img> HTML tags? I really don't want to copy the images to my own server...
How do I:
Store the image in a variable such that,
Echo the image from the variable without changing headers.
Edit:
I said above that I am putting this image inside another variable, e.g.:
$str = "blah blah" . $var_holding_img . "more text";
Is it possible to insert something in the string above that will be replaced with the images? Can parse the variable $str later to replace some random text like "abg30j-as" with the image...

I found an answer to my own question:
First, I created another PHP file, called img.php:
<?php
$url = $_GET['imgurl'];
/*
Conduct image verification, etc.
*/
$img_ext = get_ext($url); //Create function "get_ext()" that gets file extension
header('Content-type: image/' . $img_ext);
echo file_get_contents($url);
?>
Then, in the original PHP file, I used this PHP code:
<?php
$var_holding_img = '<img src="img.php?imgurl=http://example.com/image.png"/>';
$string = "This is an image:<br \>" . $var_holding_img . "<br \>displayed dynamically with PHP.";
echo $string;
?>
This way, the PHP file "img.php" can use the proper headers and the image can be inserted as HTML into any other PHP variable.

How do I:
Store the image in a variable such that,
Echo the image from the variable without changing headers.
You can do this in two ways.
In way one, you serialize the image in a string, and save the string in the session. Which is exactly the same as saving it server side, except that now the session GC should take care of clearing it for you. Then the IMG SRC you use will redirect to a script that takes the image and outputs it as image with proper MIME type.
In way two, if the image is small enough, you can encode it as BASE64 and output it into a specially crafted IMG tag:
http://www.sweeting.org/mark/blog/2005/07/12/base64-encoded-images-embedded-in-html
This saves you some time in connection, also. Of course the image must be reasonably small.

You can't save the actual image in a variable. Either you save the URL or copy the image (what you obvious don't want) to your server and save the path to the image
See answer 1, you can't echo the image itself, only link it
Edit: Okay obviously you can save images directly to a variable, but I don't recommend you to do this.

No, that isn't possible. If you want to serve something, it has to exist on the server.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP Tesseract , text extraction from an image - php

Related

php-QRcode generation failed

Getting images from Directory API using PHP

PDF form population with FPDM

How to treat a PHP image generation script as an image

Store image in variable, echo later

Categories

Resources