php extract text from word file pages wise

php extract text from word file pages wise - php

I want to extract text from an msword file (eg: .docx) to use the content of msword file in my PHP script.
Is it possible to get all pages in an array of php? Like
$array = [
[0] => 'content here......',
[1] => 'content here......',
[2] => 'content here......',
];
where index 0 is page one, index 1 is page two and so on.

I have done it Myself i used a class
https://github.com/abhihub/EDUSITE/blob/master/DocxConversion.php
$docObj = new DocxConversion("example.docx");
$docText= $docObj->convertToText(1);
and i got my desired result
Posting this answer because maybe someone needs it in Future

Related

How to extract the name from text in PHP

I want to extract the name from a paragraph or text content. I am using PHP. I tried to extract the name from below library.
https://packagist.org/packages/php-text-analysis/php-text-analysis
https://packagist.org/packages/php-text-analysis/php-text-analysis
$text = "my name is maneesh, and my friend name is Paritosh";
$freqDist = freq_dist(tokenize($text));
print_r($freqDist); die;
My expected output is : maneesh, Paritosh
Actual result is getting only frequency of word:
(
[my] => 2
[name] => 2
[is] => 2
[maneesh] => 1
[and] => 1
[friend] => 1
[Paritosh] => 1
)

If you are going to use the library you mentioned, you have to train your model. That means, fill them with many possible ways in which people can say their name. But even so, I wouldn't be perfect (depends on how well you trained your model).
Moreover, you are getting only frequency of words because that's the analysis you requested with the method freq_dist. I think you have to use corpus analysis for what you want.

How to tidy up custom fields inputs in Wordpress

Hi all looking for a little help.
I've created a site which shows the previous form of soccer teams like so:
This works fine and each letter is output by PHP as an image.
The problem is that the only way I could get my head round it to work it out was to create a custom field of checkboxes in Wordpress like so:
What would probably work better would just to have a textbox on the backend where I could just type in the form like "WLDWW" and then the front end display as necessary.
Problem is that I'm not entirely sure where to start with PHP for it to read each individual letter that I put into the textbox and translate that into the image needed at the front end.
Any ideas?
Thanks in advance

Take a look at str_split(). (see here)
$string = "WLWLLW";
$result = str_split($string);
This will output:
Array
(
[0] => W
[1] => L
[2] => W
[3] => L
[4] => L
[5] => W
)
Then you can iterate through the array and display as needed, if you want to use PHP. Of course, I don't know how you've implemented it exactly or how it uses Wordpress, so you may have to make some adjustments as needed.

Removing portion from scraped array

Currently I am scraping a website and I am trying to remove a portion of the code which I don't want to be included in the array.
so the code I have currently
$content['article'] = $html2->find('.hentry-content',0);
$content['article'] = $content['article']->plaintext;
This returns everything within the .hentry-content class on the website I am gathering content from.
Now the content that gets returned looks like this.
array (
[article] => This is some example filler content please no actual meaning behind random bridge for bridge random you dog tomorrow http://example.com/our-random-mp3.com
)
Now at the end of this output it usually includes a random MP3 is there anyway that I can pull just the content portion of the array without the mp3 being included?

if link is inside of <a> tag this should work
foreach($content['article']->find('a') as $item) {
$item->outertext = '';
}
echo $content['article']->plaintext;

If the returned text only contains one link to the random mp3-file you could filter it out with:
$url_pattern = '/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/';
$content['article'] = preg_replace($url_pattern, '', $content['article']->plaintext);
This will remove all urls from the text. I took the url-pattern from http://code.tutsplus.com/tutorials/8-regular-expressions-you-should-know--net-6149.

create PHP array using fopen with append [duplicate]

This question already has answers here:
PHP Array saved to Text file
(4 answers)
Writing Array to File in php And getting the data
(5 answers)
How do I store an array in a file to access as an array later with PHP?
(8 answers)
Closed 3 months ago.
I am having trouble figuring out how to write to a file using fopen "a" append mode.
the file itself is a simple PHP array:
$array = array(
"entry1" => "blah blah",
"entry2" => "forbarbaz",
);
simple enough. So using fopen with the 2nd arg set to "a" should allow me to append the file using fputs.... the problem is the opening and closing lines, ie $array = array( and );
so now the file should look like this:
"entry1" => "blah blah",
"entry2" => "forbarbaz",
how would I rebuild this data into a working PHP array assuming it is just a txt file with a list of entries without the opening and closing lines? Sorry if this is not clear, its a little complicated. No I am not going to store these values in a DB, I need the speed advantage by holding these particular values in a file array.
So the questions really is how would i go about constructing the usable PHP array from a txt file with a line by line list like this?
To clarify:
how do i pull in a txt file with lines like this:
"entry1" => "blah blah",
"entry2" => "forbarbaz",
and have a workable $php_array()????

Try this.
File format (at the beginning of work with it):
<?php
$array = array();
Now it's correct php-file.
Then simply add new rows like as follows:
$f = fopen('myarray.php', 'a');
fputs($f, PHP_EOL.'$array["entry1"] = "value1";');
fclose($f);
And use it by simply include('myarray.php');

Why not json_encode the array when you store it in the file and then json_decode the JSON into an array when you extract it from the file?
json_encode
json_decode

Maybe you're looking for the fseek function: http://se.php.net/fseek
When opening a file in r+ mode, the file pointer is at the beginning of the file. Sounds like you want to place it at the end of the file minus a few bytes, then write your new data.

Untested. Try this:
$data = file_get_contents('./data.txt', true);
$array = eval("array(".$data.")");

Parsing CSV file into array with comments at top

Here is a template of a possible text file I might need to import into my database:
#NAME:"Test"
#REV:"rev1"
#PRODUCT:"product1","description1","option1"
#PRODUCT:"product2","description2","option1","option2"
"A1","key1","DALI"
"B1","key2",""
"B2","key3","option2"
"C1","key4",""
The first 4 lines is a new addition to the format of these files. I was importing the comma separated data itself successfully before the addition of the comment lines on top.
I was wondering if someone can provide me the most efficient way to put all the values in the comment lines into variables in PHP.
I always have a little trouble when it comes to RegEx. I'm not sure how to best grab the lines starting with a #.
Essentially, I would like to have the following data available to me:
$csv['name']: "Test";
$csv['rev']: "rev1";
$csv['products']: array(
0 => array('name' => "product1", 'desc' => "description1", 'options' => "option1"),
1 => array('name' => "product2", 'desc' => "description2", 'options' => "option1,option2"),
);
$csv['data']: The rest of the data in text file
There could be multiple #PRODUCTS defined, so that is why it would be nice to have an array made from those lines.
Thanks for your help.

Are you using php 5.3? If so, then you can simply read your file using fgets() and detect comments using substr($line, 0, 1). If you don't detect a #,it means it a data line, then pass it on to str_getcsv()...
Cheers

To match something started with #, just use ^ at the beginning of regexp (outside of group)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php extract text from word file pages wise - php

I have done it Myself i used a class https://github.com/abhihub/EDUSITE/blob/master/DocxConversion.php $docObj = new DocxConversion("example.docx"); $docText= $docObj->convertToText(1); and i got my desired result Posting this answer because maybe someone needs it in Future

Related

How to extract the name from text in PHP

How to tidy up custom fields inputs in Wordpress

Removing portion from scraped array

create PHP array using fopen with append [duplicate]

Parsing CSV file into array with comments at top

Categories

Resources