Search all php files for a string in between a class - php

I am using a simple php translation class and I have about more than 2000 php files which the translation class was implemented and new strings are as well implemented so I need an updated text file with all the translation strings.
I need to get all the translated values from each php file and save it into a text file without any repeated value.
Translation class
<?php $translate->__('Calendar'); ?>
So I need to get Calendar saved into a txt file and this should be done for all the files in all folders.
Everything in between $translate->__(' and ') should be saved.
The below code not working for some reason.
$fn = $_SERVER['DOCUMENT_ROOT']."/apps/test/test2/calendar.php";
$handle = fopen($fn, 'r');
$valid = false;
$search = "\/\\$translate\\-\\>__\\(\\'(.*?)'\\)\/g";
while (($buffer = fgets($handle)) !== false) {
if(preg_match_all($search, $buffer, $m)) {
print $m[1];
} else {
}
}
fclose($handle);

You're extracting strings with this pattern:
/\$translate\-\>__\(\'(.*?)'\)/g
extract all of matched items and save them any where.
Demo and Details : https://regex101.com/r/LzMyJY/1
$fn = $_SERVER['DOCUMENT_ROOT']."/apps/test/test2/calendar.php";
$handle = fopen($fn, 'r');
$valid = false;
$search = "/\\".'$'."translate\\-\\>__\\(\\'(.*?)'\\)/g";
while (($buffer = fgets($handle)) !== false) {
if(preg_match_all($search, $buffer, $m)) {
print $m[1];
} else {
}
}
fclose($handle);
Note:
In use of regex patterns, remember handle backslash \ when putting pattern in ".." (change all \ to \\ in this case)
If using '...' don't change \ with \\ !

Related

Regex PHP - Get specific text including new lines and multiple spaces

I was trying to get the text starting from css:{(some text...)} up to the ending bracket only, not including the texts below in another text file using php.
test.sample
just a sample text
css:{
"css/test.css",
"css/test2.css"
}
sample:text{
}
I'm using vscode/sublime search and replace tool to test my regex syntax and nothing is wrong, I successfully get the text that I want including all the new lines and spaces inside, but when i tried to apply it on php, the regex that I created doesn't work, it cannot find the text that im looking for.
here is my code:
myphp.php
$file = file_get_contents("src/page/test.sample");
echo $file . "<br>";
if (preg_match_all("/(css\s*\n*:\s*\n*\{\s*\n*)+((.|\n\S)*|(.|\n\s)*)(\n*)(\}\W)$/", $file)) {
echo "Success";
} else {
echo "Failed!";
}
This is my regex that I just created.
(css\s*\n*:\s*\n*{\s*\n*)+((.|\n\S)|(.|\n\s))(\n*)(}\W)$
Please help me, Im open for any suggestion, Im a newbie on regular expression, Im lacking on knowledge about the logic of it.
thanks.
Try this my friend:
<?php
$file = "testfile.php"; // call the file
$f = fopen($file, 'rb'); // open the file
$found = false;
while ($line = fgets($f, 1000)) { // read every line of the file
if ($found) {
echo $line;
continue;
}
if (strpos($line, "css:") !== FALSE) { // if we found the word 'css:' we print everything after that
$found = true;
}
}
Hey guys I found the solution! base on the answer of #alex
Dont really know if I am implementing this right.
Here is my code
$src = "src/page/darwin.al"; //get the source file
$file = fopen($src,"rb"); //I dont really know what 'rb' means, I guess it simply means, 'not a commong text file'?, search for it!
$found = false;
$css = false;
while($line = fgets($file)){ //read every line of text and assign that line of text in a variable $line
if(strpos(preg_replace("/\s*/", "", $line), "css:{") === 0){ //if the current line is == 'css:{' <the thing that Im looking for,
$found = true;
$css = true;
}elseif($css && strpos(preg_replace("/\s*/", "", $line),"}") === 0){ //If we are still inside the css block and found the '{'
echo $line;
break;
}
if ($found) {
echo preg_replace("/\s*/", "", $line); //remove every whitespace!
}
}
fclose($file);//close the file

Search for text in a 5GB+ file then get whole line

I want to search for the text Hello (example) in a TXT file whose size is 5GB+ then return the whole line.
I've tried using SplFileObject but what I know is that the line number is required to use SplFileObject, like that:
$linenumber = 2094;
$file = new SplFileObject('myfile.txt');
$file->seek($linenumber-1);
echo $file->current();
But as previously mentioned, I want to search for a string then get the whole line, I don't know the line number.
Any help would be appreciated.
this should work:
<?php
$needle = 'hello';
$count = 1;
$handle = fopen("inputfile.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
// process the line read.
$pos = strpos($line, $needle);
if ($pos !== false) {
echo $line . PHP_EOL;
echo "in line: ".$count . PHP_EOL;
break;
}
$count++;
}
fclose($handle);
} else {
// error opening the file.
}
This is the answer that I can use. Thanks a lot to #user3783243
For Linux:
exec('grep "Hello" myfile.txt', $return);
For Windows:
exec('findstr "Hello" "myfile.txt"', $return);
Now $return should contain the whole line.
Unfortunately, this doesn't work if exec() and system() functions are disabled by your server administrator in the php.ini file. But for me it works fine.
If someone have a better solution I'd be glad to know it :)

Replace a particular line in a text file using php?

I have a text file that stores lastname, first name, address, state, etc as a string with a | delimiter and each record on a separate line.
I have the part where I need to store each record on a new line and its working fine; however, now I need to be able to go back and update the name or address on a particular line and I can't get it to work.
This how to replace a particular line in a text file using php? helped me here but I am not quite there yet. This overwrites the whole file and I lose the records. Any help is appreciated!
After some edit seems to be working now. I am debugging to see if any errors.
$string= implode('|',$contact);
$reading = fopen('contacts.txt', 'r');
$writing = fopen('contacts.tmp', 'w');
$replaced = false;
while (!feof($reading)) {
$line = fgets($reading);
if(stripos($line, $lname) !== FALSE) {
if(stripos($line, $fname) !== FALSE) {
$line = "$string";
$replaced = true;
}
}
fwrite($writing, "$line");
//fputs($writing, $line);
}
fclose($reading); fclose($writing);
// might as well not overwrite the file if we didn't replace anything
if ($replaced)
{
rename('contacts.tmp', 'contacts.txt');
} else {
unlink('contacts.tmp');
}
It seems that you have a file in csv-format. PHP can handle this with fgetcsv() http://php.net/manual/de/function.fgetcsv.php
if (($handle = fopen("contacts.txt", "r")) !== FALSE) {
$data = fgetcsv($handle, 1000, '|')
/* manipulate $data array here */
}
fclose($handle);
So you get an array that you can manipulate. After this you can save the file with fputcsv http://www.php.net/manual/de/function.fputcsv.php
$fp = fopen('contacts.tmp', 'w');
foreach ($data as $fields) {
fputcsv($fp, $fields);
}
fclose($fp);
Well, after the comment by Asad, there is another simple answer. Just open the file in Append-mode http://de3.php.net/manual/en/function.fopen.php :
$writing = fopen('contacts.tmp', 'a');

PHP equivalent of Perl's TIE

Is there an equivalent to PERL's TIE in php? I'd like to see if a string(single word) is in a file where each line is a single word/string. If it is, I'd like to remove the entry. This would be very easy to do in Perl but unsure how I would do this in PHP.
Depending on how you read the file will determine how you remove the entry.
Using fgets:
$filtered = "";
$handle = fopen("/file.txt", "r");
if ($handle) {
// Read file line-by-line
while (($buffer = fgets($handle)) !== false) {
if (strpos($buffer, "replaceMe") === false)
$filtered .= $buffer;
}
}
fclose($handle);
Using file_get_contents:
filterArray($value){
return (strpos($value) === false);
}
// Read file into a string
$string = file_get_contents('input.txt');
$array = explode("\n", $string);
$filtered = array_filter($array, "filterArray");
Using file:
function filterArray($value){
return (strpos($value) === false);
}
// Read file into array (each line as an element)
$array = file('input.txt', FILE_IGNORE_NEW_LINES);
$filtered = array_filter($array, "filterArray");
Note: Each method assumes that you want to remove the entire entry if it contains a single word.

Extract text from doc and docx

I would like to know how can I read the contents of a doc or docx. I'm using a Linux VPS and PHP, but if there is a simpler solution using other language, please let me know, as long as it works under a linux webserver.
Here i have added the solution to get the text from .doc,.docx word files
How to extract text from word file .doc,docx php
For .doc
private function read_doc() {
$fileHandle = fopen($this->filename, "r");
$line = #fread($fileHandle, filesize($this->filename));
$lines = explode(chr(0x0D),$line);
$outtext = "";
foreach($lines as $thisline)
{
$pos = strpos($thisline, chr(0x00));
if (($pos !== FALSE)||(strlen($thisline)==0))
{
} else {
$outtext .= $thisline." ";
}
}
$outtext = preg_replace("/[^a-zA-Z0-9\s\,\.\-\n\r\t#\/\_\(\)]/","",$outtext);
return $outtext;
}
For .docx
private function read_docx(){
$striped_content = '';
$content = '';
$zip = zip_open($this->filename);
if (!$zip || is_numeric($zip)) return false;
while ($zip_entry = zip_read($zip)) {
if (zip_entry_open($zip, $zip_entry) == FALSE) continue;
if (zip_entry_name($zip_entry) != "word/document.xml") continue;
$content .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
zip_entry_close($zip_entry);
}// end while
zip_close($zip);
$content = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $content);
$content = str_replace('</w:r></w:p>', "\r\n", $content);
$striped_content = strip_tags($content);
return $striped_content;
}
This is a .DOCX solution only. For .DOC or .PDF you'll need to use something else like pdf2text.php for PDF
function docx2text($filename) {
return readZippedXML($filename, "word/document.xml");
}
function readZippedXML($archiveFile, $dataFile) {
// Create new ZIP archive
$zip = new ZipArchive;
// Open received archive file
if (true === $zip->open($archiveFile)) {
// If done, search for the data file in the archive
if (($index = $zip->locateName($dataFile)) !== false) {
// If found, read it to the string
$data = $zip->getFromIndex($index);
// Close archive file
$zip->close();
// Load XML from a string
// Skip errors and warnings
$xml = new DOMDocument();
$xml->loadXML($data, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NOERROR | LIBXML_NOWARNING);
// Return data without XML formatting tags
return strip_tags($xml->saveXML());
}
$zip->close();
}
// In case of failure return empty string
return "";
}
echo docx2text("test.docx"); // Save this contents to file
Parse .docx, .odt, .doc and .rtf documents
I wrote a library that parses the docx, odt and rtf documents based on answers here and elsewhere.
The major improvement I have made to the .docx and .odt parsing is the that the library processes the XML that describes the document and attempts to conform it to HTML tags, i.e. em and strong tags. This means that if you're using the library for a CMS, text formatting is not lost
You can get it here
My solution is Antiword for .doc and docx2txt for .docx
Assuming a linux server that you control, download each one, extract then install. I installed each one system wide:
Antiword: make global_install
docx2txt: make install
Then to use these tools to extract the text into a string in php:
//for .doc
$text = shell_exec('/usr/local/bin/antiword -w 0 ' .
escapeshellarg($docFilePath));
//for .docx
$text = shell_exec('/usr/local/bin/docx2txt.pl ' .
escapeshellarg($docxFilePath) . ' -');
docx2txt requires perl
no_freedom's solution does extract text from docx files, but it can butcher whitespace. Most files I tested had instances where words that should be separated had no space between them. Not good when you want to full text search the documents you're processing.
Try ApachePOI. It works well for Java. I suppose you won't have any difficulties installing Java on Linux.
I would suggest, Extract text using apache Tika, you can extract multiple type of file content like .doc/.docx and pdf and many other.
I used docxtotxt to extract docx file content. My code is as follows:
if($extention == "docx")
{
$docxFilePath = "/var/www/vhosts/abc.com/httpdocs/writers/filename.docx";
$content = shell_exec('/var/www/vhosts/abc.com/httpdocs/docx2txt/docx2txt.pl
'.escapeshellarg($docxFilePath) . ' -');
}
I insert little improvements in doc to txt converter function
private function read_doc() {
$line_array = array();
$fileHandle = fopen( $this->filename, "r" );
$line = #fread( $fileHandle, filesize( $this->filename ) );
$lines = explode( chr( 0x0D ), $line );
$outtext = "";
foreach ( $lines as $thisline ) {
$pos = strpos( $thisline, chr( 0x00 ) );
if ( $pos !== false ) {
} else {
$line_array[] = preg_replace( "/[^a-zA-Z0-9\s\,\.\-\n\r\t#\/\_\(\)]/", "", $thisline );
}
}
return implode("\n",$line_array);
}
Now it saves empty rows and txt file looks row by row .
You can use Apache Tika as complete solution it provides REST API.
Another good library is RawText, as it can do an OCR over images, and extract text from any doc. It's non-free, and it works over REST API.
The sample code extracting your file with RawText:
$result = $rawText->extract($your_file)

Categories