How can I remove certain lines in a multiline string? - php

My code is receiving a string which I have no control of, which I'll call $my_string. The string is the contents of a transcript. If I echo the string, like so:
echo $my_string;
I can see the contents, which look something like this.
1
00:00:00.000 --> 00:00:04.980
[MUSIC]
2
00:00:04.980 --> 00:00:08.120
Hi, my name is holl and I am here
to write some PHP.
3
00:00:08.120 --> 00:00:10.277
You can see my screen, here.
What I'd like to do is run this through a function so it's just the actual words spoken, removing all the lines that signify time, or the order.
[MUSIC]
Hi, my name is holl and I am here
to write some php.
You can see my screen, here.
My idea is to explode the whole string by the break, and try to detect which lines which are either empty or start with a number, like so...
$lines = explode("\n", $my_string);
foreach ($lines as $line) {
if (is_numeric(line[0]) || empty(line[0]) ) {
continue;
}
$exclude[] = $line;
}
$transcript = implode("\n", $exclude);
But the result of this action is exactly the same- the output has numbers and blank lines. I clearly misunderstand something- but what is it? And is there a better way of trying to achieve my goal?
Thanks!
EDIT: Removed an echo where I wasn't actually using one in my code.

The problem is that you use indexing on $line:
$lines = explode("\n", $my_string);
foreach ($lines as $line) {
if (is_numeric(line[0]) || empty(line[0]) ) {//index usage?
continue;
}
$exclude[] = $line;
}
$transcript = echo implode("\n", $exclude); //remove echo
replace by:
$lines = explode("\n", $my_string);
foreach ($lines as $line) {
if (is_numeric($line) || empty($line) ) {//here
continue;
}
$exclude[] = $line;
}
$transcript = implode("\n", $exclude);
You also need regex matching to remove the 00:00:00.000 --> 00:00:04.980 fragments.
You can combine them by:
if(preg_match('/^(|\d+|\d+:\d+:\d+\.\d+\s+-->\s+\d+:\d+:\d+\.\d+)$/',$line)) { //regex
takes all possibilities into account:
$lines = explode("\n", $my_string);
foreach ($lines as $line) {
if(preg_match('/^(|\d+|\d+:\d+:\d+\.\d+\s+-->\s+\d+:\d+:\d+\.\d+)$/',$line)) {
continue;
}
$exclude[] = $line;
}
$transcript = implode("\n", $exclude);
echo $transcript;
Example (with php -a):
$ php -a
php > $my_string='1
php ' 00:00:00.000 --> 00:00:04.980
php ' [MUSIC]
php '
php ' 2
php ' 00:00:04.980 --> 00:00:08.120
php ' Hi, my name is holl and I am here
php ' to write some PHP.
php '
php ' 3
php ' 00:00:08.120 --> 00:00:10.277
php ' You can see my screen, here.';
php > $lines = explode("\n", $my_string);
php > foreach ($lines as $line) {
php { if(preg_match('/^(|\d+|\d+:\d+:\d+\.\d+\s+-->\s+\d+:\d+:\d+\.\d+)$/',$line)) {
php { continue;
php { }
php { $exclude[] = $line;
php { }
php > $transcript = implode("\n", $exclude);
php > echo $transcript;
[MUSIC]
Hi, my name is holl and I am here
to write some PHP.
You can see my screen, here.

Your code works almost. Just forgot $ in line[0] and " " is not empty().
$my_string = <<< EOF
1
00:00:00.000 --> 00:00:04.980
[MUSIC]
2
00:00:04.980 --> 00:00:08.120
Hi, my name is holl and I am here
to write some PHP.
3
00:00:08.120 --> 00:00:10.277
You can see my screen, here.
EOF;
$lines = explode("\n", $my_string);
foreach ($lines as $line) {
$temp = trim($line[0]);
if (is_numeric($temp) || empty($temp) ) {
continue;
}
$exclude[] = $line;
}
$transcript = implode("\n", $exclude);
echo $transcript;
Result:
[MUSIC]
Hi, my name is holl and I am here
to write some PHP.
You can see my screen, here.

It looks like it's a pattern. That is every first and second line contain meta data, the third is text, and the fourth is empty. If that is indeed the case, it should be trivial. You don't have to check the content at all and just grab the third line of every quartet:
$lines = explode("\n", $my_string);
$texts = array();
for ($i = 0; $i < count($lines); $i++) {
if ($i % 4 == 2) { // Index of third line is 2, of course.
$texts[] = $lines[i];
}
}
$transcript = implode($texts, "\n");
With alternative logic, because as you rightfully mentioned there can be more than one line of text, you could say that blocks/entries whatever you call them, are separated by an empty line. Each block starts with two lines of meta data, followed by one (or maybe zero) or more lines of text. With that logic you could parse it like this:
$lines = explode("\n", $my_string);
$texts = array();
$linenr = 0;
foreach ($lines as $line) {
// Keep track of the how manieth non-empty line it is.
if ($line === '')
$linenr = 0;
else
$linenr++;
// Skip the first two lines of a block.
if ($linenr > 2)
$texts[] = $line;
}
$transcript = implode($texts, "\n");
I don't know this particular format, but if I wanted to do this, I would be eager to find a pattern like this rather than parse the lines themselves. It looks like a script or subtitle file, and if you want to turn it into a transcript, it would be a shame if somebody shouted '300' and it would not be transcripted.

to remove theses lines try to use : preg_replace + regex
php man [1]: http://php.net/manual/en/function.preg-replace.php

Related

Get value from file - php

Let's say I have this in my text file:
Author:MJMZ
Author URL:http://abc.co
Version: 1.0
How can I get the string "MJMZ" if I look for the string "Author"?
I already tried the solution from another question (Php get value from text file) but with no success.
The problem may be because of the strpos function. In my case, the word "Author" got two. So the strpos function can't solve my problem.
Split each line at the : using explode, then check if the prefix matches what you're searching for:
$lines = file($filename, FILE_IGNORE_NEW_LINES);
foreach($lines as $line) {
list($prefix, $data) = explode(':', $line);
if (trim($prefix) == "Author") {
echo $data;
break;
}
}
Try the following:
$file_contents = file_get_contents('myfilename.ext');
preg_match('/^Author\s*\:\s*([^\r\n]+)/', $file_contents, $matches);
$code = isset($matches[1]) && !empty($matches[1]) ? $matches[1] : 'no-code-found';
echo $code;
Now the $matches variable should contains the MJMZ.
The above, will search for the first instance of the Author:CODE_HERE in your file, and will place the CODE_HERE in the $matches variable.
More specific, the regex. will search for a string that starts with the word Author followed with an optional space \s*, followed by a semicolon character \:, followed by an optional space \s*, followed by one or more characters that it is not a new line [^\r\n]+.
If your file will have dinamically added items, then you can sort it into array.
$content = file_get_contents("myfile.txt");
$line = explode("\n", $content);
$item = new Array();
foreach($line as $l){
$var = explode(":", $l);
$value = "";
for($i=1; $i<sizeof($var); $i++){
$value .= $var[$i];
}
$item[$var[0]] = $value;
}
// Now you can access every single item with his name:
print $item["Author"];
The for loop inside the foreach loop is needed, so you can have multiple ":" in your list. The program will separate name from value at the first ":"
First take lines from file, convert to array then call them by their keys.
$handle = fopen("file.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
$pieces = explode(":", $line);
$array[$pieces[0]] = $pieces[1];
}
} else {
// error opening the file.
}
fclose($handle);
echo $array['Author'];

PHP append variable to the END of a specific line in file

So I got the following function:
function searchWord($fileName, $str) {
$addthis = “string”;
$lines = file($fileName);
foreach ($lines as $lineNumber => $line) {
if (strpos($line, $str) !== false) {
$lines[$lineNumber] = $lines[$lineNumber].$addthis
file_put_contents($filename, implode($lines) . PHP_EOL);
break;
} else {
// do nothing
}
}
}
It searches for a specific string $str in a file $fileName. It then should add $addthis to the END (!!) of the line $lineNumber the $str was found on.
What the code does now is adding $addthis to the start of the line. I have searched the web but couldn't come up with a satisfying solution to my problem.
You need to remove the new line character from the end of your line before adding to it, then add a new line character back onto the end of it before writing back to the file. The updated line below should help.
$lines[$lineNumber] = trim($lines[$lineNumber]).$addthis.PHP_EOL;
Also, the write should happen after the loop.
Try this.
In my example, im using a text file called test.txt with these 3 lines on it:
apple
banana
cherry
Here's the PHP code.
<?php
// Search Function
function searchWord($fileName, $str, $addThis = 'string') {
if (file_exists($fileName)) {
$lines = file($fileName);
array_walk_recursive($lines, function(&$line) {
$line = trim($line);
});
foreach ($lines as $lineNo => $lineStr) {
if (false !== strpos($lineStr, $str)) {
$lines[$lineNo] = $lineStr . $addThis;
break;
}
}
file_put_contents($fileName, implode(PHP_EOL, $lines));
}
}
// Usage
searchWord('test.txt', 'banana', 'string');
?>
After the script is ran, the contents of test.txt is:
apple
bananastring
cherry
is this what you are after?

PHP search text file line by line for two strings then output line

I am trying to search a text file for two values on a line. If both values are present I need to output the entire line. The values I am searching for may not be next to each other which is where I am getting stuck. I have the following code which works well but only for one search value:
<?php
$search = $_REQUEST["search"];
// Read from file
$lines = file('archive.txt');
foreach($lines as $line)
{
// Check if the line contains the string we're looking for, and print if it does
if(strpos($line, $search) !== false)
echo"<html><title>SEARCH RESULTS FOR: $search</title><font face='Arial'> $line <hr>";
}
?>
Any assistance much appreciated. Many thanks in advance.
Assuming the values you're searching for are separated by a space, and they will both always be present, explode should do the trick:
$search = explode(' ', $_REQUEST["search"]); // change ' ' to ',' if you separate the search terms with a comma, etc.
// Read from file
$lines = file('archive.txt');
foreach($lines as $line)
{
// Check if the line contains the string we're looking for, and print if it does
if(strpos($line, $search[0]) !== false && strpos($line, $search[1] !== false)) {
echo"<html><title>SEARCH RESULTS FOR: $search</title><font face='Arial'> $line <hr>";
}
}
I'll leave it up to you to add some validation to make sure there are always two elements in the $search array, etc.
I also corrected the HTML code. The script looks for two values, $search and $search2. It is using stristr(). For the case-sensitive version of stristr, refer to strstr(). The script will return all lines containing both $search and $search2.
<?php
$search = $_REQUEST["search"];
$search2 = $_REQUEST['search2'];
// Read from file
$lines = file('archive.txt');
echo"<html><head><title>SEARCH RESULTS FOR: $search</title></head><body>";
foreach($lines as $line)
{
// Check if the line contains the string we're looking for, and print if it does
if(stristr($line,$search) && stristr($line,$search2)) // case insensitive
echo "<font face='Arial'> $line </font><hr>";
}
?>
</body></html>
Just search for your other value also and use && to check for both.
<?php
$search1 = $_REQUEST["search1"];
$search2 = $_REQUEST["search2"];
// Read from file
$lines = file('archive.txt');
foreach($lines as $line)
{
// Check if the line contains the string we're looking for, and print if it does
if(strpos($line, $search1) !== false && strpos($line, $search2) !== false)
echo"<html><title>SEARCH RESULTS FOR: $search1 and $search2</title><font face='Arial'> $line <hr>";
}
?>
This worked for me. You may define what you like in searchthis aray and it will be displayed with whole line.
<?php
$searchthis = array('1','2','3');
$matches = array();
$handle = fopen("file_path", "r");
if ($handle)
{
while (!feof($handle))
{
$buffer = fgets($handle);
foreach ($searchthis as $param) {
if(strpos($buffer, $param) !== FALSE)
$matches[] = $buffer;
}}
fclose($handle);
}
foreach ($matches as $parts) {
echo $parts;
}
?>

Reformatting The Content Of A File

I have a file called "data.txt". In this file, there are 30 lines with strings who all have the same length. It looks like this:
2QA4ZRDUT
IDVLTLZSC
4GYC3HCMV
1W6409JD5
70P7U66TE
... and so on.
What I want to do now is reformatting these lines. I want 5 strings on one line, seperated with a ";". After that, I want a new line and the next 5. In the end, it should look like this:
2QA4ZRDUT;IDVLTLZSC;4GYC3HCMV;1W6409JD5;70P7U66TE;
NGN1TGF6G;JWVI7LSIZ;U99TMVXXK;KLBDMRPQV;MEFKLUO3;
... and so on, until the whole content of the file is reformatted like this.
The last hours I've been trying to accomplish this by using for, foreach and while loops but I only manage to get one line. I hope somebody can help me since I'm not that experienced with PHP.
This is the content of "data.txt":
2QA4ZRDUT
IDVLTLZSC
4GYC3HCMV
1W6409JD5
70P7U66TE
OG2JBBZF6
5391PHOVW
ZAJ3OZ4H2
GMOB9E9X7
Q8U4C8ZK1
0WDZLRWWJ
N487W3S24
PKXQFFEK3
NSMKC29IB
HOLI1T2ZB
DVPIVLLLS
FH7RSZWTM
9VSUWPZEX
NM6ZWV19I
NGN1TGF6G
JWVI7LSIZ
U99TMVXXK
KLBDMRPQV
MEFKLUO3L
LICFIK24W
ELGPLCK51
QQS4SOJV1
KJ2UVTU1B
FLQ6T7LG6
QJZLAPYN1
something like that:
<?php
$oldf=fopen('data.txt','r');
$newf=fopen('data_new.txt','w');
$i=0;
while(!feof($oldf))
{
$i++;
$line=fgets($oldf);
fwrite($newf,$line.';');
if($i%5==0)
fwrite($newf,"\n");
}
fclose($oldf);
fclose($newf);
Try this:
<?php
$file_name = "test.txt"; // Change file name as needed
$file_data = file_get_contents($file_name);
$file_data_array = explode('\n', $file_data);
$i = 0;
$new_file_data = "";
foreach ($file_data_array as $piece)
{
$new_file_data .= $piece . ';';
$i++;
if ($i % 5 == 0) // Change number of pieces you want on a line as needed
{
$new_file_data .= '\n';
}
}
file_put_contents($file_name, $new_file_data);
?>
This should do it:
$str = '2QA4ZRDUT
IDVLTLZSC
4GYC3HCMV
1W6409JD5
70P7U66TE
OG2JBBZF6
5391PHOVW
ZAJ3OZ4H2
GMOB9E9X7
Q8U4C8ZK1
0WDZLRWWJ
N487W3S24
PKXQFFEK3
NSMKC29IB
HOLI1T2ZB
DVPIVLLLS
FH7RSZWTM
9VSUWPZEX
NM6ZWV19I
NGN1TGF6G
JWVI7LSIZ
U99TMVXXK
KLBDMRPQV
MEFKLUO3L
LICFIK24W
ELGPLCK51
QQS4SOJV1
KJ2UVTU1B
FLQ6T7LG6
QJZLAPYN1';
$arr = explode(PHP_EOL, $str);
$c = 1;
$str3 = '';
foreach ($arr as $item) {
$str3.= $item.';';
if ($c % 5 == 0) {
$str3.= PHP_EOL;
}
++$c;
}
echo "<p>$str3</p>";
Be aware that you won't see the line breaks in the browser, but they will be apparent when you 'view source'.
Edit: I'm using PHP_EOL here instead of "\n" as the other answers have suggested. This ensures that the script will work correctly on any platform (Win, MacOS & *nix) as long as the PHP setting auto_detect_line_endings is set to true

Help with string parsing

I have a huge library file containing a word and it's synonyms, this is some words and their synonyms in the format of my library:
aantarrão|1
igrejeiro|igrejeiro|aantarrão|beato
aãsolar|1
desolar|desolar|aãsolar|afligir|arrasar|arruinar|consternar|despovoar|devastar|magoar
aba|11
amparo|amparo|aba|abrigo|achego|acostamento|adminículo|agasalho|ajuda|anteparo|apadrinhamento|apoio|arrimo|asilo|assistência|auxíjlio|auxílio|baluarte|bordão|broquel|coluna|conchego|defesa|égide|encosto|escora|esteio|favor|fulcro|muro|patrocínio|proteção|proteçâo|resguardo|socorro|sustentáculo|tutela|tutoria
apoio|apoio|aba|adesão|adminículo|amparo|aprovação|arrimo|assentimento|base|bordão|coluna|conchego|descanso|eixo|encosto|escora|espeque|fé|fulcro|proteçâo|proteção|refúgio|socorro|sustentáculo
beira|beira|aba|beirada|borda|bordo|cairel|encosta|extremidade|falda|iminência|margem|orla|ourela|proximidade|rai|riba|sopé|vertente
beirada|beirada|aba|beira|encosta|falda|margem|sopé|vertente
encosta|encosta|aba|beira|beirada|clivo|falda|lomba|sopé|subida|vertente
falda|falda|aba|beira|beirada|encosta|fralda|sopé|vertente
fralda|fralda|aba|falda|raiss|raiz|sopé
prestígio|prestígio|aba|auréola|autoridade|domínio|força|halo|importância|influência|preponderância|valia|valimento|valor
proteção|proteção|aba|abrigo|agasalho|ajuda|amparo|apoio|arrimo|asilo|auspiciar|auxílio|bafejo|capa|custódia|defesa|égide|escora|fautoria|favor|fomento|garantia|paládio|patrocínio|pistolão|quartel|refúgio|socorro|tutela|tutoria
sopé|sopé|aba|base|beira|beirada|encosta|falda|fralda|raiz|vertente
vertente|vertente|aba|beira|beirada|declive|encosta|falda|sopé
see aantarrão is a word and below it are the synonyms, I can't think of a way to get the word and the synonyms on an associative array, this is what I'm trying to do:
<?
$file = file('library.txt');
$array_sinonimos = array();
foreach($file as $k)
{
$explode = explode($k, "|");
if(is_int($explode[1]))
{
$word = $explode[0];
}
}
?>
nothing, lol, what can I do here ? loop lines until I find an empty line then try to get a new word with the explode ?, help !
Here's some code I cooked up that seems to work.
See the code in action here: http://codepad.org/TVpYgW91
See the code here
UPDATED to read line by line
<?php
$filepointer = fopen("library.txt", "rb");
$words = array();
while(!feof($filepointer)) {
$line = trim(fgets($filepointer));
$content = explode("|", $line);
if (count($content) == 0)
continue;
if (is_numeric(end($content))) {
$word = reset($content);
continue;
}
if (isset($words[$word]))
$words[$word] = array_merge($words[$word], $content);
else
$words[$word] = $content;
}
print_r($words);
So what's the strategy?
fix up the line endings
run through the file line by line
ignore empty lines (count($content))
split the line up on the pipes, if the line has a numerical value for the last value, then this becomes our word
we only get to the last step if none of the other traps got touched, because of the continue statements, so if it is then just split up the words by the pipe and add them to or create the array element.
Try this. I can't remember if array_merge() will work with a null, but the basic idea is that $word is the $key to the assoc array.
<?
$file = file('library.txt');
$array_sinonimos = array();
foreach($file as $k)
{
$explode = explode($k, "|");
if(is_int($explode[1]))
{
$word = $explode[0];
}
else if(!empty($explode))
{
$array_sinonimos[$word] = array_merge($synonyms[$word], $explode);
}
}
?>

Categories