I do some process of text files like take the lines that begin with number.
$file = $_FILES['file']['tmp_name'];
$lines = file($file);
foreach ($lines as $line_num => $line) {
$checkFirstChar = isNumber($line);
if ($checkFirstChar !== false){
$line_parts = explode(' ', $line);
$line_number = array_shift($line_parts);
$string1 = mysql_real_escape_string(implode(' ', $line_parts));
$string2 = implode(' ', $line_parts);
// insert sentence $string1 in database a sentence a row
// then I wanna get the text in one string that i've filtered before to do another process
I use $string2 but it still the collection of string, every sentence is a string,
string(165) "sentence1 . " string(273) "sentence2 . " etc
all i need is all of sentence become one string again. what could I do? thanks
example input text file :
=========================
file : jssksksks
=========================
1. blablabla.
2. bliblibli .
3. balbalba
=========================
file : jkklkok
=========================
1.blulbulbu.
2.bleblelbl
$string2 is inside the loop. You can collect all the lines that match the criteria and only implode them after the loop:
$file = $_FILES['file']['tmp_name'];
$lines = file($file);
newlines = array();
foreach ($lines as $line_num => $line) {
$checkFirstChar = isNumber($line);
if ($checkFirstChar !== false){
$line_parts = explode(' ', $line);
$line_number = array_shift($line_parts);
// Concat this line together and add it to another array.
$newlines[] = implode(' ', $line_parts);
}
}
// Concat all the lines together.
$newfile = implode("\n", $newlines);
Related
I have the following text file :
====================================================================================
INDEXNUMARTICLE: '1997'
FILE: '###\www.kkk.com\kompas-pront\0004\25\economic\index.htm' NUMSENT: '22' DOMAIN: 'economic'
====================================================================================
2. Social change is a general term which refers to:
4. change in social structure: the nature, the social institutions.
6. When behaviour pattern changes in large numbers, and is visible and sustained, it results in a social change.
I wanna get only the sentence without the numbering and save it in database :
=========================================================================
= id = topic = content =
=========================================================================
= 1 = economic = Social change is a general term which refers to: =
= change in social structure: the nature, =
= the social institutions. When behaviour pattern =
= changes in large numbers, and is visible and sustained,
= it results in a social change. =
CODE
function isNumber($string) {
return preg_match('/^\\s*[0-9]/', $string) > 0;
}
$txt = "C:/Users/User/Downloads/economic.txt";
$lines = file($txt);
foreach($lines as $line_num => $line) {
$checkFirstChar = isNumber($line);
if ($checkFirstChar !== false) {
$line_parts = explode(' ', $line);
$line_number = array_shift($line_parts);
foreach ($line_parts as $part) {
if (empty($part)) continue;
$parts = array();
$string = implode(' ', $parts);
$query = mysql_query("INSERT INTO tb_file VALUES ('','economic','$string')");
}
}
}
I have the problem with array, the data that inserted in column content are words by words in different row. please help me. thank you :)
I think your idea is to complicated - try this short one:
$txt = "C:/Users/User/Downloads/economic.txt";
$lines = file($txt);
foreach($lines as $line_num => $line) {
$checkFirstChar = isNumber($line);
if ($checkFirstChar !== false) {
//entire text line without number
$string = substr($line,strpos($line,"")+1);
$query = mysql_query("INSERT INTO tb_file VALUES ('','economic','$string')");
}
}
Try this one, with regex.
$regex = "/[0-9]\. /";
$txt = "C:/Users/User/Downloads/economic.txt";
$str = file_get_contents($txt);
$index = -1;
//Find the first ocurrence of a number followed by '.' and a whitespace
if(preg_match($regex, $str, $matches, PREG_OFFSET_CAPTURE)) {
$index = $matches[0][1];
}
//Remove all the text before that first occurrence
$str = substr($str, $index);
//Replace all the occurrences of number followed by '. ' with ' '
$text = preg_replace($regex, " ", $str);
Let's say I have this in my text file:
Author:MJMZ
Author URL:http://abc.co
Version: 1.0
How can I get the string "MJMZ" if I look for the string "Author"?
I already tried the solution from another question (Php get value from text file) but with no success.
The problem may be because of the strpos function. In my case, the word "Author" got two. So the strpos function can't solve my problem.
Split each line at the : using explode, then check if the prefix matches what you're searching for:
$lines = file($filename, FILE_IGNORE_NEW_LINES);
foreach($lines as $line) {
list($prefix, $data) = explode(':', $line);
if (trim($prefix) == "Author") {
echo $data;
break;
}
}
Try the following:
$file_contents = file_get_contents('myfilename.ext');
preg_match('/^Author\s*\:\s*([^\r\n]+)/', $file_contents, $matches);
$code = isset($matches[1]) && !empty($matches[1]) ? $matches[1] : 'no-code-found';
echo $code;
Now the $matches variable should contains the MJMZ.
The above, will search for the first instance of the Author:CODE_HERE in your file, and will place the CODE_HERE in the $matches variable.
More specific, the regex. will search for a string that starts with the word Author followed with an optional space \s*, followed by a semicolon character \:, followed by an optional space \s*, followed by one or more characters that it is not a new line [^\r\n]+.
If your file will have dinamically added items, then you can sort it into array.
$content = file_get_contents("myfile.txt");
$line = explode("\n", $content);
$item = new Array();
foreach($line as $l){
$var = explode(":", $l);
$value = "";
for($i=1; $i<sizeof($var); $i++){
$value .= $var[$i];
}
$item[$var[0]] = $value;
}
// Now you can access every single item with his name:
print $item["Author"];
The for loop inside the foreach loop is needed, so you can have multiple ":" in your list. The program will separate name from value at the first ":"
First take lines from file, convert to array then call them by their keys.
$handle = fopen("file.txt", "r");
if ($handle) {
while (($line = fgets($handle)) !== false) {
$pieces = explode(":", $line);
$array[$pieces[0]] = $pieces[1];
}
} else {
// error opening the file.
}
fclose($handle);
echo $array['Author'];
$file = file_get_contents("http://www.bigsite.com");
How could i go about removing all lines from string $file that contains the word "hello" ?
$file = file_get_contents("http://www.bigsite.com");
$lines = explode("\n", $file);
$exclude = array();
foreach ($lines as $line) {
if (strpos($line, 'hello') !== FALSE) {
continue;
}
$exclude[] = $line;
}
echo implode("\n", $exclude);
$file = file_get_contents("http://www.example.com");
// remove sigle word hello
echo preg_replace('/(hello)/im', '', $file);
// remove multiple words hello, foo, bar, foobar
echo preg_replace('/(hello|foo|bar|foobar)/im', '', $file);
EDIT Removing the Lines
// read each file lines in array
$lines = file('http://example.com/');
// match single word hello
$pattern = '/(hello)/im';
// match multiple words hello, foo, bar, foobar
$pattern = '/(hello|foo|bar|foobar)/im';
$rows = array();
foreach ($lines as $key => $value) {
if (!preg_match($pattern, $value)) {
// lines not containing hello
$rows[] = $line;
}
}
// now create the paragraph again
echo implode("\n", $rows);
Here you go:
$file = file('http://www.bigsite.com');
foreach( $file as $key=>$line ) {
if( false !== strpos($line, 'hello') ) {
unset $file[$key];
}
}
$file = implode("\n", $file);
$file = file_get_contents("http://www.bigsite.com");
echo preg_replace('/((^|\n).*hello.*(\n|$))/', "\n", $file).trim();
The 4 patterns are for matching
if the first line has hello
A center line has hello
The last line has hello
The only line has hello
In case this are files with \r\n (Carriage return & Newline like on Windows) you need to modify this accordingly. The Trim can remove trailing and/or leading newlines
I have a csv file with this:
software
hardware
educational
games
languages
.
.
.
I need a new csv file with:
software;hardware;educational;games;languages;....
How can I do that?
I'm doing:
<?php
$one = file_get_contents('one.csv');
$patterns =" /\\n/";
$replacements = ";";
$newone = preg_replace($patterns, $replacements, $one);
echo $newone;
file_put_contents('newone.csv', $newone );
?>
This adds the semicolon at the end of the line but the line break is still there
Surprisingly none of you mentioned file() that returns what he needs:
$cont = file('somefile.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
file_put_contents('somefile.csv',implode(';',$cont));
2 lines of code without using slow regex
OR
if you need less code, here with 1 line of code, the way i like !
file_put_contents(
'somefile.csv',
implode(
';',
file('somefile.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES)
)
);
Here is how you can do this.
Edit : tested this, works correct.
<?php
$row = 1;
$readHandle = fopen("in.csv", "r"); // open the csv file
$writeHandle = fopen("out.csv","w");
$subArr = array();
while (($data = fgetcsv($readHandle, 1000, "\n")) !== FALSE) {
$myStr = $data[0]; // this stores the zeroth column of each CSV row
$subArr[] = $myStr; // subArr contains all your words
}
fputcsv($writeHandle,$subArr,";"); // it creates a CSV with single line seperated by ;
fclose($readHandle);
fclose($writeHandle);
?>
I guess you could get a preg_match_all() to get every alphanumeric word surrounded by quotes into an array.
Then you just loop on that array and display them adding a semicolon.
as a one off, I would run home to mama...
perl -p -i -e 's|(.*)\n|$1;|m' one.cvs
Your file may have carriage returns. Try this:
$newone = str_replace("\r\n", ';', $one);
To cover all possibilities:
<?php
$file = 'data.csv';
file_put_contents($file, '"software"
"hardware"
"educational"
"games"
"languages"
');
$input_lines = file($file);
$output_columns = array();
foreach($input_lines as $line){
$line = trim($line); // Remove trailing new line
$line = substr($line, 1); // Remove leading quote
$line = substr($line, 0, -1); // Remove trailing quote
$output_columns[] = $line;
}
echo implode(';', $output_columns);
Beware: this code assumes no errors in input file. Always add some validation.
I suggest doing it like this:
<?php
$one = file_get_contents('one.csv');
$patterns ="/\\r?\\n/";
$replacements = ";";
$newone = preg_replace($patterns, $replacements, $one);
echo $newone;
file_put_contents('newone.csv', $newone );
?
I'm trying to remove some excessive indention from a string, in this case it's SQL, so it can be put into a log file. So I need the find the smallest amount of indention (aka tabs) and remove it from the front of each line, but the following code ends up printing out exactly the same, any ideas?
In other words, I want to take the following (NOTE: StackOverflow editor converted my tabs to spaces, in the code, a tab simulates 4 spaces, but it really is a \t character)
SELECT
blah
FROM
table
WHERE
id=1
and convert it to
SELECT
blah
FROM
table
WHERE
id=1
here's the code I tried and fails
$sql = '
SELECT
blah
FROM
table
WHERE
id=1
';
// it's most likely idented SQL, remove any idention
$lines = explode("\n", $sql);
$space_count = array();
foreach ( $lines as $line )
{
preg_match('/^(\t+)/', $line, $matches);
$space_count[] = strlen($matches[0]);
}
$min_tab_count = min($space_count);
$place = 0;
foreach ( $lines as $line )
{
$lines[$place] = preg_replace('/^\t{'. $min_tab_count .'}/', '', $line);
$place++;
}
$sql = implode("\n", $lines);
print '<pre>'. $sql .'</pre>';
It seems the problem was
strlen($matches[0])
returns 0 and 1 for the first and last line, which isn't the 3 I actually wanted as the minimum, so a quick hack was to
trim the SQL
skip counting the length if it's less than 2
Not the most elegant solution, but it'll always work because tabs are usually in the 4+ count in this code. Here's the fixed code:
$sql = '
SELECT
blah
FROM
table
WHERE
id=1
';
// it's most likely idented SQL, remove any idention
$lines = explode("\n", $sql);
$space_count = array();
foreach ( $lines as $line )
{
preg_match('/^(\t+)/', $line, $matches);
if ( strlen($matches[0]) > 1 )
{
$space_count[] = strlen($matches[0]);
}
}
$min_tab_count = min($space_count);
$place = 0;
foreach ( $lines as $line )
{
$lines[$place] = preg_replace('/^\t{'. $min_tab_count .'}/', '', $line);
$place++;
}
$sql = implode("\n", $lines);
print $sql;
private function cleanIndentation($str) {
$content = '';
foreach(preg_split("/((\r?\n)|(\r\n?))/", trim($str)) as $line) {
$content .= " " . trim($line) . PHP_EOL;
}
return $content;
}