How can I search a .tsv file for multiple matches to a string and export them to a database?
What I'm trying to do is search a large file called mdata.tsv (1.5m lines) for a string given to it from an array. Afterwards output matching columns data.
The current code is what I've gotten stuck at:
<?php
$file = fopen("mdata.tsv","r"); //open file
$movies = glob('./uploads/Videos/*/*/*/*.mp4', GLOB_BRACE); //Find all the movies
$movID = array(); //Array for movies IDs
//Get XML and add the IDs to $movID()
foreach ($movies as $movie){
$pos = strrpos($movie, '/');
$xml = simplexml_load_file((substr($movie, 0, $pos + 1) .'movie.xml'));
array_push($movID, $xml->id);
}
//Loop through the TSV rows and search for the $tmdbID then print out the movies category.
foreach ($movID as $tmdbID) {
while(($row = fgetcsv($file, 0, "\t")) !== FALSE) {
fseek($file,0);
$myString = $row[0];
$b = strstr( $myString, $tmdbID );
//Dump out the row for the sake of clarity.
//var_dump($row);
$myString = $row[0];
if ($b == $tmdbID){
echo 'Match ' . $row[0] .' '. $row[8];
} // Displays movie ID and category
}
}
fclose($file);
?>
Example of tsv file:
tt0043936 movie The Lawton Story The Lawton Story 0 1949 \N \N Drama,Family
tt0043937 short The Prize Pest The Prize Pest 0 1951 \N 7 Animation,Comedy,Family
tt0043938 movie The Prowler The Prowler 0 1951 \N 92 Drama,Film-Noir,Thriller
tt0043939 movie Przhevalsky Przhevalsky 0 1952 \N \N Biography,Drama
It looks as though you can simplify this code by using in_array() instead of the nested loops to see if the current line is in the list of required ID's. The one change needed to make sure this works is that you need to ensure that you store strings in the $movID array.
$file = fopen("mdata.tsv","r"); //open file
$movies = glob('./uploads/Videos/*/*/*/*.mp4', GLOB_BRACE); //Find all the movies
$movID = array(); //Array for movies IDs
//Get XML and add the IDs to $movID()
foreach ($movies as $movie){
$pos = strrpos($movie, '/');
$xml = simplexml_load_file((substr($movie, 0, $pos + 1) .'movie.xml'));
// Store ID as string
$movID[] = (string) $xml->id;
}
while(($row = fgetcsv($file, 0, "\t")) !== FALSE) {
if ( in_array($row[0], $movID) ){
echo 'Match ' . $row[0] .' '. $row[8];
} // Displays movie ID and category
}
Related
I have just learnt some basic skill for html and php and I hope someone could help me .
I had created a html file(a.html) with a form which allow students to input their name, student id, class, and class number .
Then, I created a php file(a.php) to saved the information from a.html into the info.txt file in the following format:
name1,id1,classA,1
name2,id2,classB,24
name3,id3,classA,15
and so on (The above part have been completed with no problem) .
After that I have created another html file(b.html), which require user to enter their name and id in the form.
For example, if the user input name2 and id2 in the form, then the php file(b.php) will print the result:
Class: classB
Class Number: 24
I have no idea on how to match both name and id at the same time in the txt file and return the result in b.php
example data:
name1,id1,classA,1
name2,id2,classB,24
name3,id3,classA,15
<?php
$name2 = $_POST['name2'];
$id2 = $_POST['id2'];
$data = file_get_contents('info.txt');
if($name2!='')
$konum = strpos($data, $name2);
elseif($id2!='')
$konum = strpos($data, $id2);
if($konum!==false){
$end = strpos($data, "\n", $konum);
$start = strrpos($data, "\n", (0-$end));
$row_string = substr($data, $start, ($end - $start));
$row = explode(",",$row_string);
echo 'Class : '.$row[2].'<br />';
echo 'Number : '.$row[3].'<br />';
}
?>
Iterate through lines until you find your match. Example:
<?php
$csv=<<<CSV
John,1,A
Jane,2,B
Joe,3,C
CSV;
$data = array_map('str_getcsv', explode("\n", $csv));
$get_name = function($number, $letter) use ($data) {
foreach($data as $row)
if($row[1] == $number && $row[2] == $letter)
return $row[0];
};
echo $get_name('3', 'C');
Output:
Joe
You could use some simple regex. For example:
<?php
$search_name = (isset($_POST['name'])) ? $_POST['name'] : exit('Name input required.');
$search_id = (isset($_POST['id'])) ? $_POST['id'] : exit('ID input required.');
// First we load the data of info.txt
$data = file_get_contents('info.txt');
// Then we create a array of lines
$lines = preg_split('#\\n#', $data);
// Now we can loop the lines
foreach($lines as $line){
// Now we split the line into parts using the , seperator
$line_parts = preg_split('#\,#', $line);
// $line_parts[0] contains the name, $line_parts[1] contains the id
if($line_parts[0] == $search_name && $line_parts[1] == $search_id){
echo 'Class: '.$line_parts[2].'<br>';
echo 'Class Number: '.$line_parts[3];
// No need to execute the script any further.
break;
}
}
You can run this. I think it is what you need. Also if you use post you can change get to post.
<?php
$name = $_GET['name'];
$id = $_GET['id'];
$students = fopen('info.txt', 'r');
echo "<pre>";
// read each line of the file one by one
while( $student = fgets($students) ) {
// split the file and create an array using the ',' delimiter
$student_attrs = explode(',',$student);
// first element of the array is the user name and second the id
if($student_attrs[0]==$name && $student_attrs[1]==$id){
$result = $student_attrs;
// stop the loop when it is found
break;
}
}
fclose($students);
echo "Class: ".$result[2]."\n";
echo "Class Number: ".$result[3]."\n";
echo "</pre>";
strpos can help you find a match in your file. This script assumes you used line feed characters to separate the lines in your text file, and that each name/id pairing is unique in the file.
if ($_POST) {
$str = $_POST["name"] . "," . $_POST["id"];
$file = file_get_contents("info.txt");
$data = explode("\n", $file);
$result = array();
$length = count($data);
$i = 0;
do {
$match = strpos($data[$i], $str, 0);
if ($match === 0) {
$result = explode(",", $data[$i]);
}
} while (!$result && (++$i < $length));
if ($result) {
print "Class: " . $result[2] . "<br />" . "Class Number: " . $result[3];
} else {
print "Not found";
}
}
I have a CSV file and I want to check if the row contains a special title. Only if my row contains a special title it should be converted to XML, other stuff added and so on.
My question now is, how can I iterate through the whole CSV file and get for every title the value in this field?
Because if it matches my special title I just want to convert the specified row where the title is matching my title. Maybe also an idea how I can do that?
Sample: CSV File
I must add that feature to my actual function. Because my actual function is just is converting the whole CSV to XML. But I just want to convert the specified rows.
My actual function:
function csvToXML($inputFilename, $outputFilename, $delimiter = ',')
{
// Open csv to read
$inputFile = fopen($inputFilename, 'rt');
// Get the headers of the file
$headers = fgetcsv($inputFile, 0, $delimiter);
// Create a new dom document with pretty formatting
$doc = new DOMDocument('1.0', 'utf-8');
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
// Add a root node to the document
$root = $doc->createElement('products');
$root = $doc->appendChild($root);
// Loop through each row creating a <row> node with the correct data
while (($row = fgetcsv($inputFile, 0, $delimiter)) !== false) {
$container = $doc->createElement('product');
foreach ($headers as $i => $header) {
$child = $doc->createElement($header);
$child = $container->appendChild($child);
$value = $doc->createTextNode($row[$i]);
$value = $child->appendChild($value);
}
$root->appendChild($container);
}
$strxml = $doc->saveXML();
$handle = fopen($outputFilename, 'w');
fwrite($handle, $strxml);
fclose($handle);
}
Just check the title before adding the rows to XML. You could do it by adding the following lines:
while (($row = fgetcsv($inputFile, 0, $delimiter)) !== false) {
$specialTitles = Array('Title 1', 'Title 2', 'Title 3'); // titles you want to keep
if(in_array($row[1], $specialTitles)){
$container = $doc->createElement('product');
foreach ($headers as $i => $header) {
$child = $doc->createElement($header);
$child = $container->appendChild($child);
$value = $doc->createTextNode($row[$i]);
$value = $child->appendChild($value);
}
$root->appendChild($container);
}
}
This is what I'm trying to do:
I have a text file called member.log
ADD THE TOTAL amount of outstanding payments from each member-210.00 etc,
Eg: inactive : [2007-04-01 08:42:21] "home/club/member" 210.00 "r-200"
To me it makes seems that I would need to separate the different parts of record so that I can target the [key] that correspondes to the amount 210.00, etc
I thought to do this with explode() but as I'm not passing a string to explode() I am getting an error: Warning: explode() expects parameter 2 to be string, array given in /home/mauri210/public_html/lfctribe.com/index.php on line 25
How can I solve this so that I can add up the total for each line?
Here is my php:
<?php
//Open the dir
$dirhandle = opendir('/home/mauri210/public_html/lfctribe.com/data');
//Open file
$file = fopen('/home/mauri210/public_html/lfctribe.com/data/members.log', 'r');
//Declare array
$arrFile = array();
//Add each line of file in to array while not EOF
while (!feof($file)) {
$arrFile[] = fgets($file);
//explode
$exarrFile = explode(' ', $arrFile);
}
var_dump($exarrFile);
?>
Here is contents of members.log :
inactive : [2007-04-01 08:42:21] "home/club/member" 210.00 "r-200"
inactive : [2008-08-01 05:02:20] "home/club/staff" 25.00 "r-200"
active : [2010-08-11 10:12:20] "home/club/member" 210.00 "r-500"
inactive : [2010-01-02 11:12:33] "home/premier/member" 250.00 "r-200"
active : [2013-03-04 10:02:30] "home/premier/member" 250.00 "r-800"
active : [2011-09-14 15:02:55] "home/premier/member" 250.00 "r-100"
while (!feof($file)) {
$arr_file = fgets($file);
$arrFile[] = fgets($file);
//explode
$exarrFile = explode(' ', $arr_file);
}
var_dump($exarrFile);
Try something like this
$sum=0;
foreach(file("path/to/file") as $line )
{
$fields=explode (" ", $line);
$sum += $fields[count($fields)-1];
}
echo $sum;
You'll be needing this I guess
$items= preg_split('/[,\s]+/', $yourline);
I think I have solved this problem. I've tested with the small amount of sample data and seems to work. Here is my updated code:
<?php
//Open the dir
$dirhandle = opendir('/home/mauri210/public_html/lfctribe.com/data');
//Open file
$file = fopen('/home/mauri210/public_html/lfctribe.com/data/members.log', 'r');
//List contents of file
while (($contents = fgets($file)) !== false) {
$parts = explode(" ", $contents);
$total = $total + $parts[5];
}
var_dump($parts);
echo "this is key 5 $total";
?>
I would like to scan a large piece of text using PHP and find all matches for a pattern, but then also 2 lines above the match and 2 lines below.
My text looks like this, but with some extra unnecessary text above and below this sample:
1
Description text
123.456.12
10.00
10.00
3
Different Description text
234.567.89
10.00
30.00
#Some footer text that is not needed and will change for each text file#
15
More description text
564.238.02
4.00
60.00
15
More description text
564.238.02
4.00
60.00
#Some footer text that is not needed and will change for each text file#
15
More description text
564.238.02
4.00
60.00
15
More description text
564.238.02
4.00
60.00
Using PHP, I am looking to match each number in bold (always same format - 3 numbers, dot, 3 numbers, dot, 2 numbers) but then also return the previous 2 lines and the next 2 lines and hopefully return an array so that I can use:
$contents[$i]["qty"] = "1";
$contents[$i]["description"] = "Description text";
$contents[$i]["price"] = "10.00";
$contents[$i]["total"] = "10.00";
etc...
Is this possible and would I use regex? Any help or advice would be greatly appreciated!
Thanks
ANSWERED BY vzwick
This is my final code that I used:
$items_array = array();
$counter = 0;
if (preg_match_all('/(\d+)\n\n(\w.*)\n\n(\d{3}\.\d{3}\.\d{2})\n\n(\d.*)\n\n(\d.*)/', $text_file, $matches)) {
$items_string = $matches[0];
foreach ($items_string as $value){
$item = explode("\n\n", $value);
$items_array[$counter]["qty"] = $item[0];
$items_array[$counter]["description"] = $item[1];
$items_array[$counter]["number"] = $item[2];
$items_array[$counter]["price"] = $item[3];
$items_array[$counter]["total"] = $item[4];
$counter++;
}
}
else
{
die("No matching patterns found");
}
print_r($items_array);
$filename = "yourfile.txt";
$fp = #fopen($filename, "r");
if (!$fp) die('Could not open file ' . $filename);
$i = 0; // element counter
$n = 0; // inner element counter
$field_names = array('qty', 'description', 'some_number', 'price', 'total');
$result_arr = array();
while (($line = fgets($fp)) !== false) {
$result_arr[$i][$field_names[$n]] = trim($line);
$n++;
if ($n % count($field_names) == 0) {
$i++;
$n = 0;
}
}
fclose($fp);
print_r($result_arr);
Edit: Well, regex then.
$filename = "yourfile.txt";
$file_contents = #file_get_contents($filename);
if (!$file_contents) die("Could not open file " . $filename . " or empty file");
if (preg_match_all('/(\d+)\n\n(\w.*)\n\n(\d{3}\.\d{3}\.\d{2})\n\n(\d.*)\n\n(\d.*)/', $file_contents, $matches)) {
print_r($matches[0]);
// do your matching to field names from here ..
}
else
{
die("No matching patterns found");
}
(.)+\n+(.)+\n+(\d{3}\.\d{3}\.\d{2})\n+(.)+\n+(.)+
It might be necessary to replace \n with \r\n. Make sure the regex is in a mode when the "." doesn't match with the new line character.
To reference groups by names, use named capturing group:
(?P<name>regex)
example of named capturing groups.
You could load the file in an array, and them use array_slice, to slice each 5 blocks of lines.
<?php
$file = file("myfile");
$finalArray = array();
for($i = 0; $i < sizeof($file); $i = $i+5)
{
$finalArray[] = array_slice($file, $i, 5);
}
print_r($finalArray);
?>
Hy everyone, I'm having trouble with properly nesting while loops to read from 2 arrays.
I have 2 files from which I read the content:
file1: item_list.txt
string1 \n
string2 \n
string3 \n
...
file2: item+info.txt
string3 \t info1 \t info2 \t info3
string1 \t info7 \t info1 \t info4
string5 \t info2 \t info3
string2 \t info2 \t info4 \t info1
(values are separated by new lines and tabs only, I added one space between characters here just to increase readability).
I read from files using fgetcsv() function, and each row from file is stored as an array into a variable $data. I created a while loop with condition (!feof($fp)) to read through the file until the last row. But I can't quite properly nest the second loop.
What I want to do with this:
read the first string found in file1, go to file2 and try to find that string. If there's a match, get the info data for that string (all of the data, or just one, doesn't matter). If there's no match, return message "no match". In either case, once the second loop has done it's thing, I need to read the second string in file1, and do the search in file2 again. Repeat this as long as there is something to read from the file1.
here are two versions of my code, they don't work, and I can't figure out why.
//opening the files
$fp = fopen("$DOCUMENT_ROOT/test/item_list.txt", "r"); #raw item list
$pf = fopen("$DOCUMENT_ROOT/test/item+info.txt", "r"); #item+info list
//read from first file
$line=0;
while (!feof($fp)){
$line++;
$data1 = fgetcsv($fp, "1000", "\n");
$item1= $data1[0];
echo "line: $line. item_list: ".$item1; //just to see on screen what's happening
print_r($data1); //same here, just to see what's going on
echo"<br />";
//searching for string in file2
$row=0;
while (!feof($pf)){
$row++;
$data2 = fgetcsv($pf, "1000", "\t");
$item2= $data2[0];
echo "line: $row. item+info: ".$item2; //just checking things on screen
print_r($data2); //here too
echo "<br />";
//conditioning
//equal strings
if ($string1== $string2)
echo $data2[1]."<br />";
break;
}
}
fclose($fp);
fclose($pf);
this used to work as long as the items in item_list.txt and item+info.txt are oredered
exactly the same (string1\nstring2\string3 ->
string1\tinfo1\nstring2\tinfo2\nstring3\tinfo3 - but that's never going to happen in my
case, it's impossible to order the items like that)
I tried to do it with foreach() statement do itterate through arrays, but the result is something that I can't make any sense out of.
while (!feof($fp)){
$data1 = fgetcsv($fp);
foreach ($data1 as $token1) {
while (!feof($pf)) {
$data2 = fgetcsv($pf);
foreach ($data2 as $value) {
explode ("\t", $value);
if ($token1 == $value[0])
echo $value[1];
}
break;
}
}
}
This should do it:
$file1 = file($DOCUMENT_ROOT . '/test/item_list.txt');
$file2 = file($DOCUMENT_ROOT . '/test/item+info.txt');
foreach ($file1 as $line)
{
$line = rtrim($line); // just in case ...
if ($line === '') continue;
foreach($file2 as $infoline)
{
$infoline = explode("\t", rtrim($infoline);
if ($line === $infoline[0])
{
array_shift($infoline);
echo $line . '<br /><br />' . implode('<br />', $infoline);
// $results[$line] = $infoline; // uncomment this if you need the search results stored for later use
break;
}
}
}
Here's a rough shot at it:
$filename1 = 'item_list.txt';
$filename2 = 'item+info.txt';
# Init the Raw Array
$content2raw = array();
# Get the File Contents for File 2
$file2 = file_get_contents( $filename2 );
# Split it into an Array by Line Breaks
$content2raw = preg_split( "/\n/" , $file2 , -1 , PREG_SPLIT_NO_EMPTY );
# Unset the variable holding the file contents
unset( $file2 );
# Init the Fixed Array
$content2 = array();
# Loop through the Raw Array
foreach( $content2raw as $l ){
// Each Line of Filename2
# Split the Line on Tabs
$t = preg_split( "/\s*\t\s*/" , $l , -1 );
# Set the Fixed Array, using the first element from the line as the key
$content2[ $t[0] ] = $t;
}
# Unset the Raw Array
unset( $content2raw );
# Get the File Contents from File 1
$file1 = file_get_contents( $filename1 );
# Split it into an Array by Line Breaks
$contents1 = preg_split( "/\n/" , $file1 , -1 , PREG_SPLIT_NO_EMPTY );
# Unset the variable holding the file contents
unset( $file1 );
# Loop through the Lines, using each line as the Key to look for
foreach( $content1 as $v ){
# Check whether a matching element exists in the array from File 2
if( !array_key_exists( $k , $content2 ) ){
// No Match Found
echo 'No Match';
}else{
// Match Found
echo 'Match Found';
var_dump( $content2[$v] );
}
}
Amendment, as per comment/feedback from #Bluewind
$filename1 = 'item_list.txt';
$filename2 = 'item+info.txt';
# Open and Split the file into an Array by Line
$content2raw = file( $filename2 );
# Init the Fixed Array
$content2 = array();
# Loop through the Raw Array
foreach( $content2raw as $l ){
// Each Line of Filename2
# Split the Line on Tabs
$t = preg_split( "/\s*\t\s*/" , $l , -1 );
# Set the Fixed Array, using the first element from the line as the key
$content2[ $t[0] ] = $t;
}
# Unset the Raw Array
unset( $content2raw );
# Open and Split the file into an Array by Line
$contents1 = file( $filename1 );
# Loop through the Lines, using each line as the Key to look for
foreach( $content1 as $v ){
# Check whether a matching element exists in the array from File 2
if( !array_key_exists( $k , $content2 ) ){
// No Match Found
echo 'No Match';
}else{
// Match Found
echo 'Match Found';
var_dump( $content2[$v] );
}
}
This is actually much less code than you seem to think. First, you read an info file and build a hash table out of it:
foreach(file("info_list") as $line) {
$line = explode("\t", trim($line));
$info[$line[0]] = $line;
}
then you iterate through the items file and look if there are matching entries in the hash:
foreach(file("item_list") as $line) {
$item = trim($line);
if(isset($info[$item]))
// we have some info for this item
else
// we have no info for this item
}
that's basically all about this