Collecting Stackoverflow Q and A's in a text file - php

I think my method is lame, but I cannot think of a better way to do this.
I use Ultraedit text editor to hold all the stuff I cull out of Stackoverflow for PHP and MySQL in a text file. This is my strict format for each new entry:
#################################################
TITLE: THIS IS MY TITLE (ALL IN CAPS, FOLLOWD BY A DOTTED LINE)
-------------------------------------------------
...probably a question first (if necessary), then another shorter dotted line
-------------------
...answer(s)...
#################################################
So, here is an actual entry:
#################################################
TITLE: READING FIRST 5 FIELDS OF CSV FILE INTO PHP
-------------------------------------------------
(...with fgetcsv...)
$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
echo "<p> $num fields in line $row: <br /></p>\n";
// iterate over each column here
for ($c=0; $c < $num; $c++) {
// handle column data here
echo $data[$c] . "<br />\n";
// exit the loop after 3rd column parsed
if ($c == 2) break;
}
++$row;
}
fclose($handle);
-----------------
(...without fgetcsv...)
$lines = file('data.csv');
$linecount = count($lines);
for ($i = 1; $i < $linecount; $i++){
$fields = explode(',', $lines[$i]);
$sno = $fields[0];
$name = $fields[1];
$ph = $fields[2];
$add = $fields[3];
}http://stackoverflow.com/users/login?returnurl=%2fquestions%2fask
#################################################
I can get a list of titles by searching for "TITLE: *", etc. My text file now contains about 15,000 lines. Is there a better way to do this? I have asked StackOverflow before about snippet software, but after a thorough search, there is really nothing out there that fits my needs.
In a way, I'm surprised that there is not a PHP/MySQL application for doing this (collecting snippets). I can't do it because I don't have the knowledge or talent. The snippet collector in my IDE will not suffice.
Thanks!

why not build yourself a little application with a small sql backend (say SQLCE or SQLITE)?
You could build it so that you have the following tables:
Title
Code Snippet
Original Question Url
and then you can relate in the TAGS of the question via another Table to allow better searching/cross referencing.

I have used InfoSelect software for years for this purpose, and have many megabytes of searchable notes and code snippets. It isn't exclusively for code snippets. It's the software equivalent of keeping notes on index cards and being able to arrange them in hierarchies, or do a search on them.
Another similar tool is OneNote which is part of Microsoft Office.
If you remove the condition that your tool should be specifically for tracking code snippets, you may be able to broaden your choices.

Related

CSV parsing conditions in PHP

I created a CSV parser that works fine for some CSV files I've found online, but one that I converted from XLS to CSV via Microsoft Excel 2011 does not work.
The ones that work are formatted as such:
"Sort Order","Common Name","Formal Name","Type","Sub Type","Sovereignty","Capital","ISO 4217 Currency Code","ISO 4217 Currency Name","ITU-T Telephone Code","ISO 3166-1 2 Letter Code","ISO 3166-1 3 Letter Code","ISO 3166-1 Number","IANA Country Code TLD"
"1","Afghanistan","Islamic State of Afghanistan","Independent State",,,"Kabul","AFN","Afghani","+93","AF","AFG","004",".af".........................etc...
The one that doesn't work is formatted like this:
Order Id,Date Ordered,Date Returned,Product Id,Description,Order Reason Code,Return Qty,Order Return Comment,Ship To Name,Ship To Address1,Ship To Address2,Ship To Address3,Ship To City,Ship To State,Ship To Zipcode,Ship To Country,Disposition,Ship To Email,ShipVia
5555555,2013-07-05 13:58:36.000,2013-08-16 00:00:00.000,5555-55,0555 - Some Test Thing,Refund,2,,jeric beatty,123 fake st,,,burke,NJ,55055,US,Discard,test#test.com,Super Fast Shipping
Is there anyway to get excel to export in the format as the first one? I would like to avoid doing this manually as the file is huge and I would have to manually edit lots of parts of it where I couldn't do a "replace all". Another issue could be that there are double and sometimes triple commas in some places. Though this does appear in both files.
Here is the parser:
function ingest_csv() {
$file_url = 'http://www.path.to/csv/file.csv';
$record_num = 0;
$records = array();
$header = array();
if (($handle = fopen($file_url, "r")) !== FALSE) {
$records['id'] = '';
while (($data = fgetcsv($handle)) !== FALSE) {
$records['id'][$record_num] = '';
$cell_num = 0;
foreach ($data as $cell) {
if($record_num == 0) {
$header = $data;
} else {
$current_key = $header[$cell_num];
$records['id'][$record_num][$current_key] = $cell;
}
$cell_num++;
}
$record_num++;
}
fclose($handle);
}
else {
echo 'could not open file.';
}
return array($record_num, $records);
}
function batch_csv() {
list($num_rows, $rows) = ingest_csv
print_r($num_rows);
print_r($rows);
}
As mentioned in the comments though you may be trying to reinvent the wheel here, though personally I've asked questions where I didn't want to give long rambling explanations of why I was forced to use unconventional approaches so should this be one of those situations here's an answer.
In OpenOffice Calculator (for example) and when you go to save as CSV you get a number of further options including the decision to double quote all fields.
Unfortunately Excel doesn't give you the choice, but Microsoft do offer up a workaround using a macro - http://support.microsoft.com/kb/291296/en-us

Split large Excel/Csv file to multiple files on PHP or Javascript

I have excel(file.xls)/csv(file.csv) file that contains/will contain hundreds of thousands of entry, even millions I guess. Is it possible to split this one to multiple file? Like file.xls to file1.xls, file2.xls, file3.xls and so on.
Are there any libraries to use? Is this possible on PHP? or how about javascript?
On where I can specify how many rows to be included on each file?
Thanks
Quick and dirty way of splitting a CSV file into several CSV files
$inputFile = 'input.csv';
$outputFile = 'output';
$splitSize = 10000;
$in = fopen($inputFile, 'r');
$rowCount = 0;
$fileCount = 1;
while (!feof($in)) {
if (($rowCount % $splitSize) == 0) {
if ($rowCount > 0) {
fclose($out);
}
$out = fopen($outputFile . $fileCount++ . '.csv', 'w');
}
$data = fgetcsv($in);
if ($data)
fputcsv($out, $data);
$rowCount++;
}
fclose($out);
Yes it is possible to do that in PHP and with CSV files. You basically iterate over the large file and chunk each X rows, forwarding those rows to another file.
You find the information how to open the large CSV file as an iterator in this answer here:
Answer to "how to extract data from csv file in php"
Then you need to chunk the iterator each X rows parts. That can be done as outline here:
Answer to "Need some advice with PHP loop"
Just instead of outputting into multiple <ul>...</ul> HTML lists, you copy over into a new files. That basically works like outlined in:
Answer to "How can I split a CSV file in PHP?"
However this time you want to use the SplFileObject::fputcsv method. Take care you use the latest stable PHP for this, otherwise you need do different, see fputcsv().
If the first line of the original file contains column-headers, you might be as well interested in the following:
Answer to "Process CSV Into Array With Column Headings For Key"
It just shows some ways to extend / process the incomming file. You might not need the full abstraction done there, just keeping the first line around might do it already.
I think You can also use "split by file size":
$part = 1;
$maxSize = 50;//50 Mb
$fopen = fopen('filename.csv','r') or die ('ERROR');
while (($line = fgetcsv($fopen, 10000, ";")) !== FALSE) {
$ftowrite = fopen("Part_$part.csv",'a');
fputcsv($ftowrite,$line);
clearstatcache();
$size = filesize ( "review_p$part.csv" ) / 1000000;
if ($size > $maxSize) {
fclose($ftowrite);
$part++;
}
}

Using fseek to start reading a CSV after a certain number of lines

I am using the current code to read a csv file and add it to an array:
echo "starting CSV import<br>";
$current_row = 1;
$handle = fopen($csv, "r");
while ( ($data = fgetcsv($handle, 10000, ",") ) !== FALSE )
{
$number_of_fields = count($data);
if ($current_row == 1) {
//Header line
for ($c=0; $c < $number_of_fields; $c++)
{
$header_array[$c] = $data[$c];
}
} else {
//Data line
for ($c=0; $c < $number_of_fields; $c++)
{
$data_array[$header_array[$c]] = $data[$c];
}
array_push($products, $data_array);
}
$current_row++;
}
fclose($handle);
echo "finished CSV import <br>";
However when using a very large CSV this times out on the server, or has a memory limit error.
I'd like a way to do it in stages, so after the first say 100 lines it will refresh the page, starting at line 101.
I will probably be doing this with a meta refresh and a URL parameter.
I just need to know how to adapt that code above to start at the line I tell it to.
I have looked into fseek() but I'm not sure how to implement this here.
Can you please help?
The timout can be circumvented using
ignore_user_abort(true);
set_time_limit(0);
When experiencing problems with the memory limit, it may be wise to take a step back and look at what you're actually doing with the data you're processing. Are you pushing the data into a database? calculate something off the data but don't need to store the actual data, …
Do you really need to push (array_push($products, $data_array);) the rows into an array (for later processing)? can you instead write to the database directly? or calculate directly? or build an html <table> directly? or whatever the hell you're doing right then an there, within the while() loop, without pushing everything into an array first?
If you're able to chunk the processing, I guess you don't need that array at all. Otherwise you'd have to restore the array for every chunk - not solving the memory issue one bit.
If you can manage to change your processing algorithm to waste less memory / time, you should seriously consider that over any chunked processing requiring a round-trip to the browser (for so many performance and security reasons…).
Anyways, you can, at any time, identify the current stream offset with ftell() and re-set to that position using fseek(). You'd only need to pass that integer to your next iteration.
Also there is no need for your inner for() loops. This should produce the same results:
<?php
$products = array();
$cols = null;
$first = true;
$handle = fopen($csv, "r");
while (($data = fgetcsv($handle, 10000, ",")) !== false) {
if ($first) {
$cols = $data;
$first = false;
} else {
$products[] = array_combine($cols, $data);
}
}
fclose($handle);
echo "finished CSV import <br>";

Output contacts from csv file with PHP

I need to be able to output contacts via a loop on a page from a CSV file downloaded from Outlook.
If the user has the file on their local machine, I suppose I need some sort of upload mechanism, then let my script read uploaded file and then run the results via some loop and output one contact per line.
Each line will have a checkbox next to a contact and if checked, the form will post results and they will be written into db.
Normal format of Outlook .CSV example file here
I only need Name and email. First and last can be merged in just Name. I suppose i need to run some sort of email validation to reject malformed entries...
Just trying to understand what needs to be done.
You should look into fgetcsv, which can read your CSV file and return an array to you. This is really easy to work with.
$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
$row++;
for ($c=0; $c < $num; $c++) {
echo $data[$c] . "<br />";
}
}
fclose($handle);
}
For information about reading the csv file check out this http://php.net/manual/en/function.fgetcsv.php

Display specific fields from a CSV file

I'm very new to PHP. I have looked around at other questions but none of them seem to provide a solution, so hopefully someone can help!
I have a csv file, but wish to pick out individual fields instead of displaying a whole column.
Is this possible with php?
My code so far (below) picks out specific columns which is not quite what I want to do. If it could pick out specific rows, that would be better than what it's currently showing, but ideally I'd be able to pick specific fields out.
<table>
<?php
$handle = fopen("test.csv", "r");
while (!feof($handle) ) {
$line_of_text = fgetcsv($handle, 1024, ",");
print "<tr><td>" . $line_of_text[0] . "</td><td>" . $line_of_text[5] . "</td></tr>";
}
fclose($handle);
?>
</table>
Hopefully that makes sense!
fgetcsv() only reads the file line by line, so if you want to skip to a particular line, you'd have to put that in yourself:
$desired_line = 17;
$current_line = 0;
while($line = fgetcsv($handle)) {
$current_line++;
if ($current_line < $desired_line) {
continue; // keep reading more lines until we reach 17.
}
print blah blah blah;
}

Categories