Parsing ASCII files to a MySQL table

Parsing ASCII files to a MySQL table - php

For a project, I need to get some word definitions in a database. All the definitions can be found on multiple DB files, but the DB files that I got are for a C language program and are in the form of ASCII (I believe). I need to somehow phrase thorough the files, line by line add the data into a MySQL database.
I would prefer using PHP and/or MySQL.
I tried writing a PHP script to go through and do it, but it timed-out and is intensive on my system and in most cases don't complete.
I heard about LOAD DATA INFILE from MySQL but have no clue how to use it with this.
The file names change for each file and do not have a specific extension, however, all of them can be read from a text file, and I am sure they are all the same in terms of content.
I uploaded the contents of one file here.
You can see that some lines are useless, but the lines starting with { are good and the pattern is essentially the first word is the dictionary term, and the content within () are the definitions. The parts within the "" are sample sentences.
All I need to extract are the terms, definitions and sentences.
The definitions are provided by Princeton University and the license is open source (and I will be crediting them).

Unless you want to reinvent the wheel I would go with something like wordnet2sql. It will output an SQL script that you can use to create your MySQL tables.
You can find the database specifications on princeton's website.
LOAD DATA is useful for csv files but not so much for special database formats.

Related

Elegant PHP parsing solution for large pipe delimited text source files

I'm currently attempting to come up with a solution to the following problem:
I have been tasked with parsing large (+-3500 Lines 300kb) pipe delimited text files and comparing them line by line to corresponding codes within our database. An example of a file would be:
File name: 015_A.txt
File content (example shows only 4 lines):
015|6999|Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.|1|1|0|0|2016/01/01
015|3715|It has roots in a piece of classical Latin literature from 45 BC|1|1|213.5|213.5|2016/01/01
015|3724|Making it over 2000 years old.|1|1|617.4|617.4|2016/01/01
015|4028|Words will go here.|1|1|74|74|2016/01/01
I will be providing a web interface which I have already built to allow a file to be selected from the browser and then uploaded to the server.
Using the above example pipe file I will only be using these:
Code (using above line 1 as an example: 6999)
Price (using above line 1 as an example: 0)
I would then (to my mind not sure if this is best method) need to run a query (our DB is MSSQL) for each line, example:
SELECT t.Price
FROM table t
WHERE t.code = '6999'
If t.Price === 0 then line 1 has passed. As it is equal to the source file.
This is where I believe I just needed to ask some advice as I am sure there are many ways to tackle this problem, I would just like to, if possible be pointed in the direction of doing this in an efficient manner. (Example best method of parsing the file? Do I run a query per code or rather do a SQL statement using an IN clause and then compare every code and price? Should I scrap this idea and use some form of pure SQL tool bearing in mind I have pipe file to deal with / import.)
Any advice would be greatly appreciated.

Your story appears to end somewhat prematurely. Is the only thing this script should do is check the values in the database match the files in the file? If so, it would be simpler just to extract the data from the database and overwrite the file. If not, then this implies you need to retain some record of the variations.
This has some bearing on the approach taken to the reconcilliation; running 3500 queries against the database is going to take some time - mostly spent on the network and in query parsing (i.e. wasted). OTOH comparing 3500 records in a single SELECT to find mismatches will take no time at all.
The problem is that your data is out at the client and uploading via a browser only gets it halfway to the database. If you create another table on the database (not a temporary table - add a column to represent the file) it is possible to INSERT multiple rows in a single DML statement, but really you should batch them up in lots of 100 or so records, meaning you only need to execute 36 queries to complete the operation - and you've got a record of the data in the database which simplifies how you report the mismatches.
You probably should not use the DBMS supplied utilities for direct import unless you ABSOLUTELY trust the source data.

How to parse CSV file in PHP and store fields in a database?

I need help parsing the following a CSV file in PHP, so I can insert the contents into a database.
I know I use file_get_contents() but after that I feel a bit lost.
What I'd like to store.
Collection1 - events.text & date
Collection2 - position & name.text & total
I'm not sure how best structure the data to insert into a database table.
"**collection1**"
"events.href","**events.text**","**date**","index","url"
"tur.com/events/classic.html","John Deere Classic","Thursday Jul 9
- Sunday Jul 12, 2015","1","tur.com/r.html"
"collection2"
"**position**","name.href","**name.text**","**total**","index","url"
"--","javascript:void(0);","Scott","--","2","tur.com/r.html"
"--","javascript:void(0);","Billy","--","3","tur.com/r.html"
"--","javascript:void(0);","Jon","--","4","tur.com/r.html"
"--","javascript:void(0);","Bill","--","5","tur.com/r.html"
"--","javascript:void(0);","Tim","--","6","tur.com/r.html"
"--","javascript:void(0);","Carlos","--","7","tur.com/r.html"
"--","javascript:void(0);","Robert","--","8","tur.com/r.html"
"--","javascript:void(0);","Rod","--","9","tur.com/r.html"

As per your previous question, I think this needs to be broken down into sections. As it stands it is rather too broad to answer.
Read the information using file_get_contents(). Make sure this works first, by echoing it to the console. (It sounded from your other question that you felt this would not work if the URL does not have a .csv suffix. It should work regardless of the file extension - try it. If it fails it may be dependent on cookies or JavaScript or some other problem).
Design and create your table structure in MySQL. It seems like you have two tables. They should both have a primary key. Are they related in some fashion? If so, perhaps one has a foreign key to the other one?
Explode your text file on the new line character and loop across the resulting array of lines.
If your CSV data has a title row in the first row position, delete that from your array.
For each line, read the elements of interest using PHP's build-in CSV parsing functions, and store them in variables.
Pass these variables to a custom function that saves the data.
For each save, you'll need to do an INSERT. I recommend using PDO here. Make sure you bind your parameters.
Where you get stuck on a specific problem, you can ask a new and focussed question. At present, the task is to break things down into discrete and researchable pieces.
One trick worth remembering is this shortcut to the PHP manual. If you do not know how fgetcsv works, for example, type php.net/fgetcsv into your browser address bar, and the PHP site will find the function for you. The documentation is excellent.

How to load this txt file to MySQL?

I have been trying for a while now to import this kind of file to phpmyadmin and with external PHP infile load code, however I cant seem to get the result I would like. I'm not too sure whether I am putting the correct data within the format specific options.
Columns separated with: (space character)
Columns enclosed with: -
Columns escaped with: -
Lines terminated with: auto (\n)
Could someone propose a suggestion as what I should do? The snippet of text below is what the text file looks like (without the bullet points).
Perhaps phpmyadmin is not the way to go?
I would show you guys pictures but I don't have the reputation...yet.
If this helps, I have a link from where I got the dataset:
- https://snap.stanford.edu/data/web-Amazon.html
The layout is shown in the link, I would use Python if need be, though I don't have the experience with that language, I'm willing to use a parser which is provided in the link (wouldn't know where to start).
-product/productId- B000068VBQ
-product/title- Fisher-Price Rescue Heroes: Lava Landslide
-product/price- 8.88
-review/userId- unknown
-review/profileName- unknown
-review/helpfulness- 11/11
-review/score- 2.0
-review/time- 1042070400
-review/summary- Requires too much coordination
-review/text- I bought this software for my 5 year old. He has a couple of the other RH software games..

Storing a shapefile into postgresql using PHP

I'm trying to develop a PHP script that lets users upload shapefiles to import to a postGIS database.
First of all, for the conversion part, AFAIK we can use shp2pgsql to convert the shapefile to a postgresql table; I was wondering if there is another way of doing the conversion, as I would prefer not to use the exec() command.
I would also appretiate any idea on storing the data in a way that does not require dozens of uniquenamed tables.

There seems to be no other way than using the postgresql's binary to convert the shapefile. Although it is not really a bad choice, I would rather not use exec() if there is a PHP native function, or apache module to do it!
However, it sounds like exec is the only sane option available. So I'm going to use it.
No hard feelings! :)
About the last part, it's a different question and should be asked separately. Although, I'm afraid there is no other way of doing it.
UPDATE example added
$queries = shell_exec("shp2pgsql -s ".SRID." -c $shpfilpath $tblname")
or respond(false, "Error parsing the shapfile.");
pg_query($queries) or respond(false, "Query failed!");
SRID is a constant containing the "SRID"!
$shpfilpath is a path to the desired ShapeFile
$tblname is the desired name for the table

See this blog post about loading shapefiles using the PHP shapefile reader plugin from here. http://www.phpclasses.org/package/1741-PHP-Read-vectorial-data-from-geographic-shape-files.html. The blog post focuses on using PHP on the backend to load data for a Flash app, but you should be able to ignore the flash part and use the PHP portion for your needs.
Once you have the data loaded from the shapefile, you could convert the geometry to a WKT string and use ST_GeomFromText or other PostGIS functions to store in the database.
Regarding the unique columns for a shapefile, I've found that to be the most straightforward way to store ad-hoc shapefile attributes and then retrieve that data. However, you could use a "tuple" system, and convert the attributes to strings, then store them in arbitrarily named columns (col1, col2, col3, etc.) if you don't care about attribute names or types.
If you cared about names and types, you could go one step further and store them as a shapefile "schema" in another table.

Write your shp2pgsql and define its parameters using text editor ie
sublime notepad etc.
Copy, paste and change shapefile name for each
layer.
Save as a batch file .bat.
Pull up command window.
Pull up directory with where yu .bat file is saved.
Hit enter and itll run the code for all your shapefiles and they will be uploaded to the
database you defined in writing your code.
Use qgis, go to postgis window and hit connect.
You are good to go your shapefiles are now ready to go and can be added as layers to your map. Make sure the spatial reference matches what they were prior to running it. Does that make sense? I hope that helped its the quickest way.

Adding this answer just for the benefit of anyone who is looking for the same as the OP and does not want to rely on eval() nor external tools.
As of August 2019, you could use PHP Shapefile, a free and open source PHP library I have been developing and maintaining for a few years that can read and write any ESRI Shapefile and convert it natively from/to WKT and GeoJSON, without any third party dependency.
Using my library, which provides WKT to use with PostGIS ST_GeomFromText() function and an array containing all the data to perform a simple INSERT, makes this task trivial, fast and secure, without the need of evil eval().

Sorting a list in PHP

Alright, so I've just (almost) finished my first little php script. Basically all it does is handling forms. It writes whatever the user put in to the fields of the form to a text file, then I include that text file in the index of the little page I have set up.
So, currently it writes to the beginning of the text file (but doesn't overwrite previous data). But, several users wants this list to be alphabetically sorted instead. So what I want to do is make whatever they submit fall into the list in alphabetical order.
The thing here is also that all I use in the text file are divs. The list is basically 'divided' into 3 parts. 'Title', 'Link', and 'Poster'. All I have done is positioned them with css.
So, can I sort this list (the titles, in this case) alphabetically and still have the 'link' and 'poster' information assigned the way they already are, but just with the titles sorted?
It don't use databases at all on my site, so there is no database handling at all used in this script (mainly because I'm not experienced at all in this).

I would also suggest storing the data in the file as either XML or JSON. PHP can sort the records easily and the sorting will be preserved in the file when the data is read back in.
For example
file_put_contents("/tmp/datafile",json_encode($recordset));
and when reading the file back in
$recordset = json_decode(file_get_contents("/tmp/datafile"));
edit
but seriously if you have customers and are charging them for your experience and time, use a database, there are many out there (a few mentioned already)
MySQL
sqlite
PostgreSQL
Oracle
MSSQL

If you really don't want to use a database, even a one-file database (such as sqlite), you can group the three fields using a separator such as _ and sort this single-field list

If this is a small project for some friends or something and the file shouldn't ever have more than maybe a few hundred lines, the functions you're looking for are "file" and "usort".
If you make sure each row is on its' own line, you can load the lines into an array with "file". You can sort the array using a function to compare items with the "usort" function.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Parsing ASCII files to a MySQL table - php

Related

Elegant PHP parsing solution for large pipe delimited text source files

How to parse CSV file in PHP and store fields in a database?

How to load this txt file to MySQL?

Storing a shapefile into postgresql using PHP

Sorting a list in PHP

Categories

Resources