mySQL - Load GTFS Data - php

I was wondering if anyone has had any success loading GTFS data to a mySQL database. I've looked all over the place for a good tutorial but I can't find anything that has been helpful.

I succeed in importing GTFS files into MySQL.
Step 1: Create a database
CREATE DATABASE gtfs
DEFAULT CHARACTER SET utf8
DEFAULT COLLATE utf8_general_ci;
Step 2: Create tables
For instance, create the table stops for stops.txt,
-- stop_id,stop_code,stop_name,stop_lat,stop_lon,location_type,parent_station,wheelchair_boarding
CREATE TABLE `stops` (
stop_id VARCHAR(255) NOT NULL PRIMARY KEY,
stop_code VARCHAR(255),
stop_name VARCHAR(255),
stop_lat DECIMAL(8,6),
stop_lon DECIMAL(8,6),
location_type INT(2),
parent_station VARCHAR(255),
wheelchair_boarding INT(2),
stop_desc VARCHAR(255),
zone_id VARCHAR(255)
);
Step 3: Load local data
For instance, load the local file stops.txt into the table stops,
LOAD DATA LOCAL INFILE 'stops.txt' INTO TABLE stops FIELDS TERMINATED BY ',' IGNORE 1 LINES;
The complete source code with an example is placed on GitHub (here). Make some slight changes for your purpose.

Have you tried this one :
https://code.google.com/p/gtfsdb/
The website says :
GTFS (General Transit Feed Specification) Database
Python code that will load GTFS data into a relational database, and Sql/Geo-Alchemy ORM bindings to the GTFS tables in the gtfsdb.
The gtfsdb project's focus is on making GTFS data available in a programmatic context for software developers. The need for the gtfsdb project comes from the fact that a lot of developers start out a GTFS-related effort by first building some amount of code to read GTFS data (whether that's an in-memory loader, a database loader, etc...); gtfsdb can hopefully reduce the need for such drudgery, and give developers a starting point beyond the first step of dealing with GTFS in .csv file format.

I actually used the following link as a base and converted it to a script, worked like a charm
http://steffen.im/php-gtfs-to-mysql/

In my case, I created the table structures first.
Then load the data via the following command in mysql command console
load data local infile '/media/sf_Downloads/google_transit/stops.txt' into table stop fields terminated by ',' enclosed by '"' lines terminated by '\n' ignore 1 rows;
This loads the stops.txt into stop table. ignore the first row (heading) in the stops.txt file.
'/media/...' is just the path to the file, where I extract the gtfs data.

I loaded the GTFS data into a SQLite database through the following commands:
Create a new SQLite Database named test.sqlite3
sqlite3 test.sqlite3
Set the mode to csv
sqlite> .mode csv
Import Each File into a corresponding table with import command .import FILE TABLE
sqlite> .import agency.txt agency
sqlite> .import calendar.txt calendar
...
sqlite> .import stops.txt stops
Now your database should be loaded

Hi looking for something similar and still not getting an easy way to do it. I am going to give a try GTFS to MySQl using Python

According to Google Developers, a GTFS feed is basically a zip file containing text files. You should be able to store it in your MySQL database using a BLOB type field. But this isn't recommended. You should store it as a file on your server's disk and then store the file's name/path into a regular text/varchar field in your database.

Related

PHP times out on importing data into mysql type InnoDB database

I am working on a PHP/mysql program that needs its own mysql table and I want to include a sample database for testing/learning purposes.
I want use a PHP installation script to automate the creation of the mysql table and inserting the sample database.
The latest versions of mysql now set the engine type to InnoDB and I can successfully create the mysql database using PHP - which defaults to the InnoDB type.
The problem comes when I try to import the sample database (from a csv file) - out of 1800 records only 500 records are imported before PHP times out.
I have come up with a possible solution.
Create a mysql database with MyISAM type - using CREATE TABLE $table_name ...... ENGINE=MyISAM
Import the records from the csv file into the MyISAM table - using INSERT INTO $table_name .......
Finally change the database type from MyISAM to InnoDB - using ALTER TABLE $table_name ENGINE = InnoDB
This three step process works MUCH faster and completes well before the PHP script times out.
I have checked the InnoDB table and data using phpmyadmin and all appears to be OK.
Can anyone find fault with this method and if so can you offer an easy solution.
The processing would be even faster if you did not do so much work.
LOAD DATA INFILE ...
will load the entire CSV file in one step, without your having to open + read + parse + INSERT each row one by one.
If you need to manipulate any of the columns, then these steps are more general, yet still much faster than either of your methods:
CREATE TABLE tmp ... ENGINE=CSV; -- and point to your file
INSERT INTO real_table
SELECT ... the columns, suitably manipulated in SQL
FROM tmp;
No loop, no OPEN, no parsing.
This happens to all apache php and mysql installation. You need to up the Apache max execution time in order to make php upload large files into mysql.
I would recommend you carefully study php.ini file and understand how it's controlled in the backend.

Load csv into MySQL, using a new table

I am using phpMyAdmin to load a CSV file into a MySQL table but, does the table have to be created before the load?
I have read that if the workbench is used ( Import a CSV file into MySQL workbench into a new table dynamically ) the latest version creates the table on the fly but, is this possible with phpMyAdmin.
Here is the code that doesn't work
load data local infile '/Users/...'
into table skin_cutaneous_melanoma
fields terminated by ','
lines terminated by '\n'
Error is:
#1146 - Table hospital.skin_cutaneous_melanoma' doesn't exist
Sure, phpMyAdmin can do this.
From the main page, look for the Import tab. This is, of course, how you expect to import any file including if you already have the database and table created, but if not phpMyAdmin creates a temporary database and table for you using generic names and best guesses as to the data types for your columns. Note that this is probably not the ideal structure, but is the best guess based on the existing data in your database. For best results, you'll probably want to put the desired column names at the top of your CSV file and select the corresponding option during import, or rename them after import (likewise with the database and table names) -- but it is absolutely possible to do the import to a new table.

MySQL table from phpMyAdmin's tracking reports

Using phpMyAdmin I can track a certain table's transactions (new inserts, deletes, etc), but is it possible to change or export it to a SQL table to be imported into my site using PHP.
While it is not exactly what I was looking for, I found a way to do it for the time being, one of the tables in the phpmyadmin (it's own) database is called pma__tracking, which contains a record of all tables being tracked, one of its columns is the data_sql longtext column which writes each report (ascendingly which is a bit annoying) in the following format
# log date username
data definition statement
Just added it for future references.

Update MySQL Table using CSV file

my Current MySQL table employee_data has 13k rows with 17 columns. The data in the table came from a CSV file Employees.csv. after importing my csv data I added a new column 'password' (so its not in the csv file) Password is edited and accessed via a web portal. I now have an updated csv file and I want to update my main table with that data but I don't want to lose my password info.
Should I import my new CSV file into a temp table in my database and some how compare them? I am not sure where to start and I am open to recommendations.
I am now realizing I should have kept my password info in a separate table. Doh!
I guess I could created a php file that compares each row based on the employee_id field but with 13k rows I am afraid it would time out possibly.
I would do it like this :
Create a temp table using CREATE TABLE new_tbl LIKE orig_tbl; syntax
use LOAD DATA INFILE to import the data from the CSV into the table
Use UPDATE to update the primary table using a primary key / unique column (perhaps employee_id)
I have worked with tables containing 120 million lines and imported CSV files containing 30 million lines into it - this is the method I use all of the time - much more efficient than anything in PHP (and thats my server side language of choice)
Try other tools other than php based ones phpMyAdmin
MySQL workbench is a great tool,
based on you connection it will take a while to query the database with your data.
There are workarounds with php timeout limit,
set_time_limit();

Prepare and import data into existing database

I maintain a PHP application with SQL Server backend. The DB structure is roughly this:
lot
===
lot_id (pk, identify)
lot_code
building
========
buildin_id (pk, identity)
lot_id (fk)
inspection
==========
inspection_id (pk, identify)
building_id (fk)
date
inspector
result
The database already has lots and buildings and I need to import some inspections. Key points are:
It's a one-time initial load.
Data comes in an Excel file.
The Excel data is unaware of DB autogenerated IDs: inspections must be linked to buildings through their lot_code
What are my options to do such data load?
date inspector result lot_code
========== =========== ======== ========
31/12/2009 John Smith Pass 987654X
28/02/2010 Bill Jones Fail 123456B
1) get the excel file into a CSV.
2) import the CSV file into a holding table: SQL SERVER – Import CSV File Into SQL Server Using Bulk Insert – Load Comma Delimited File Into SQL Server
3) write a stored procedure/script where you declare local variables and loop through each row in the in the holding table, building out the proper rows in the actual tables. Since this is a one time load, there is no shame in looping, and you'll have complete control over all the logic.
Your data would have to have natural primary keys in the data file. It looks like lot_code may be one, but I don't see one for the building table.
Also, you say that inspections are be related to buildings through lot code, yet the relationship in the table is between building and inspection.
If the data is modeled correctly, you can import to temp tables and then insert/update the target tables using the natural keys.
I originally added this answer to the question itself. I'm moving it to a proper answer because that's the right place. Please note anyway that the whole business is from 2010 so the information may or may not be longer relevant.
In case someone else has to do a similar task, these are the steps the data load finally required:
Prepare the Excel file: remove unwanted columns, give proper names to sheets and column headers, etc.
With SQL Server Import / Export Wizard (32-bit version; the 64-bit version lacks this feature), load each sheet in the into a (new) database table. The wizard takes care of (most of) the dirty details, including creating the appropriate DB structure.
Log into the database with your favourite client. To make SQL coding easier, I created some extra fields in the new tables.
Start a transaction.
BEGIN TRANSACTION;
Update the auxiliary columns in the newly created tables:
UPDATE excel_inspection$
SET building_id = bu.building_id
FROM building bu
INNER JOIN ....
Insert data in the destination tables:
INSERT INTO inspection (...)
SELECT ...
FROM excel_inspection$
WHERE ....
Review the results and commit the transaction if everything's fine:
COMMIT;
In my case, SQL Server complained about collation conflicts when joining the new tables with the existing ones. It was fixed by setting an appropriate collation in the new tables but the method differs: in SQL Server 2005 I could simply change collation from the SQL Server Manager (Click, Click, Save and Done) but in SQL Server 2008 I had to set collation manually in the import Wizard ("Edit SQL" button).

Categories