After years of false starts, I'm finally diving head first into learning to code PHP. After about 10 failed previous attempts to learn, it's getting exciting and finally going fairly well.
The project I'm using to learn with is for work. I'm trying to import 100+ fixed width text files into a MySql database.
So far so good
I'm getting comfortable with sql, and I'm learning some php tricks, but I'm not sure how to tie all the pieces together. The basic structure for what I want to do goes something like the following:
Name the text file I want to import
Do a LOAD DATA INFILE to import the data into one field it to a temporary db
Use substring() to separate the fixed width file into real columns
Remove lines I don't want (file identifiers, subtotals, etc....)
Add the files in the temp db, to the main db
Drop the temp db and start again
As you can see in the attached code, thigns are working fine. It gets the new file, imports it to the temp table, removes unwanted lines and then moves the content to final main database. Perfect.
Questions three
My two questions are:
Am I doing this 'properly'? When I want to run a pile of queries one after anohter, do I keep assinging mysql_query to random variables?
How would I go about automating the script to loop through every file there and import them? Rather than have to change the file name and run the script every time.
And, last, what PHP function would I use to 'select' the file(s) I want to import? You know, like attaching a file to an email -> Browse for file, upload it, and then run the script on it?
Sorry for this being an ultra-beginner question, but I'm having trouble seeing how all the pieces fit together. Specifcally I'm wondering how multiple sql queries get strung together to form a script? The way I've done it below? Some other way?
Thanks x 100 for any insights!
Terry
<?php
// 1. Create db connection
$connection = mysql_connect("localhost","root","root") or die("DB connection failed:" . mysql_error());
// 2. Select the database
$db_select = mysql_select_db("pd",$connection) or die("Couldn't select the database:" . mysql_error());
?>
<?php
// 3. Perform db query
// Drop table import if it already exists
$q="DROP table IF EXISTS import";
//4. Make new import table with just one field
if ($newtable = mysql_query("CREATE TABLE import (main VARCHAR(700));", $connection)) {
echo "Table import made successfully" . "<br>";
} else{
echo "Table import was not made" . "<br>";
}
//5. LOAD DATA INFILE
$load_data = mysql_query("LOAD DATA INFILE '/users/terrysutton/Desktop/importmeMay2010.txt' INTO table import;", $connection) or die("Load data failed" . mysql_error());
//6. Cleanup unwanted lines
if ($cleanup = mysql_query("DELETE FROM import WHERE main LIKE '%GRAND%' OR main LIKE '%Subt%' OR main LIKE '%Subt%' OR main LIKE '%USER%' OR main LIKE '%DATE%' OR main LIKE '%FOR:%' OR main LIKE '%LOCATION%' OR main LIKE '%---%' OR `main` = '' OR `main` = '';")){
echo "Table import successfully cleaned up";
} else{
echo "Table import was not successfully cleaned up" . "<br>";
}
// 7. Next, make a table called "temp" to store the data before it gets imported to denominators
$temptable = mysql_query("CREATE TABLE temp
SELECT
SUBSTR(main,1,10) AS 'Unit',
SUBSTR(main,12,18) AS 'Description',
SUBSTR(main,31,5) AS 'BD Days',
SUBSTR(main,39,4) AS 'ADM',
SUBSTR(main,45,4) AS 'DIS',
SUBSTR(main,51,4) AS 'EXP',
SUBSTR(main,56,5) AS 'PD',
SUBSTR(main,100,5) AS 'YTDADM',
SUBSTR(main,106,5) AS 'YTDDIS',
SUBSTR(main,113,4) AS 'YTDEXP',
SUBSTR(main,118,5) AS 'YTDPD'
FROM import;");
// 8. Add a column for the date
$datecolumn = mysql_query("ALTER TABLE temp ADD Date VARCHAR(20) AFTER Unit;");
$date = mysql_query("UPDATE temp SET Date='APR 2010';");
// 8. Move data from the temp table to its final home in the main database
// Append data in temp table to denominator table
$append = mysql_query("INSERT INTO denominators SELECT * FROM temp;");
// 9. Drop import and temp tables to start from scratch.
$droptables = mysql_query("DROP TABLE import, temp;");
// 10. Next, rename the text file to be imported and do the whole thing over again.
?>
<?php
// 5. Close connection
mysql_close($connection);
?>
If you have access to the command like, you can do all your data loading right from the mysql command line. Further, you can automate the process by writing a shell script. Just because you can do something in PHP doesn't mean you should.
For instance, you can just install PHPMyAdmin, create your tables on the fly, then use mysqldump to dump your database definitions to a file. like so
mysqldump -u myusername -pmypassword mydatabase > mydatabase.backup.sql
later, you can then just reload the whole database
mysql -u myusername -pmypassword < mydatabase.backup.sql
It's cool that you are learning to do things in PHP, but focus on doing the stuff you will do in PHP regularly rather than doing RDBMS stuff in PHP which is not where you should do it most of the time anyway. Build forms, and process the data. Learn how to build objects, and why you might want to do that. Head over and check out Symphony and Doctrine. Learn about the Front Controller pattern.
Also, look into PDO. It is very "bad form" to use the direct mysql_query() functions anymore.
Finally, PHP is great for templating and including disparate parts to form a cohesive whole. Practice making a left and top navigation html file. Figure out how you can include that one file on all your pages so that your same navigation shows up everywhere.
Then figure out how to look at variables like the page name and highlight the navigation tab you are on. Those are the things PHP is well suited for.
Why don't you load the files and process them in PHP, and use it to insert values in the actual table?
Ie:
$data = file_get_contents('somefile');
// process data here, say you dump it into a 2d array like
// $insert[$rows][$cols]
// then you can insert these into the db, ie:
$query = '';
foreach ($insert as $row) {
$query .= "INSERT INTO table VALUES ({$row[1]}, {$row[2]}, {$row[3]});";
}
mysql_query($query);
The purpose behind setting mysql_query to a variable is so that you can get the data you were querying for. In the case of any other query than SELECT, it only returns true or false.
So in the case where you are using if ($var = mysql...) you do not need the variable assingment there at all as the function returns true or false.
Also, I feel like doing all your substring and data file processing would be MUCH better suited in PHP. you can look into the fopen function and the related functions on the left side of that page.
Related
because a provider I use, has a quite unreliable MySQL servers, which are down at leas 1 time pr week :-/ impacting one of the sites I made, I want to prevent its outeges in the following way:
dump the MySQL table to a file In case the connection with the SQL
server is failed,
then read the file instead of the Server, till the Server is back.
This will avoid outages from the user experience point of view.
In fact things are not so easy like it seems and I ask for your help please.
What I did is to save the data to a JSON file format.
But this got issues because many data on the DB are "in clear" included escaped complex URLs, with long argument's line, that give some issue during the decode process from JSON.
On CSV and TSV is also not workign correctly.
CSV is delimited by Commas or Semilcolon , and those are present in the original content taken from the DB.
TSV format leave double quotes that are not deletable, without avoid to go to eliminate them into the record's fields
Then I tried to serialize each record read from the DB, store it and retrive it serializing it.
But the result is a bit catastrophic, becase all the records are stored in the file.
When I retrieve them, only one is returned. then there is something that blocks the functioning of the program (here below the code please)
require_once('variables.php');
require_once("database.php");
$file = "database.dmp";
$myfile = fopen($file, "w") or die("Unable to open file!");
$sql = mysql_query("SELECT * FROM song ORDER BY ID ASC");
// output data of each row
while ($row = mysql_fetch_assoc($sql)) {
// store the record into the file
fwrite($myfile, serialize($row));
}
fclose($myfile);
mysql_close();
// Retrieving section
$myfile = fopen($file, "r") or die("Unable to open file!");
// Till the file is not ended, continue to check it
while ( !feof($myfile) ) {
$record = fgets($myfile); // get the record
$row = unserialize($record); // unserialize it
print_r($row); // show if the variable has something on it
}
fclose($myfile);
I tried also to uuencode and also with base64_encode but they were worse choices.
Is there any way to achieve my goal?
Thank you very much in advance for your help
If you have your data layer well decoupled you can consider using SQLite as a fallback storage.
It's just a matter of adding one abstraction more, with the same code accessing the storage and changing the storage target in case of unavailability of the primary one.
-----EDIT-----
You could also try to think about some caching (json/html file?!) strategy returning stale data in case of mysql outage.
-----EDIT 2-----
If it's not too much effort, please consider playing with PDO, I'm quite sure you'll never look back and believe me this will help you structuring your db calls with little pain when switching between storages.
Please take the following only as an example, there are much better
way to design this architectural part of code.
Just a small and basic code to demonstrate you what I mean:
class StoragePersister
{
private $driver = 'mysql';
public function setDriver($driver)
{
$this->driver = $driver;
}
public function persist($data)
{
switch ($this->driver)
{
case 'mysql':
$this->persistToMysql($data);
case 'sqlite':
$this->persistToSqlite($data);
}
}
public function persistToMysql($data)
{
//query to mysql
}
public function persistSqlite($data)
{
//query to Sqlite
}
}
$storage = new StoragePersister;
$storage->setDriver('sqlite'); //eventually to switch to sqlite
$storage->persist($somedata); // this will use the strategy to call the function based on the storage driver you've selected.
-----EDIT 3-----
please give a look at the "strategy" design pattern section, I guess it can help to better understand what I mean.
After SELECT... you need to create a correct structure for inserting data, then you can serialize or what you want.
For example:
You have a row, you could do that - $sqls[] = "INSERT INTOsong(field1,field2,.. fieldN) VALUES(field1_value, field2_value, ... fieldN_value);";
Than you could serialize this $sqls, write into file, and when you need it, you could read, unserialize and make query.
Have you thought about caching your queries into a cache like APC ? Also, you may want to use mysqli or pdo instead of mysql (Mysql is deprecated in the latest versions of PHP).
To answer your question, this is one way of doing it.
var_export will export the variable as valid PHP code
require will put the content of the array into the $data variable (because of the return statement)
Here is the code :
$sql = mysql_query("SELECT * FROM song ORDER BY ID ASC");
$content = array();
// output data of each row
while ($row = mysql_fetch_assoc($sql)) {
// store the record into the file
$content[$row['ID']] = $row;
}
mysql_close();
$data = '<?php return ' . var_export($content, true) . ';';
file_put_contents($file, $data);
// Retrieving section
$rows = require $file;
On my ZF project, I'm importing data from a CSV file and after some treatement, I insert the data in my MySQL database with a Zend_Db_Table. Here's what the code looks like:
private function addPerson($data)
{
$personDao = new Person();
$personRow = $personDao ->createRow();
if($newperson == -1)
{
//already in DB
}
else
{
$personRow->setName($data['name']);
...
$personRow->save();
}
}
It's working just fine. My only concern is the time it'll take for thousands of rows to be inserted using this way.
So my question is: Is there anyway I can improve my code for large files?
Can I still use the save() function for a lot of rows (>6000) ?
Any suggestion will be welcome.
I was wondering if there's a ZEND function that can buffer like 500 rows and insert them in one shot instead of using save() on each row. I'm already at 1min for 6000 rows...
I think to optimize the integration of csv file, you must transfer the work to MySQL. Either stored or through PHP command line procedure.
It will be create a new file for your tables.
You will find ideas in Import CSV to mysql table.
I did not do, but I think it is quite feasible.
I hope it will help you :)
I am trying to source the structure and data from a sage line 50 database but am having trouble with my update/create script.
Basically, I am writing a tool so that the local intranet site can display data sourced from sage to normal employees without using up a sage login (orders due in/stock levels etc). I am having trouble with this because it seems that the Sage50 database was developed by total morons. There are no Unique keys in this database, or, more accurately, very very few. The structure is really old school you can find the structure on pastebin HERE (bit too big for here). You'll notice that there are some tables with 300+ columns, which I think is a bit stupid. However, I have to work with this and so I need a solution.
There are a few problems syncing that I have encountered. Primarily it's the fact ODBC can't limit statements to 1 row so I can check data type, and secondly, with there being no IDs, I can't check if it's a duplicate when doing the insert. At the moment, this is what I have:
$rConn = odbc_connect("SageLine50", "user", "password");
if ($rConn == 0) {
die('Unable to connect to the Sage Line 50 V12 ODBC datasource.');
}
$result = odbc_tables($rConn);
$tables = array();
while (odbc_fetch_row($result)){
if(odbc_result($result,"TABLE_TYPE")=="TABLE") {
$tables[odbc_result($result,"TABLE_NAME")] = array();
}
}
This produces the first level of the list you see on pastebin.
A foreach statement is then run to produce the next level with the columns within the table
foreach($tables as $k=> $v) {
$query = "SELECT * FROM ".$k;
$rRes = odbc_exec($rConn,$query);
$rFields = odbc_num_fields ($rRes);
$i = 1;
while($i <= $rFields) {
$tables[$k][] = odbc_field_name($rRes, $i);
$i++;
}
CreateTableandRows($k,$tables[$k]);
}
At the moment, I then have a bodged together function to create each table (not that I like the way it does it).
Because I can't automatically grab back one row (or a few rows), to check the type of data with get_type() to then automatically set the row type, it means the only way I can figure out to do this is to set the row type as text and then change them retrospectively based upon a Mysql query.
Here is the function that's called for the table creation after the foreach above.
function CreateTableandRows($table,$rows) {
$db = array(
"host" => '10.0.0.200',
"user" => 'user',
"pass" => 'password',
"table" => 'ccl_sagedata'
);
$DB = new Database($db);
$LocSQL = "CREATE TABLE IF NOT EXISTS `".$table."` (
`id` int(11) unsigned NOT NULL auto_increment,
PRIMARY KEY (`id`),";
foreach($rows as $k=>$v) {
$LocSQL .= "
".$v." TEXT NOT NULL default '',";
}
$LocSQL = rtrim($LocSQL, ',');
$LocSQL .= "
) ENGINE=MyISAM DEFAULT CHARSET=utf8";
echo '<pre>'.$LocSQL.'</pre>';
$DB->query($LocSQL);
}
I then need/want a function that takes each table at a time and synchronizes the data to the ccl_sagedata database. However, it needs to make sure it's not inserting duplicates, i.e. this script will be run to sync the sage database at the start or end of each day and without ID numbers INSERT REPLACE won't work. I am obviously implementing auto inc primary ID's for each new table in the ccl_sagedata db. But I needs to be able to reference something static in each table that I can identify through ODBC (I hope that makes sense). In my current function, it has to call the mysql database for each row on the sage database and see if there is a matching row.
function InsertDataFromSage($ODBCTable) {
$rConn = odbc_connect("SageLine50", "user", "pass");
$query = "SELECT * FROM ".$ODBCTable;
$rRes = odbc_exec($rConn,$query);
$rFields = odbc_num_fields ($rRes);
while( $row = odbc_fetch_array($rRes) ) {
$result[] = $row;
}
$DB = new Database($db);
foreach($result as $k => $v) {
$CHECKEXISTS = "SELECT * FROM ".$ODBCTable." WHERE";
$DB->query($CHECKEXISTS);
// HERE WOULD BE A PART THAT PUTS DATA INTO THE DATABASE IF IT DOESN'T ALREADY EXIST
}
}
The only other thing I can think to note is that the 'new Database' class is simply just a functionalised standard mysqli database class. It's not something I'm having problems with.
So to re-cap.
I am trying to create a synchronization script that creates (if not exists) tables within a mysql database and then imports/syncs the data.
ODBC Can't limit the output so I can't figure out the data types in the columns automatically (can't do it manually because it's a massive db with 80+ tables
I can't figure out how to stop the script creating duplicates because there are no IDs in the sage source database.
For those of you not in the UK, Sage is a useless accounting package that runs on water and coal.
The Sage database only provides data, it doesn't allow you to input data outside of csv files in the actual program.
I know this is a bit late but Im already doing the same thing but with MS SQL.
Ive used a DTS package that truncates known copies of the tables (ie AUDIT_JOURNAL) and then copies everything in daily.
I also hit a bit of a wall trying to handle updates of these tables hence the truncate and re-create. Sync time is seconds so its not a bad option. It may be a bit of a ball ache but I say design your sync tables manually.
As you rightly point out, sage is not very friendly to being poked, so id say don't try to sync it all either.
Presumably you'll need reports to present to users but you don't need that much to do this. I sync COMPANY,AUDIT_JOURNAL, AUDIT_USAGE, CAT_TITLE,CAT_TITLE_CUS, CHART_LIST,CHART_LIST_CUS, BANK,CATEGORY,CATEGORY_CUS,DEPARTMENT, NOMINAL_LEDGER,PURCHASE_LEDGER,SALES_LEDGER.
This allows recreation of all the main reports (balance sheet, trial balance, supplier balances, etc all with drill down). If you need more help this late on let me know. I have a web app called MIS that you could install locally but the sync is a combo of ODBC and the DTS.
OK you do not need to create a synchronisation script you can query ODBC in real time you can even do joins like you do in SQL to retrieve data from multiple tables. The only thing you cannot do is write data back to sage.
I don't know how to title this...but anyway,
In my script if data doesn't exist, i insert a new row into a database and then check again for that row, for example:
1 search the db
2 if nothing
include(create.php) -> create entry
3 search the db for that row
Am I going to have to put in a usleep(1000000); between the include and the next search on the db? or is there something I am missing?
THanks!
Include the file create.php on top of your php script and call the functions you required inside this block
if ($number_of_rows < 1) { //call the functions from create.php you need here}
Seriously, why even have a create.php used in that manner anyway? Include create.php at the top of the page, put all the insert syntax into a function, and call it later on in your main script. That would work.
Or even better, don't even bother including it. Just run the queries straight in your main page. That way if you need to change something, you won't have to affect other pages.
You can use something like that
$query="Select * from table where id='23'";
$result=mysql_query($query);
if(mysql_num_rows($result)>0){
//result find in sql table
}else{
$query1="INSERT INTO table (schema) values(values)";
$result=mysql_query($query1);
}
sleep(1);
$query="select *...."
You can also use mysql INTERVAL query.It will automatic make a query after a particular interval and search for data.
I'm writing a small application that will write the contents of a csv file to a database in CodeIgniter, but I'm having difficulty checking for dupes before write. Since this is a fairly common task, I was wondering how some of the professionals (you) do it. For context and comparison, here is my (currently non-working) code:
$sql = "SELECT * FROM salesperson WHERE salesperson_name='" . $csvarray['salesperson_name'][$i] . "'";
$already_exists = $this->db->query($sql);
if (sizeof($already_exists) > 0) {
//add to database
}
Any input at all short of "You suck!" (I already hear that from my own brain twice an hour at this point) will be greatly appreciated. Thanks.
What you're doing will work, but as above the comparison should be "== 0" not "> 0".
You can also use CodeIgniters built in num_rows method instead of using sizeof, like so:
$sql = "your query";
$query = $this->db->query($sql);
if ($query->num_rows() == 0) {
// no duplicates found, add new record
}
As a side note, if you're using CodeIgniter then it's worthwhile to use Active Record to build your queries instead of writing SQL. You can learn more about it here: CodeIgniter - Active Record
Firstly you probably want (rather than >0)
if (sizeof($already_exists) == 0) {
//add to database
}
Also could just push all your data to a temporary staging table, which you could then clean out with some spiffy sql- and then push back to your staging table.
Taking a punt on you using mysql (either that or postgres if you are phping) You could try this (taken from http://www.cyberciti.biz/faq/howto-removing-eliminating-duplicates-from-a-mysql-table/)
You will have to change your .ID to your primary key
delete from salesperson_temptable
USING salesperson_temptable, salesperson_temptable as vtable
WHERE (NOT salesperson_temptable.ID=vtable.ID)
AND (salesperson_temptable.salesperson_name=vtable.salesperson_name)
then when you review the contents of salesperson_temptable you can then insert it back to the salesperson table