I have 70,000 rows in a MySQL table. I am trying to upload them in Firebase via import JSON. I have exported the table into JSON format. The output was an array. e.g., JSON is as follows -
[
{"question_id":"99","question":"What is your name?"},
{"question_id":"200","question":"What do you do?"}
]
For the correct use of Firebase in mobile apps, I need to import this JSON data as an object, along with GUID like following -
{
"-Lf64AvZinbjvEQLMzGc" : {
"question_id" : 99,
"question" : "What is your name?"
},
"-Lf64AvZinbjvEQLMzGd" : {
"question_id" : 200,
"question" : "What do you do?"
}
}
Since the number of rows are 70,000 (JSON file of this data is 110 MB); inserting individually into Firebase is not possible. So I was trying for a way to generate 70,000 GUID and editing the JSON file to add each one before every object. But, I am stuck at both places. I am using the following class (PushId.php) for generating GUID -
<?php
/**
* Fancy ID generator that creates 20-character string identifiers with the following properties:
*
* 1. They're based on timestamp so that they sort *after* any existing ids.
* 2. They contain 72-bits of random data after the timestamp so that IDs won't collide with other clients' IDs.
* 3. They sort *lexicographically* (so the timestamp is converted to characters that will sort properly).
* 4. They're monotonically increasing. Even if you generate more than one in the same timestamp, the
* latter ones will sort after the former ones. We do this by using the previous random bits
* but "incrementing" them by 1 (only in the case of a timestamp collision).
*/
class PushId
{
/**
* Modeled after base64 web-safe chars, but ordered by ASCII.
*
* #var string
*/
const PUSH_CHARS = '-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz';
/**
* Timestamp of last push, used to prevent local collisions if you push twice in one ms.
*
* #var int
*/
private static $lastPushTime = 0;
/**
* We generate 72-bits of randomness which get turned into 12 characters and appended to the
* timestamp to prevent collisions with other clients. We store the last characters we
* generated because in the event of a collision, we'll use those same characters except
* "incremented" by one.
*
* #var array
*/
private static $lastRandChars = [];
/**
* #return string
*/
public static function generate()
{
$now = (int) microtime(true) * 1000;
$isDuplicateTime = ($now === static::$lastPushTime);
static::$lastPushTime = $now;
$timeStampChars = new SplFixedArray(8);
for ($i = 7; $i >= 0; $i--) {
$timeStampChars[$i] = substr(self::PUSH_CHARS, $now % 64, 1);
// NOTE: Can't use << here because javascript will convert to int and lose the upper bits.
$now = (int) floor($now / 64);
}
static::assert($now === 0, 'We should have converted the entire timestamp.');
$id = implode('', $timeStampChars->toArray());
if (!$isDuplicateTime) {
for ($i = 0; $i < 12; $i++) {
$lastRandChars[$i] = floor(rand(0, 64));
}
} else {
// If the timestamp hasn't changed since last push, use the same random number, except incremented by 1.
for ($i = 11; $i >= 0 && static::$lastRandChars[$i] === 63; $i--) {
static::$lastRandChars[$i] = 0;
}
static::$lastRandChars[$i]++;
}
for ($i = 0; $i < 12; $i++) {
$id .= substr(self::PUSH_CHARS, $lastRandChars[$i], 1);
}
static::assert(strlen($id) === 20, 'Length should be 20.');
return $id;
}
/**
* #param bool $condition
* #param string $message
*/
private static function assert($condition, $message = '')
{
if ($condition !== true) {
throw new RuntimeException($message);
}
}
}
Following is the PHP code that I wrote to generate 70,000 GUID but it showed error (while the same code is working if I use it to generate 1 GUID only)
require_once('PushId.php');
$vars = new PushId();
$my_file = 'TheGUIDs.txt';
$handle = fopen($my_file, 'w') or die('Cannot open file: '.$my_file);
$i=1;
while($i <= 70000){
$data = $vars->generate();
fwrite($handle, $data);
echo $data;
$i++;
}
fclose($handle);
Edit -1
The error I am getting in generating GUID is -
Notice: Undefined offset: 11 in C:\wamp64\www\firebase-json\PushId.php on line 65
Notice: Undefined variable: lastRandChars in C:\wamp64\www\firebase-json\PushId.php on line 71
The first GUID is complete, e.g. in last run I got -Lf8WPlkkNTUjYkIP4WT, but then I got incomplete GUID -Lf8WPlk------------
First, you forgot to prepend static:: to $lastRandChars in some places:
for ($i = 0; $i < 12; $i++) {
$lastRandChars[$i] = floor(rand(0, 64)); // << here
}
for ($i = 0; $i < 12; $i++) {
$id .= substr(self::PUSH_CHARS, $lastRandChars[$i], 1); // << and here
}
Next:
When incrementing previous ID, you missed that you should increment not only last char, but all previous chars as well if they are more than 63.
Sample IDs sequence:
...
-Lf64AvZinbjvEQLMzGw
-Lf64AvZinbjvEQLMzGx
-Lf64AvZinbjvEQLMzGy
-Lf64AvZinbjvEQLMzGz
-Lf64AvZinbjvEQLMzH- <<< on this step last char is reset, and previous one is incremented by one
-Lf64AvZinbjvEQLMzH0
-Lf64AvZinbjvEQLMzH1
...
-Lf64AvZinbjvEQLMzzw
-Lf64AvZinbjvEQLMzzx
-Lf64AvZinbjvEQLMzzw
-Lf64AvZinbjvEQLMzzz
-Lf64AvZinbjvEQLN--- <<< on this step last three chars are reset, and previous one is incremented by one
-Lf64AvZinbjvEQLN--0
-Lf64AvZinbjvEQLN--1
...
Modified increment logic:
// If the timestamp hasn't changed since last push, use the same random number, except incremented by 1.
for ($i = 11; $i >= 0; $i--) {
$previousIncremented = false;
for ($j = $i; $j > 0; $j--) {
if (!$previousIncremented) {
static::$lastRandChars[$j]++;
}
if (static::$lastRandChars[$j] == 64) {
static::$lastRandChars[$j] = 0;
static::$lastRandChars[$j - 1]++;
$previousIncremented = true;
} else {
break 2;
}
}
}
I use yii2. And I need to find IP which is not used (is not in database) by method getFreeIPAddress. I have class like this:
class Radreply extends ActiveRecord {
const ATTRIBUTE_DEFAULT_IP_ADDRESS = 'Framed-IP-Address';
const IP_ADDRESS_MAX = '10.255.255.255'; // max value for IP
const IP_ADDRESS_MIN = '10.0.0.11'; // min value for IP
public function getIntegerIP(){ // converts IP from string to integer format
return ip2long($this->value);
}
public static function getFreeIPAddress(){
$records = self::findAll(['attribute'=>self::ATTRIBUTE_DEFAULT_IP_ADDRESS]); // get all record which contain IP address
$existIPs = ArrayHelper::getColumn($records,'integerIP'); // get array of IP which is converted to integer by method getIntegerIP
for ($integerIP = ip2long(self::IP_ADDRESS_MIN); $integerIP<=ip2long(self::IP_ADDRESS_MAX); $integerIP++){
// increasing one by one IP address in integer format from value IP_ADDRESS_MIN to value IP_ADDRESS_MAX
if (!in_array($integerIP, $existIPs)){
$stringIP = long2ip($integerIP);
$arrayDigits = explode('.', $stringIP);
$lastDigit = array_pop($arrayDigits);
if ($lastDigit!='0'){ // check if last digit of IP is not 0
return $stringIP;
}
}
}
return '';
}
}
Method getFreeIPAddress works find, but in db there are a lot of records with IP and increasing one by one IP and checking if this IP exist in db is very long way. How I can optimize this algorithm? Is there faster way to get unused IP?
I think, I've found better solution without extra table in database
class Radreply extends ActiveRecord {
const ATTRIBUTE_DEFAULT_IP_ADDRESS = 'Framed-IP-Address';
const IP_ADDRESS_MAX = '10.255.255.255'; // max value for IP
const IP_ADDRESS_MIN = '10.0.0.11'; // min value for IP
public function getIntegerIP(){ // converts IP from string to integer format
return ip2long($this->value);
}
public static function getFreeIPAddress(){
$records = self::findAll(['attribute'=>self::ATTRIBUTE_DEFAULT_IP_ADDRESS]); // gets all record which contain IP address
$existIPs = ArrayHelper::getColumn($records,'integerIP'); // gets array of IP which is converted to integer by method getIntegerIP
$intIpAddressMin = ip2long(self::IP_ADDRESS_MIN); // gets min IP in integer format
$endRange = empty($existIPs) ? $intIpAddressMin : max($existIPs); // checks if at least one IP is used
$availableIPs = range( $intIpAddressMin, $endRange + 2); // generates array with available IP addresses (+2 because next address can be with last digit 0)
$missingIPs = array_diff($availableIPs,$existIPs); // removes all used IP
foreach ($missingIPs as $value){
$lastDigit = $value % 256;
if ($lastDigit != 0){
return long2ip($value);
}
}
return '';
}
}
bool in_array ( mixed $needle , array $haystack [, bool $strict = FALSE ] )
In my opinion,you can set strict true .
my php code with strict = false
<?php
$y="1800";
$x = array();
for($j=0;$j<50000;$j++){
$x[]= "{$j}";
}
for($i=0;$i<30000;$i++){
if(in_array($y,$x)){
continue;
}
}
time php test.php
real 0m4.418s
user 0m4.404s
sys 0m0.012s
when strict is true
for($i=0;$i<30000;$i++){
if(in_array($y,$x ,true)){
continue;
}
}
time php test.php
real 0m1.548s
user 0m1.540s
sys 0m0.004s
what‘s more ,if you can get the used ip with ascending order . you can get the o(m+n) time complexity,which m is the length of all ip you should try , n is the length of all ip in db with merge algorithm .
if you can get the used ip with ascending order
in Pseudocode .
tmpIp = minIp;
while(temIp <= maxIp){
if( dbIsEmpty){
break;
}
dbIp =getNextFromDb();
while(temIp < dbIp){
printf temIp ;
temIp ++;
}
temIp ++;
}
while(temIp <= maxIp){
printf temIp ;
temIp++;
}
here is my php code, where i repalce echo ip by $count++;
In this demo there is about 80000 ip with type of long
<?php
function mergeSort( $result){
$minIp = ip2long('10.0.0.11') ;
$maxIp = ip2long('10.255.255.255');
$count =0;
$tmpIp = $minIp;
while($temIp <= $maxIp){
if( empty($result)){
break ;
}
$tmp = array_pop($result);
$dbIp =$tmp['ip'];
while($temIp < $dbIp){
// echo temIp ;
// i repalce it by count ++ , i don't want it
//full my teminal .
$count ++;
$temIp ++;
}
$temIp ++;
}
while($temIp <= $maxIp){
//echo $temIp ; replace by $count++
$count ++;
$temIp++;
}
return $count -1;
}
$servername = "localhost";
$username = "root";
$password = "aaaaa";
$dbname = "IP";
$conn = new PDO('mysql:host=' . $servername . ';dbname=' . $dbname , $username, $password);
$conn->setAttribute(PDO::ATTR_AUTOCOMMIT , true);
$stmt = $conn->prepare("select * from ipTable order by ip desc");
$stmt->execute();
$result = $stmt->fetchAll();
$count = mergeSort($result);
echo $count ;
?>
it take about 10s ;
time php test.php
184460881
real 0m10.626s
user 0m10.416s
sys 0m0.168s
I have a PHP function I got from the web that uses bcmath functions:
function SteamID64to32($steamId64) {
$iServer = "1";
if(bcmod($steamId64, "2") == "0") {
$iServer = "0";
}
$steamId64 = bcsub($steamId64,$iServer);
if(bccomp("76561197960265728",$steamId64) == -1) {
$steamId64 = bcsub($steamId64,"76561197960265728");
}
$steamId64 = bcdiv($steamId64, "2");
return ("STEAM_0:" . $iServer . ":" . $steamId64);
}
For any valid input the function will at random times add ".0000000000" to the output.
Ex:
$input = 76561198014791430;
SteamID64to32($input);
//result for each run
STEAM_0:0:27262851
STEAM_0:0:27262851.0000000000
STEAM_0:0:27262851
STEAM_0:0:27262851
STEAM_0:0:27262851.0000000000
STEAM_0:0:27262851
STEAM_0:0:27262851.0000000000
STEAM_0:0:27262851.0000000000
STEAM_0:0:27262851.0000000000
This was tested on 2 different nginx servers running php-fpm.
Please help me understand what is wrong here. Thanks
The answer provided by JD doesn't account for all possible STEAM_ ID types, namely, anything that is in the STEAM_0:1 range. This block of code will:
<?php
function Steam64ToSteam32($Steam64ID)
{
$offset = $Steam64ID - 76561197960265728;
$id = ($offset / 2);
if($offset % 2 != 0)
{
$Steam32ID = 'STEAM_0:1:' . bcsub($id, '0.5');
}
else
{
$Steam32ID = "STEAM_0:0:" . $id;
}
return $Steam32ID;
}
echo Steam64ToSteam32(76561197960435530);
echo "<br/>";
echo Steam64ToSteam32(76561197990323733);
echo "<br/>";
echo Steam64ToSteam32(76561198014791430);
?>
This outputs
STEAM_0:0:84901
STEAM_0:1:15029002
STEAM_0:0:27262851
The first one is for a Valve employee and someone who's had a Steam account since 2003 (hence the low ID), the second is a random profile I found on VACBanned which had an ID in the STEAM_0:1 range. The third is the ID you provided in your sample code.
I found this: SteamProfile
/**
* This file is part of SteamProfile.
*
* Written by Nico Bergemann <barracuda415#yahoo.de>
* Copyright 2009 Nico Bergemann
*
* SteamProfile is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* SteamProfile is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with SteamProfile. If not, see <http://www.gnu.org/licenses/>.
*/
// thanks to voogru for the id transformation algorithm (http://forums.alliedmods.net/showthread.php?t=60899)
class SteamID {
private $sSteamID = '';
private $sSteamComID = '';
const STEAMID64_BASE = '76561197960265728';
public function __construct($sID) {
// make sure the bcmath extension is loaded
if(!extension_loaded('bcmath')) {
throw new RuntimeException("BCMath extension required");
}
if($this->isValidSteamID($sID)) {
$this->sSteamID = $sID;
$this->sSteamComID = $this->convertToSteamComID($sID);
} elseif($this->isValidComID($sID)) {
$this->sSteamID = $this->convertToSteamID($sID);
$this->sSteamComID = $sID;
} else {
$this->sSteamID = '';
$this->sSteamComID = '';
}
}
public function getSteamID() {
return $this->sSteamID;
}
public function getSteamComID() {
return $this->sSteamComID;
}
public function isValid() {
return $this->sSteamID != '';
}
private function isValidSteamID($sSteamID) {
return preg_match('/^(STEAM_)?[0-5]:[0-9]:\d+$/i', $sSteamID);
}
private function isValidComID($sSteamComID) {
// anything else than a number is invalid
// (is_numeric() doesn't work for 64 bit integers)
if(!preg_match('/^\d+$/i', $sSteamComID)) {
return false;
}
// the community id must be bigger than STEAMID64_BASE
if(bccomp(self::STEAMID64_BASE, $sSteamComID) == 1) {
return false;
}
// TODO: Upper limit?
return true;
}
private function convertToSteamComID($sSteamID) {
$aTMP = explode(':', $sSteamID);
$sServer = $aTMP[1];
$sAuth = $aTMP[2];
if((count($aTMP) == 3) && $sAuth != '0' && is_numeric($sServer) && is_numeric($sAuth)) {
$sComID = bcmul($sAuth, "2"); // multipy Auth-ID with 2
$sComID = bcadd($sComID, $sServer); // add Server-ID
$sComID = bcadd($sComID, self::STEAMID64_BASE); // add this odd long number
// It seems that PHP appends ".0000000000" at the end sometimes.
// I can't find a reason for this, so I'll take the dirty way...
$sComID = str_replace('.0000000000', '', $sComID);
return $sComID;
} else {
throw new RuntimeException("Unable to convert Steam-ID");
}
}
private function convertToSteamID($sSteamComID) {
$sServer = bcmod($sSteamComID, '2') == '0' ? '0' : '1';
$sCommID = bcsub($sSteamComID, $sServer);
$sCommID = bcsub($sCommID, self::STEAMID64_BASE);
$sAuth = bcdiv($sCommID, '2');
return "STEAM_0:$sServer:$sAuth";
}
}
I have to convert .DBF and .FPT files from Visual FoxPro to MySQL. Right now my script works for .DBF files, it opens and reads them with dbase_open() and dbase_get_record_with_names() and then executes the MySQL INSERT commands.
However, some fields of these .DBF files are of type MEMO and therefore stored in a separate files ending in .FPT. How do I read this file?
I have found the specifications of this filetype in MSDN, but I don't know how I can read this file byte-wise with PHP (also, I would really prefer a simplier solution).
Any ideas?
Alright, I have carefully studied the MSDN specifications of DBF and FPT file structures and the outcome is a beautiful PHP class which can open a DBF and (optional) an FPT memo file at the same time. This class will give you record after record and thereby fetch any memos from the memo file - if opened.
class Prodigy_DBF {
private $Filename, $DB_Type, $DB_Update, $DB_Records, $DB_FirstData, $DB_RecordLength, $DB_Flags, $DB_CodePageMark, $DB_Fields, $FileHandle, $FileOpened;
private $Memo_Handle, $Memo_Opened, $Memo_BlockSize;
private function Initialize() {
if($this->FileOpened) {
fclose($this->FileHandle);
}
if($this->Memo_Opened) {
fclose($this->Memo_Handle);
}
$this->FileOpened = false;
$this->FileHandle = NULL;
$this->Filename = NULL;
$this->DB_Type = NULL;
$this->DB_Update = NULL;
$this->DB_Records = NULL;
$this->DB_FirstData = NULL;
$this->DB_RecordLength = NULL;
$this->DB_CodePageMark = NULL;
$this->DB_Flags = NULL;
$this->DB_Fields = array();
$this->Memo_Handle = NULL;
$this->Memo_Opened = false;
$this->Memo_BlockSize = NULL;
}
public function __construct($Filename, $MemoFilename = NULL) {
$this->Prodigy_DBF($Filename, $MemoFilename);
}
public function Prodigy_DBF($Filename, $MemoFilename = NULL) {
$this->Initialize();
$this->OpenDatabase($Filename, $MemoFilename);
}
public function OpenDatabase($Filename, $MemoFilename = NULL) {
$Return = false;
$this->Initialize();
$this->FileHandle = fopen($Filename, "r");
if($this->FileHandle) {
// DB Open, reading headers
$this->DB_Type = dechex(ord(fread($this->FileHandle, 1)));
$LUPD = fread($this->FileHandle, 3);
$this->DB_Update = ord($LUPD[0])."/".ord($LUPD[1])."/".ord($LUPD[2]);
$Rec = unpack("V", fread($this->FileHandle, 4));
$this->DB_Records = $Rec[1];
$Pos = fread($this->FileHandle, 2);
$this->DB_FirstData = (ord($Pos[0]) + ord($Pos[1]) * 256);
$Len = fread($this->FileHandle, 2);
$this->DB_RecordLength = (ord($Len[0]) + ord($Len[1]) * 256);
fseek($this->FileHandle, 28); // Ignoring "reserved" bytes, jumping to table flags
$this->DB_Flags = dechex(ord(fread($this->FileHandle, 1)));
$this->DB_CodePageMark = ord(fread($this->FileHandle, 1));
fseek($this->FileHandle, 2, SEEK_CUR); // Ignoring next 2 "reserved" bytes
// Now reading field captions and attributes
while(!feof($this->FileHandle)) {
// Checking for end of header
if(ord(fread($this->FileHandle, 1)) == 13) {
break; // End of header!
} else {
// Go back
fseek($this->FileHandle, -1, SEEK_CUR);
}
$Field["Name"] = trim(fread($this->FileHandle, 11));
$Field["Type"] = fread($this->FileHandle, 1);
fseek($this->FileHandle, 4, SEEK_CUR); // Skipping attribute "displacement"
$Field["Size"] = ord(fread($this->FileHandle, 1));
fseek($this->FileHandle, 15, SEEK_CUR); // Skipping any remaining attributes
$this->DB_Fields[] = $Field;
}
// Setting file pointer to the first record
fseek($this->FileHandle, $this->DB_FirstData);
$this->FileOpened = true;
// Open memo file, if exists
if(!empty($MemoFilename) and file_exists($MemoFilename) and preg_match("%^(.+).fpt$%i", $MemoFilename)) {
$this->Memo_Handle = fopen($MemoFilename, "r");
if($this->Memo_Handle) {
$this->Memo_Opened = true;
// Getting block size
fseek($this->Memo_Handle, 6);
$Data = unpack("n", fread($this->Memo_Handle, 2));
$this->Memo_BlockSize = $Data[1];
}
}
}
return $Return;
}
public function GetNextRecord($FieldCaptions = false) {
$Return = NULL;
$Record = array();
if(!$this->FileOpened) {
$Return = false;
} elseif(feof($this->FileHandle)) {
$Return = NULL;
} else {
// File open and not EOF
fseek($this->FileHandle, 1, SEEK_CUR); // Ignoring DELETE flag
foreach($this->DB_Fields as $Field) {
$RawData = fread($this->FileHandle, $Field["Size"]);
// Checking for memo reference
if($Field["Type"] == "M" and $Field["Size"] == 4 and !empty($RawData)) {
// Binary Memo reference
$Memo_BO = unpack("V", $RawData);
if($this->Memo_Opened and $Memo_BO != 0) {
fseek($this->Memo_Handle, $Memo_BO[1] * $this->Memo_BlockSize);
$Type = unpack("N", fread($this->Memo_Handle, 4));
if($Type[1] == "1") {
$Len = unpack("N", fread($this->Memo_Handle, 4));
$Value = trim(fread($this->Memo_Handle, $Len[1]));
} else {
// Pictures will not be shown
$Value = "{BINARY_PICTURE}";
}
} else {
$Value = "{NO_MEMO_FILE_OPEN}";
}
} else {
$Value = trim($RawData);
}
if($FieldCaptions) {
$Record[$Field["Name"]] = $Value;
} else {
$Record[] = $Value;
}
}
$Return = $Record;
}
return $Return;
}
function __destruct() {
// Cleanly close any open files before destruction
$this->Initialize();
}
}
The class can be used like this:
$Test = new Prodigy_DBF("customer.DBF", "customer.FPT");
while(($Record = $Test->GetNextRecord(true)) and !empty($Record)) {
print_r($Record);
}
It might not be an almighty perfect class, but it works for me. Feel free to use this code, but note that the class is VERY tolerant - it doesn't care if fread() and fseek() return true or anything else - so you might want to improve it a bit before using.
Also note that there are many private variables like number of records, recordsize etc. which are not used at the moment.
I replaced this code:
fseek($this->Memo_Handle, (trim($RawData) * $this->Memo_BlockSize)+8);
$Value = trim(fread($this->Memo_Handle, $this->Memo_BlockSize));
with this code:
fseek($this->Memo_Handle, (trim($RawData) * $this->Memo_BlockSize)+4);
$Len = unpack("N", fread($this->Memo_Handle, 4));
$Value = trim(fread($this->Memo_Handle, $Len[1] ));
this helped for me
Although not PHP, VFP is 1-based references and I think PHP is zero-based references so you'll have to decypher and adjust accordingly, but this works and hopefully you'll be able
to post your version of this portion when finished.
FILETOSTR() in VFP will open a file, and read the entire content into
a single memory variable as a character string -- all escape keys, high byte characters, etc, intact. You'll probably need to rely on an FOPEN(), FSEEK(), FCLOSE(), etc.
MemoTest.FPT was my sample memo table/file
fpt1 = FILETOSTR( "MEMOTEST.FPT" )
First, you'll have to detect the MEMO BLOCK SIZE used when the file was created. Typically this would be 64 BYTES, but per the link you had in your post.
The header positions 6-7 identify the size (VFP, positions 7 and 8). The first byte is the high-order
nBlockSize = ASC( SUBSTR( fpt1, 7, 1 )) * 256 + ASC( SUBSTR( fpt1, 8, 1 ))
Now, at your individual records. Wherever in your DBF structure has the memo FIELD (and you can have many per single record structure) there will be 4 bytes. In THE RECORD field, it identifies the "block" in the memo file where the content is stored.
MemoBytes = 4 bytes at your identified field location. These will be stored as ASCII from 0-255. This field is stored with the FIRST byte as low-order and the 4th byte as 256^3 = 16777216. The first "Block" ever used will be starting in position offset of 512 per the memo .fpt file spec that the header takes up positions 0-511.
So, if your first memo field has a content of "8000" where the 8 is the actual 0x08, not number "8" which is 0x38, and the zeros are 0x00.
YourMemoField = "8000" (actually use ascii, but for readability showing hex expected value)
First Byte is ASCII value * 1 ( 256 ^ 0 )
Second Byte is ASCII value * 256 (256 ^ 1)
Third Byte is ASCII value * 65536 (256 ^ 2)
Fourth Byte is ASCII value * 16777216 (256 ^ 3)
nMemoBlock = byte1 + ( byte2 * 256 ) + ( byte3 * 65536 ) + ( byte4 * 16777216 )
Now, you'll need to FSEEK() to the
FSEEK( handle, nMemoBlock * nBlockSize +1 )
for the first byte of the block you are looking for. This will point to the BLOCK header. In this case, per the spec, the first 4 bytes identify the Block SIGNATURE, the second 4 bytes is the length of the content. For these two, the bytes are stored with HIGH-BYTE first.
From your FSEEK(), its REVERSE of the nMemoBlock above with the high-byte. The "Byte1-4" here are from your FSEEK() position
nSignature = ( byte1 * 16777216 ) + ( byte2 * 65536 ) + ( byte3 * 256 ) + byte4
nMemoLength = ( byte5 * 16777216 ) + ( byte6 * 65536 ) + ( byte7 * 256 ) + byte8
Now, FSEEK() to the 9th byte (1st actual character of the data AFTER the 8 bytes of the header you just read for signature and memo length). This is the beginning of your data.
Now, read the rest of the content...
FSEEK() +9 characters to new position
cFinalMemoData = FREAD( handle, nMemoLength )
I know this isn't perfect, nor PHP script, but its enough of pseudo-code on hOW things are stored and hopefully gets you WELL on your way.
Again, PLEASE take into consideration as you are stepping through your debug process to ensure 0 or 1 offset basis. To help simplify and test this, I created a simple .DBF with 2 fields... a character field and a memo field, added a few records and some basic content to confirm all content, positions, etc.
I think it's unlikely there are FoxPro libraries in PHP.
You may have to code it from scratch. For byte-wise reading, meet fopen() fread() and colleagues.
Edit: There seems to be a Visual FoxPro ODBC driver. You may be able to connect to a FoxPro database through PDO and that connector. How the chances of success are, and how much work it would be, I don't know.
The FPT file contains memo data. In the DBF you have columns of type memo, and the information in this column is a pointer to the entry in the FPT file.
If you are querying the data from the table you only have to reference the memo column to get the data. You do not need to parse the data out of the FPT file separately. The OLE DB driver (or the ODBC driver if your files are VFP 6 or earlier) should just give you this information.
There is a tool that will automatically migrate your Visual FoxPro data to MySQL. You might want to check it out to see if you can save some time:
Go to http://leafe.com/dls/vfp
and search for "Stru2MySQL_2" for the tool to migrate the structures of the data and "VFP2MySQL Data Upload program" for tools to help with the migration.
Rick Schummer VFP MVP
You also might want to check the PHP dbase libraries.They work quite well with DBF files.
<?
class Prodigy_DBF {
private $Filename, $DB_Type, $DB_Update, $DB_Records, $DB_FirstData, $DB_RecordLength, $DB_Flags, $DB_CodePageMark, $DB_Fields, $FileHandle, $FileOpened;
private $Memo_Handle, $Memo_Opened, $Memo_BlockSize;
private function Initialize() {
if($this->FileOpened) {
fclose($this->FileHandle);
}
if($this->Memo_Opened) {
fclose($this->Memo_Handle);
}
$this->FileOpened = false;
$this->FileHandle = NULL;
$this->Filename = NULL;
$this->DB_Type = NULL;
$this->DB_Update = NULL;
$this->DB_Records = NULL;
$this->DB_FirstData = NULL;
$this->DB_RecordLength = NULL;
$this->DB_CodePageMark = NULL;
$this->DB_Flags = NULL;
$this->DB_Fields = array();
$this->Memo_Handle = NULL;
$this->Memo_Opened = false;
$this->Memo_BlockSize = NULL;
}
public function __construct($Filename, $MemoFilename = NULL) {
$this->Prodigy_DBF($Filename, $MemoFilename);
}
public function Prodigy_DBF($Filename, $MemoFilename = NULL) {
$this->Initialize();
$this->OpenDatabase($Filename, $MemoFilename);
}
public function OpenDatabase($Filename, $MemoFilename = NULL) {
$Return = false;
$this->Initialize();
$this->FileHandle = fopen($Filename, "r");
if($this->FileHandle) {
// DB Open, reading headers
$this->DB_Type = dechex(ord(fread($this->FileHandle, 1)));
$LUPD = fread($this->FileHandle, 3);
$this->DB_Update = ord($LUPD[0])."/".ord($LUPD[1])."/".ord($LUPD[2]);
$Rec = unpack("V", fread($this->FileHandle, 4));
$this->DB_Records = $Rec[1];
$Pos = fread($this->FileHandle, 2);
$this->DB_FirstData = (ord($Pos[0]) + ord($Pos[1]) * 256);
$Len = fread($this->FileHandle, 2);
$this->DB_RecordLength = (ord($Len[0]) + ord($Len[1]) * 256);
fseek($this->FileHandle, 28); // Ignoring "reserved" bytes, jumping to table flags
$this->DB_Flags = dechex(ord(fread($this->FileHandle, 1)));
$this->DB_CodePageMark = ord(fread($this->FileHandle, 1));
fseek($this->FileHandle, 2, SEEK_CUR); // Ignoring next 2 "reserved" bytes
// Now reading field captions and attributes
while(!feof($this->FileHandle)) {
// Checking for end of header
if(ord(fread($this->FileHandle, 1)) == 13) {
break; // End of header!
} else {
// Go back
fseek($this->FileHandle, -1, SEEK_CUR);
}
$Field["Name"] = trim(fread($this->FileHandle, 11));
$Field["Type"] = fread($this->FileHandle, 1);
fseek($this->FileHandle, 4, SEEK_CUR); // Skipping attribute "displacement"
$Field["Size"] = ord(fread($this->FileHandle, 1));
fseek($this->FileHandle, 15, SEEK_CUR); // Skipping any remaining attributes
$this->DB_Fields[] = $Field;
}
// Setting file pointer to the first record
fseek($this->FileHandle, $this->DB_FirstData);
$this->FileOpened = true;
// Open memo file, if exists
if(!empty($MemoFilename) and file_exists($MemoFilename) and preg_match("%^(.+).fpt$%i", $MemoFilename)) {
$this->Memo_Handle = fopen($MemoFilename, "r");
if($this->Memo_Handle) {
$this->Memo_Opened = true;
// Getting block size
fseek($this->Memo_Handle, 6);
$Data = unpack("n", fread($this->Memo_Handle, 2));
$this->Memo_BlockSize = $Data[1];
}
}
}
return $Return;
}
public function GetNextRecord($FieldCaptions = false) {
$Return = NULL;
$Record = array();
if(!$this->FileOpened) {
$Return = false;
} elseif(feof($this->FileHandle)) {
$Return = NULL;
} else {
// File open and not EOF
fseek($this->FileHandle, 1, SEEK_CUR); // Ignoring DELETE flag
foreach($this->DB_Fields as $Field) {
$RawData = fread($this->FileHandle, $Field["Size"]);
// Checking for memo reference
if($Field["Type"] == "M" and $Field["Size"] == 4 and !empty($RawData)) {
// Binary Memo reference
$Memo_BO = unpack("V", $RawData);
if($this->Memo_Opened and $Memo_BO != 0) {
fseek($this->Memo_Handle, $Memo_BO[1] * $this->Memo_BlockSize);
$Type = unpack("N", fread($this->Memo_Handle, 4));
if($Type[1] == "1") {
$Len = unpack("N", fread($this->Memo_Handle, 4));
$Value = trim(fread($this->Memo_Handle, $Len[1]));
} else {
// Pictures will not be shown
$Value = "{BINARY_PICTURE}";
}
} else {
$Value = "{NO_MEMO_FILE_OPEN}";
}
} else {
if($Field["Type"] == "M"){
if(trim($RawData) > 0) {
fseek($this->Memo_Handle, (trim($RawData) * $this->Memo_BlockSize)+8);
$Value = trim(fread($this->Memo_Handle, $this->Memo_BlockSize));
}
}else{
$Value = trim($RawData);
}
}
if($FieldCaptions) {
$Record[$Field["Name"]] = $Value;
} else {
$Record[] = $Value;
}
}
$Return = $Record;
}
return $Return;
}
function __destruct() {
// Cleanly close any open files before destruction
$this->Initialize();
}
}
?>
I made a couple related threads but this is the one direct question that I'm seeking the answer for. My framework will use Zend_Translate if the php version is 5, otherwise I have to mimic the functionality for 4.
It seems that pretty much every implementation of gettext relies on setlocale or locales, I know there's a LOT of inconsistency across systems which is why I don't want to rely upon it.
I've tried a couple times to get the textdomain, bindtextdomain and gettext functions to work but I've always needed to invoke setlocale.
By the way, all the .mo files will be UTF-8.
Here's some reusable code to parse MO files in PHP, based on Zend_Translate_Adapter_Gettext:
<?php
class MoParser {
private $_bigEndian = false;
private $_file = false;
private $_data = array();
private function _readMOData($bytes)
{
if ($this->_bigEndian === false) {
return unpack('V' . $bytes, fread($this->_file, 4 * $bytes));
} else {
return unpack('N' . $bytes, fread($this->_file, 4 * $bytes));
}
}
public function loadTranslationData($filename, $locale)
{
$this->_data = array();
$this->_bigEndian = false;
$this->_file = #fopen($filename, 'rb');
if (!$this->_file) throw new Exception('Error opening translation file \'' . $filename . '\'.');
if (#filesize($filename) < 10) throw new Exception('\'' . $filename . '\' is not a gettext file');
// get Endian
$input = $this->_readMOData(1);
if (strtolower(substr(dechex($input[1]), -8)) == "950412de") {
$this->_bigEndian = false;
} else if (strtolower(substr(dechex($input[1]), -8)) == "de120495") {
$this->_bigEndian = true;
} else {
throw new Exception('\'' . $filename . '\' is not a gettext file');
}
// read revision - not supported for now
$input = $this->_readMOData(1);
// number of bytes
$input = $this->_readMOData(1);
$total = $input[1];
// number of original strings
$input = $this->_readMOData(1);
$OOffset = $input[1];
// number of translation strings
$input = $this->_readMOData(1);
$TOffset = $input[1];
// fill the original table
fseek($this->_file, $OOffset);
$origtemp = $this->_readMOData(2 * $total);
fseek($this->_file, $TOffset);
$transtemp = $this->_readMOData(2 * $total);
for($count = 0; $count < $total; ++$count) {
if ($origtemp[$count * 2 + 1] != 0) {
fseek($this->_file, $origtemp[$count * 2 + 2]);
$original = #fread($this->_file, $origtemp[$count * 2 + 1]);
$original = explode("\0", $original);
} else {
$original[0] = '';
}
if ($transtemp[$count * 2 + 1] != 0) {
fseek($this->_file, $transtemp[$count * 2 + 2]);
$translate = fread($this->_file, $transtemp[$count * 2 + 1]);
$translate = explode("\0", $translate);
if ((count($original) > 1) && (count($translate) > 1)) {
$this->_data[$locale][$original[0]] = $translate;
array_shift($original);
foreach ($original as $orig) {
$this->_data[$locale][$orig] = '';
}
} else {
$this->_data[$locale][$original[0]] = $translate[0];
}
}
}
$this->_data[$locale][''] = trim($this->_data[$locale]['']);
unset($this->_data[$locale]['']);
return $this->_data;
}
}
Ok, I basically ended up writing a mo file parser based on Zend's Gettext Adapter, as far as I know gettext is pretty much reliant upon the locale, so manually parsing the .mo file would save the hassle of running into odd circumstances with locale issues with setlocale. I also plan on parsing the Zend Locale data provided in the form of xml files.