Break query data into several files

Break query data into several files - php

I want to get table from database and then break output data by files (50 entries in each file): list01.txt, list02.txt... But somehow I got stacked at the question how to break data more effectively.
if ( $result = $mysqli->query($query) ) {
icount = 0;
while ( $row = mysqli_fetch_array($result) ) {
if ( icount % 50 == 0 ) {
$snum = int( icount / 50 );
$filename = 'scripts/spisok'.$snum.'.txt';
$handle = fopen( $filename, 'w' );
}
fwrite( $filename, $row['uname'].';'.$row['email'].'<br />' );
icount++;
}
echo 'ok';
$result->free();
}
Can I just break $result into 50-entry arrays first and then write them all? Sorry, im novice to PHP

You can also use array_chunk
<?php
if ( $result = $mysqli->query( $query ) ) {
$data = array();
while( $row = mysqli_fetch_array( $result ) ) {
$data[] = $row['uname'].';'.$row['email'];
}
$result->free();
// divide an array into a desired number of split lists
$chunks = array_chunk( $data, 50 );
// loop through chunks
foreach( $chunks as $index => $chunk ) {
$file = 'scripts/spisok'.( $index + 1 ).'.txt';
$chunk = implode( "<br />", $chunk );
file_put_contents( $file, $chunk );
// or
/*
$handle = fopen( $file, 'w' );
fwrite( $handle, $chunk );
fclose( $handle );
*/
}
unset( $data, $chunks );
}
?>

There is nothing particularly wrong with what you are doing except for some syntax errors here and there.
Try this :-
if ( $result = $mysqli->query($query) ) {
$icount = 0;
$handle = NULL;
while ( $row = mysqli_fetch_array($result) ) {
if ( $icount % 50 == 0 ) {
if ( $handle !== NULL ) {
fclose($handle);
}
$snum = int( $icount / 50 );
$filename = 'scripts/spisok'.$snum.'.txt';
$handle = fopen( $filename, 'w' );
}
fwrite( $handle, $row['uname'].';'.$row['email'].'<br />' );
$icount++;
}
fclose($handle);
echo 'ok';
$result->free();
}

Related

Php WebSocket Handshake

I create a secure websocket server with stream_socket_server.
And i'd like to have a function accept that sequencially get the connections and do the handshake as a blocking access. With chrome, it connects and deconnects and with Firefox it tells me that there's an error with fread.
The snippet is :
$write = null;
$except = null;
$sockets = $this->socket;
#stream_select( $sockets, $write, $except, 0 );
foreach( $sockets as $socket ) {
$resource = #stream_socket_accept( $socket );
if (!$resource) {
return false;
}
$accepts = array( $resource );
#stream_select( $accepts, $write, $except, 0, 1000 );
foreach( $accepts as $accepted ) {
$buffer = '';
$bytes_to_read = 8192;
while ( $chunk = fread( $accepted, $bytes_to_read ) ) {
$buffer .= $chunk;
$status = stream_get_meta_data( $accepted );
$bytes_to_read = $status[ "unread_bytes" ];
if ( strlen( $buffer ) === 1 )
$bytes_to_read = 8192;
}
$response = HandShake::perform( $buffer );
$responseLength = strlen( $response );
for( $written = 0; $written < $responseLength; $written += $fwrite ) {
$fwrite = fwrite( $accepted, substr( $response, $written ) );
if ( ( $fwrite === false ) || ( $fwrite === 0 ) ) {
stream_socket_shutdown( $accepted, STREAM_SHUT_RDWR );
return false;
}
}
}
}
Can you tell me why it doesn't work and transform my snippet ?

Convert DBF to CSV

I have a number of DBF database files that I would like to convert to CSVs. Is there a way to do this in Linux, or in PHP?
I've found a few methods to convert DBFs, but they are very slow.

Try soffice (LibreOffice):
$ soffice --headless --convert-to csv FILETOCONVERT.DBF

Change the files variable to a path to your DBF files. Make sure the file extension matches the case of your files.
set_time_limit( 24192000 );
ini_set( 'memory_limit', '-1' );
$files = glob( '/path/to/*.DBF' );
foreach( $files as $file )
{
echo "Processing: $file\n";
$fileParts = explode( '/', $file );
$endPart = $fileParts[key( array_slice( $fileParts, -1, 1, true ) )];
$csvFile = preg_replace( '~\.[a-z]+$~i', '.csv', $endPart );
if( !$dbf = dbase_open( $file, 0 ) ) die( "Could not connect to: $file" );
$num_rec = dbase_numrecords( $dbf );
$num_fields = dbase_numfields( $dbf );
$fields = array();
$out = '';
for( $i = 1; $i <= $num_rec; $i++ )
{
$row = #dbase_get_record_with_names( $dbf, $i );
$firstKey = key( array_slice( $row, 0, 1, true ) );
foreach( $row as $key => $val )
{
if( $key == 'deleted' ) continue;
if( $firstKey != $key ) $out .= ';';
$out .= trim( $val );
}
$out .= "\n";
}
file_put_contents( $csvFile, $out );
}

Using #Kohjah's code, here an update of the code using a better (IMHO) fputcsv approach:
// needs dbase php extension (http://php.net/manual/en/book.dbase.php)
function dbfToCsv($file)
{
$output_path = 'output' . DIRECTORY_SEPARATOR . 'path';
$path_parts = pathinfo($file);
$csvFile = path_parts['filename'] . '.csv';
$output_path_file = $output_path . DIRECTORY_SEPARATOR . $csvFile;
if (!$dbf = dbase_open( $file, 0 )) {
return false;
}
$num_rec = dbase_numrecords( $dbf );
$fp = fopen($output_path_file, 'w');
for( $i = 1; $i <= $num_rec; $i++ ) {
$row = dbase_get_record_with_names( $dbf, $i );
if ($i == 1) {
//print header
fputcsv($fp, array_keys($row));
}
fputcsv($fp, $row);
}
fclose($fp);
}

index.php returning a blank web page on openshift host

I have a website I am trying to maintain for a project:
http://uomtwittersearch-jbon0041.rhcloud.com/
The user connects to Twitter through the application and authenticates by using the twitteroauth library (by abraham). The process works fine up until it lands on index.php (calling index.inc as the respective HTML page) where it gives me a blank page. On localhost it works perfectly fine so I am not sure what could be causing this. Other pages such as connect.php initialize as required.
Visiting the website as it is will give an error and I am assuming that is because it cannot find index.php directly and it lies in the folder twitteroauth-master. I will fix this when I manage to at least make the contents of index.php appear but for now I am visiting:
http://uomtwittersearch-jbon0041.rhcloud.com/twitteroauth-master/connect.php
first, and this also goes to anyone who would like to visit it. If you have twitter log on with your details, this will move you to index.php which will be blank. Other than that one can simply replace 'connect' with 'index'.
What could be causing the blank page for index.php?
This is only my first ever web development project so I am not sure if this is something obvious. Moreover, I am using OpenShift for hosting.
EDIT --------------------
This is my index.php script. Again the script works fine without any problems on localhost.
<?php
//session_save_path(home/users/web/b2940/ipg.uomtwittersearchnet/cgi-bin/tmp);
ini_set('display_errors',1);
error_reporting(E_ALL);
session_start ();
require_once ('twitteroauth/twitteroauth.php');
require_once ('config.php');
include ('nlp/stop_words.php');
include ('nlp/acronyms.php');
set_time_limit ( 300 );
//////////////////////// TWITTEROAUTH /////////////////////////////////////
/* If access tokens are not available redirect to connect page. */
if (empty ( $_SESSION ['access_token'] ) || empty ( $_SESSION ['access_token'] ['oauth_token'] ) || empty ( $_SESSION ['access_token'] ['oauth_token_secret'] )) {
header ( 'Location: ./clearsessions.php' );
}
/* Get user access tokens out of the session. */
$access_token = $_SESSION ['access_token'];
/* Create a TwitterOauth object with consumer/user tokens. */
$connection = new TwitterOAuth ( CONSUMER_KEY, CONSUMER_SECRET, $access_token ['oauth_token'], $access_token ['oauth_token_secret'] );
///////////////////////////////////////////////////////////////////////////
///// UNCOMMENT BELOW TO AUTOMATICALLY SPECIFY CURRENTLY LOGGED IN USER
//$user = $connection->get('account/verify_credentials');
//$user_handle = $user->screen_name;
$user_handle = 'AngeloDalli';
$timeline = getContent ( $connection, $user_handle, 1 );
$latest_id = $timeline [0]->id_str;
$most_recent = getMostRecentTweet ();
if ($latest_id > $most_recent) {
$t_start = microtime(true); // start indexing
$timeline = getContent ( $connection, $user_handle, 200 );
$json_index = decodeIndex ();
$json_index = updateIndex ( $timeline, $connection, $user_handle, $json_index, $most_recent );
$json_index = sortIndex ( $json_index );
$json = encodeIndex ( $json_index );
updateMostRecentTweet ( $latest_id );
$_SESSION ['index_size'] = countIndex ( $json_index );
$t_end = microtime(true); // finish indexing
$content = 'New tweets indexed! Number of tweets in index: ' . $_SESSION ['index_size'];
// total indexing time
$time = 'Total time of indexing: ' . ($t_end - $t_start)/60 . ' seconds';
} else {
$content = 'No new tweets indexed!';
$time = '';
}
/////////////////////// FUNCTIONS //////////////////////////////////////////////
function getContent($connection, $user_handle, $n) {
$content = $connection->get ( 'statuses/user_timeline', array (
'screen_name' => $user_handle,
'count' => $n
) );
return $content;
}
function decodeIndex() {
$string = file_get_contents ( INDEX_PATH );
if ($string) {
$json_index = json_decode ( $string, true );
} else {
$json_index = [ ];
}
return $json_index;
}
function updateIndex($timeline, $connection, $user_handle, $json_index, $most_recent) {
// URL arrays for uClassify API calls
$urls = [ ];
$urls_id = [ ];
// halt if no more new tweets are found
$halt = false;
// set to 1 to skip first tweet after 1st batch
$j = 0;
// count number of new tweets indexed
$count = 0;
while ( (count ( $timeline ) != 1 || $j == 0) && $halt == false ) {
$no_of_tweets_in_batch = 0;
$n = $j;
while ( ($n < count ( $timeline )) && $halt == false ) {
$tweet_id = $timeline [$n]->id_str;
if ($tweet_id > $most_recent) {
$text = $timeline [$n]->text;
$tokens = parseTweet ( $text );
$coord = extractLocation ( $timeline, $n );
addSentimentURL ( $text, $tweet_id, $urls, $urls_id );
$keywords = makeEntry ( $tokens, $tweet_id, $coord, $text );
foreach ( $keywords as $type ) {
$json_index [] = $type;
}
$n ++;
$no_of_tweets_in_batch ++;
} else {
$halt = true;
}
}
if ($halt == false) {
$tweet_id = $timeline [$n - 1]->id_str;
$timeline = $connection->get ( 'statuses/user_timeline', array (
'screen_name' => $user_handle,
'count' => 200,
'max_id' => $tweet_id
) );
// skip 1st tweet after 1st batch
$j = 1;
}
$count += $no_of_tweets_in_batch;
}
$json_index = extractSentiments ( $urls, $urls_id, $json_index );
echo 'Number of tweets indexed: ' . ($count);
return $json_index;
}
function parseTweet($tweet) {
// find urls in tweet and remove (HTTP ONLY CURRENTLY)
$tweet = preg_replace ( '/(http:\/\/[^\s]+)/', "", $tweet );
// split tweet into tokens and clean
$words = preg_split ( "/[^A-Za-z0-9]+/", $tweet );
// /[\s,:.##?!()-$%&^*;+=]+/ ------ Alternative regex
$expansion = expandAcronyms ( $words );
$tokens = removeStopWords ( $expansion );
// convert to type-frequency array
$tokens = array_filter ( $tokens );
$tokens = array_count_values ( $tokens );
return $tokens;
}
function expandAcronyms($terms) {
$words = [ ];
$acrok = array_keys ( $GLOBALS ['acronyms'] );
$acrov = array_values ( $GLOBALS ['acronyms'] );
for($i = 0; $i < count ( $terms ); $i ++) {
$j = 0;
$is_acronym = false;
while ( $is_acronym == false && $j != count ( $acrok ) ) {
if (strcasecmp ( $terms [$i], $acrok [$j] ) == 0) {
$is_acronym = true;
$expansion = $acrov [$j];
}
$j ++;
}
if ($is_acronym) {
$expansion = preg_split ( "/[^A-Za-z0-9]+/", $expansion );
foreach ( $expansion as $term ) {
$words [] = $term;
}
} else {
$words [] = $terms [$i];
}
}
return $words;
}
function removeStopWords($words) {
$tokens = [ ];
for($i = 0; $i < count ( $words ); $i ++) {
$is_stopword = false;
$j = 0;
while ( $is_stopword == false && $j != count ( $GLOBALS ['stop_words'] ) ) {
if (strcasecmp ( $words [$i], $GLOBALS ['stop_words'] [$j] ) == 0) {
$is_stopword = true;
} else
$j ++;
}
if (! $is_stopword) {
$tokens [] = $words [$i];
}
}
return $tokens;
}
function extractLocation($timeline, $n) {
$geo = $timeline [$n]->place;
if (! empty ( $geo )) {
$place = $geo->full_name;
$long = $geo->bounding_box->coordinates [0] [1] [0];
$lat = $geo->bounding_box->coordinates [0] [1] [1];
$coord = array (
'place' => $place,
'latitude' => $lat,
'longitude' => $long
);
} else {
$coord = [ ];
}
return $coord;
}
function addSentimentURL($text, $tweet_id, &$urls, &$urls_id) {
$urls_id [] = $tweet_id;
$url = makeURLForAPICall ( $text );
$urls [] = $url;
}
function makeURLForAPICall($tweet) {
$tweet = str_replace ( ' ', '+', $tweet );
$prefix = 'http://uclassify.com/browse/uClassify/Sentiment/ClassifyText?';
$key = 'readkey=' . CLASSIFY_KEY . '&';
$text = 'text=' . $tweet . '&';
$version = 'version=1.01';
$url = $prefix . $key . $text . $version;
return $url;
}
function makeEntry($tokens, $tweet_id, $coord, $text) {
$types = array ();
while ( current ( $tokens ) ) {
$key = key ( $tokens );
array_push ( $types, array (
'type' => $key,
'frequency' => $tokens [$key],
'tweet_id' => $tweet_id,
'location' => $coord,
'text' => $text
) );
next ( $tokens );
}
return $types;
}
function extractSentiments($urls, $urls_id, &$json_index) {
$responses = multiHandle ( $urls );
// add sentiments to all index entries
foreach ( $json_index as $i => $term ) {
$tweet_id = $term ['tweet_id'];
foreach ( $urls_id as $j => $id ) {
if ($tweet_id == $id) {
$sentiment = parseSentiment ( $responses [$j] );
$json_index [$i] ['sentiment'] = $sentiment;
}
}
}
return $json_index;
}
// - Without sentiment, indexing is performed at reasonable speed
// - With sentiment, very frequent API calls greatly reduce indexing speed
// - filegetcontents() for Sentiment API calls is too slow, therefore considered cURL
// - cURL is still too slow and indexing performance is still not good enough
// - therefore considered using multi cURL which is much faster than by just using cURL
// on its own and significantly improved sentiment extraction which in turn greatly
// improved indexing with sentiment
function multiHandle($urls) {
// curl handles
$curls = array ();
// results returned in xml
$xml = array ();
// init multi handle
$mh = curl_multi_init ();
foreach ( $urls as $i => $d ) {
// init curl handle
$curls [$i] = curl_init ();
$url = (is_array ( $d ) && ! empty ( $d ['url'] )) ? $d ['url'] : $d;
// set url to curl handle
curl_setopt ( $curls [$i], CURLOPT_URL, $url );
// on success, return actual result rather than true
curl_setopt ( $curls [$i], CURLOPT_RETURNTRANSFER, 1 );
// add curl handle to multi handle
curl_multi_add_handle ( $mh, $curls [$i] );
}
// execute the handles
$active = null;
do {
curl_multi_exec ( $mh, $active );
} while ( $active > 0 );
// get xml and flush handles
foreach ( $curls as $i => $ch ) {
$xml [$i] = curl_multi_getcontent ( $ch );
curl_multi_remove_handle ( $mh, $ch );
}
// close multi handle
curl_multi_close ( $mh );
return $xml;
}
// SENTIMENT VALUES ON INDEX.JSON FOR THIS ASSIGNMENT ARE NOT CORRECT SINCE THE
// NUMBER OF API CALLS EXCEEDED 5000 ON THE DAY OF HANDING IN. ONCE THE API CALLS
// ARE ALLOWED AGAIN IT CLASSIFIES AS REQUIRED
function parseSentiment($xml) {
$p = xml_parser_create ();
xml_parse_into_struct ( $p, $xml, $vals, $index );
xml_parser_free ( $p );
$positivity = $vals [8] ['attributes'] ['P'];
$negativity = 1 - $positivity;
$sentiment = array (
'pos' => $positivity,
'neg' => $negativity
);
return $sentiment;
}
function sortIndex($json_index) {
$type = array ();
$freq = array ();
$id = array ();
foreach ( $json_index as $key => $row ) {
$type [$key] = $row ['type'];
$freq [$key] = $row ['frequency'];
$id [$key] = $row ['tweet_id'];
}
array_multisort ( $type, SORT_ASC | SORT_NATURAL | SORT_FLAG_CASE,
$freq, SORT_DESC,
$id, SORT_ASC,
$json_index );
return $json_index;
}
function encodeIndex($json_index) {
$json = json_encode ( $json_index, JSON_FORCE_OBJECT | JSON_PRETTY_PRINT );
$index = fopen ( INDEX_PATH, 'w' );
fwrite ( $index, $json );
fclose ( $index );
return $json;
}
function countIndex($json_index) {
$tweets = [ ];
$count = 0;
for($i = 0; $i < count ( $json_index ); $i ++) {
$id = $json_index [$i] ['tweet_id'];
if (in_array ( $id, $tweets )) {
} else {
$tweets [] = $id;
$count ++;
}
}
return $count;
}
function lookup($array, $key, $val) {
foreach ( $array as $item ) {
if (isset ( $item [$key] ) && $item [$key] == $val) {
return true;
} else {
return false;
}
}
}
function getMostRecentTweet() {
$file = fopen ( 'latest.txt', 'r' );
$most_recent = fgets ( $file );
if (! $most_recent) {
$most_recent = 0;
}
fclose ( $file );
return $most_recent;
}
function updateMostRecentTweet($latest_id) {
$file = fopen ( 'latest.txt', 'w' );
fwrite ( $file, $latest_id . PHP_EOL );
fclose ( $file );
}
include ('index.inc');
?>

I have fixed the problem. When creating my application on OpenShift using the application wizard, I was specifying PHP 5.3 as the cartridge and not PHP 5.4 (note the way I'm specifying certain empty arrays).
The true lesson to take from this is: always be sure about the version of the language you're developing with
Thank you for any help given and I hope this may come of use to someone else in the future!

Reading INI file from PHP which contains Semicolons

I have to read config files in PHP which contain entries with semicolons, e.g.
[section]
key=value;othervalue
I notices that parse_ini_file() removes all semicolons and what follows, even when set to INI_SCANNER_RAW.
The INI files come from legacy systems and I can't change the format. I only have to read them.
What's the best tool to use when I have to preserve the entries with semicolons?

I would recommend reading the file into an array first, convert the semicolons to pipes |, then spit that out to a temporary file and use parse_ini_file() with that new temporary file.
Like so...
$string = file_get_contents('your_file');
$newstring = str_replace(";","|",$string);
$tempfile = 'your_temp_filename';
file_put_contents($tempfile, $newstring);
$arrIni = parse_ini_file($tempfile);
Then after that you could always replace the pipes with semicolons as you enumerate your new INI based array.

For ini files, the ; is the comment symbol.
So it's actually a good idea to not use it for something else.
However, you can use this slightly modified functions from the solutions found here :
<?php
//Credits to goulven.ch AT gmail DOT com
function parse_ini ( $filepath )
{
$ini = file( $filepath );
if ( count( $ini ) == 0 ) { return array(); }
$sections = array();
$values = array();
$globals = array();
$i = 0;
foreach( $ini as $line ){
$line = trim( $line );
// Comments
if ( $line == '' || $line{0} == ';' ) { continue; }
// Sections
if ( $line{0} == '[' )
{
$sections[] = substr( $line, 1, -1 );
$i++;
continue;
}
// Key-value pair
list( $key, $value ) = explode( '=', $line, 2 );
$key = trim( $key );
$value = trim( $value );
if (strpos($value, ";") !== false)
$value = explode(";", $value);
if ( $i == 0 ) {
// Array values
if ( substr( $line, -1, 2 ) == '[]' ) {
$globals[ $key ][] = $value;
} else {
$globals[ $key ] = $value;
}
} else {
// Array values
if ( substr( $line, -1, 2 ) == '[]' ) {
$values[ $i - 1 ][ $key ][] = $value;
} else {
$values[ $i - 1 ][ $key ] = $value;
}
}
}
for( $j=0; $j<$i; $j++ ) {
$result[ $sections[ $j ] ] = $values[ $j ];
}
return $result + $globals;
}
You can see examples of usage following the link.

Export Excel from Mysql Table

Pls, let someone help me with this code, When I export, it export the file but start with row2, the first row is excluded.it reads like , header, then row2,3,4 till the end of the row.
<?php
require_once("/includes/session.php");
require_once("/includes/db_connection.php");
require_once("/includes/functions.php");
// Table Name that you want
// to export in csv
$ShowTable = "staff_tab";
$today=date("dmY");
$FileName = "StaffRecord".$today.". csv";
$file = fopen($FileName,"w");
$sql = mysqli_query($connection,("SELECT * FROM $ShowTable LIMIT 500"));
$row = mysqli_fetch_assoc($sql);
// Save headings alon
$HeadingsArray=array();
foreach($row as $name => $value){
$HeadingsArray[]=$name;
}
fputcsv($file,$HeadingsArray);
// Save all records without headings
while($row = mysqli_fetch_assoc($sql)){
$valuesArray=array();
foreach($row as $name => $value){
$valuesArray[]=$value;
}
fputcsv($file,$valuesArray);
}
fclose($file);
header("Location: $FileName");
echo "Complete Record saves as CSV in file: <b style=\"color:red;\">$FileName</b>";
?>

Your call to $row = mysqli_fetch_assoc($sql); is pushing the internal pointer forward to row 2. Use mysqli_data_seek($sql, 0); to push it back to the start. http://php.net/manual/en/function.mysql-data-seek.php
...
$sql = mysqli_query($connection,("SELECT * FROM $ShowTable LIMIT 500"));
$row = mysqli_fetch_assoc($sql);
mysqli_data_seek($sql, 0); // return pointer to first row
// Save headings alon
$HeadingsArray=array();
...

maybe this could help as this will include all rows and set headers from the db headers. This would also force download the file without leaving the page.
$export = mysqli_query ($query) or die ( "Sql error : " . mysqli_error( ) );
// extract the field names for header
$fields = mysqli_num_fields ( $export );
for ( $i = 0; $i < $fields; $i++ )
{
$headers = mysqli_fetch_field($export);
$header .= $headers->name. "\t";
}
// export data
while( $row = mysqli_fetch_row( $export ) )
{
$line = '';
foreach( $row as $value )
{
if ( ( !isset( $value ) ) || ( $value == "" ) )
{
$value = "\t";
}
else
{
$value = str_replace( '"' , '""' , $value );
$value = '"' . $value . '"' . "\t";
}
$line .= $value;
}
$data .= trim( $line ) . "\n";
}
$data = str_replace( "\r" , "" , $data );
if ( $data == "" )
{
$data = "\nNo Record(s) Found!\n";
}
// allow exported file to download forcefully
header("Content-type: application/octet-stream");
header("Content-Disposition: attachment; filename=Something.xls");
header("Pragma: no-cache");
header("Expires: 0");
print "$header\n$data";

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Break query data into several files - php

Related

Php WebSocket Handshake

Convert DBF to CSV

index.php returning a blank web page on openshift host

Reading INI file from PHP which contains Semicolons

Export Excel from Mysql Table

Categories

Resources