Permanently write variables to a php file with php - php

I need to be able to permanently change variables in a php file using php.
I am creating a multilanguage site using codeigniter and using the language helper which stores the text in php files in variables in this format:
$lang['title'] = "Stuff";
I've been able to access the plain text of the files using fopen() etc and I it seems that I could probably locate the areas I want to edit with with regular expressions and rewrite the file once I've made the changes but it seems a bit hacky.
Is there any easy way to edit these variables permanently using php?
Cheers

If it's just an array you're dealing with, you may want to consider var_export. It will print out or return the expression in a format that's valid PHP code.
So if you had language_foo.php which contained a bunch of $lang['title'] = "Stuff"; lines, you could do something along the lines of:
include('language_foo.php');
$lang['title2'] = 'stuff2';
$data = '$lang = ' . var_export($lang, true) . ';';
file_put_contents('language_foo.php', '<?PHP ' . $data . ' ?>');
Alternatively, if you won't want to hand-edit them in the future, you should consider storing the data in a different way (such as in a database, or serialize()'d, etc etc).

It looks way easier to store data somewhere else (for instance, a database) and write a simple script to generate the *.php files, with this comment on top:
#
# THIS FILE IS AUTOGENERATED - DO NOT EDIT
#

I once faced a similar issue. I fixed it by simply adding a smarty template. The way I did it was as follows:
Read the array from the file
Add to the array
Pass the array to smarty
Loop over the array in smarty and generate the file using a template (this way you have total control, which might be missing in reg-ex)
Replace the file
Let me know if this helps.

Assuming that
You need the dictionary file in a human-readable and human-editable form (no serializing etc.)
The Dictionary array is an one-dimensional, associative array:
I would
Include() the dictionary file inside a function
Do all necessary operations on the $lang array (add words, remove words, change words)
Write the $lang array back into the file using a simple loop:
foreach ($lang as $key => $value)
fwrite ($file, "\$lang['$key'] = '$value';\n";
this is an extremely limited approach, of course. I would be really interested to see whether there is a genuine "PHP source code parser, changer and writer" around. This should be possible to do using the tokenizer functions.

If it also is about a truly multilingual site, you might enjoy looking into the gettext extension of PHP. It falls back to a library that has been in use for localizing stuff for many years, and where tools to keep up with the translation files have been around for almost quite as long. This makes supporting all the languages in later revisions of the product more fun, too.
In other news, I would not use an array but rather appropriate definitions, so that you have a file
switch ($lang) {
case 'de':
define('HELLO','Hallo.');
define('BYE','Auf wiedersehen.');
break;
case 'fr':
define('HELLO','Bonjour');
define('BYE','Au revoir.');
break;
case 'en':
default:
define ('HELLO','Hello.');
define ('BYE','Bye.');
}
And I'd also auto-generate that from a database, if maintenance becomes a hassle.

Pear Config will let you read and write PHP files containing settings using its 'PHPArray' container. I have found that the generated PHP is more readable than that from var_export()

Related

Upper or Lower Case Issue using get_file_contents php

I am using a PHP GET method to grab a file name that then is placed in a get_file_contents command. If it is possible, I would like to ignore letter case so that my URL's are cleaner.
For instance, example.com/file.php?n=File-Name will work but example.com/file.php?n=file-name will not work using the code below. I feel like this should be easy but I'm coming up dry. Any thoughts?
$file = $_GET['n'];
$file_content = file_get_contents($file);
Lowercase all your filenames and use:
file_get_contents(strtolower($file));
(I hope you're aware of some of the risks involved in using this.)
The Linux filesystem is case sensitive. If you want to do case insensitive matching against files that already exist on the user's machine, your only option is to obtain a directory listing and do case-insensitive comparison.
But you don't explain where the download URLs come from. If you already know the correct filenames and you want to generate prettier URLs, you can keep a list of the true pathnames and look them up when you receive a case-normalized one in a URL (you could even rename them completely, obfuscate, etc.)

Perf. issue / Too much calls to string manipulation functions

This question is about optimizing a part of a program that I use to add in many projects as a common tool.
This 'templates parser' is designed to use a kind of text pattern containing html code or anything else with several specific tags, and to replace these by developer given values when rendered.
The few classes involved do a great job and work as expected, it allows when needed to isolate design elements and easily adapt / replace design blocks.
The patterns I use look like this (nothing exceptional I admit) :
<table class="{class}" id="{id}">
<block_row>
<tr>
<block_cell>
<td>{content}</td>
</block_cell>
</tr>
</block_row>
</table>
(Example code below are adapted extracts)
The parsing does things like that :
// Variables are sorted by position in pattern string
// Position is read once and stored in cache to avoid
// multiple calls to str_pos or str_replace
foreach ($this->aVars as $oVar) {
$sString = substr($sString, 0, $oVar->start) .
$oVar->value .
substr($sString, $oVar->end);
}
// Once pattern loaded, blocks look like --¤(<block_name>)¤--
foreach ($this->aBlocks as $sName=>$oBlock) {
$sBlockData = $oBlock->parse();
$sString = str_replace('--¤(' . $sName . ')¤--', $sBlockData, $sString);
}
By using the class instance I use methods like 'addBlock' or 'setVar' to fill my pattern with data.
This system has several disadvantages, among them the multiple objects in memory (one for each instance of block) and the fact that there are many calls to string manipulation functions during the parsing process (preg_replace in the past, now just a bunch of substr and pals).
The program on which I'm working is making a large use of these templates and they are just about to show their limits.
My question is the following (No need for code, just ideas or a lead to follow) :
Should I consider I've abused of this and should try to manage so that I don't need to make so many calls to these templates (for instance improving cache, using only simple view scripts...)
Do you know a technical solution to feed a structure with data that would not be that mad resource consumer I wrote ? While I'm writing I'm thinking about XSLT, would it be suitable, if yes could it improve performances ?
Thanks in advance for your advices
Use the XDebug extension to profile your code and find out exactly which parts of the code are taking the most time.

Searching for a link in a website and displaying it PHP

hello im a newbie in php i am trying make a search function using php but only inside the website without any database
basically if i want to search a string namely "Health" it would display the lines
The Joys of Health
Healthy Diets
This snippet is the only thing i could find if properly coded would output the "lines" i want
$myPage = array("directory.php","pages.php");
$lines = file($myPage[n]);
echo $lines[n];
i havent tried it yet if it would work but before i do i want to ask if there is any better way to do this?
if my files have too many lines wont it stress out the server?
The file() function will return an array. You should use file_get_contents() instead, as it returns a string.
Then, use regular expressions to find specific text within a link.
Your goal is fine but the method you're thinking about is not. the file() function read a file, line by line, and inserts it into an array. This assumes the HTML is well-structured in a human-readable fashion, which is not always the case. However, if you're the one providing the HTML and you make sure the structure is perfectly defined, ok... here you have the example you provided us with but complete (take into account it's the 'wrong' way of solving your problem, but if you want to follow that pattern, it's ok):
function pagesearch($pages, $string) {
if (!empty($pages) && !empty($string)) {
$tags = [];
foreach ($pages as $page) {
if ($lines = file($page)) {
foreach ($lines as $line) {
if (!empty($line)) {
if (mb_strpos($line, $string)) {
$tags[$page][] = $line;
}
}
}
}
}
return $tags;
}
}
This will return you an array with all the pages you referenced with all occurrences of the word you look for, separated by page. As I said, it's not the way you want to solve this, but it's a way.
Hope that helps
Because you do not want to use any database and because the term database is very broad and includes the file-system you want to do a search in some database without having a database.
That makes no sense. In your case one database at least is the file-system. If you can accept the fact that you want to search a database (here your html files) but you do not want to use a database to store anything related to the search (e.g. some index or cached results), then what you suggest is basically how it is working: A real-time, text-based, line-by-line file-search.
Sure it is very rudimentary but as your constraint is "no database", you have already found the only possible way. And yes it will stress your server when used because real-time search is expensive.
Otherwise normally Lucene/Solr is used for the job but that is a database and a server even.

Alternative to php preg_match to pull data from an external website?

I want to extrat the content of a specific div in an external webpage, the div looks like this:
<dt>Win rate</dt><dd><div>50%</div></dd>
My target is the "50%". I'm actually using this php code to extract the content:
function getvalue($parameter,$content){
preg_match($parameter, $content, $match);
return $match[1];
};
$parameter = '#<dt>Score</dt><dd><div>(.*)</div></dd>#';
$content = file_get_contents('https://somewebpage.com');
Everything works fine, the problem is that this method is taking too much time, especially if I've to use it several times with diferents $content.
I would like to know if there's a better (faster, simplier, etc.) way to acomplish the same function? Thx!
You may use DOMDocument::loadHTML and navigate your way to the given node.
$content = file_get_contents('https://somewebpage.com');
$doc = new DOMDocument();
$doc->loadHTML($content);
Now to get to the desired node, you may use method DOMDocument::getElementsByTagName, e.g.
$dds = $doc->getElementsByTagName('dd');
foreach($dds as $dd) {
// process each <dd> element here, extract inner div and its inner html...
}
Edit: I see a point #pebbl has made about DomDocument being slower. Indeed it is, however, parsing HTML with preg_match is a call for trouble; In that case, I'd also recommend looking at event-driven SAX XML parser. It is much more lightweight, faster and less memory intensive as it does not build a tree. You may take a look at XML_HTMLSax for such a parser.
There are basically three main things you can do to improve the speed of your code:
Off load the external page load to another time (i.e. use cron)
On a linux based server I would know what to suggest but seeing as you use Windows I'm not sure what the equivalent would be, but Cron for linux allows you to fire off scripts at certain schedule time offsets - in the background - so not using a browser. Basically I would recommend that you create a script who's sole purpose is to go and fetch the website pages at a particular time offset (depending on how frequently you need to update your data) and then write those webpages to files on your local system.
$listOfSites = array(
'http://www.something.com/page.htm',
'http://www.something-else.co.uk/index.php',
);
$dirToContainSites = getcwd() . '/sites';
foreach ( $listOfSites as $site ) {
$content = file_get_contents( $site );
/// i've just simply converted the URL into a filename here, there are
/// better ways of handling this, but this at least keeps things simple.
/// the following just converts any non letter or non number into an
/// underscore... so, http___www_something_com_page_htm
$file_name = preg_replace('/[^a-z0-9]/i','_', $site);
file_put_contents( $dirToContainSites . '/' . $file_name, $content );
}
Once you've created this script, you then need to set the server up to execute it as regularly as you need. Then you can modify your front-end script that displays the stats to read from local files, this would give a significant speed increase.
You can find out how to read files from a directory here:
http://uk.php.net/manual/en/function.dir.php
Or the simpler method (but prone to possible problems) is just to re-step your array of sites, convert the URLs to file names using the preg_replace above, and then check for the file's existence in the folder.
Cache the result of calculating your statistics
It's quite likely this being a stats page that you'll want to visit it quite frequently (not as frequent as a public page, but still). If the same page is visited more often than the cron-based script is executed then there is no reason to do all the calculation again. So basically all you have to do to cache your output is do something similar to the following:
$cachedVersion = getcwd() . '/cached/stats.html';
/// check to see if there is a cached version of this page
if ( file_exists($cachedVersion) ) {
/// if so, load it and echo it to the browser
echo file_get_contents($cachedVersion);
}
else {
/// start output buffering so we can catch what we send to the browser
ob_start();
/// DO YOUR STATS CALCULATION HERE AND ECHO IT TO THE BROWSER LIKE NORMAL
/// end output buffering and grab the contents so we now have a string
/// of the page we've just generated
$content = ob_get_contents(); ob_end_clean();
/// write the content to the cached file for next time
file_put_contents($cachedVersion, $content);
echo $content;
}
Once you start caching things you need to be aware of when you should delete or clear your cache - otherwise if you don't your stats output will never change. With regards to this situation, the best time to clear your cache is at the point you go and fetch the external web pages again. So you should add this line to the bottom of your "cron" script.
$cachedVersion = getcwd() . '/cached/stats.html';
unlink( $cachedVersion ); /// will delete the file
There are other speed improvements you could make to the caching system (you could even record the modified times of the external webpages and load only when they have been updated) but I've tried to keep things easy to explain.
Don't use a HTML Parser for this situation
Scanning a HTML file for one particular unique value does not require the use of a fully-blown or even lightweight HTML Parser. Using RegExp incorrectly seems to be one of those things that lots of start-up programmers fall into, and is a question that is always asked. This has led to lots of automatic knee-jerk reactions from more experience coders to automatically adhere to the following logic:
if ( $askedAboutUsingRegExpForHTML ) {
$automatically->orderTheSillyPersonToUse( $HTMLParser );
} else {
$soundAdvice = $think->about( $theSituation );
print $soundAdvice;
}
HTMLParsers should be used when the target within the markup is not so unique, or your pattern to match relies on such flimsy rules that it'll break the second an extra tag or character occurs. They should be used to make your code more reliable, not if you want to speed things up. Even parsers that do not build a tree of all the elements will still be using some form of string searching or regular expression notation, so unless the library-code you are using has been compiled in an extremely optimised manner, it will not beat well coded strpos/preg_match logic.
Considering I have not seen the HTML you are hoping to parse, I could be way off, but from what I've seen of your snippet it should be quite easy to find the value using a combination of strpos and preg_match. Obviously if your HTML is more complex and might have random multiple occurances of <dt>Win rate</dt><dd><div>50%</div></dd> it will cause problems - but even so - a HTMLParser would still have the same problem.
$offset = 0;
/// loop through the occurances of 'Win rate'
while ( ($p = stripos ($html, 'win rate', $offset)) !== FALSE ) {
/// grab out a snippet of the surrounding HTML to speed up the RegExp
$snippet = substr($html, $p, $p + 50 );
/// I've extended your RegExp to try and account for 'white space' that could
/// occur around the elements. The following wont take in to account any random
/// attributes that may appear, so if you find some pages aren't working - echo
/// out the $snippet var using something like "echo '<xmp>'.$snippet.'</xmp>';"
/// and that should show you what is appearing that is breaking the RegExp.
if ( preg_match('#^win\s+rate\s*</dt>\s*<dd>\s*<div>\s*([0-9]+%)\s*<#i', $snippet, $regs) ) {
/// once you are here your % value will be in $regs[1];
break; /// exit the while loop as we have found our 'Win rate'
}
/// reset our offset for the next loop
$offset = $p;
}
Gotchas to be aware of
If you are new to PHP, as you state in a comment above, then the above may seem rather complicated - which it is. What you are trying to do is quite complex, especially if you want to do it optimally and fast. However, if you follow throught the code I've given and research any bits that you aren't sure of / haven't heard of (php.net is your friend), it should give you a better understanding of a good way to achieve what you are doing.
Guessing ahead however, here are some of the problems you might face with the above:
File Permission errors - in order to be able to read and write files to and from the local operating system you will need to have the correct permissions to do so. If you find you can not write files to a particular directory it might be that the host you are using wont allow you to do so. If this is the case you can either contact them to ask about how to get write permission to a folder, or if that isn't possible you can easily change the code above to use a database instead.
I can't see my content - when using output buffering all the echo and print commands do not get sent to the browser, they instead get saved up in memory. PHP should automatically output all the stored content when the script exits, but if you use a command like ob_end_clean() this actually wipes the 'buffer' so all the content is erased. This can lead to confusing situations when you know you are echoing something.. but it just isn't appearing.
(Mini Disclaimer :) I've typed all the above manually so you may find there are PHP errors, if so, and they are baffling, just write them back here and StackOverflow can help you out)
Instead of trying to not use preg_match why not just trim your document contents down in size? for example, you could dump everything before <body and everything after </body>. then preg_match will be searching less content already.
Also, you could try to do each one of these processes as a pseudo separate thread, so that way they aren't happening one at a time.

Am I breaking any "php good practice" in the following php array which deals with 3 (human) languages?

This is the most optimal way of dealing with a multilingual website I can think of, right now (not sure) which doesn't involve gettext, zend_translate or any php plugin or framework.
I think its pretty straight forward: I have 3 languages and I write their "content" in different files (in form of arrays), and later, I call that content to my index.php like you can appreciate in the following picture:
alt text http://img31.imageshack.us/img31/1471/codew.png
I just started with php and I would like to know if I'm breaking php good practices, if the code is vulnerable to XSS attack or if I'm writing more code than necessary.
EDIT: I posted a picture so that you can see the files tree (I'm not being lazy)
EDIT2: I'm using Vim with the theme ir_black and NERDTree.
Looks all right to me, although I personally prefer creating and using a dictionary helper function:
<?php echo dictionary("showcase_li2"); ?>
that would enable you to easily switch methods later, and gives you generally more control over your dictionary. Also with an array, you will have the problem of scope - you will have to import it into every function using global $language; very annoying.
You will probably also reach the point when you have to insert values into an internationalized string:
You have %1 votes left in the next %2 hours.
Sie haben %1 stimmen übrig für die nächsten %2 stunden.
Sinulla on %1 ääntä jäljellä seuraavan %2 tunnin ajassa.
that is something a helper function can be very useful for:
<?php echo dictionary("xyz", $value1, $value2 ); ?>
$value1 and $value2 would be inserted into %1 and %2 in the dictionary string.
Such a helper function can easily be built with an unlimited number of parameters using func_get_args().
It's OK generally. For instance, punBB's localization works this way. It is very fast. Faster than calling a function or an object's method or property. But I see a problem with this approach, since it doesn't support language fallbacks easily. I mean, if you don't have a string for Chinese, let it be displayed in English.
This problem is topical when you upgrade your system and you don't have time to translate everything in every language.
I'd better use something like
lang.en.php
$langs['en'] = array(
...
);
lang.cn.php
$langs['cn'] = array(
...
);
[prepend].php (some common lib)
define('DEFAULT_LANG', 'en');
include_once('lang.' . DEFAULT_LANG '.php');
include_once('lang.' . $user->lang . '.php');
$lang = array_merge($langs[DEFAULT_LANG], $langs[$user->lang]);
Looks all right to me also, but:
Seems that you have localization for multiple modules/sites, so why not break it down to multidimensional array?
$localization = array(
'module' => (object)array(
'heading' => 'oh, no!',
'perex' => 'oh, yes!'
)
);
I personally like to creat stdClass out of arrays with
$localization = (object)$localization;
so you can use
$localization->module->heading;
:) my 2 cents
The only way that this could be xss is if you have register_globals=On and you don't set $lang['showcase_lil'] or other $lang's. But I don't think you have to worry about this. So I think your in the clear.
as an xss test:
http://127.0.0.1/whatever.php?lang[showcase_lil]=alert(/xss/)
Wouldn't it have been better to post code and briefly explain this issue to us?
Anyway, putting each language in its own file and loading it through some sort of language component seems okay. I'd prefer using some sort of gettext, but this is okay too, I guess.
You should make a function for calling the language keys rather than relying on an array, something like
<?php echo lang('yourKey'); ?>
One thing to watch for is interpolation; that's really the only place XSS could sneak in if your server settings are sensible. If you at any point need to do something along the lines of translating "$project->name has $project->member_count members", you'll have to make sure you escape all HTML that goes in there.
But other than that, you should be fine.

Categories