Best way to replace strings - php

I have an array where I'm storing the bad and good string pairs.
Ex.:
array(
"Man. United"=>"Manchester United",
"Bay. Munchen"=>"Bayern Munchen",
"Bay. Munich"=>"Bayern Munchen",
...
)
so in this case I'm using strtr to replace the given string, but in this case I always have to add or remove data's from the array. Is there any way to store just the good names in one array and replace which is very similar? For me is much easier to build up the array with the good names.

You could use similar_text or one of the other functions mentioned in the see also section to try and correct them automatically, but won't be as accurate as if you list the spelling mistakes yourself.
*edit: levenshtein may also be a good one to try...
The Levenshtein distance is defined as the minimal number of
characters you have to replace, insert or delete to transform str1
into str2.

Related

alphabetically sort array based on first few strings

I woud like to sort an array which typically includes names and email addresses. The problem is that the email addresses appear last even though they may start with 'a'
e.g.
$myarray = ("Alex Mayfeild", "David Beckham", "Oliver Twist", "ant.stev#wherever.com", "peter.pan#neverland.com", ........) //and so on
Upon sorting the array using php's sort function "ant.stev#wherever.com" will appear close to the end even though the functionality I would like to achieve is for him to appear after Alex.
natcasesort and natsource functions based on natural ordering seem to fail. Correction: natcasesource works it returns true when working as stated in docs. Thanks #meagar
Is there anyway to achieve the requested functionality. Thanks for any help guys. It is very much appreciated.
sort() is case sensitive, as it sorts based on the letters ASCII value.
Try natcasesort(), if you want too "sort an array using a case insensitive 'natural order' algorithm".
Seems to me that sort($myarray, SORT_STRING|SORT_FLAG_CASE); should sort the array the way you want.

scandir - sort numeric filenames

Done some searching, but can't seem to find the exact answer I'm looking for.
I'd like to pull in files with numbered filenames using 'scandir($dir)', but have them sort properly. For example, file names are:
1-something.ext
2-something-else.ext
3-a-third-name.ext
.
.
.
10-another-thing.ext
11-more-names.ext
The problem I'm having is that 10-a-fourth-thing.ext will show before 2-something-else.ext. I'd like to find a better way of solving this issue than introducing leading '0' in front of all file names.
Any thoughts? Thanks.
natsort does exactly what you need.
sort with SORT_NUMERIC will also work for filenames that start with numbers, but it will break if there are also names that have no numbers in front (all non-number-prefixed names will be sorted before number-prefixed names, and their order relative to one another will be random instead of alphabetic).
You can use sort like this:
sort($arr, SORT_NUMERIC); // asuming $arr is your array
If you want to reassign keys (which natsort does not do), use usort() combined with strnatcmp() or strnatcasecmp():
usort($arr, 'strnatcmp'); // Or 'strnatcasecmp' for case insensitive

PHP - Exploding on character(s) that can NEVER be user-defined... How?

Ok, am trying to find a character or group of characters, or something that can be used that I can explode from, since the text is user-defined, I need to be able to explode from a value that I have that can never be within the text.
How can I do this?
An example of what I'm trying to do...
$value = 'text|0||#fd9||right';
Ok,
text is something that should never change in here.
0, again not changeable
#fd9 is a user-defined string that can be anything that the user inputs...
and right sets the orientation (either left or right).
So, the problem I'm facing is this: How to explode("||", $value) so that if there is a || within the user-defined part... Example:
$value = 'text|0||Just some || text in here||right';
So, if the user places the || in the user-defined part of the string, than this messes this up. How to do this no matter what the user inputs into the string? So that it should return the following array:
array('text|0', 'Just some || text in here', 'right');
Should I be using different character(s) to explode from? If so, what can I use that the user will not be able to input into the string, or how can I check for this, and fix it? I probably shouldn't be using || in this case, but what can I use to fix this?
Also, the value will be coming from a string at first, and than from the database afterwards (once saved).
Any Ideas?
The problem of how to represent arbitrary data types as strings always runs up against exactly the problem you're describing and it has been solved in many ways already. This process is called serialization and there are many serialization formats, anything from PHP's native serialize to JSON to XML. All these formats specify how to present complex data structures as strings, including escaping rules for how to use characters that have a special meaning in the serialization format in the serialized values themselves.
From the comments:
Ok, well, basically, it's straight forward. I already outlined 13 of the other parameters and how they work in Dream Portal located here: http://dream-portal.net/topic_122.0.html so, you can see how they fit in. I'm working on a fieldset parameter that basically uses all of these parameters and than some to include multiple parameters into 1. Anyways, hope that link helps you, for an idea of what an XML file looks like for a module: http://dream-portal.net/topic_98.0.html look at the info.xml section, pay attention to the <param> tag in there, at the bottom, 2 of them.
It seems to me that a more sensible use of XML would make this a lot easier. I haven't read the whole thing in detail, but an XML element like
<param name="test_param" type="select">0:opt1;opt2;opt3</param>
would make much more sense written as
<select name="test_param">
<option default>opt1</option>
<option>opt2</option>
<option>opt3</option>
</select>
Each unique configuration option can have its own unique element namespace with custom sub-elements depending on the type of parameter you need to represent. Then there's no need to invent a custom mini-format for each possible parameter. It also allows you to create a formal XML schema (whether this will do you any good or not is a different topic, but at least you're using XML as it was meant to be used).
You can encode any user input to base64 and then use it with explode or however you wish.
print base64_encode("abcdefghijklmnopqrstuvwxyz1234567890`~!##$%^&*()_+-=[];,./?>:}{<");
serialized arrays are also not a bad idea at all. it's probably better than using a comma separated string and explode. Drupal makes good use of serialized arrays.
take a look at the PHP manual on how to use it:
serialize()
unserialize()
EDIT: New Solution
Is it a guarantee that text doesn't contain || itself?
If it doesn't, you can use substr() in combination with strpos() and strrpos() instead of explode
Here's what I usually do to get around this problem.
1) capture user's text and save it in a var $user_text;
2) run an str_replace() on $user_text to replace the characters you want to split by:
//replace with some random string the user would hopefully never enter
$modified = str_replace('||','{%^#',$user_text);
3) now you can safely explode your text using ||
4) now run an str_replace on each part of the explode, to set it back to the original user entered text
foreach($parts as &$part) {
$part = str_replace('{%^#','||',$part);
}

Getting one value out of a serialized array in PHP

What would you say is the most efficient way to get a single value out of an Array. I know what it is, I know where it is. Currently I'm doing it with:
$array = unserialize($storedArray);
$var = $array['keyOne'];
Wondering if there is a better way.
You are doing it fine, I can't think of a better way than what you are doing.
You unserialize
You get an array
You get value by specifying index
That's the way it can be done.
Wondering if there is a better way.
For the example you give with the array, I think you're fine.
If the serialized string contains data and objects you don't want to unserialize (e.g. creating objects you really don't want to have), you can use the Serialized PHP library which is a complete parser for serialized data.
It offers low-level access to serialized data statically, so you can only extract a subset of data and/or manipulate the serialized data w/o unserializing it. However that looks too much for your example as you only have an array and you don't need to filter/differ too much I guess.
Its most efficient way you can do, unserialize and get data, if you need optimize dont store all variables serialized.
Also there is always way to parse it with regexp :)
If you dont want to unseralize the whole thing (which can be costly, especially for more complex objects), you can just do a strpos and look for the features you want and extract them
Sure.
If you need a better way - DO NOT USE serialized arrays.
Serialization is just a transport format, of VERY limited use.
If you need some optimized variant - there are hundreds of them.
For example, you can pass some single scalar variable instead of whole array. And access it immediately
I, too, think the right way is to un-serialize.
But another way could be to use string operations, when you know what you want from the array:
$storedArray = 'a:2:{s:4:"test";s:2:"ja";s:6:"keyOne";i:5;}';
# another: a:2:{s:4:"test";s:2:"ja";s:6:"keyOne";s:3:"sdf";}
$split = explode('keyOne', $storedArray, 2);
# $split[1] contains the value and junk before and after the value
$splitagain = explode(';', $split[1], 3);
# $splitagain[1] should be the value with type information
$value = array_pop(explode(':', $splitagain[1], 3));
# $value contains the value
Now, someone up for a benchmark? ;)
Another way might be RegEx ?

What to use as array_import/var_import for sort-of exported array?

I have a string. It's a user submitted string. (And you should never ever trust user submitted anything.)
If certain (not unsafe) characters exist in the string, it's supposed to become a multi dimensional array/tree. First I tried splits, regex and loops. Too difficult. I've found a very easy solution with a few simple str_replace's and the result is a string that looks like an array definition. Eg:
array('body', array('div', array('x'), array(), array('')), array(array('oele')))
It's a silly array, but it's very easily created. Now that string has to become that array. I'm using eval() for that and I don't like it. Since it's user submitted (and must be able to contain just about anything), there could be any sort of function calls in that string.
So the million dollar question: is there some kind of var_import, or array_import that creates an array from a string and does nothing else (like mysterious, dangerous calls to exec etc)?
Yes, I have tried php.net and neither of the above _import functions exist.
What I'm looking for is the exact opposite of var_import, becasuse the string I have as input, looks exactly like the string var_export would output.
Any other suggestions to make it safer then eval are also welcome! But I'm not abandoning the current method (it's just too simple).
Using
array('body', array('div', array('x'), array(), array('')), array(array('oele')))
as input, I replaced some chars to make it a valid JSON string and imported that via json_decode.
Works perfectly. If some illegal chars are present, json_decode will trip over them (and not execute any dangerous code).

Categories