PHP - Break up word into array - php

i've done plenty of googling and whatnot and can't find quite what i'm looking for...
I am working on tightening up the authentication for my website. I decided to take the user's credentials, and hash/salt the heck out of them. Then store these values in the DB and in user cookies. I modified a script I found on the PHP website and it's working great so far. I noticed however when using array_rand, that it would select the chars from the predefined string, sequentially. I didn't like that, so I decided to use a shuffle on the array_rand'd array. Worked great.
Next! I thought it would be clever to turn my user inputted password into an array, then merge that with my salted array! Well, I am having trouble turning my user's password into an array. I want each character in their password to be an array entry. IE, if your password was "cool", the array would be, Array 0 => c 1 => o 2 => o 3 => l, etc etc. I have tried word to split up the string then explode it with the specified break character, that didn't work. I figure I could do something with a for loop, strlen and whatnot, but there HAS to be a more elegant way.
Any suggestions? I'm kind of stumped :( Here is what I have so far, i'm not done with this as I haven't progressed further than the explodey part.
$strings = wordwrap($string, 1, "|");
echo $strings . "<br />";
$stringe = explode("|", $strings, 1);
print_r($stringe);
echo "<br />";
echo "This is the exploded password string, for mixing with salt.<hr />";
Thank you so much :)

The php function you want is str_split
str_split('cool', 1);
And it would return, is used as above
[0] => c
[1] => o
[2] => o
[3] => l

Thanks to PHP's loose typing if you treat the string as an array, php will hapilly do what you would expect. For example:
$string = 'cool';
echo $string[1]; // output's 'o'.

Never, EVER implement (or in this case, design too!) cryptographical algorithms unless you really know what you are doing. If you decide to go ahead and do it anyways, you're putting your website at risk.
There's no reason you should have to do this: there is most certainly libraries and/or functions to do all of this sort of thing already.

Related

How to split a search url into an associative array

So I would like to take a string like this,
q=Sugar Beet&qf=vegetables&range=time:[34-40]
and break it up into separate pieces that can be put into an associative array and sent to a Solr Server.
I want it to look like this
['q'] => ['Sugar Beets],
['qf'] => ['vegetables']
After using urlencode I get
q%3DSugar+Beet%26qf%3Dvegetables%26range%3Dtime%3A%5B34-40%5D
Now I was thinking I would make two separate arrays that would use preg_split() and take the information between the & and the = sign or the = and the & sign, but this leaves the problem of the final and first because they do not start with an & or end in an &.
After this, the plan was to take the two array and combine them with array_combine().
So, how could I do a preg_split that addresses the problem of the first and final entry of the string? Is this way of doing it going to be too demanding on the server? Thank you for any help.
PS: I am using Drupal ApacheSolr to do this, which is why I need to split these up. They need to be sent to an object that is going to build q and qf differently for instance.
You don't need a regular expression to parse query strings. PHP already has a built-in function that does exactly this. Use parse_str():
$str = 'q=Sugar Beet&qf=vegetables&range=time:[34-40]';
parse_str($str, $params);
print_r($params);
Produces the output:
Array
(
[q] => Sugar Beet
[qf] => vegetables
[range] => time:[34-40]
)
You could use the parse_url() function/.
also:
parse_str($_SERVER['QUERY_STRING'], $params);

How a hash or mapping works in PHP

In the language of Perl, I define a hash as a mapping between one thing and another or an essential list of elements. As stated in the documentation..
A hash is a basic data type. It uses keys to access its contents.
So basically a hash is close to an array. Their initializations even look very similar.
If I were to create a mapping in Perl, I could do something like below for comparing.
my %map = (
A => [qw(a b c d)],
B => [qw(c d f a)],
C => [qw(b d a e)],
);
my #keys = keys %map;
my %matches;
for my $k ( 1 .. #keys ) {
$matches{$_} |= 2**$k for #{$map{ $keys[$k-1] }};
}
for ( sort keys %matches ) {
my #found;
for my $k ( 1 .. #keys ) {
push #found, $keys[$k-1] if $matches{$_} & 2**$k;
}
print "$_ found in ", (#found? join(',', #found) : 0 ), "\n";
}
Output:
a found in A,C,B
b found in A,C
c found in A,B
d found in A,C,B
e found in C
f found in B
I would like to find out the best method of doing this for performance and efficiency in php
If I understand correctly, you are looking to apply your knowledge of Perl hashes to PHP. If I'm correct, then...
In PHP a "Perl hash" is generally called an "associative array", and PHP implements this as an array that happens to have keys as indexes and its values are just like a regular array. Check out the PHP Array docs for lots of examples about how PHP lets you work with arrays of this (and other) types.
The nice thing about PHP is it is very flexible as to how you can deal with arrays. You can define an array as having key-value pairs then treat it like a regular array and ignore the keys, and that works just fine. You can mix and match...it doesn't complain much.
Philosophically, a hash or map is just a way to keep discrete pieces of related information together. That's all most non-primitive data structures are, and PHP is not very opinionated about how you go about things; it has lots of built-in optimizations, and does a pretty solid job of doing these types of things efficiently.
To answer your questions related to your example:
1) As for simplicity (I think you mean) and maintainability, I don't think there's anything wrong with your use of an associative array. If a data set is in pairs, then key-value pairs is a natural way to express this type of data.
2) As for most efficient, as far as lines of code and script execution overhead goes...well, the use of such a mapping is a vanishingly small task for PHP. I don't think any other way of handling it would matter much, PHP can handle it by the thousands without complaint. Now if you could avoid the use of a regular expression, on the other hand...
3) You're using it, really. Don't over think it - in PHP this is just an "array", and that's it. It's a variable that holds an arbitrary amount of elements, and PHP handles multiple-dimensions or associativity pretty darn well. Well enough that it's almost never going to be the cause of any problem you have.
PHP will handle things like hash/maps behind the scenes very logically and efficiently, to the point that part of the whole point of the language is for you not to bother to try to think about such things. If you have relates pieces of data in chunks, use an array; if the pieces of data comes in pairs, use key-value pairs; if it comes by the dozen, use an "array of arrays" (a multidimensional array where some - or all - of it's elements are arrays).
PHP doesn't do anything stupid like create a massive overhead just because you wanted to use key-value pairs, and it has lots of built-in features like foreach $yourArray as $key => $value and the functions you used like array_keys() and array_values(). Feel free to use them - as core features they are generally pretty darn well optimized!
For what you are doing I would rather use sprintf:
$format = 'Hello %s how are you. Hey %s, hi %s!';
printf($format, 'foo', 'bar', 'baz');

Php regexp get the strings from array print_r like string

Im trying to list out here how to match strings that looks like array printr.
variable_data[0][var_name]
I would like to get from above example 3 strings, variable_data, 0 and var_name.
That above example is saved in DB so i same structure of array could be recreated but im stuck. Also a if case should look up IF the string (as above) is in that structure, otherwise no preg_match is needed.
Note: i dont want to serialize that array since the array 'may' contain some characters that might break it when unserializing and i also need that value in the array to be fully visible.
Any one with regexp skills who might know the approach ?
Solution:
(\b([\w]*[\w]).\b([\w]*[\w]).+(\b[\w]*[\w]))
Thos 2 first indexes should be skipped... but i still get what i want :)
Not for nothing but couldn't you just do..
$result = explode('[', someString);
foreach ($result as $i => $v) {
$temp = str_replace(']'. ''. $result[$i]);
//Do something with temp
}
Obviously you need to edit the above a little bit depending on what you are doing but it is very simple and even gives you the same flexibility and you don't need to invoke the matching engine...
I don't think we build regex's here for people... instead please see http://regexpal.com/ for a Regex tester / builder with visual aid.
Furthermore people usually don't know how to use them properly which is then fostered by others creating the expressions for them.
Please remember complex expressions can have terrible performance overheads although there is nothing seemingly complex about your request...
Then after it is compelte post your completed RegEx and answer your own question for maximum 1337ne$$ :)
But since I am nice here is your reward:
\[.+\]\[\d+\]
or
[a-z]+_[a-z]+\[.+\]\[\d+\]
Depending on what you want to match out of the string (which you didn't specify) so I assumed all
Both perform as follows:
arr_var[name][0]; //Matched
arr_var[name]; //Not matched
arr_var[name][0][1];//Matched
arr_var[name][2220][11];//Matched
Again, test them and understand with visual aid at the above link.
Solution:
(\b([\w]*[\w]).\b([\w]*[\w]).+(\b[\w]*[\w]))
Those 2 first indexes should be skipped... but i still get what i want :)
Edit
Here is improved one:
$str = "variable[group1][parent][child][grandchild]";
preg_match_all('/(\b([\w]*[\w]))/', $str,$matches);
echo '<pre>';
print_r($matches);
echo '</pre>';
// Output
Array
(
[0] => variable
[1] => group1
[2] => parent
[3] => child
[4] => grandchild
)

Eliminating commonly used words from a string in PHP, MySQL

I have code that takes a massive string from a SQL database and parses it into individual words and puts them into an array to be counted, with the goal of making a graph of the must used words, but I need to find a means of removing commonly used words. I made a very basic array of words to compare to but it's not very effective. Is their some means of a dictionary file i can compare it to? any ideas would be fantastic.
I am currently editing an existing "Data representation algorithm" at an internship and i really don't know where to start. It has been suggested I use a dictionary file but not only do I not have have one, I wouldn't know how to compare it.
You can do this using the in_array function:
<?php
$whitelist = array('a', 'the');
function whitelisted($var)
{
global $whitelist;
return (!in_array($var, $whitelist));
}
$str = "a lazy fox jumped over the lazy farmer";
print_r(array_count_values(array_filter(explode(" ", $str), "whitelisted")));
?>
//produces:
Array
(
[lazy] => 2
[fox] => 1
[jumped] => 1
[over] => 1
[farmer] => 1
)
Of course, you could and should re-arrange this to work with your own scope (global is probably not ideal), but it should get you started on pruning out common words you don't care to count.
http://ideone.com/kfNzM

Find 3-8 word common phrases in body of text using PHP

I'm looking for a way to find common phrases within a body of text using PHP. If it's not possible in php, I'd be interested in other web languages that would help me complete this.
Memory or speed are not an issues.
Right now, I'm able to easily find keywords, but don't know how to go about searching phrases.
I've written a PHP script that does just that, right here. It first splits the source text into an array of words and their occurrence count. Then it counts common sequences of those words with the specified parameters. It's old code and not commented, but maybe you'll find it useful.
Using just PHP? The most straightforward I can come up with is:
Add each phrase to an array
Get the first phrase from the array and remove it
Find the number of phrases that match it and remove those, keeping a count of matches
Push the phrase and the number of matches to a new array
Repeat until initial array is empty
I'm trash for formal CS, but I believe this is of n^2 complexity, specifically involving n(n-1)/2 comparisons in the worst case. I have no doubt there is some better way to do this, but you mentioned that efficiency is a non-issue, so this'll do.
Code follows (I used a new function to me, array_keys that accepts a search parameter):
// assign the source text to $text
$text = file_get_contents('mytext.txt');
// there are other ways to do this, like preg_match_all,
// but this is computationally the simplest
$phrases = explode('.', $text);
// filter the phrases
// if you're in PHP5, you can use a foreach loop here
$num_phrases = count($phrases);
for($i = 0; $i < $num_phrases; $i++) {
$phrases[$i] = trim($phrases[$i]);
}
$counts = array();
while(count($phrases) > 0) {
$p = array_shift($phrases);
$keys = array_keys($phrases, $p);
$c = count($keys);
$counts[$p] = $c + 1;
if($c > 0) {
foreach($keys as $key) {
unset($phrases[$key]);
}
}
}
print_r($counts);
View it in action: http://ideone.com/htDSC
I think you should go for
str_word_count
$str = "Hello friend, you're
looking good today!";
print_r(str_word_count($str, 1));
will give
Array
(
[0] => Hello
[1] => friend
[2] => you're
[3] => looking
[4] => good
[5] => today
)
Then you can use array_count_values()
$array = array(1, "hello", 1, "world", "hello");
print_r(array_count_values($array));
which will give you
Array
(
[1] => 2
[hello] => 2
[world] => 1
)
An ugly solution, since you said ugly is ok, would be to search for the first word for any of your phrases. Then, once that word is found, check if the next word past it matches the next expected word in the phrase. This would be a loop that would keep going so long as the hits are positive until either a word is not present or the phrase is completed.
Simple, but exceedingly ugly and probably very, very slow.
Coming in late here, but since I stumbled upon this while looking to do a similar thing, I thought I'd share where I landed in 2019:
https://packagist.org/packages/yooper/php-text-analysis
This library made my task downright trivial. In my case, I had an array of search phrases that I wound up breaking up into single terms, normalizing, then creating two and three-word ngrams. Looping through the resulting ngrams, I was able to easily summarize the frequency of specific phrases.
$words = tokenize($searchPhraseText);
$words = normalize_tokens($words);
$ngram2 = array_unique(ngrams($words, 2));
$ngram3 = array_unique(ngrams($words, 3));
Really cool library with a lot to offer.
If you want fulltext search in html files, use Sphinx - powerful search server.
Documentation is here

Categories