On this page levenshtein(), I am using the example #1 with following variables:
// input misspelled word
$input = 'htc corporation';
// array of words to check against
$words = array('htc', 'Sprint Nextel', 'Sprint', 'banana', 'orange',
'radish', 'carrot', 'pea', 'bean');
Could someone please tell me why the expected result is carrot rather than htc? Thanks
Because the levenshtein distance from htc corporation is 12 whereas the distance to carrot is only 11.
The levenshtein function calculates how many characters it has to add or replace to get to a certain word, and because htc corporation has 12 extra characters than htc it has to remove 12 to get to just htc. To get to the word carrot from htc corporation it takes 11 changes.
"htc corporation" to "htc" has a distance of 12 (remove " corporation" = 12 characters). "htc corporation" to "carrot" has a distance of no more than 11.
"htc corporation" => "corporation": 4
"corporation" => "corporat": 3
"corporat" => "corrat": 2
"corrat" => "carrat": 1
"carrat" => "carrot": 1
4 + 3 + 2 + 1 + 1 = 11
It looks like what you might be looking for isn't straight-up levenshtein distance, but a "closest substring" match. There's an example implementation of such a thing using a modified Levenshtein algorithm here. Using this algorithm gives scores of:
htc: 0
Sprint Nextel: 11
Sprint: 4
banana: 5
orange: 3
radish: 3
carrot: 3
pea: 2
bean: 3
which recognizes "htc" as an exact substring match and gives it a score of zero. The runner-up, "pea", has a score of two, because you could align it with the "p", the "e", or the "a" in corporation, and then replace the other two characters, etc. When working with this algorithm you should be aware that the score will never be higher than the length of the "needle" string, so shorter strings will generally get lower scores (they're "easier to match").
Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertion, deletion, substitution) required to change one word into the other.
here is a simple analysis
$input = 'htc corporation';
// array of words to check against
$words = array(
'htc',
'Sprint Nextel',
'Sprint',
'banana',
'orange',
'radish',
'carrot',
'pea',
'bean'
);
foreach ( $words as $word ) {
// Check for Intercept
$ic = array_intersect(str_split($input), str_split($word));
printf("%s \t l= %s , s = %s , c = %d \n",$word ,
levenshtein($input, $word),
similar_text($input, $word),
count($ic));
}
Output
htc l= 12 , s = 3 , c = 5
Sprint Nextel l= 14 , s = 3 , c = 8
Sprint l= 12 , s = 1 , c = 7
banana l= 14 , s = 2 , c = 2
orange l= 12 , s = 4 , c = 7
radish l= 12 , s = 3 , c = 5
carrot l= 11 , s = 1 , c = 10
pea l= 13 , s = 2 , c = 2
bean l= 13 , s = 2 , c = 2
It clear htc has a distance of 12 while carrot has 11 if you want htc then Levenshtein alone is not enough .. you need to compare exact word then set priorities
Related
I am working on a step sequencer program for drum sounds. It takes a 16 bit binary pattern example: '1010010100101001' and then it breaks the binary pattern into chunks like so: 10, 100, 10, 100, 10, 100, 1. It then assigns each chunk a time value based on how many digits. Reason why, is some drum sample sounds ring out longer than the length of 1 beat, so the chunking solves this part. (for example if the beat was 60bpm 1 digit = 1 second) '10' = 2 seconds, '100' = 3 seconds, '1' = seconds. (allowing me to trim the sounds to the proper length in the pattern and concat it into a final wav using ffmpeg) Also 1 = drum hit / 0 = silent hit..... This method works great for my needs.
Now I can make perfect beat loops.... and I want to add a velocity pattern layer on top of this to allow ghost notes / add human feel / dynamics to my drum patterns. I have decided to use a 0,1,2,3,4 value system for the velocity patterns. '0' = 0% volume, '1' = 25% volume, '2' = 50% volume, '3' = 75% volume, and '4' = %100 volume. (0 volume so I can add open hi hat / cymbal crash hard stops that a 0 in binary pattern wouldn't do) So along with the '1111111111111111' pattern you would see a velocity pattern layer, say '4242424242424242' (That velocity pattern alternates 100% hit and 50% hit and sounds good with hi hats / like a real drummer)
Using PHP I am breaking 16 bit binary patterns into an array of chunks. '1001110011110010' would be
['100','1','1','100','1','1','1','100','10']
Now via a loop, I need to map another 16 digit number layer pattern of 0,1,2,3,4 digits to first digit of each chunk.
Example 1:
Velocity Pattern: '4242424242424242'
Binary Pattern: '1001110011110010'
Array = ['100','1','1','100','1','1','1','100','10']
'100' = 4 (1st digit in 4242424242424242 pattern)
'1' = 2 (4th digit in 4242424242424242 pattern)
'1' = 4 (5th digit in 4242424242424242 pattern)
'100' = 2 (6th digit in the 4242424242424242 pattern)
'1' = 4 (9th digit in the 4242424242424242 pattern)
'1' = 2 (10th digit in the 4242424242424242 pattern)
'1' = 4 (11th digit in the 4242424242424242 pattern)
'100' = 2 (12th digit in the 4242424242424242 pattern)
'10' = 4 (15th digit in the 4242424242424242 pattern)
Example 2:
Velocity Pattern: '4242424242424242'
Binary Pattern: '1111111111111111'
Array = ['1','1','1','1','1','1','1','1','1','1','1','1','1','1','1','1']
'1' = 4 (n1 digit in 4242424242424242 pattern)
'1' = 2 (n2 digit in 4242424242424242 pattern)
'1' = 4 (n3 digit in 4242424242424242 pattern)
'1' = 2 (n4 digit in 4242424242424242 pattern)
'1' = 4 (n5 digit in 4242424242424242 pattern)
'1' = 2 (n6 digit in 4242424242424242 pattern)
'1' = 4 (n7 digit in 4242424242424242 pattern)
'1' = 2 (n8 digit in 4242424242424242 pattern)
'1' = 4 (n9 digit in 4242424242424242 pattern)
'1' = 2 (n10 digit in 4242424242424242 pattern)
'1' = 4 (n11 digit in 4242424242424242 pattern)
'1' = 2 (n12 digit in 4242424242424242 pattern)
'1' = 4 (n13 digit in 4242424242424242 pattern)
'1' = 2 (n14 digit in 4242424242424242 pattern)
'1' = 4 (n15 digit in 4242424242424242 pattern)
'1' = 2 (n16 digit in 4242424242424242 pattern)
Example 3:
Velocity Pattern: '4231423142314231'
Binary Pattern: '0001000100010001'
Array = ['0','0','0','1000','1000','1000','1']
'0' = 4 (1st digit in 4231423142314231 pattern)
'0' = 2 (2nd digit in 4231423142314231 pattern)
'0' = 3 (3rd digit in 4231423142314231 pattern)
'1000' = 1 (4th digit in 4231423142314231 pattern)
'1000' = 1 (8th digit in 4231423142314231 pattern)
'1000' = 1 (12th digit in 4231423142314231 pattern)
'1' = 1 (16th digit in 4231423142314231 pattern)
The patterns will vary, so I need a method that works even if the pattern starts with 0, ect.
a pattern of 111111111111111 would be easy since each 1 is already split into a group by itself.
I tried using a counter called "$v_count" to map find the position in the pattern but its not working like expected.
$v_count = 0;
$beat_pattern = '1001110011110010';
$velocity_pattern = '4242424242424242';
preg_match_all('/10*|0/', $beat_pattern, $m);
$c_count = count($m, COUNT_RECURSIVE) - 1;
for ($z = 0; $z < $c_count; $z++) {
$z2 = $z;
${"c" . $z} = $m[0][$z];
${"cl" . $z} = strlen($m[0][$z]);
if (${"cl" . $z} == 1 & $m[0][$z] == "0") {
$v_count = $v_count + 1;
echo 'the position of this chunk is: '.$v_count.' in the velocity_pattern<br>';
};
if (${"cl" . $z} == 1 & $m[0][$z] == "1") {
$v_count = $v_count + 1;
echo 'the position of this chunk is: '.$v_count.' in the velocity_pattern<br>';
};
if (${"cl" . $z} > 1) {
if ($z == 1)
{
$v_count = 1;
}
if ($z > 1)
{
$v_count = $v_count + 1;
}
echo ' - the velocity position of this chunk is: '.$v_count.' in the pattern<br>';
$v_count = $v_count + ${"cl" . $z} + 1;
};
}
From the example you've given, it seems that you need the corresponding value from the velocity array and the duration between the 1's in the beat array.
This code first extracts the 1's by splitting it into an array and then filtering out the 0's. So
$beat_pattern = '1001110011110010';
$velocity_pattern = '4242424242424242';
$beat = array_filter(str_split($beat_pattern));
would give in $beat...
Array
(
[0] => 1
[3] => 1
[4] => 1
[5] => 1
[8] => 1
[9] => 1
[10] => 1
[11] => 1
[14] => 1
)
it then takes each entry in turn, works out the length by looking at the next key and subtract the two, also using the index to get the corresponding velocity.
To account for the starting with 0, you can loop up to the first instance of 1 and output the velocity pattern for the same element...
$beat_pattern = '1001110011110010';
$velocity_pattern = '4242424242424242';
$beat = array_filter(str_split($beat_pattern));
$beatKeys = array_keys($beat);
// For the leading 0's
for( $i = 0; $i < $beatKeys[0]; $i++ ) {
echo "1-". $velocity_pattern[$i] . PHP_EOL;
}
for ( $i = 0; $i < count($beatKeys); $i++ ) {
echo ($beatKeys[$i+1] ?? strlen($beat_pattern)) - $beatKeys[$i] . "-".
$velocity_pattern[$beatKeys[$i]] . PHP_EOL;
}
gives (length-velocity)...
3-4
1-2
1-4
3-2
1-4
1-2
1-4
3-2
2-4
Assuming your two input strings:
$binary = '0001000110101001';
$velocity = '4231423142314231';
If you analyse the pattern with a regex, you can obtain all the component parts in one operation, including pauses at the start of the pattern (which are essentially 0% volume beats).
$index = 0;
preg_match_all('/^0+|10*/', $binary, $parts);
foreach ($parts[0] as $part) {
$duration = strlen($part); // How many beats
$volume = $part[0] ? $velocity[$index] : 0; // The corresponding volume number
$index += $duration;
}
To develop this further, it seems to me that it would be practical to produce a proper array of data for the pattern, and you could package up this functionality if you so wanted:
function drumPattern($binary, $velocity) {
$output = [];
$index = 0;
preg_match_all('/^0+|10*/', $binary, $parts);
foreach ($parts[0] as $part) {
$duration = strlen($part);
$output[] = [
'duration' => $duration,
'volume' => $part[0] ? $velocity[$index] : 0
];
$index += $duration;
}
return $output;
}
Example
drumPattern($binary, $velocity);
Produces the following output
Array
(
[0] => Array
(
[duration] => 3
[volume] => 0
)
[1] => Array
(
[duration] => 4
[volume] => 1
)
[2] => Array
(
[duration] => 1
[volume] => 1
)
[3] => Array
(
[duration] => 2
[volume] => 4
)
[4] => Array
(
[duration] => 2
[volume] => 3
)
[5] => Array
(
[duration] => 3
[volume] => 4
)
[6] => Array
(
[duration] => 1
[volume] => 1
)
)
i have a text and i want convert it to array by exclude but i cant get true array
# SRC-ADDRESS DST-ADDRESS PACKETS BYTES SRC-USER DST-USER 0 10.40.47.48 216.58.205.211 12 822 2 1 10.40.47.48 102.132.97.21 66 9390 2 2 184.106.10.77 10.40.47.252 10 1819 1 3 10.40.47.252 104.27.155.225 1 41 1 4 10.40.47.252 144.76.103.6 5 878 1 5 102.132.97.35 10.40.47.252 11 1159 1 6 10.40.47.252 52.169.53.217 1 397 1 7 104.27.155.225 10.40.47.252 1 52 1
and i want result like this
Array
(
[0] => Array
(
[.id] => *0
[src-address] => 10.40.47.50
[dst-address] => 185.144.157.141
[packets] => 6
[bytes] => 1349
)
[1] => Array
(
[.id] => *1
[src-address] => 195.122.177.151
[dst-address] => 10.40.47.48
[packets] => 4
[bytes] => 174
[dst-user] => 2
)
....
i try this but it is wrong
$arr = exclude(" ",$text);
edit :
i can get text by another way
0 src-address=108.177.15.188 dst-address=10.40.47.252 packets=1 bytes=52 dst-user="1" 1 src-address=10.40.47.48 dst-address=172.217.19.150 packets=11 bytes=789 src-user="2" 2 src-address=184.106.10.77 dst-address=10.40.47.252 packets=26 bytes=5450 dst-user="1"
As I mentioned in the comments, one way would be to first explode your input by " " (space). You loop through each element/row of the resulting array. Then you explode each of those by = (equals sign). If the result of that explode is a single-element array, you know you should start a new row and create a key-value pair using your special .id key. If the count of the result is two, take the first part and make it the key of a new key-value pair in the current row, and take the second part and make it the value of that key-value pair.
There's a bit of a wrinkle in the fact that some of your source values are quoted, but you seem to want them not quoted in the result. To handle that we do a lose equality check on the value to see if it is the same when converted to an integer or not. If it is, then we convert it to remove the quotes.
$inputText = '0 src-address=108.177.15.188 dst-address=10.40.47.252 packets=1 bytes=52 dst-user="1" 1 src-address=10.40.47.48 dst-address=172.217.19.150 packets=11 bytes=789 src-user="2" 2 src-address=184.106.10.77 dst-address=10.40.47.252 packets=26 bytes=5450 dst-user="1"';
$result = array();
$spaceParts = explode(" ", $inputText);
foreach($spaceParts as $part)
{
$subParts = explode("=", $part);
if(count($subParts) == 1)
{
$resultIndex = (isset($resultIndex) ? $resultIndex+1 : 0);
$result[$resultIndex] = array(".id" => "*".$part);
}
else if(count($subParts) == 2)
{
$result[$resultIndex][$subParts[0]] = ((int)$subParts[1] == $subParts[1] ? (int)$subParts[1] : $subParts[1]);
}
else
{
// unexpected, handle however you want
}
}
print_r($result);
DEMO
Try reading the string in using str_getcsv replacing the delimiter with whatever the string is delimited by.
var_dump(str_getcsv($input, ","));
Note the manual states that the delimiter must be one char long. If wanting a tab or multiple spaces you will need to look into the answer:
str_getcsv on a tab-separated file
str-getcsv php manual
Here is something that could work but I would recoment using the csv methods instead to read the data in . And it is unclear how your data should be actually mapped to header.
$header = "# SRC-ADDRESS DST-ADDRESS PACKETS BYTES SRC-USER DST-USER ";
$input = "# SRC-ADDRESS DST-ADDRESS PACKETS BYTES SRC-USER DST-USER 0 10.40.47.48 216.58.205.211 12 822 2 1 10.40.47.48 102.132.97.21 66 9390 2 2 184.106.10.77 10.40.47.252 10 1819 1 3 10.40.47.252 104.27.155.225 1 41 1 4 10.40.47.252 144.76.103.6 5 878 1 5 102.132.97.35 10.40.47.252 11 1159 1 6 10.40.47.252 52.169.53.217 1 397 1 7 104.27.155.225 10.40.47.252 1 52 1 ";
$string = str_replace($header, "", $input );
$delimiter = " ";
$columns = 6;
$splitData = explode($delimiter, $string);
$result = [];
$i= 0;
foreach ($splitData as $key => $value) {
$result[$i][] = $value;
if (($key+1) % $columns == 0 ){
$i++;
}
}
var_dump($result);
Using the second example with the 0 src-address=108.177.15.188 dst-address=10.40.47.252 packets=1 bytes=52 dst-user="1" format, there are 6 entries:
$result = array_map(function($v) {
parse_str("id=".implode("&", $v), $output);
return $output;
}, array_chunk(explode(' ', $text), 6));
Explode the array on spaces
Chunk the array into 6 entries per element
Map to a function that implodes each array on & and parse it as a query string
If I have a string as : 10 20 3 4 15 6
How can I convert it to individual numbers and store it in a array?
PHP is very clever when dealing with types of variables. You don't need it to be an integer, it can be a string of numbers, and PHP would still treat it as integers when performing operations on them.
If you want to have each element be the numbers separated by spaces, you simply do
$array = explode(" ", "10 20 3 4 15 6");
The output of $array would then be
Array (
[0] => 10
[1] => 20
[2] => 3
[3] => 4
[4] => 15
[6] => 6
)
Live demo
$str = "10 20 3 4 15 6";
$arr = str_split($str);
$intArr = array_map('intval', $arr);
Might be a better way of doing it but the above should do the work.
i'm working on a project that will need to have everything shown with barcodes, so I've generated 7 numbers for EAN8 algorithm and now have to get these 7 numbers seperately, right now i'm using for the generation
$codeint = mt_rand(1000000, 9999999);
and I need to get this 7 numbers each seperately so I can calculate the checksum for EAN8, how can i split this integer to 7 parts, for example
12345678 to
arr[0]=1
arr[1]=2
arr[2]=3
arr[3]=4
arr[4]=5
arr[5]=6
arr[6]=7
any help would be appreciated..
also I think that I'm becoming crazy :D because I already tried most of the solutions you gave me here before and something is not working like it should work, for example:
$codeint = mt_rand(1000000, 9999999);
echo $codeint."c</br>";
echo $codeint[1];
echo $codeint[2];
echo $codeint[3];
gives me :
9082573c
empty row
empty row
empty row
solved! $codeint = (string)(mt_rand(1000000, 9999999));
Try to use str_split() function:
$var = 1234567;
print_r(str_split($var));
Result:
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => 7
)
There are two ways to do this, one of which is reasonably unique to PHP:
1) In PHP, you can treat an integer value as a string and then index into the individual digits:
$digits = "$codeint";
// access a digit using intval($digits[3])
2) However, the much more elegant way is to use actual integer division and a little knowledge about mathematical identities of digits, namely in a number 123, each place value is composed of ascending powers of 10, i.e.: 1 * 10^2 + 2 * 10^1 + 3 * 10^0.
Consequently, dividing by powers of 10 will permit you to access each digit in turn.
it's basic math you can divide them in loop by 10
12345678 is 8*10^1 + 7*10^2 + 6*10^3...
the other option is cast it to char array and then just get it as char
Edit
After #HamZa DzCyberDeV suggestion
$string = '12345678';
echo "<pre>"; print_r (str_split($string));
But in mind it comes like below but your suggestion is better one.
If you're getting string from your function then you can use below one
$string = '12345678';
$arr = explode(",", chunk_split($string, 1, ','));
$len = count($arr);
unset($arr[$len-1]);
echo "<pre>";
print_r($arr);
and output is
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => 7
[7] => 8
)
okay what you can do is
Type cast to string with prefill 0
this is how it works
$sinteger = (string)$integer;
$arrsize = 0 ;
for (i=strlen($sinteger), i == 0 ; i--)
{
arr[$arrsize]=$sinteger[i];
$arrsize++;
}
And then what is left you can prefill with zip.
I am sure you can manage the order reverse or previous. but this is simple approach.
I have read the PHP Manuel about array_filter
<?php
function odd($var)
{
// returns whether the input integer is odd
return($var & 1);
}
function even($var)
{
// returns whether the input integer is even
return(!($var & 1));
}
$array1 = array("a"=>1, "b"=>2, "c"=>3, "d"=>4, "e"=>5);
$array2 = array(6, 7, 8, 9, 10, 11, 12);
echo "Odd :\n";
print_r(array_filter($array1, "odd"));
echo "Even:\n";
print_r(array_filter($array2, "even"));
?>
Even I see the result here :
Odd :
Array
(
[a] => 1
[c] => 3
[e] => 5
)
Even:
Array
(
[0] => 6
[2] => 8
[4] => 10
[6] => 12
)
But I did not understand about this line: return($var & 1); Could anyone explain me about this?
You know && is AND, but what you probably don't know is & is a bit-wise AND.
The & operator works at a bit level, it is bit-wise. You need to think in terms of the binary representations of the operands.
e.g.
710 & 210 = 1112 & 0102 = 0102 = 210
For instance, the expression $var & 1 is used to test if the least significant bit is 1 or 0, odd or even respectively.
$var & 1
010 & 110 = 0002 & 0012 = 0002 = 010 = false (even)
110 & 110 = 0012 & 0012 = 0012 = 110 = true (odd)
210 & 110 = 0102 & 0012 = 0002 = 010 = false (even)
310 & 110 = 0112 & 0012 = 0012 = 110 = true (odd)
410 & 210 = 1002 & 0012 = 0002 = 010 = false (even)
and so on...
&
it's the bitwise operator. It does the AND with the corrispondent bit of $var and 1
Basically it test the last bit of $var to see if the number is even or odd
Example with $var binary being 000110 and 1
000110 &
1
------
0
0 (false) in this case is returned so the number is even, and your function returns false accordingly
$var & 1 - is bitwise AND
it checks if $var is ODD value
0 & 0 = 0,
0 & 1 = 0,
1 & 0 = 0,
1 & 1 = 1
so, first callback function returns TRUE only if $var is ODD, and second - vise versa (! - is logical NOT).
It is performing a bitwise AND with $var and 1. Since 1 only has the last bit set, $var & 1 will only be true if the last bit is set in $var. And since even numbers never have the last bit set, if the AND is true the number must be odd.
& is bitwise "and" operator. With 1, 3, 5 (and other odd numbers) $var & 1 will result in "1", with 0, 2, 4 (and other even numbers) - in "0".
An odd number has its zeroth (least significant) bit set to 1:
v
0 = 00000000b
1 = 00000001b
2 = 00000010b
3 = 00000011b
^
The expression $var & 1 performs a bitwise AND operation between $var and 1 (1 = 00000001b). So
the expression will return:
1 when $var has its zeroth bit set to 1 (odd number)
0 when $var has its zeroth bit set to 0 (even number)
& is a bitwise AND on $var.
If $var is a decimal 4, it's a binary 100. 100 & 1 is 100, because the right most digit is a 0 in $var - and 0 & 1 is 0, thus, 4 is even.
it returns 0 or 1, depending on your $var
if $var is odd number, ex. (1, 3, 5 ...) it $var & 1 returns 1, otherwise (2, 4, 6) $var & 1 returns 0