How to generate this kind of random curves? - php

Is it possible to generate this kind of random curves?
I've tried IMagick bezier curves (see http://www.php.net/manual/en/function.imagickdraw-bezier.php), but even with 20-30 points they do not look like this. Here is my sample http://mechanicalzilla.com/sandbox/imagick/curve.php
Thank you.

I bet you could write an algorithm which would basically take x number of random twists before going straight to the exit coordinates. This also assumes that algorithm is smart enough to check the angle of the turn. (assuming you don't want to endup in knot-web)
However, assuming that this isn't your graduation task or that you are paid on per-hour basis to work on this, this would be a waste of time and success is highly doubtful.
Even if you'd manage to generate single line algorithm, doing it so that the lines wouldn't come too close to each other is close to impossible. You will end up with something like this:

Looks like:
x = 0; y = 0; angel = 0;
while (true) {
angel = angel + 0.5 - random(1);
x1 = x + 0.1 * cos(angel);
y1 = y + 0.1 * sin(angel);
if (abs(x1 - x) + abs(y1 - y) < 10)
drawline(x,y,x1,y1);
x = x1; y = y1;
if (x < 0) x = width;
if (y < 0) y = height;
if (x > width) x = 0;
if (y > height) y = 0;
}

This is far from a complete answer, but in my mind's eye seems like it could help you:
Instead of drawing curves from the start to the end point of the entire line, consider subdividing your board into a evenly spaced grid. Each square of one column of the grid is entitled to have one point of one curve in it, and you'd steadily advance from left to right (at first? for simplicity's sake.).
The randomness would come into play by picking a square for a curve - to prevent it from getting too chaotic, you could give this randomness bounds, say, "you're not allowed to pick a square that (if a distance from square to square is considered 1) violates abs(current vertical position - new vertical position) <= 5 unless none such is free anymore at this point" or some other arbitrary restraint. ("unless none such is free anymore at this point" is important, otherwise it's possible to lock yourself into an unsolvable state.)
(Sorry, drawing curves with my mouse -> worst/no interpolation ever. Catmull-Rom interpolation will probably be your friend here, though, I imagine.)
The display should be loose enough given that your curve points cannot arbitrarily scatter together given a grid, but it's probably very difficult to get the curve to connect to the end point 'fluidly' - might be a good solution if you don't mind arbitrary end points, though, read as, the algorithm can decide for itself where it wants the line to end.
Think this idea might help you with your curves?

One way to approach this would be to first generate a set of random curves and then use a physics solver to apply repulsion forces between them to avoid clumping.
Here's a quick proof of concept:
I created this using a very niche tool (for anyone interested: Kangaroo Physics solver, a plugin for Grasshopper, visual scripting language for Rhinoceros3d) but you can probably recreate the same concept in any mainstream programming language, eg. Python.

Related

Examples of log() algorithm using arbitrary precision maths

I'm looking to find an algorithm that I can implement in PHP to get the natural log() of an integer number using arbitrary precision maths. I'm limited by the PHP overlay library of the GMP library (see http://php.net/manual/en/ref.gmp.php for available GMP functions in PHP.)
If you know of a generic algorithm that can be translated into PHP, that would also be a useful starting point.
PHP supports a native log() function, I know, but I want to be able to work this out using arbitrary precision.
Closely related is getting an exp() function. If my schoolboy Maths serves me right, getting one can lead to the other.
Well you would have the Taylor series, that can be rewritten for better convergence
To transform this nice equality into an algorithm, you have to understand how a converging series work : each term is smaller and smaller. This decrease happens fast enough so that the total sum is a finite value : ln(y).
Because of nice properties of the real numbers, you may consider the sequence converging to ln(y) :
L(1) = 2/1 * (y-1)/(y+1)
L(2) = 2/1 * (y-1)/(y+1) + 2/3 * ( (y-1)/(y+1) )^3
L(3) = 2/1 * (y-1)/(y+1) + 2/3 * ( (y-1)/(y+1) )^3 + 2/5 * ( (y-1)/(y+1) )^5
.. and so on.
Obviously, the algorithm to compute this sequence is easy :
x = (y-1)/(y+1);
z = x * x;
L = 0;
k = 0;
for(k=1; x > epsilon; k+=2)
{
L += 2 * x / k;
x *= z;
}
At some point, your x will become so small that it will not contribute to the interesting digits of L anymore, instead only modifying the much smaller digits. When these modifications start to be too insignificant for your purposes, you may stop.
Thus if you want to achieve a precision 1e^-20, set epsilon to be reasonably smaller than that, and you're good to go.
Don't forget to factorize within the log if you can. If it's a perfect square for example, ln(a²) = 2 ln(a)
Indeed, the series will converge faster when (y-1)/(y+1) is smaller, thus when y is smaller (or rather, closer to 1, but that should be equivalent if you're planning on using integers).

How to get pixel along circumfrence

I want to get the values of pixels in a circumference in an image, and plot them.
I know Circumference is C=2*pi*radius, but I am uncertain about how to iterate through all the points in a circle to get the pixel data.
To get a single pixel, this would work. But I need to get pixel values along a circles circumference. how should I iterate through to get that data?
$pixel=getPixel($image, $x, $y);
look at answers here 3D sphere boundary
it is 3D equivalent of your problem so it may help but if you need also the pixel order to be right then most likely is this not for you.
If you want speed use Bresenham
but for newbies it can be difficult to implement and even more to understand
if you want simplicity instead (or for start) then:
use parametric circle equation
x=x0+r*cos(t)
y=y0+r*sin(t)
which get you pixel position for the circle boundary while t= <0,2*pi) [rad] use deg or rad according to your sin,cos functions
pixels only
circle circumference is 2*pi*r [pixels] so step for parameter t should be small enough to reach as many points
dt <= 2*pi/2*pi*r // whole circle / number of pixels
dt <= 1/r // let use half of that for safety
so for extracting points use this C++ code:
int x0=...,y0=...,r=...; // input values
int xx=x0+r+r,yy=y0,x,y;
double t,dt=0.5/r;
for (t=0.0;t<2.0*M_PI;t+=dt)
{
x=x0+int(double(double(r)*cos(t)));
y=y0+int(double(double(r)*sin(t)));
if ((xx!=x)&&(yy!=y)) // check if the coordinates crossed pixel barrier
{
xx=x; yy=y;
// here do what you need to do with pixel x,y
}
}
if there are holes inside your perimeter then lower the dt more. The less it is the smaller step you use but it also slows down the whole thing. You can have r as double or has its copy to avoid 2 int/double conversions. xx,yy are last used pixel coordinates to avoid processing single pixel multiple times.
At start it is set point that is not inside circle for safety. if r==0 then you should set dt to some safety value like dt=M_PI;
One way to do this would be to copy the low level code used to plot the pixels when creating the image of a circle on the screen. This works by incrementing (or decrementing) one of the co-ordinates and then adjusting the other one so as to keep the same distance from the centre of the circle. To ensure it is symmetrical, you make sure that each octant of the circle is plotted in exactly the same way. Details at http://www.asksatyam.com/2011/01/bresenhams-circle-algorithm_22.html (Or http://en.wikipedia.org/wiki/Midpoint_circle_algorithm of course).

Generate N random coordinates in a point delimited form using PHP

I believe I need a solution using PHP for the following problem. Let's start and say we have a map, that width is 100000 and height 100000.
I'd have a region into that map, designed by many X / Y / Z coordinates. something like:
{{-56000;190073;-4509};{-54955;190073;-4509};{-54954;190638;-4509}{-56000;190638;-4509}}
That's 4 points forming a square on our map. But the zones can be defined by 10+ points, so nothing like squares.
Now I'd need a way to generate N different random coordinates that are INSIDE that region.
I don't know where and how to start with this problem, but I know how to use PHP. Just actually lacking the theory part. What algorithm could I use?
Use the rand function to generate x & y coordinates n the range specified by your bounds:
$x = rand($min_x, $max_x);
$y = rand($min_y, $max_y);
I'm not sure what range you want to use for your z coordinate.

Pi help with Php (mass looping)

My primary question is:
Is this alot of loops?
while ($decimals < 50000 and $remainder != "0") {
$number = floor($remainder/$currentdivider); //Always round down! 10/3 =3, 10/7 = 1
$remainder = $remainder%$currentdivider; // 10%3 =1, 10%1
$thisnumber = $thisnumber . $number;
$remainder = $remainder . 0; //10
$decimals += 1;
}
Or could I fit more into it? -without the server crashing/lagging.
I'm just wondering,
Also is there a more effiecent way of doing the above? (e.g. finidng out that 1/3 = 0.3 to 50,000 decimals.)
Finally:
I'm doing this for a pi formulae the (1 - 1/3 + 1/5 - 1/7 etc.) one,
And i'm wondering if there is a better one. (In php)
I have found one that finds pi to 2000 in 4 seconds.
But thats not what I want. I want an infinite series that converges closer to Pi
so every refresh, users can view it getting closer...
But obv. converging using the above formulae takes ALONG time.
Is there any other 'loop' like Pi formulaes (workable in php) that converge faster?
Thanks alot...
Here you have several formulas for calculating Pi:
http://mathworld.wolfram.com/PiFormulas.html
All of them are "workable" in PHP, like in any other programming language. A different question is how fast they are or how difficult they are to implement.
If the formulas converge faster or slower, it's a Math question, not about programming, so I can't help you. I can tell you that as a rule of a thumb, the less nested loops you put, the faster will be your algorithm (this is a general rule, don't take it as the absolute truth!)
Anyway, since the digits of Pi are known until a certain digit, why don't you copy it into a file and then just index it? That will be extremely fast :)
You can check previous answers to similar questions:
How can pi be calculated to a set number of digits in PHP?
https://stackoverflow.com/questions/3045020/which-is-the-best-formulae-to-find-pi
Check http://mathworld.wolfram.com/PiIterations.html (taken from the last answer). Those formulaes are using iterations and can therefor be implemented using a loop.
You should use google and search for "php implementation xxxxxxx" (where xxxxxx stands for the algorithm name you want to search for).
EDIT: Here is an implementation of Vietas formula using a while-loop in php.

LSA - Latent Semantic Analysis - How to code it in PHP?

I would like to implement Latent Semantic Analysis (LSA) in PHP in order to find out topics/tags for texts.
Here is what I think I have to do. Is this correct? How can I code it in PHP? How do I determine which words to chose?
I don't want to use any external libraries. I've already an implementation for the Singular Value Decomposition (SVD).
Extract all words from the given text.
Weight the words/phrases, e.g. with tf–idf. If weighting is too complex, just take the number of occurrences.
Build up a matrix: The columns are some documents from the database (the more the better?), the rows are all unique words, the values are the numbers of occurrences or the weight.
Do the Singular Value Decomposition (SVD).
Use the values in the matrix S (SVD) to do the dimension reduction (how?).
I hope you can help me. Thank you very much in advance!
LSA links:
Landauer (co-creator) article on LSA
the R-project lsa user guide
Here is the complete algorithm. If you have SVD, you are most of the way there. The papers above explain it better than I do.
Assumptions:
your SVD function will give the singular values and singular vectors in descending order. If not, you have to do more acrobatics.
M: corpus matrix, w (words) by d (documents) (w rows, d columns). These can be raw counts, or tfidf or whatever. Stopwords may or may not be eliminated, and stemming may happen (Landauer says keep stopwords and don't stem, but yes to tfidf).
U,Sigma,V = singular_value_decomposition(M)
U: w x w
Sigma: min(w,d) length vector, or w * d matrix with diagonal filled in the first min(w,d) spots with the singular values
V: d x d matrix
Thus U * Sigma * V = M
# you might have to do some transposes depending on how your SVD code
# returns U and V. verify this so that you don't go crazy :)
Then the reductionality.... the actual LSA paper suggests a good approximation for the basis is to keep enough vectors such that their singular values are more than 50% of the total of the singular values.
More succintly... (pseudocode)
Let s1 = sum(Sigma).
total = 0
for ii in range(len(Sigma)):
val = Sigma[ii]
total += val
if total > .5 * s1:
return ii
This will return the rank of the new basis, which was min(d,w) before, and we'll now approximate with {ii}.
(here, ' -> prime, not transpose)
We create new matrices: U',Sigma', V', with sizes w x ii, ii x ii, and ii x d.
That's the essence of the LSA algorithm.
This resultant matrix U' * Sigma' * V' can be used for 'improved' cosine similarity searching, or you can pick the top 3 words for each document in it, for example. Whether this yeilds more than a simple tf-idf is a matter of some debate.
To me, LSA performs poorly in real world data sets because of polysemy, and data sets with too many topics. It's mathematical / probabilistic basis is unsound (it assumes normal-ish (Gaussian) distributions, which don't makes sense for word counts).
Your mileage will definitely vary.
Tagging using LSA (one method!)
Construct the U' Sigma' V' dimensionally reduced matrices using SVD and a reduction heuristic
By hand, look over the U' matrix, and come up with terms that describe each "topic". For example, if the the biggest parts of that vector were "Bronx, Yankees, Manhattan," then "New York City" might be a good term for it. Keep these in a associative array, or list. This step should be reasonable since the number of vectors will be finite.
Assuming you have a vector (v1) of words for a document, then v1 * t(U') will give the strongest 'topics' for that document. Select the 3 highest, then give their "topics" as computed in the previous step.
This answer isn't directly to the posters' question, but to the meta question of how to autotag news items. The OP mentions Named Entity Recognition, but I believe they mean something more along the line of autotagging. If they really mean NER, then this response is hogwash :)
Given these constraints (600 items / day, 100-200 characters / item) with divergent sources, here are some tagging options:
By hand. An analyst could easily do 600 of these per day, probably in a couple of hours. Something like Amazon's Mechanical Turk, or making users do it, might also be feasible. Having some number of "hand-tagged", even if it's only 50 or 100, will be a good basis for comparing whatever the autogenerated methods below get you.
Dimentionality reductions, using LSA, Topic-Models (Latent Dirichlet Allocation), and the like.... I've had really poor luck with LSA on real-world data sets and I'm unsatisfied with its statistical basis. LDA I find much better, and has an incredible mailing list that has the best thinking on how to assign topics to texts.
Simple heuristics... if you have actual news items, then exploit the structure of the news item. Focus on the first sentence, toss out all the common words (stop words) and select the best 3 nouns from the first two sentences. Or heck, take all the nouns in the first sentence, and see where that gets you. If the texts are all in english, then do part of speech analysis on the whole shebang, and see what that gets you. With structured items, like news reports, LSA and other order independent methods (tf-idf) throws out a lot of information.
Good luck!
(if you like this answer, maybe retag the question to fit it)
That all looks right, up to the last step. The usual notation for SVD is that it returns three matrices A = USV*. S is a diagonal matrix (meaning all zero off the diagonal) that, in this case, basically gives a measure of how much each dimension captures of the original data. The numbers ("singular values") will go down, and you can look for a drop-off for how many dimensions are useful. Otherwise, you'll want to just choose an arbitrary number N for how many dimensions to take.
Here I get a little fuzzy. The coordinates of the terms (words) in the reduced-dimension space is either in U or V, I think depending on whether they are in the rows or columns of the input matrix. Off hand, I think the coordinates for the words will be the rows of U. i.e. the first row of U corresponds to the first row of the input matrix, i.e. the first word. Then you just take the first N columns of that row as the word's coordinate in the reduced space.
HTH
Update:
This process so far doesn't tell you exactly how to pick out tags. I've never heard of anyone using LSI to choose tags (a machine learning algorithm might be more suited to the task, like, say, decision trees). LSI tells you whether two words are similar. That's a long way from assigning tags.
There are two tasks- a) what are the set of tags to use? b) how to choose the best three tags?. I don't have much of a sense of how LSI is going to help you answer (a). You can choose the set of tags by hand. But, if you're using LSI, the tags probably should be words that occur in the documents. Then for (b), you want to pick out the tags that are closest to words found in the document. You could experiment with a few ways of implementing that. Choose the three tags that are closest to any word in the document, where closeness is measured by the cosine similarity (see Wikipedia) between the tag's coordinate (its row in U) and the word's coordinate (its row in U).
There is an additional SO thread on the perils of doing this all in PHP at link text.
Specifically, there is a link there to this paper on Latent Semantic Mapping, which describes how to get the resultant "topics" for a text.

Categories