How to create URLs without identifiers

How to create URLs without identifiers - php

These days, I am seeing more a more, URL like this: example.com/this_is_content_title
I am not sure how this is done, or possible even. I know most sites use something like example.com/108/some_text_here with the 108 part being an identifier/id of the news but, how is it possible to append the title just after the domain name and how to fetch from database?
My guess it they are appending it using apache, but why can't we see the identifier and how do they search/find the content based on the text?
maybe:
$title = str_replace(array('_', '/'), '', $_SERVER['REQUEST_URI']);
$query = "SELECT * FROM article WHERE title = '$title'";
I am assuming that is how they retrieve the contents from database, even though it seems not efficient way to do it. Because, it would be much more speedy to search for an article based on id's instead of a whole sentences.
Anyway, if that is how they show the articles, how does one just append title into the URI?
Don't tell me it is something like this:
$row['title'] = 'I am title for some article';
echo "<a href='http://example.com/".str_replace(' ', '_', $row['title'])."'";

this is call seo url
you can see some samples here http://davidwalsh.name/htaccess-url

You are right, that is how it is done, and the string is indentifier here. Also workaround here would be ENUM of all identifiers but lets not go too deep into it.
It seems not efficient way for you, I see your point, but hey this action is only ~0,01sec slower and gives you a lot of advance on SEO because you have clean url, so nothing too bad there...

I can't explain it very well but I was reading the Apache docs on URL rewriting and it is quite easy to do what I think they called a 'silent' redirect, so that without changing the URL client-side the server redirects all requests for example.com/this_is_content_title to example.com/108/some_text_here or for that matter example.com/?id=108.
Aside from server configuration, I think WordPress does this with PHP when you enable pretty permalinks, so the WordPress repos and codex might be a good place to look for more details.

Related

Running preg_replace on html code taking too long

At the risk of getting redirected to this answer (yes, I read it and spent the last 5 minutes laughing out loud at it), allow me to explain this issue, which is just one in a list of many.
My employer asked me to review a site written in PHP, using Smarty for templates and MySQL as the DBMS. It's currently running very slowly, taking up to 2 minutes (with a entirely white screen through it all, no less) to load completely.
Profiling the code with xdebug, I found a single preg_replace call that takes around 30 seconds to complete, which currently goes through all the HTML code and replaces each URL found to its SEO-friendly version. The moment it completes, it outputs all of the code to the browser. (As I said before, that's not the only issue -the code is rather old, and it shows-, but I'll focus on it for this question.)
Digging further into the code, I found that it currently looks through 1702 patterns with each appropriate match (both matches and replacements in equally-sized arrays), which would certainly account for the time it takes.
Code goes like this:
//This is just a call to a MySQL query which gets the relevant SEO-friendly URLs:
$seourls_data = $oSeoShared->getSeourls();
$url_masks = array();
$seourls = array();
foreach ($seourls_data as $seourl_data)
{
if ($seourl_data["url"])
{
$url_masks[] = "/([\"'\>\s]{1})".$site.str_replace("/", "\/", $seourl_data["url"])."([\#|\"'\s]{1})/";
$seourls[] = "$1".MAINSITE_URL.$seourl_data["seourl"]."$2";
}
}
//After filling both $url_masks and $seourls arrays, then the HTML is parsed:
$html_seo = preg_replace($url_masks, $seourls, $html);
//After it completes, $html_seo is simply echo'ed to the browser.
Now, I know the obvious answer to the problem is: don't parse HTML with a regexp. But then, how to solve this particular issue? My first attempt would probably be:
Load the (hopefully, well-formed) HTML into a DOMDocument, and then get each href attribute in each a tag, like so.
Go through each node, replacing the URL found for its appropriate match (which would probably mean using the previous regexps anyway, but on a much-reduced-size string)
???
Profit?
but I think it's most likely not the right way to solve the issue.
Any ideas or suggestions?
Thanks.

As your goal is to be SEO-friendly, using canonical tag in the target pages would tell the search engines to use your SEO-friendly urls, so you don't need to replace them in your code...

Oops ,That's really tough, bad strategy from the beginning , any way that's not your fault,
i have 2 suggestion:-
1-create a caching technique by smarty so , first HTML still generated in 2 min >
second HTMl just get from a static resource .
2- Don't Do what have to be done earlier later , so fix the system ,create a database migration that store the SEO url in a good format or generate it using titles or what ever, on my system i generate SEO links in this format ..
www.whatever.com/jobs/722/drupal-php-developer
where i use 722 as Id by parsing the url to get the right page content and (drupal-php-developer) is the title of the post or what ever
3 - ( which is not a suggestion) tell your client that project is not well engineered (if you truly believe so ) and need a re structure to boost performance .
run

Dynamic hierarchical pages and SEO

Cheers everyone!
Please bear with me, I really did do some research on this, but I couldn't come to a final solution, hence I'm here to hear your opinions.
What I want to build is a small i18n-CMS with dynamic hierarchical pages such as:
domain.tld/en/I/am/a/path
I want to find the least performance intense way that allows me to have beautiful, SEO and human-friendly URLs.
I use a Closure-Table, so two tables in the database, one for the pagenodes and one for the pathtree plus another table for the localised page, that references a certain pagenode (three in total).
My different solutions so far:
Sure I could make an algorithm, that goes through all the different request segments and checks if there is an English "path" under an "a" under an "am" under an "I", but this seems very unwise considering a multitude of page-hits.
Or is it?
Positive: I wouldn't need to save the path anywhere, because it would be calculated. So moving pages around wouldn't need to recalculate the path and save it again.
I could simply save the whole path to the database, as VARCHAR(2000) or something and then just check if there is a page with path "I/am/a/path" in English language and get that one.
This seems to be rather messy.
As I do it now. Currently I add an "ID" at the end of my path. Such as:
domain.tld/en/I/am/a/path.1
So if you enter "domain.tld/en.1" you get forwarded to the one with the right slug. But here again I need to save the slug to the database, for each single page.
Also I would love to get rid of the id (could I do this with mod-rewrite and .htaccess?)
Any more insights on this one? As I'm not a webdeveloper, so I'm not really sure regarding performance.
Kindest regards,
Meren

It seems to me that page request will happen a million times more often than an editor changing a page address. So I would definitely go with the save-to-db option. What you can do is create an extra field in which you save the 'slug' for that page, in combination with .htaccess you can redirect pages from the 'slug' addresses. For example in http://www.fuuu.com/futest-fu , 'futest-fu' is a slug which could be rewritten to an ID number (or anything you would want it to be). Amongst others, Wordpress works this way. Check out this discussion for some insights: http://wordpress.org/support/topic/where-are-the-permalinks-slug-stored-in-the-database

Best method for URI Routing this in CodeIgniter?

So, here's an example on Forrst, a CodeIgniter website:
http://forrst.com/posts/PHP_Nano_Framework_WIP_Just_throwing_some_ideas_o-mU8
Look at that nice URL. You've got the root site, then posts, then the post title and a short extract. This is pretty cool for user experience.
However, my CodeIgniter site's URLs just plain suck. E.G.
http://mysite.com/code/view/120
So it accesses the controller code, then the function view, then the 20 on the end is the Post ID (and it does the database queries based on that).
I realised I could do some routing. So in my routes.php file, I put the following in:
$route['posts/(:num)'] = "code/view/$1"; - so this will make http://mysite.com/posts/120 be the same as http://mysite.com/code/view/120. A bit nicer, I think you'll agree.
My question is - how can I use a similar technique to Forrst, whereby an extract of the post is actually appended to the URL? I can't really see how this would be possible. How can the PHP script determine what it should look up in the database, especially if there are several things with the same title?
Thanks!
Jack

To get a URL like in your example you need to add a routing rule, like you've already done $route['posts/(:num)'] = "code/view/$1";. Forrst's url seems to be "mapped" (or something like that), I think the last part of the uri is the identifier (o-mU8 seems like a hash, but I prefer an int id) which is stored in the db, so if he queries, he splits the uri by the ndashes (_), and gets the last part of it, like this within your controller action:
$elements = explode('_',$this-uri-segment(2));
$identifier = $elements[count($elements)-1];
$results = $this->myModel->myQuery($identifier);
Basically the string between the controller/ and the identifier is totally useless, but not if your goal is a better SEO.
I hope this helps

See the official dicussions. The term that is often related to this is "slug". Haven't tested the approach from the CI forums myself, but the suggestions and examples look pretty good.

The URL helper in codeigniter has a function call url_title(). I haven't used it myself but I think it's what you are looking for.

How can I have Code Igniter URI segments for multiple variables?

I'm writing an app that allows you to filter database results based on Location and Category.
If someone was to search for Liverpool under the Golf category the URI would be /index.php/search/Liverpool/Golf.
Should someone want to search by Location but not category, they would be sent to /index.php/search/Liverpool
However, should someone want to filter only by category they would be unable to use /index.php/search/Golf because that would be caught by the location search.
Is there a best practice way to have /index.php/search/Golf be recognised? Some best practice as to what else to add to the URI to make these two queries distinct? /index.php/search/category/Golf perhaps?
Though that is beginning to show characteristics of /index.php?search&category=Golf which is exactly what I'm trying to avoid.

Try using $this->uri->uri_to_assoc(n)
described here http://codeigniter.com/user_guide/libraries/uri.html (half way down on page)
basically you will structure your url like this:
mysite.com/index.php/search/location/liverpool/category/golf
NOTE: the parameters are optional so you dont have to have both in there all the time. you can just as well do
mysite.com/index.php/search/location/liverpool/
and
mysite.com/index.php/search/category/golf
this way it will return FALSE if the element you are looking for does not exist

It would probably be best to keep your URI segments relavent no matter what they are searching for.
index.php/LOCATION/CATEGORY
If they are not interested in a location then pass a filler to the system:
index.php/anywhere/golf
Then in your code you just check for that specific string of ANYWHERE to determine if they only want to see the activity. I assume that you are going to be redirecting them with either links or forums (and that they aren't typing the URI string themselves) so you should be safe in just passing information that you expect and testing against that.

I use the format suggested by Tom above and then do something along the lines of below to determine the value of the parameters.
$segment_array = $this->uri->segment_array();
$is_location_searched = array_search('location', $segment_array);
if($is_location_searched && $this->uri->segment($is_location_searched +1))
{
$location = $this->uri->segment($is_sorted+1);
}

Have a look at http://lucenebook.com/#/p:solr/s:wiki and click around a bit on the left-hand navigation. Pay close attention to what happens in the url when you do. I really like this scheme for many reasons.
It's SEO-friendly.
"Curious" people can mix/match the urls and it still resolves to a proper search.
It just looks good!
Of course, the trick is really in the code, in how you build the thing. It took me a few weeks to sort it out, but I finally have my own version of that site. Just not ajax based, because I like search engines better than ajax. Ajax don't pay the bills.

Using SEO-friendly links

I'm developing a PHP website, and currently my links are in a facebook-ish style, like so
me.com/profile.php?id=123
I'm thinking of moving to something more friendly to crawling search engines
(like here at stackoverflow), something like:
me.com/john-adams
But how can I differentiate from two users with the same name - or more correctly, how does stackoverflow tell the difference from two questions with the same title?
I was thinking of doing something like
me.com/john-adams-123
and parsing the url.
Any other recommendations?

Stackoverflow does something similar to your me.com/john-adams-123 option, except more like me.com/123/john-adams where the john-adams part actually has no programmatic meaning. The way you're proposing is slightly better because the semantic-content-free numeric ID is farther to the right in the URL.
What I would do is store a unique slug (these SEO-friendly URL components are generally called slugs) in the user table and do the number append thing when necessary to get a unique one.

In stack overflow's case, it's
http://stackoverflow.com/questions/975240/using-seo-friendly-links
http://stackoverflow.com/questions <- Constant prefix
/975240 <- Unique question id
using-seo-friendly-links <- Any text at all, defaults to title of question.
Facebook, on the other hand, has decided to just make everyone pick a unique ID. Then they are going to use that as a profile page. Something like http://facebook.com/p/username/. They are solving the problem of uniqueness between users, by just requiring it to be some string that the user picks that is unique among all existing users.

SO 'cheats' :-).
The link for your question is "Using SEO-friendly links" but "Using SEO-friendly links" also works.
The part after the number is the SEO friendly bit, but SO doesn't really care what's there. I think it defaults to the question title.
So in your case you could construct a link like:
me.com/123/john-adams
a second john adams would have a different Id and a unique URL like :
me.com/111/john-adams

I would say that your proposed solution is a better solution to that of stackoverflows as it preserves content hierarchy:
me.com/john-adams-123
Usage of the unique ID before the username is simply nonsensical.
I would, however, recommend enforcement of content type:
me.com/john-adams-123.html
This will allow for consistent urls while serving a variety of content types.
Additionally, you could make use of sexatrigesimal for the unique id, to further reduce the amount of unnecessary cruft in your URL, especially for high end numbers, but this is often overkill :D
me.com/john-adams-123.html -> me.com/john-adams-3F.html
me.com/john-adams-1234567890.html -> me.com/john-adams-KF12OI.html
Finally, be sure to utilize 301 redirects on non-conforming accessible URIs to redirect to the "correct" seo-friendly schema to prevent duplicate content penalties.

I'd go with your style of me.com/john-adams-123, because I think the leftmost part of the URI has more importance in SEO ranking.
Actually, if you are willing to use this on several controllers (not just user profile), you may want to do it more like me.com/john-adams-profile-123 with a rewriting rule redirecting /.+-profile-(\d+) to profile.php?uid=$1 and still be able to use, say, me.com/john-adams-articles-123 for this user's articles...

To avoid dealing with the links contain special characters, you can use this plugin for Zend Framework.
https://github.com/btlagutoli/CharConvert
$filter2 = new Zag_Filter_CharConvert(array(
'onlyAlnum' => true,
'replaceWhiteSpace' => '-'
));
echo $filter2->filter('éééé ááááá ? 90 :');//eeee-aaaaa-90
this can help you deal with strings in other languages

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.