user-friendly URLs reliable with the database - php

I have a database containing a table named songs with a field title.
Now If my url is http://www.foo.com/songs/xxx (xxx = the title of the song),
apache is silently redirecting to a page that looks similar to : /song.php?title=xxx.
To embellish the URLs I convert spaces into underscores (cause I know some browser display %20 instead of space, not%20really%20user%20friendly%20ya%20know%20what%20i%20mean).
There's a snag cause if the title contains spaces and underscores (e.g. DJ_underscore fx) and the script converts it into DJ_underscore_fx the sql :
select * from songs where songs.title=xxx
can't find it.
here's the sketch to be more specific:
a script fetches the different titles in the database
converts all the space into underscore ( e.g. name_of the song ->
name_of_the_song )
echo them as links ( e.g. name_of_the_song )
the user clicks on the link and requests the document
apache is silently redirecting ( e.g. /songs/name_of_the_son ->
/song.php?title=name_of_the_song )
song.php fetches the specific data ( e.g. select * from songs where songs.title=name_of_the_song )
ok you see that there's no entry in the database that looks like name_of_the_song but name_of the song.
How can I manage the whole so that my URL remains clear and the title field is not restricted to a certain amount of values (can have spaces, underscore, dashes, well anything)?

Use something like /1234/name-of-page/ where 1234 is the primary key ID of the row and name-of-page is ignored by your script.
This gives a link directly to the primary key of the entry in the table, which will give you several benefits:
No need to have duplicate ID fields.
Fast indexing on SELECT queries.
You still get the readability and SEO benefits of a "pretty" URL.
You might notice that StackOverflow itself does exactly this:
/questions/8211267/user-friendly-urls-reliable-with-the-database/
Which probably gets re-written to something like:
question.php?id=8211267

Just add another field that will keep the exact name used in URL. And when you have some "duplicates" - just append them with _2, _3 etc or give a way for user to edit and give another name manually.

What your trying to achieve is definetly the wrong way, you could have hundreds of variations to lookup in your database and is also bad for SEO.
Start by setting a rule that all URL's have _ to seperate the space, that's how most site URL's are done (digg.com being an example).
Then create a seperate field that stores the URL e.g.
title | url
song name | song_name
Then do your lookup based on the URL field.
For legacy reasons you could also replace any spaces with _ in your lookup script when you receive the title from the GET before doing the database query.

well, if you want spaces in the url, people will have it uri encoded for transit. if rather than replacing all _ with spaces, just use a uridecoder (can't remember the exact title). it would still allow for spaces to be typed. On the displayer, the shown text in the link, cant you do an str_replace to convert %20 in spaces?
Either that of have a computer friendly version of the title (that doesn't use spaces, but underscores) and a user friendly column that does have the spaces

Related

Dynamic URL rewriting using codeigniter/php/htaccess when change in url

I have URL like base_url()/controller/rowtitle-rowid.
Where this URL is manipulated using function and arranged according to row in database.
As mentioned above I need to change whole URL when I am changing rowid in the URL, that means it should rewrite to particular id of row in database.
For example:
Consider following URLs
https://www.marketsandmarkets.com/Market-Reports/medical-device-testing-market-254474064.html
2.https://www.marketsandmarkets.com/Market-Reports/timing-relay-market-241993160.html
if I change bold numbers in number 2 URL to bold numbers in number 1 URL then it will rewrite the URL to URL number 1 and redirect to URL number 1
Can this possible?
Thanks in advance.
Your say you are modifying the rowid which I assume are the numbers you are bolding in your 1,2 statements, and at the same time you want to redirect to a different slug and rowid which is the dashed statement before the rowid at the end of the url e.g. some-page-name.
I'm not sure what the rowid has to do with anything, perhaps it's just a random int to make sure you don't have duplicate slugs, but it isn't strictly required as long as you increment your titles if you have duplicates like some-page-name and some-page-name-1.
In PHP and codeigniter what you are trying to do is indeed possible as long as you keep a database with all the iterations of the slug. Then in your Market-Reports controller you just look for the most recent iteration of the slug and serve that as a link to the user. Or if the user is coming from a search engine, on the say Market-Reports single page controller to timing-relay-market-241993160 you query the database where rowid = 'someid' and order by created or something and then redirect the user to that slug which would be medical-device-testing-market-254474064.
Further if your database table for keeping track of slugs has the following rows rowid, rowtitle, fullslug you could literally just use intval() to get the rowid from the fullslug and do the same type of thing as I just outlined in the previous paragraph.
NOTE: You cannot do this with static HTML pages. I am assuming since you tagged PHP and codeigniter you are using php and controllers.

get row in mysql using data from friendly url and using regular expression

I need to write code for getting data from a database by a friendly url.
I have a table company with the field title for storing some info about a company. I want to get data by title using friendly url. E.g. example.com/company/aurum-1. First I tried to change some defined symbols to - :
function seoUrl2CompanyName($string) {
$string = preg_replace("/[-ecsui]/","%",$string);
return $string."%";
}
[-ecsui] is used because my native language has non-standard symbols like šįėęų which I cant use in my friendly url, so I tried to change them to % and use the following mysql to find the company by title:
$SQL = "SELECT * FROM company
"WHERE title LIKE '".seoUrl2CompanyName($_GET['company'])."';
But if I use this logic I meet with some difficulties when select return more than one row. E.g
example.com/company/aurum -> seoUrl2CompanyName('aurum')-> a%r%m% ->
like a%r%m% -> 24 rows in my table for match this pattern
My goal is to create the fastest way to find the company from company table by name using data from url.
I would take the suggest from #AgeDeO but expand your SQL like this, that you take the data the company-name reflects AND the ID you get from your URL:
$SQL = "SELECT *
FROM company
WHERE title LIKE '".seoUrl2CompanyName($_GET['company'])."'
AND ID = ".$myId.";
With these 2 factors, you should only get one row and can be sure, that no one just replace his 1 with a 2 and gets other companys data.
NEVER EVER DO THIS:
I want get data by title
What are you going to do when the company name changes? What when two companies have the same name? What happens with spaces, special characters etc...
I understand that you want friendly urls and that is possible, just add the company name as dummy data in the url. Show the company name but do not use it.
Use example.com/1/company/aurum-1 instead, where the 1 is the actual company id.
Beware that it is fairly easy to guess other companies like this. When I change the 1 in a 2 I could have access to the other company like this. If you do not want this, make sure you check for permissions on page load.

MySQL exact URL search

So I'm trying to merge two databases of company information (Table A and Table B from here on out) where the most common (and reliable) single reference point is the website URL. Table A is up-to-date, and Table B is to be updated.
I've extracted the URLs from Table A and cleaned them up using PHP (about 6000 URLs) and the plan is to find and update some information in Table B based on the URLs found (but not the URL itself).
In Table A the URLs are all either domain.com or www.domain.com or www.subdomain.domain.com without http:// or any trailing /'s or other URL data. In Table B they are raw URLs which might contain any extra information with them such as http:// etc.
Now I've tried searching for the company by the URL in Table B like so:
SELECT * FROM companies WHERE website LIKE '%$url1%' OR website LIKE '%$url2%'...
While this works, it is also pulling out information that isn't correct. For example, I don't have bt.com (or any variation of) in the list from Table A, yet it is matching on it in Table B (there is a www.corporate.bt.com in Table A which I think it is matching on).
So, how can I stop this from happening? It's clearly finding something LIKE it in the URL list, but I only want to match on the exact string. So in the example above, if I'm searching for www.corporate.bt.com it should only return that if it finds it within a string (http://www.corporate.bt.com/ is fine, http://bt.com/ is not)
Also, what would be the best possible way of performing this action with a dataset this large? Table A has around 6,000 URLs, Table B has 14,000 (not all of Table A will be in Table B).
LIKE won't return exact search but you can use MySql REGEXP for exact search, it will find exact result in search filed and return only exact url
SELECT * FROM companies WHERE website REGEXP '[[:<:]]$url1[[:>:]]' OR
website REGEXP '[[:<:]]$url2[[:>:]]'
Or if filed have only single url then you can use = operator
SELECT * FROM companies WHERE website = '$url1' OR website = '$url2'
UPDATE
In this you can expend REGEX serarch and input only SERVER_NAME e.g domain.com, domain1.com, abc.domain.com, see below query
$url = "doamin.com";
$url1 = "domain1.com";
SELECT * FROM companies WHERE
website REGEXP '^(htt(p|ps):\/\/|htt(p|ps):\/\/www\.)($url)$' OR
website REGEXP '^(htt(p|ps):\/\/|htt(p|ps):\/\/www\.)($url1)$'
So it turns out that I hadn't filtered through the list of address' in Table A well enough, and it appears that a url of 'http' had slipped through - which meant that every url that contained 'http' was being found...
So I added another filter which checked for the presence of a . in the URL, which ensured that it was at least something.something
if (strpos($domain, ".") !== false) {
// It has a .±
}

MySQL Query to Replace http:// and www in website field

I have a column in my database for website URL and there are many different types of results. Some with www, some with http:// and some without.
I really need to clean up the field, so is there a query I can run to:
Replace all domains with just domain.com format. So remove any www or http://'s
If there is any fields with invalid format like "N/A" or something, so anything without a "." I need to empty it.
And then of course I will update my PHP code to automatically strip it from now on. But for the current entries I need to clean those up.
You can use the REPLACE function to achieve your first point - see the other answers for this. However, I would seriously consider leaving www in the entries as is; because, as the first comment points out, there are actual differences. You might also miss url's like www2.domain.com for example. If you wanted to display them in your app, you can simply remove them in the text presentation (by substringing after the first '.' for example) but leave the href consistent (if displayed as links).
Your second point can be achieved using the INSTR or LOCATE functions.
Simply:
UPDATE table SET url = 'N/A' WHERE LOCATE('.', url) = 0
Read more about both functions here
UPDATE table SET column = REPLACE(column, 'http://', '');
UPDATE table SET column = REPLACE(column, 'www.', '');

Constructing URL without item ID and getting right item

I have had this problem for a while,
Let say we have a movies website
And we have a movie named Test-movies123! in the database,
now what I would do is make a URL watch/test-movie123-{$id}/ and then query DB with the ID,
Now the issue with this is that the ID shouldn't be there, how can I go around this ?
if I get the test-movie123 from url and search it, I wont find it because it has no ! unless I use LIKE but thats not very trusty...
Anyone could suggest anything ? Would be much appreciated
Well, you could create a rule for taking the movie title and turning it into a slug. So, you'd know that you always lowercased the title, removed anything other than letters, numbers and dashes, and converted whitespace into a single dash.
Then store that in another column in your database, and be sure you are forcing uniqueness. Take the URL and search that column from that.
From that point you just have to deal with what happens if you have a second video uploaded that produces the exact same slug. There are a number of options for this ... append a random number slug, increment a number and append it, etc.
To do that, you may have in your database something like the primary_key as
"test-movies123".
Imagine you have a control panel, you insert movies in a form.
Then use the title Test Movies123! to save it in the database like this example:
id: AUTO_INCREMENT NUMBER
keyname: sanityTitle("Test Movies123!") <-- this should save "test-movies123"
title: "Test Movies123!"
stuff: "blablabla"
note sanityTitle() will be your function to prepare friendly url's from titles.
Then your url will look like
watch/test-movie123/ using regex control in url's
or
watch/?id=test-movie123 raw
You will search for the INDEXED or PRIMARY key, "keyname" in the table, it will output 1 row, with all your stuff.

Categories