Dismiss Notice
Welcome to Our Community
Wanting to join the rest of our members? Feel free to sign up today.

htaccess files

Discussion in 'General Webmaster Helpdesk' started by melkior_inactive, Apr 9, 2007.

  1. You can't live with them, you can't live without them.
    Whether you want it or not, creating a good web based application these days involves editing and creating these little buggers.
    And they are ugly little brutes. If you want to really offend someone tell him: "You look like a .htaccess file!" That should do the trick.

    But all in all, you have to tame them in order to get some results and this is where this thread steps in.
    I'll try to do my best in explaining possibilities of these files and things you can do with them in a friendly way.

    Now, I've seen hundreds of .htaccess tutorials on the net and some are OK, some look like they were written for people with 3 brains and 7 microprocessors in the head. But most of them fail describing what a .htaccess file is. Sometimes even I'm not sure what they are but I'll try to explain.

    First of all, they only work with Apache server! So it doesn't matter if your web server is running Linux, Mac OS or Windows, it needs to run on Apache.
    Now when we have that cleared up, what do they do?
    Imagine the Apache server as the USA (United States of America in case you think it's an IT acronym). And like the USA, Apache has its rules. USA has laws, the constitution and whatnot. Apache has a configuration file (usually httpd.conf or apache2.conf but it can be called different on your server). USA has amendments, and Apache has modules. These are the extensions and changes to the original set of rules.
    Finally, each state in the USA has it's own state laws. Each folder has it's own "laws" and they can be found in the .htaccess file.
    Laws are complicated to read, so is the htaccess file. But if lawyers can read laws, then webmasters should be able to read htaccess files.

    OK, enough with the metaphor. A few more important things are:
    If Apache can't find the .htaccess file in the folder it uses global rules set in the main configuration (which you usually don't have access to unless it's your own server). And if you apply an .htaccess file to the folder /yourserver/www/script/ the .htaccess file will work for all the sub folders in the script/ folder.
    But sometimes you want some other subfolders to have a different set of rules. That's fairly easy. Just create a new .htaccess file in the subfolder of your choice and it'll overrule the .htaccess file in the parent folder.
    In the next post I'll give you an overview of things that can be done with one of these files, and then we'll go on to writing our own rules.
  2. One more thing I forgot to add: you should set the permissions to your .htaccess files (chmod) to 644 so the server can read them but not anyone else.
    Exposing your .htaccess file imposes a serious security risk so be warned!

    OK, here's a list of things you can do with them:
    1. Password protect your folders
    2. Rewrite your URLs (the infamous mod_rewrite) -- I'm not going to write about this since I have already written a post about this here.
    3. Change the default error documents
    4. Block and allow users by IP addresses
    5. Block referrers
    6. Block bots you don't like and/or offline browsers, site downloaders etc...
    7. Prevent listing of directories
    8. Add MIME types
    9. Redirect pages
    10. Stop people hotlinking your files
    11. Enable SSI
    12. Change your default pages
  3. Password protecting your web folders

    1. Password protecting your web folders

    So, you've finally decided to write a love song for your girlfriend who works as a IT technician, so you thought putting it on your website might be romantic. But you don't want your friends see how you make a complete fool out of yourself.
    Well, I have some good news and some bad news for you. The good news is, you can password protect your love song and send the access data to your girlfriend but the bad news is that you're still making a fool out yourself since it's not romantic at all.

    You still want to do it? Wow, you're persistent. Well here's how to do it:
    you have a site: www.yoursite.com
    and you've decided to put the love song in:
    And you want it protected.
    Create the subfolder lovesong in your public_html folder on the web server.
    Create a .htaccess file with this content in the lovesong folder:
    AuthUserFile /home/myaccount/safedirectory/.htpasswd
    AuthGroupFile /dev/null
    AuthName EnterPassword
    AuthType Basic
    require user mydarling
    The above code will produce a password protected directory which only your darling could access.
    The first line specifies the direct path (not the URL) to the .htpasswd file which contains the username and the hashed password. If possible, put this file in a folder which can't be accessed over the Web (usually a folder called private/) -- a folder which is not contained in the public_html/ or www/ folder).
    Now all you need to do is create the content for this file.
    It should look like this:
    To hash the password you can use this tool.

    Although there are lots of tools to do this on the net. This one is just an example.

    The require user line specifies that only the user mydarling can access the content of the folder lovesong/.
    If you have more than one girlfriends (you dirty dog! :D), add their user data to the .htpasswd file and in your .htaccess file change the line:
    require user mydarling
    require user valid-user
    This allows access to all authenticated users from the .htpasswd file.

    Now upload your love song (or whatever you're trying to hide) to the lovesong/ sub folder and you're good to go (get the boot from the girlfriend).
  4. temi

    temi Facilitator Webmaster

    Excellent tutorials, keep going :)
  5. Changing error documents

    3. Changing error documents

    So what's this? Well you must have seen HTTP errors from time to time. You know: "Error 404 - Not found" and stuff like that.
    Instead of the default white pages with black text you can have flashy pages that go well with your design.
    There's no real reason for doing this except the fact that it makes your website look more professional.
    The usual errors you'll be creating new pages for are:
    400 - Bad Request
    401 - Authorization Required
    403 - Forbidden
    404 - Not Found
    500 - Internal Server Error

    There are more but this isn't a place or time to list them all and creating error pages for some isn't recommended (200 in example would create an infinite loop since it's a success code).

    So first create your own custom error pages and give them names.
    Put them all in one folder on your server, I'll use error/ in this example.
    Add this to your .htaccess file (or create a new one if you don't have one already):
    ErrorDocument 400 /error/400.html
    ErrorDocument 401 /error/401.html
    ErrorDocument 403 /error/403.html
    ErrorDocument 404 /error/404.html
    ErrorDocument 500 /error/500.html
    You get the idea. Just be careful to get the names of your files right and you've done everything.

    You can even specify HTML code in the htaccess file instead of linking to a file:
    ErrorDocument 404 "<body bgcolor=#FF0000><b>Not found!</b> But if you wait long enough someone might start looking for it. <img src="/smiley.gif" /></body>"
    OK, you've now got custom error documents! You're way cool now! :D
  6. Blocking and allowing IPs

    4. Blocking and/or allowing IPs

    Remember that (ex-)girlfriend of yours you wrote that love song for? Well, she ditched your ass but that's not all. Now she started spamming your site's forum, guestbook, blog. She's all over the place and you just don't have the time to delete her comments.
    But you know she has a static IP. You're in luck!
    Add these lines to your .htaccess file:
    deny from
    If her IP is she won't be able to access your site.
    You can also deny all users (even you) but the server will still be able to access the files in the folder:
    deny from all
    You can allow only certain users:
    allow from
    That's not enough?
    Well you can ban or allow IP ranges. Let's say you want to ban all the users from to
    Do this:
    deny from 123.123.123.
    And you're set.

    You can even allow or ban certain domains.
    allow from www.ukwebmasterworld.com
    Which would allow access to the part of your site from www.ukwebmasterworld.com

    Your site is now safe from the old hag! :D
  7. Blocking referrers

    5. Blocking referrers

    Blocking links to your site which come from one domain has numerous reasons and I'm not going to list them here. You have your own reasons and I respect them.

    This is actually an update to the mod_rewrite setting. So you can only do this if you have the mod_rewrite module on your server.

    This is what you write in your .htaccess:
    RewriteEngine on
    Options +FollowSymlinks
    RewriteCond %{HTTP_REFERER} siteyouareblocking\.com [NC]
    RewriteRule .* - [F]
    That will block the siteyouareblocking.com
    To block multiple sites do this:
    RewriteEngine on
    Options +FollowSymlinks
    RewriteCond %{HTTP_REFERER} siteyouareblocking\.com [NC,OR]
    RewriteCond %{HTTP_REFERER} anothersite\.com
    RewriteRule .* - [F]
    The [NC] makes the domain case insensitive.
    The [F] in the RewriteRule is to show the 403 Forbidden error to those who go to your site via the blocked site.
  8. Alam

    Alam New Member Webmaster

  9. Alam

    Alam New Member Webmaster

    Thanks for your reply and want to know that I have visitied your indicated thread and happy for getting my result :)
    You have done a good job for us :d
  10. Blocking bots you don't like

    6. Blocking bots you don't like

    What is HTTP_USER_AGENT?
    I could go on the whole day about it but the main idea is:
    it's the identifier of the app/user/bot/service accessing your site.
    And your website can recognise it. OK, that's the first part.

    Now, we all know that there are some bots that do you more harm than good. Site rippers too, they eat your bandwith. And you want to block them.
    The usual way to block bots is the robots.txt file. But some bots ignore it.
    Well, let's just say we don't like those bots.

    So, what do you do?
    .htaccess file is your friend (again)!

    Add this to your .htaccess file:
    RewriteEngine On 
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] 
    RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] 
    RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] 
    RewriteCond %{HTTP_USER_AGENT} ^Zeus 
    RewriteRule ^.* - [F,L]
    That's the list of some of the unwanted apps. You can expand it or arrange it the way you want.
    If the user agent equals the text behind the ^ the site gets a 403 Forbidden error.
    Neat, isn't it?
    This can save you a lot of trouble!
  11. Folder listing customization

    7. Folder listing customization

    Every now and then you end up with a folder on your site that doesn't have the index page. So if someone navigates to it they get the folder listing. All files included.
    Most of the webservers prevent this from happening but some don't.
    Here's what you do:
    Add this line to your .htaccess file:
    IndexIgnore *
    IndexIgnore setting defines what files will not be listed. Since we have a * here, it's used as a wildcard so no files are listed.
    But sometimes you want to hide only zip and rar files.
    In that case you add this line to your .htaccess instead:
    IndexIgnore *.zip *.rar
    Now all the other files will be listed but zip and rar won't.

    But what if you really want a folder to be listed. And your server settings don't allow this.
    Well, you can always add this to your .htaccess:
    Options +Indexes
    This will allow the folder to be listed but watch out that you don't allow access to some private data this way.
    If you've set up your file with the last option you can customize the folder listing even more.
    Add two files in the listed folder. One is called README and the other one HEADER.
    Now the content of HEADER will be displayed before the folder listing, and README after the folder listing.
    So if you're giving download links for software in this manner (just an example) you can display additional information and tips to your users with these two files.
  12. Add MIME types

    8. Add MIME types

    So you don't like clowns but you love mimes? Here's tutorial for you!
    Just kidding. Wrong mimes.

    Anyway MIME is short for *drumroll* Multipurpose Internet Mail Extensions.
    So you're asking yourself what does mail have to do with htaccess?
    The answer is nothing. Well not in this tutorial anyway.
    But MIME isn't used for mail only. It plays an important role in HTTP. MIME is an Internet Standard which defines what applications open what file types when accessed from the net.

    Some servers don't have the MIME settings configured for some file types. We'll pretend that our fictional server doesn't work well with Flash.
    So instead of opening the swf file in your browser it offers you to download it.
    What you need to do is add this to your .htaccess file:
    AddType application/x-shockwave-flash swf
    List of MIMEs can be found here:
    IANA | MIME Media Types

    But there's another common problem. Sometimes you really want a file to be downloaded instead of being opened in the browser.
    How do you do that? Easy! Replace the upper code with:
    AddType application/octet-stream swf
    So when someone clicks on the link to the swf file he'll be offered to download it instead of the file playing in the browser.
    Also, use these settings with care. You don't want to make your users opening JPEGs with their Calculator, OK?
  13. Redirection

    9. Redirection

    Okay. We've come the one of the most common applications of .htaccess files and also one of the simplests.

    No matter what is the reason for your redirect you always do it the same way. It could be that you moved some portions of your site to a different (sub)domain or you changed your directory structure. Whatever the reason is, instead of painfully changing dozens of links on your pages you can simply add the links to your .htaccess file and redirect them.

    This is the code for redirection:
    Redirect /folder/file.html http://www.site.com/newfolder/newfile.html
    Use the relative path for the old files and full URL for new files and you're good to go.
    You can also redirect whole folders in the same way:
    Redirect /myoldfolder http://www.site.com/mynewfolder/
    It will affect all the links on your site and it's the fastest way to change the links if you changed your folder structure.
    Ready, steady, redirect!
  14. Hotlinking prevention

    10. Hotlinking prevention

    When someone is trying to copy the design of your site, you don't need to go haywire. It just means that your site is good and he likes the design.
    But when someone starts using your images, CSS files and even JavaScript than it's time to say: "Bye bye!"
    News sites have similair problems. Users on forums and blogs tend to use the pictures from news sites when posting news on their sites. That's ok if they download them and reupload them on their server or a free image host. But if they just use the path to your site they are actually stealing your bandwith.

    You can stop this easily with the help of .htaccess files.
    Add this to your file:
    RewriteEngine on
    RewriteCond %{HTTP_REFERER} !^$
    RewriteCond %{HTTP_REFERER} !^http://(www\.)?yoursite.com/.*$ [NC]
    RewriteRule \.(gif|jpg|js|css)$ - [F]
    This setting will prevent hotlinking of gif, jpg, js and css files. You can expand the list in the last line of the code as necessary.
    Just don't forget to replace yoursite.com with your domain. :)

    But there's a nice bonus to this too. You can make the hotlinkers look like idiots.
    Use this in your .htaccess file:
    RewriteEngine on
    RewriteCond %{HTTP_REFERER} !^$
    RewriteCond %{HTTP_REFERER} !^http://(www\.)?yoursite.com/.*$ [NC]
    RewriteRule \.(gif|jpg)$ http://www.yoursite.com/stopstealing.gif [R,L]
    Also create a default picture (in this example it's called stopstealing.gif) you can put anything you want in it - typically an angry message to the hotlinker, and upload the file to your server.
    Now when someone links to your gif and jpg files, stopstealing.gif will be displayed instead.
    Note that the path to your replacement picture is in the last line of the code so you can customize it any way you like.
    And as usual you can expand the number of filetypes you want to block.
  15. temi

    temi Facilitator Webmaster

    I never knew you can do so much with .httaccess files, well done Melky loooad of rep coming your way :)
  16. Enable SSI

    11. Enable SSI

    SSI are Server Side Includes. Sounds complicated? Well it is.
    And it's kinda dangerous. I'm not going to write much about SSI here. If you need it, you know what it is and how to use it.
    If you've never heard of them you can skip this tutorial.

    SSI is a way of serving dynamicall content in HTML files without using PHP or CGI. They're good for adding small pieces of information to your HTML. But if you want to generate the whole page dynamically I suggest that you turn to PHP, Perl or whatever your preference is.
    Most servers have SSI disabled and there's a good reason for it. It's server intensive if not used properly.
    Before enabeling this contact your web host first because there's a good chance it's not allowed to do so.

    OK here's the code to enable SSI:
    AddType text/html .shtml
    AddHandler server-parsed .shtml
    Options Indexes FollowSymLinks Includes
    This will get the server to parse all .shtml files for SSI.
    You can replace .shtml with anything you like but don't use .html cause it will get the server to parse all your HTML files even if they don't have SSI. You might overload the server that way.

    In case you still want to use .html files with SSI there's another way.
    Chmod (change permissions) all your html files with SSI (only with SSI) to have the +X flag (that marks them as executives).
    Now add this to your .htaccess:
    XBitHack on
    Now Apache will search for SSI only in html pages that have the +X flag.

    And if you all behave well maybe I'll write you a tutorial on SSI someday. :D
  17. Changing default directory files

    Changing default directory files

    You've probably seen that I used the DirectoryIndex directive in this tutorial.
    Well let me explain it in this final part of the htaccess tutorial.

    This directive is rather simple and sometimes very useful. Let's say that you are tired of using index.htm (or .php or .cgi or .whatever) as the default page in every folder on your site.

    Let's say you want to use coolmonkey.htm instead.

    Just use the formentioned directive to change the default directory index page. Add this code to your .htaccess file:
    DirectoryIndex coolmonkey.htm
    Now when someone navigates to www.yoursite.com he'll be taken to www.yoursite.com/coolmonkey.htm instead of www.yoursite.com/index.htm
    So you can make really creative sites now.
  18. Ladies and gentleman,
    this concludes my little htaccess tutorial.
    If you've found it useful add some reputation and I'll be most obliged.

    Special note: During the making of this tutorial no .htaccess files were harmed.
  19. Alam

    Alam New Member Webmaster

    Really a gr8 job melkior
    repu added :)

Featured Resources (View All)

Share This Page