disallow indexes,
set custom error pages,
force non-www, https,
strip trailing slash,
redirect index.html or.php to root,
pretty URLs (hides file extensions & queries),
strip anything after ".php/" (currently disabled)404 any URL with Additional Path Info
The last block labeled "STRIP ANYTHING AFTER .php/" allows forI was having trouble with nonsense URLs (URLs with Additional Path Info) like example.com/index.php/somefolder/anotherfolder/file/query..invoking broken pages & 500 errors that were being indexed as duplicates by search engines.
Sample: example.com/index.php/somefolder/another/file/query...
I set up rules to redirect to a default page rather than rendering broken pages or 500 errors. This is ok but what I'm hoping for is forI then modified that same block to send such nonsense URLs to go 404 instead so they would not be indexed.
MrWhite has educated me on the proper use of AcceptPathInfo Off and I've been reading up on the Apache AcceptPathInfo Directivebut unfortunately this had no effect.
I've added the AcceptPathInfo Off near the top of the array. I've commented out the last block "strip anything afterThe final solution required an additional .php/" which worked to send URLs with "Additional Path Info" to a working default page.
URLs with Additional Path Info are not going 404. They're rendering broken pages like they were before I addedhtaccess file inside the last block. Thedirectory containing my custom error pages are broken due to errors such as trying to load css from locations that begin with the full, bad URL including all of the "Additional Path Info" such as example.com/index.php/somedirectory/anotherdirectory/more/css/reset.css when the css files are actually located at example.com/css/reset.css
This is basically where I started before I set up the last block. It seemsbut it worked like the AcceptPathInfo Off directive is just not working. I've tried disabling some of the other rules, no luck. I've submitted a support request with the host to see if this feature is disabled at the httpd conf levelcharm.
This is the full array of rules currently in place. Everything is now working perfectly.
AcceptPathInfo Off
Options -Indexes
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
## 404 OptionsANY +FollowSymLinks
URL WITH ADDITIONAL PATH RewriteEngineINFO On##
RewriteCond %{THE_REQUEST} /([^.]+)\.php/? [NC]
RewriteRule RewriteBase^ /%1 [NC,R=404,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/?$ /$1.php [L,NC]
## SET CUSTOM ERROR PAGES ##
ErrorDocument 400 /error/error_400.php
ErrorDocument 401 /error/error_401.php
ErrorDocument 403 /error/error_403.php
ErrorDocument 404 /error/error_404.php
ErrorDocument 500 /error/error_500.php
## FORCE HTTPS & NON-WWW ##
## RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://dev.example.com/$1 [R=301,L,NE]
## STRIP TRAILING SLASH ##
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [R=301,L]
## REDIRECT INDEX TO ROOT ##
RewriteRule ^index\.php$ / [R=301,L]
RewriteRule ^index\.htm$ / [R=301,L]
## PRETTY URLS FOR SPECIFIC, DYNAMIC FILES ##
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^items^fonts/([a-zA-Z0-9_-]+)$ item.php?item=$1 [L]
RewriteRule ^items^fonts/([a-zA-Z0-9_-]+)/$ item.php?item=$1 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^free_items/([a-zA-Z0-9_-]+)$ free_item.php?item=$1 [L]
RewriteRule ^free_items/([a-zA-Z0-9_-]+)/$ free_item.php?item=$1 [L]
## PRETTY URL FOR ANY STATIC FILE ##
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([a-zA-Z0-9_-]+)$ $1.php [L,QSA]
## STRIP ANYTHING AFTER .php/ ##
## CREATES CHAIN OF 3 REDIRECTS 302-301-301 NOT GREAT ##
## RewriteCond %{THE_REQUEST} /([^.]+)\.php/? [NC]
## RewriteRule ^ /%1/ [NC,R,L]
## RewriteCond %{REQUEST_FILENAME} !-f
## RewriteCond %{REQUEST_FILENAME} !-d
## RewriteRule ^([^/]+)/?$ /$1.php [L,NC]
With this arrangement the bad URLs just redirect to default pages. The custom error pages are accessible if I hit the URL directly but it's virtually impossiblelast block (PRETTY URLS FOR ANY STATIC FILE) is also added to invoke a 404 not matter what nonsense URL you entersecondary htaccess which is placed in any subdirectories.
I tried modifying that lastThe block slightly to send such nonsense URLs to 404 and it worked but due to some interaction with other ruleslabeled "404 ANY URL WITH ADDITIONAL PATH INFO" would be unnecessary if the custom 404 page itself comes up 404"AdditionalPathInfo off" directive would work. If I hard-codeIt is apparently possible to enable or disable that feature in the full url ofhttpd-conf file on the error page (https://example.com/error/error_404.php) I get an endless 302 loopserver. Editing httpd-conf requires command-line, ssh access and can be dangerous. Consult your hosting company or server admin if that sounds scary.
## STRIP ANYTHING AFTER .php/ SEND TO 404 ##
## SENDS BAD URLS TO 404 BUT CUSTOM 404 PAGE IS ALSO 404 ##
## RewriteCond %{THE_REQUEST} /([^.]+)\.php/? [NC]
## RewriteRule ^ /%1/ [NC,R=404,L]
## RewriteCond %{REQUEST_FILENAME} !-f
## RewriteCond %{REQUEST_FILENAME} !-d
## RewriteRule ^([^/]+)/?$ ^/$1.php [L,NC]
Still open to any feedback on cleaning this up.