I'm looking at a Linux host, where I can have multiple domains off the single account. Apparently, the way it is setup is this:

www.site1.com is the main site
www.site2.com is another site.
www.site1.com/site2.com is the physical location of site2.com

My concern is mostly aimed at google and other search engine bots. I do not want my site to come up as "duplicate" and thus ruin my rankings.

The DNS on the host will automatically route people to site2.com if they type in site2.com, but if somebody (or a bot) can find the "true" physical address, it can be viewed.

Is there any way to use htaccess to prevent anybody from having control of site1.com/site.com folder? Have it re-direct them to the site2.com domain. As if the folder does not exist, at least to bots and browsers?