VideoHelp Forum




+ Reply to Thread
Results 1 to 11 of 11
  1. Video Restorer lordsmurf's Avatar
    Join Date
    Jun 2003
    Location
    dFAQ.us/lordsmurf
    Search Comp PM
    I'm trying to figure out how domains are read by search engines.

    For example,
    www.ABC.com is my site
    www.XYZ.com is my other site

    But I only have one server.

    www.ABC.com/XYZ is the actual location of XYZ.com
    I used URL masking so site visitors don't know any better.
    When they type in XYZ.com, all they see is XYZ.com in the address bar

    So... will google index XYZ.com by attacking XYZ.com, or by using the true raw www.ABC.com/XYZ folder? Or both?

    And if so, should there be multiple robots.txt files? Will that work?
    Does the google bot treat the forward as the raw domain, and not a subfolder (as is it's true nature online)

    ?????


    Help.
    Want my help? Ask here! (not via PM!)
    FAQs: Best Blank DiscsBest TBCsBest VCRs for captureRestore VHS
    Quote Quote  
  2. It'll probably crawl the site both ways.

    As for your question about treating it as a raw domain you'd probably have to ask Google that.
    Quote Quote  
  3. Member
    Join Date
    Sep 2002
    Location
    Australia
    Search Comp PM
    The first thing you must ensure is that your web server doesn't allow directory indexes, this will hide all your site directories by default, no-one should know your physical server directory structure.

    Now the only way crawlers can work out your "site" structure is to trace the links on your site, starting from the site-root path.
    The crawlers can only start from WWW.ABC.COM and WWW.XYZ.COM, they have no way of knowing that the root of WWW.XYZ.COM is "WWW.ABC.COM/XYZ" unless you have a link (or reference to it) somewhere on your site that reveals this.

    By the way what's the default site attached to your IP address? What will the crawlers see when they come in via IP.

    Be careful using "robots.txt", not all crawlers honour the file, and by inserting entries in this you are actually revealing the existence of the directories that you want to keep secret. You only need to put those directories in the robots.txt that can be reached through your public links. For example you would have many links (and references) to directory "/images/" so if you don't want your images indexed then put this in the robots.txt, but if there is no reference anywhere to directory "/XYZ/" don't expose it by putting it in the file.
    Quote Quote  
  4. Video Restorer lordsmurf's Avatar
    Join Date
    Jun 2003
    Location
    dFAQ.us/lordsmurf
    Search Comp PM
    It's paid hosting. I don't control the server.

    I've never referenced/linked to the ABC.COM/XYZ, but google has indexed it that way many times. I always link it as XYZ.com (if I link at all!)
    Want my help? Ask here! (not via PM!)
    FAQs: Best Blank DiscsBest TBCsBest VCRs for captureRestore VHS
    Quote Quote  
  5. Member
    Join Date
    Sep 2002
    Location
    Australia
    Search Comp PM
    OK Hosted - all is not lost yet.

    How is it showing in Google?
    I'm assuming that if for example you have "index.htm" in your XYZ.COM site then Google is showing it as "http://ABC.COM/XYZ/index.htm" instead of "http://XYZ.COM/index.htm" or does it show both.

    If it is only the first option then your host's DNS must show the path that way rather than have a virtual site pointing directly to the XYZ directory, if it's both then index listings must be allowed. You may be able to control this if you can set directory permissions, for example if it is Apache you may be able to create a ".htaccess" file in your directories to disallow indexes.

    Also if it's both then your original idea of placing a robots.txt file in the ABC.COM root directory to exclude the "XYZ" directory may do the trick.
    Quote Quote  
  6. Video Restorer lordsmurf's Avatar
    Join Date
    Jun 2003
    Location
    dFAQ.us/lordsmurf
    Search Comp PM
    htaccess is not allowed,
    these Windows servers not using htaccess

    If I block www.ABC.com/XYZ, would it still see XYZ.com in google? That's my question. It would take 6 months to test on my own, at minimum as they give 90 days as the refresh time

    Or would placing a robots.txt inside the XYZ folder be seen as the robots.txt for XYZ.com?

    Google sees ABC.com, XYZ.com and ABC.com/XYZ
    All 3 are found.

    I also know not everybody uses robots.txt as they should but all the major ones do, google, MSN, altavista, etc. The ones I care about most.
    Want my help? Ask here! (not via PM!)
    FAQs: Best Blank DiscsBest TBCsBest VCRs for captureRestore VHS
    Quote Quote  
  7. Member
    Join Date
    Sep 2002
    Location
    Australia
    Search Comp PM
    Windows IIS requires different config - we won't go there.
    The way 'robots.txt' is supposed to work is you put it in the root of your site ABC.COM and it controls everything with that site so you can exclude directory XYZ from it.
    Site XYZ.COM should be seen as a different site and if google comes in via that address then the 'robots.txt' in ABC.COM should have no affect. (It belongs to a different site)

    You may also find this of use -> http://member.melbpc.org.au/~tgosbell/articles/google-exclusion/
    Quote Quote  
  8. Member thecoalman's Avatar
    Join Date
    Feb 2004
    Location
    Pennsylvania
    Search PM
    Be careful LS you could pu the kaibash on your main site as far as google is concerned, I'm not sure of the specifics but linking sites on the same server is bad news..... http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=google+%22same+server%22+penalty

    Edit: Not good...... If I remeber correctly your site had a PR5 which is good, you now have a 2 which is not good. Most main pages are assigned a 3 right from the start. www.webmasterworld.com is a good place to find out info on PR.....



    Edit 2: Your other link shows a PR4.... your putting the kaibash as I stated above on your main page.

    Quote Quote  
  9. Video Restorer lordsmurf's Avatar
    Join Date
    Jun 2003
    Location
    dFAQ.us/lordsmurf
    Search Comp PM
    Interesting. There is some space donation I do off the server, and it needs to stay hidden. I also route several domain names to the various donated folders. My site has several domains hitting folders too. Most people know www.nomorecoasters.com, which is just a forward to one of the most popular pages on the site.

    I've pretty much come to the conclusion that google sucks ass as much as ebay does. They make up rules as they go along, and could care less who they screw in the process.

    While my page rank would be nice, I'm more worried about the secondary domains that need to stay unlinked and unindexed/uncrawled. Archive.org is the biggest pain in the ass, requires robots.txt to block it.

    I've got to work on lots of these things this month.
    Want my help? Ask here! (not via PM!)
    FAQs: Best Blank DiscsBest TBCsBest VCRs for captureRestore VHS
    Quote Quote  
  10. Member thecoalman's Avatar
    Join Date
    Feb 2004
    Location
    Pennsylvania
    Search PM
    Originally Posted by lordsmurf

    I've pretty much come to the conclusion that google sucks ass as much as ebay does. They make up rules as they go along, and could care less who they screw in the process.
    I think PR is meaning less and less but it's good to have it, take a look around www.webmasterworld.com. There's tons of information there, google has it's own forum. There's even a Google tech that stops in once and a while. The reason those forwards can affect your page is because of the anti-spam thing.... If done right it should be fine. Does suck that you have to fillow there rules but guess there's not much you can do about it since Google is the head cheese. Even having a bunch of domains on the same server pointing to one site can cause you trouble whether it's legit or not...

    I'm even afraid to touch my coal site cause it's sitting right where I want it.
    Quote Quote  
  11. Video Restorer lordsmurf's Avatar
    Join Date
    Jun 2003
    Location
    dFAQ.us/lordsmurf
    Search Comp PM
    Yeah, like I said, my google rank is small fries compared to everything else.
    Want my help? Ask here! (not via PM!)
    FAQs: Best Blank DiscsBest TBCsBest VCRs for captureRestore VHS
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!