trying to get/understand type of link on a website

20th Jun 2015 10:47 #1
mazinz

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2001

Location
NY
This is a website (below) which still has the old Gameshark/Codebreaker codes created by the Game Shark Code Creators Club from many years back for the Playstation system.

I was using some wedsite downloaders to mirror these codes for offline use (and for preservation should the current site go offline like each previous incarnation has).

the main part is here:

http://cheats.codetwink.com/psx

after which it has the alphabet lineup so you can find whatever game in whatever letter section

This is where my question comes in. The mirroring goes fine, BUT for any letter that has more than one page,
Lets use "C" for example, that page link is as follows:

http://cheats.codetwink.com/psx/letter/C

when you scroll to the bottom of the page you have a button called "next" and that brings you to the second page of games that start with C. However I cannot get any sort of url for that second page. It still shows the link as http://cheats.codetwink.com/psx/letter/C

So what occurs is that the website mirror programs (I tried a few) do not download these second, third, etc pages. It only will get the first page of said letter. Even archive.org has it mirrored in this way with only the first page of each letter listing

Even more curious is that if you put in the main cheat page link in something older like webzip or website downloader for windows and it redirects you to the site forum and wont even let you view the cheat pages

So this has me very curious as to what type or sort of setup the site is using to do things like this

Quote
20th Jun 2015 13:50 #2
DB83

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2007

Location
United Kingdom
One way to achieve this is to use php to store a variable = pageno.

On the first page, pageno = 0. Clicking on next increments the count and sends a new pageno = 1 (you can see this by looking at the code source) which gets read when the page loads

You can not see the underlying php which converts to plain html once the page is loaded.

Another method would be to store the number of pages per letter in a database. Again php with the database language (possibly MySQL) reads the data as the page loads.

And since pages are dynamic - one page only and not multiple pages - a downloader can not see the other page(s)

Quote
20th Jun 2015 18:32 #3
mazinz

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2001

Location
NY
Originally Posted by DB83

One way to achieve this is to use php to store a variable = pageno.

On the first page, pageno = 0. Clicking on next increments the count and sends a new pageno = 1 (you can see this by looking at the code source) which gets read when the page loads

You can not see the underlying php which converts to plain html once the page is loaded.

Another method would be to store the number of pages per letter in a database. Again php with the database language (possibly MySQL) reads the data as the page loads.

And since pages are dynamic - one page only and not multiple pages - a downloader can not see the other page(s)

Clever and interesting. I have never come across such a thing on a website before so it had me very curious, On my end I can do a workaround. For example I see what html pages it does get for said letter and I see the order they go- when I see a large break in that order (like going from lets say 35.html and then picks up again at 62.html) it usually then tells me this would be a page 2 or 3 of said letter. So I just add those extra urls (36 to 61) and I can still at least get them that way. And for those "menu" pages if I go to each one and do a "save as" it will snag the letter menu pages that way. In the end it would be a long job relinking these extras but it can be done

Thanks to you I now know two reasons why it trips up the webrippers. Thanks again

Quote
21st Jun 2015 03:27 #4
DB83

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2007

Location
United Kingdom
Nothing very clever.

I am no web-programmer but I run a small website and incorporate something like this so I only have to encode one page whereas the main content has over 180.

That being said, since I am no web-programmer, but the site has historical interest, it has been successfully archived by The National Library of Wales and their bot does successfully convert it to 180+ pages. I suspect your web-ripper could also handle it (Not that you would want to )

Since the content on your site is inside a form I believe it also depends on how the programmer sets the button 'Next'. In your case I noticed that 'Hidden' attributes are used. So the pageno is sent with the form but, again, is invisible to the end-user.

Quote
21st Jun 2015 13:04 #5
mazinz

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2001

Location
NY
Originally Posted by DB83

Nothing very clever.

I am no web-programmer but I run a small website and incorporate something like this so I only have to encode one page whereas the main content has over 180.

That being said, since I am no web-programmer, but the site has historical interest, it has been successfully archived by The National Library of Wales and their bot does successfully convert it to 180+ pages. I suspect your web-ripper could also handle it (Not that you would want to )

Since the content on your site is inside a form I believe it also depends on how the programmer sets the button 'Next'. In your case I noticed that 'Hidden' attributes are used. So the pageno is sent with the form but, again, is invisible to the end-user.

Well still it was "clever" enough to trip up any webripper I tried for the site I was after. I know very basic stuff with hmtl (simple build a page or link this with that) but that is about it. I also made it a bit simplier on my end, forgot I had the "link gopher" plugin for firefox so I just went to the few letters that had more than one page and link gopher grabbed the needed html links which I then threw in Winhttrack. The "save as" did work fine for the extra menu pages and the fun began last night linking them up to have a functioning offline version

Hmm would you have any idea why the site would re-direct to the forum section in an older browser (some of the webrippers which have not been updated since 08 have built in browsers)? Would it be a sort of command in place that "if browser does not have these specs' redirect here" kind of thing?

Either way thanks again for the help and info

Last edited by mazinz; 21st Jun 2015 at 13:26.

Quote
21st Jun 2015 16:31 #6
DB83

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2007

Location
United Kingdom
Not unusual for webpages to test for the browser version being used. Especially older versions of IE have all sorts of compatabilty issues.

You can simply add some code in the html header thus :



Quote
21st Jun 2015 20:02 #7
mazinz

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2001

Location
NY
and that would then explain why I could not even use webzip or website downloader for windows with this site

Quote
22nd Jun 2015 03:10 #8
DB83

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2007

Location
United Kingdom
I don't think so since I do not see such code on these pages. That was a simple example. One I do use for compatability purposes on my menu system.

It may be more subtle than that. This is only a thought since I do not incorporate ads etc. on my site. This site, like many, has redirects to Google to pull up ads. So it could well be Google who is providing that redirect if their code is not compatable with the old browsers found in webrippers.

Shame that no one with more web programming know-how has not joined in.

Quote
12th Jul 2015 19:43 #9
mazinz

View Profile

View Forum Posts

Private Message
Member

Join Date
Jul 2001

Location
NY
Originally Posted by DB83

I don't think so since I do not see such code on these pages. That was a simple example. One I do use for compatability purposes on my menu system.

It may be more subtle than that. This is only a thought since I do not incorporate ads etc. on my site. This site, like many, has redirects to Google to pull up ads. So it could well be Google who is providing that redirect if their code is not compatable with the old browsers found in webrippers.

Shame that no one with more web programming know-how has not joined in.

Bit of a follow up-- Though the website I mentioned does preserve the info I was after to a degree, it had many user notes about side effects or other things some of these codes did. None of the newer site incarnations had that info anymore.

Something obvious I should have tried first-- was to get the original site url and plug it into the waybackmachine over at archive.org. Issue with this is that it is rather known due to the way the machine is laid out, that too trips up website downloaders (and the reason why I did not try that option to begin ). I took note that the last time the site appeared as I wwanted was back in 2002. -- The web download actually worked using the older webzip! Not only did it get the info I wanted , but it also now has all of the original user notes to go with it. Why it worked and did not get tripped up I am no longer questioning- it worked and that is all that matters

Last edited by mazinz; 12th Jul 2015 at 19:49.

Quote

trying to get/understand type of link on a website

Thread Tools

Search Thread

Similar Threads

the more I search the less I understand

What type of TBC will fix this type of wiggling in the video?

ISP problem - some website pages load, other website pages don't

Is there an official website for pin type hd cams ?

How to convert a trp type file to Mpeg2 type?