UserJS.org

What type of page is this?

Written on 2005-10-31 12:12 by tarquin. Last modified 2005-11-07 20:00

User JavaScripts run on a variety of different pages. Some of them are real pages, and some of them are pages generated by Opera. For example, if you display an image on its own, Opera will generate a minimalistic page to put it in so that it can display it. Your script may not be designed to work with these pages, so in some cases you will need to know what sort of page your script is currently running on.

Note: with most scripts, there is no need to do this, so please do not include this in your script unless you have a real need to do so.

There is no way a script can access the content type information, and content can be delivered without any file extensions, so there is no shortcut to working out what type of document you are looking at. The only way is to look through the DOM of the document, to check if it looks like the format that Opera uses for that specific file type. These may change in subsequent versions, so if you use this, you will need to update your script when the format changes.

These are the formats used in Opera 8 and 9, when viewing the following types of files:

Images
<HTML><STYLE></STYLE><BODY><TABLE><TBODY><TR><TD><IMG src="http://www.absolute/path"/></TD></TR></TBODY></TABLE></BODY></HTML>
Files displayed using external plugins (such as flash)
<HTML><BODY><EMBED src="http://www.absolute/path"></BODY></HTML>
Text based files, including JavaScript and CSS files
<HTML><BODY><PRE>text</PRE></BODY></HTML>
JavaScript strings (the page generated if you use "javascript:'foo';" as a location), and HTML pages containing only text (no HTML tags)
<HTML><BODY>text</BODY></HTML>
Opera error pages
A complete HTML page, with the body ID set to 'opera-error'

So to detect these file types, it is a simple matter of checking if the correct number of elements are available, and if the right elements are where you expect them to be. It is possible that a real page could use this format as well, but in that case, you probably want to treat it the same way as you treat Opera's generated pages anyway.

Note that this can only be checked after the page has completed loading, so add a load event listener, and run your code in that.

document.addEventListener( 'load', function () {
  var fileType;
  if( !document.documentElement ) {
    fileType = 'empty'; //empty document
  } else if( !document.body ) {
    if( document.documentElement.tagName == 'wml' ) {
      fileType = 'wml'; //WML document
    } else if( document.documentElement.tagName == 'svg' ) {
      fileType = 'svg'; //SVG image
    } else {
      fileType = 'xml'; //generic XML document
    }
  } else if( !document.getElementsByTagName('head')[0] ) {
    var allBodyParts = document.body.getElementsByTagName('*');
    var firstBodyChild = document.body.firstChild;
    if( !allBodyParts.length ) {
      fileType = 'just-text-in-html'; //just text in HTML document
    } else if( allBodyParts.length == 1 && firstBodyChild.tagName == 'PRE' ) {
      fileType = 'text-based'; //text document
    } else if( allBodyParts.length == 1 && firstBodyChild.tagName == 'EMBED' ) {
      fileType = 'plugin'; //plugin
    } else if( allBodyParts.length == 5 && firstBodyChild.tagName == 'TABLE' && allBodyParts[4].tagName == 'IMG' ) {
      fileType = 'image'; //image
    } else if( document.body.id == 'opera-error' ) {
      fileType = 'opera-error'; //generated error page
    } else {
      fileType = 'html'; //HTML document
    }
  } else {
    fileType = 'html'; //HTML document
  }
  if( fileType == 'html' ) {
    if( document.body.tagName.toLowerCase() == 'frameset' ) {
      fileType = 'html-frameset'; //HTML frameset document
    }
    if( document.body.tagName == 'body' || document.body.tagName == 'frameset' ) {
      fileType = 'x'+fileType; //XHTML document
    }
  }
},false);

That is obviously too much code for every script to have to run, since it detects all the normal types of page that your script might encounter. If your script needs to determine if it should run or not, then you should include only the parts that you need.

For example, if you need to make sure your script only runs on frameset pages, you could use:

if( document.body && document.body.tagName.toLowerCase() == 'frameset' )

Or to make it run only on XML pages, you could use:

if( !document.body && document.documentElement.tagName != 'wml' && document.documentElement.tagName != 'svg' )

And if you need to run only on the auto-generated image pages:

if( document.body && !document.getElementsByTagName('head')[0] ) {
  var allBodyParts = document.body.getElementsByTagName('*');
  var firstBodyChild = document.body.firstChild;
  if( allBodyParts.length == 5 && firstBodyChild.tagName == 'TABLE' && allBodyParts[4].tagName == 'IMG' ) {
    //this is an image - run your code here
  }
}