At present, Web browsers are the dominant technology used to satisfy the information gathering and visualization needs of life scientists. In their current form, browsers provide users with the ability to retrieve information from widely distributed sources, but essentially no means to integrate information from multiple sources and only a very constrained set of operations for manipulating the display of that information. Given the distributed nature of information on the Web and the diversity of user requirements in interacting with that information, this situation is unsatisfactory.
In most current implementations, Web browsers facilitate information transfer between only two parties – the resource provider, who determines all information presented, all links to external resources, and nearly all manner of visualizing that information; and the consumer, who essentially can only control which page they choose to view next. The typical Web browsing experience can thus be characterized as resource-centric because everything that the user sees on a Web page is governed entirely by the resource provider.
By introducing an additional layer of processing that occurs only at the discretion of the user (by choosing whether or not to install a given script), user-scripts offer a way to effect a transition towards a user-centric browsing experience. Though it has always been possible for the technically skilled to engineer their own software for processing Web content (e.g. the notorious 'screen-scraping' characteristic of early bioinformatics [18]), the arrival of popular browser extensions such as GreaseMonkey marks the beginning of a fundamental change in the way end-users can interact with the Web. Empowered with the ability to easily embed scripts directly into their browser and to find such scripts in public repositories, Web users can now more actively make decisions about what Web content they see and how that content is presented.
Despite its intriguing, paradigm-shifting nature, the user-script concept is not without its problems. Because Web content is still primarily provided as HTML, user-scripts must process HTML in order to function. This is problematic for two reasons: 1) HTML is not designed for knowledge or data representation and hence is difficult to parse consistently and 2) HTML representations may change frequently even when the underlying data does not. The former makes it challenging to write effective user-scripts, particularly scripts that are intended to operate over multiple Web pages. The latter makes these scripts brittle in the face of superficial changes to their inputs and thus potentially unreliable [18]. Since information on the Web is currently provided primarily as HTML, alterations to the structure of this content are frequent and necessary results of the need to keep the browsable interfaces up to date. To alleviate these problems, it would clearly be beneficial if the underlying data could be exposed in a manner that was independent of its HTML representation
The potential value of separating content from presentation provides motivation for the Semantic Web [19] initiative and the standards for the annotation of Web resources, such as the Resource Description Framework (RDF)[20] and the Web Ontology Language (OWL)[21], that have recently emerged from it. With these standards in place, content providers are encouraged to provide a representation of their data for visualization (HTML) in parallel with an additional representation of their data for machine-interpretation (RDF/OWL). This would enable those who wish to utilize the content in novel ways to process the more stable, machine-readable representations while remaining unaffected by visual modifications to the associated websites. Though widespread adoption of Semantic Web standards by the community may, in principle, enable the creation of powerful, user-centred applications that go beyond the capabilities of user-script enabled browsers [22], this process is occuring very slowly [23] and the problems faced by life scientists in gathering, integrating and interpreting information on the Web are pressing. In their current form, user-scripts, such as the iHOPerator, provide an immediate means to address these needs and thus should be more widely exploited to this end.