Visiting the Mundaneum

05 Wednesday Aug 2015

Posted by Dominic in Information History

Tags

Analogue Internet, data visualisation, Henri La Fontaine, information history, information organisation, information overload, Luciano Floridi, Mondothèque, Mundaneum, museums, Paul Otlet, Röyksopp, travel

Last weekend, I afforded myself some relief from the arduous processes of dissertation-writing and job-hunting with a short trip into deepest Wallonia. The pleasant Belgian town of Mons, a European Capital of Culture this year, is host to a number of museums, galleries and historic buildings, including the Mundaneum—the remains of Paul Otlet’s utopian project to create a world city underpinned by the free and direct access to, and dissemination of, information presented in a museum that showcases his life’s work. Although this visit was not strictly necessary to support my dissertation, as the resources that I require are all available either online or through the British Library, it was nevertheless fascinating to see original copies of many of Otlet’s explanatory posters and graphics, and of course numerous sections of the Répertoire bibliographique universel—Otlet’s enormous card-catalogue index of bibliographic references. I also had some productive discussions on a number of subjects with three like-minded individuals who were also visiting the museum.

The Mundaneum’s central atrium; the universality of the project is indicated by the prominent globe.

The Mundaneum’s exhibits are spread across three floors, illuminated just to the level of hushed reverence. The displays consist of sections of the catalogue and selections of Otlet’s drawings on various subjects (which are also available online through the Google Cultural Institute; Google operates one of its data centres nearby), not just limited to library and information science, but also including works on network theory, the nature of international associations, pacifism, and utopian visions of his never-realised World City.

A large section of the Répertoire bibliographique universel.

Perhaps surprisingly, each individual drawer can be opened to reveal its original contents.

Sections of the RBU are juxtaposed with examples of Otlet’s graphical output.

Of particular interest to my research into predictions of future information technology was a full-scale realisation of Otlet’s Mondothèque, an analogue anticipation of the digital desktop computer.

Otlet’s original drawing is framed above the modern construction.

Otlet’s work is also placed in its historical context by the display of previous attempts to organise human knowledge—some of which I referred to in this previous blog post on the history of encyclopaedias—in addition to a timeline that charts advances in statistics and abstract forms of data visualisation, as people have sort to record knowledge in ever-more accurate, intuitive ways.

The museum’s exhibits, furthermore, extend beyond Otlet’s lifetime, and include a plethora of examples of subsequent data visualisation enabled by the development of computer network technology in the form of the Internet—arguably the modern realisation of Otlet’s dreams. These later displays include visualisations on serious subjects such as global inequality and climate change, but also art inspired by data visualisation. I was particularly taken by this French-produced music video for the song “Remind Me” (2002) by the Norwegian group Röyksopp, which breaks down the events of a single person’s normal working day into its constituent quantitative data through a unrelenting procession of infographics:

Another information-inspired artwork demonstrates the exponential growth in digital information that has been caused by the development and increasing global ubiquity of the Internet. The (practically invisible) black grain in the centre of the white square of this exhibit represents the total amount of digital information produced by humanity from its beginnings to the year 2003. The white square itself extends that time period to 2014, and the larger black square is a prediction of the continued rapid expansion of digital information production that will continue up to 2020:

Can you see the central black grain?

This exhibit reminded me greatly of scale models of the Solar System, and in my experience the vast emptiness of space is as similarly difficult to comprehend as the current information explosion, which philosopher of information Luciano Floridi has characterised as “The Fourth Revolution“, comparable to the advances in human understanding achieved by Copernicus, Darwin and Freud.

All in all, the Mundaneum was an extremely interesting place to visit, and I would thoroughly recommend a trip to Mons to anyone interested in the subject.

Paul Otlet lives on, almost literally, through his work and writings.

…as does his close friend and long-term collaborator, Henri La Fontaine.

Reductio ad Wikipedia?

28 Saturday Mar 2015

Posted by Dominic in Information Society

≈ 2 Comments

Tags

Alex Wright, Denis Diderot, discovery tools, Encyclopaedia Britannica, encyclopaedias, Encyclopédie, Francis Bacon, Google, Henri La Fontaine, information behaviour, information organisation, information overload, information retrieval, information society, J.C.R. Licklider, Jean-Baptiste le Rond d'Alembert, John Harris, Katharine Schopflin, LAPIS, Lexicon Technicum, library OPACs, Paul Otlet, reference librarianship, Summa Theologica, Thomas Aquinas, Wikipedia

Our most recent LAPIS session yesterday featured a guest lecture by Katharine Schopflin, a corporate information professional who has also conducted research into the roles of the encyclopaedia and reference librarianship. An encyclopaedia—whose principle defining characteristics were identified in her research as accuracy, a lack of bias, being up-to-date, authoritativeness, providing subject coverage in sufficient and appropriate depth, and written succinctly—and is essentially derived from the same principles of information organisation—such as controlled vocabularies and classification schemes—that we have encountered in other modules, and is historically rooted within the positivist school of philosophy.

The modern concept of the general reference encyclopaedia, with comprehensive A-to-Z subject coverage written by experts, with cross-references, indexes and other information-seeking tools, developed over hundreds of years, through “proto-encyclopaedias” such as Thomas Aquinas’s Summa Theologica and the later, alphabetical, Lexicon Technicum by John Harris. Many of these early encyclopaedias were compiled in the rationalist belief, typical of the Age of Enlightenment, that they represented an order of things (and therefore of knowledge about them) inherent in the universe; none more so than the famous French Encyclopédie, edited by Denis Diderot and Jean-Baptiste le Rond d’Alembert and first published in 1751. Not only did this establish the standard practice of including material contributed by a group of named experts in particular fields, but it was also explicitly based upon a “Figurative System of Human Knowledge” that was published in its first volume.

The tripartite classification of knowledge that underpinned the Encyclopédie. Like many other information organisation tools, it is based upon the earlier work of the polymath Francis Bacon.

The desire to create a universal, systematic body of knowledge in this way–which was also driven by the exponential increase of information being produced by human activity, and growing awareness of the problem of “information overload” that this could cause—reached its apogee in the early twentieth century through the work of the Belgian bibliographer and proto-information scientist, Paul Otlet, and his collaborator, Henri La Fontaine. Amongst a multitude of projects for international co-operation and standardisation, particularly in the bibliographic and information fields, was the Universal Bibliographic Repertory: an enormous collection of catalogue cards intended to function in, amongst other things, much the same way as a colossal encyclopaedia. This form of information organisation had by now superseded the individual book, or volumes of a printed encyclopaedia, as Otlet recognised the need for the information contained within a book to be separated from the physical form of the book itself, as demonstrated in his own conception of human knowledge:

Paul Otlet’s conceptual model of how human knowledge is recorded. The universal catalogue transcends the limitations of individual books and other physical “carriers” of information.

This need for separation was later repeated by J.C.R. Licklider, who, when authoring a report in the mid-1960s into the feasibility of a networked knowledge environment (i.e. the technology that developed into the current Internet), encountered the same problems:

[Licklider] wrote about ‘the difficulty of separating the information in books from the pages’—a problem that, he argued, constituted one of ‘the most serious shortcomings of our present system for interacting with the body of recorded knowledge’. What he would do was create a system of cataloguing information from a wide range of sources, extracting and indexing that information—and distributing it over a network.

(Quoted in Alex Wright’s book, Cataloging the World: Paul Otlet and the birth of the Information Age, p.250.)

Fifty or so years later, the technology to create a networked, global encyclopaedia that can transcend the limits of its printed cousins has already been in existence for over two decades (although the fully-indexed Semantic Web remains a distant dream for now), and the transition of the traditional general reference encyclopaedias into digital forms—first on CD-ROM and now on the Internet with subscription access—has been the key development in the genre’s publishing industry. However, this is not the most significant change in the encyclopaedia world, as the rise of Wikipedia—an online encyclopaedia that can be accessed and edited freely by anyone—has demonstrated since its foundation in 2001. At the time of writing, it currently comprises 4,752,420 articles, and has been studied in some detail, with its organisation, social environment and potential as a model for innovation and collaborative work all being investigated, in addition to some famous studies into its reliability versus the established subscription-based online encyclopaedias such as Britannica.

The advantages of Wikipedia—its freeness, inclusiveness, collaborative nature and ability to cover a wide range of topics that would not feature in a “conventional” encyclopaedia to a high standard—have been readily apparent, and tend to outweigh its flaws, such as an unevenness of coverage in certain disciplines, and well-established practices exist to minimise the risks of malicious vandalism associated with a freely-editable encyclopaedia. Its success can be seen through the wide adoption of the wiki (the term derives from a Hawaiian word for “quick”) web application across the Internet, from relatively well-known websites such as WikiHow, to the many hundreds, if not thousands, of wikis on obscure and niche topics (such as the first PlayStation game I owned back in 1997!) hosted by Wikia. There is even a WikiIndex which acts as a directory for all other wikis! In addition, I imagine that many of the readers of this blog have also referred to, or even edited, wikis owned by their employers and used as part of their knowledge management programmes.

The rise of Wikipedia, in conjunction with Google as the exemplar of search engines and the spread of information technology in general, has also revolutionised librarianship in practice and caused much debate in the wider LIS sector. As users’ information behaviour increasingly becomes a case of “Google plus Wikipedia”, and library catalogues move towards the model of search engine-inspired discovery tools that include resources available digitally and outside the physical space of the building (perhaps become a form of encyclopaedia in their own right in the process), is there any need for traditional information retrieval and reference librarian skills within the profession? I would answer with a resounding “yes!”: the very fact that a plethora of information is now immediately available in a variety of formats, produced by a variety of sources who may differ in terms of trustworthiness, reliability and so forth, with differences in usage and permission rights, et cetera, means that the role of the librarian or information professional as a mediator between the user and the information that they seek is still of vital importance if society is not to succumb to the strain of information overload. Thus librarians still maintain their traditional role as guardians of knowledge—possibly to the extent of being encyclopaedias in themselves—with the new skills and tools demanded by the digital age.

An introduction to classification

08 Sunday Mar 2015

Posted by Dominic in Information Organisation

≈ 2 Comments

Tags

analytico-synthetic classification, cataloguing, Charles Ammi Cutter, classification, Colon Classification, Deborah Lee, Dewey Decimal Classication, faceted classification, Henri La Fontaine, Herbert Putnam, information organisation, Library of Congress Classification, Library of Congress Subject Headings, MARC, Melvil Dewey, metadata, Online Computer Library Center, Paul Otlet, S. R. Ranganathan, Universal Decimal Classification, vocabulary control

[Before starting the MSc in Library Science at City University London, one of my jobs in the LIS sector was being a Trainee Cataloguer for a library supply company. This blog post is a modified version of the research notes I made whilst preparing for a presentation on classification that I gave during that time.]

What is classification?

Classification is the practice of giving a book, or other library item, an identifying number (or series of numbers and letters, depending on the classification scheme used) which is determined by its subject matter. The classification is used to create a call number which is attached to the item record in the library’s catalogue, and a shelfmark, which is displayed on the book itself. The classification also forms part of the item’s bibliographic record, and is present in one or more MARC fields.

Simplified MARC records including both Library of Congress Classification (050 field) and Dewey Decimal Classification (082) numbers (click to enlarge).

Why classify?

Library management: classification by subject matter allows a library to manage its collection effectively, by keeping material on a certain subject together in one place. Before the invention of modern classification schemes in the mid-to-late nineteenth century, books were classified according to their location within the library, which led to a lack of consistency between libraries and the inconvenience of reclassifying material whenever sections of the collection were moved around.
User access: the greater convenience of using such a classification system also extends into the library’s electronic catalogue: for instance, a user can quickly browse items covering a particular topic by searching for its classification number. As the classification is an abbreviated expression of the content of the book (and sometimes other bibliographic information such as the author or title), the use of such a scheme in conjunction with spine labels and other signage also allows for quicker and more effective browsing of the library shelves.
Efficiency: as with cataloguing standards in general, the widespread use of international classification schemes such as Dewey Decimal Classification (DDC) and Library of Congress Classification (LCC) allows for greater efficiency, as libraries can share the workload of classifying the thousands upon thousands of new books that are published each year, and also updating those that are antiquated.

Dewey Decimal Classification

Dewey Decimal Classification (DDC) is hierarchical — consisting of the division of subjects from the most general to the most specific, expressed in a form of controlled vocabulary — and uses pure notation (only numbers). The numbers for more general subjects can be found in the schedules (the printed or online list of classifications and rules), and classifications for those that are more complex or detailed can be constructed using number building, using standard divisions enumerated in additional tables. To facilitate searching, there is also a relative index of terms used in the schedule, so called because it shows the relation of the term to all of the disciplines in which it is found. These features and principles are demonstrated in this example, featuring a real book!

(“Forest Phoenix: How a great forest recovers after wildfire / David Lindenmayer, David Blair, Lachlan McBurney and Sam Banks.” ISBN: 9780643100343)

When books classified according to the DDC are shelved, they look something like this:

The DDC was invented by Melvil Dewey, who published the first version of the scheme in 1876. An LIS pioneer, Dewey founded the world’s first library school (attached to Columbia College) and co-founded the American Library Association, and was an advocate for causes such as the metric system and English spelling reform (his given name was originally “Melville”, for instance). He later described how he came up with the idea of using numbers to represent concepts, a simple yet revolutionary concept:

For months I dreamd night and day that there must be somewhere a satisfactory solution. After months of study, one Sunday during a long sermon by Pres. Stearns, while I lookt stedfastly at him without hearing a word, my mind absorbed in the vital problem, the solution flashed over me so that I jumpt in my seat and came very near shouting ‘Eureka’! It was to get absolute simplicity by using the simplest known symbols, the Arabic numerals as decimals, with the ordinary significance of nought, to number a classification of all human knowledge in print.

[Dewey, M. (1920) Decimal classification beginning, Library Journal, 45, no. 15, February 1920, 151-154.]

The core principles of Dewey have remained the same, with much of the updates reflecting detail changes in the assignment of numbers to new or growing fields of study, although the use of standard codes listed in supplementary tables to enable number building was an important development in response to the faceted (or analytico-synthetic) Colon Classification devised several decades later by S. R. Ranganathan. The DDC also forms the basis of the Universal Decimal System developed by Paul Otlet and Henri La Fontaine, which also includes greater scope for faceted classification and is suitable for extremely large collections.

Now in its 23rd edition, it is currently owned and maintained by the Online Computer Library Center (OCLC) Although the full details of the classification scheme must be purchased — either in three hefty print volumes or as a subscription to the online version, WebDewey — a simplified version is available in linked data format. Although it has faced criticism for a number of reasons, including its Anglo-American world-view bias, it is still the most popular classification scheme in the world: by the OCLC’s own estimates, it is used by over 200,000 libraries in over 135 countries, including over sixty national bibliographies.

Library of Congress Classification

Library of Congress Classification (LCC) is far more enumerative, in that it lists as many subjects in the schedules as possible (and thus is significantly more detailed than DDC) and uses alphanumeric notation (including Cutter Numbers, which follow a code to arrange text in alphabetic order using the fewest possible characters). There is some scope for number building, but less so than in DCC. Due to the greater size of the schedules, there is no overall index, but each general section has its own. These are not relative and therefore are more complicated to search. In general, the LCC system is less intuitive and more prescriptive than DDC due to its enumerative nature. Unlike DDC, the intention of LCC is that each item in a library (excluding duplicates) should have a unique shelfmark. Using the same book as before as an example, LCC produces a very different shelfmark to DDC:

Unlike with DDC, the author’s name/title is an intrinsic part of the classification itself.

Geographic Cutter numbers are derived from an international standard. Author/title Cutter numbers are constructed by the cataloguer using standard alpha-numeric tables (or created using an online tool), but may need to be adjusted based on the library’s existing collection to ensure that no two different items share the same shelfmark (for example, the title Cutter could be changed to F648 if F65 was already taken by a book on the same subject and with a similar title or a single/primary author named, e.g. Forrester).

When books classified according to the LCC are shelved, they look something like this:

As the name suggests, the LCC was devised by Herbert Putnam (with assistance from Charles Ammi Cutter and his Expansive Classification scheme to organise the books and other materials in the United States legislature’s collection, and can thus also be criticised for much the same Anglo-American bias that exists in the DDC, in addition to the fact that, as a practical scheme initially intended for the demands of a particular library, it lacks an sound epistemological basis. Unlike the DDC, it has been updated irregularly by subject area, and its main schedules are also available online, as downloadable Word documents or as linked data. It is also supported as an information organisation tool by the Library of Congress Subject Headings.

A note on individual judgement

Most books are classified as they are published (known as Cataloguing-In-Publication) by recognised cataloguing authorities (e.g. Library of Congress, British Library) and their classifications can simply be reused. However, the classifier may be called upon to use his or her own judgement (memorably described by the leader of our recent cataloguing workshops, Deborah Lee as the “cherry on top” of the cake of guidance notes, policies and institutional preferences that can result in a deviation from “normal” procedure), in the case of:

the existing record containing an error, typographical or otherwise;
the existing record being outdated and using a classification that has since been relocated or discontinued;
the newly-published item having yet to be catalogued or classified by an authority;
the item only having a classification in one particular scheme whereas the library requires another;
the item not having a record for miscellaneous reasons (extreme age, limited publication, published abroad, etc.)

In addition, although the schemes are extremely detailed and cover the entire breadth of human knowledge, classification can still be open to interpretation. For example, a work covering the women’s suffrage movement in Britain, but also women’s wider social role and experience at the time, could conceivably be classified under several different Dewey numbers, a selection of which follows:

The colour-coding demonstrates that the same subject matter can be incorporated in different orders to create various classifications. This feature of classification schemes is known as the Acknowledgment of Duplication.

Moreover, local library preferences may require amendments to the “pure” classifications, such as the truncation of long classifications, the avoidance of certain subdivisions, the relocation of classifications for certain topics, or even a mix of two different classification systems.

What is the future of classification?

The astute reader will doubtless have noted that much of what I have written need only be applied to library resources that exist in the form of physical entities. There is no intrinsic need to classify electronic or Internet-based resources using the existing methods, as the information contained within them is stored, and can be retrieved, in ways that transcend the traditional trappings of shelf-space and the physical library environment—although that hasn’t stopped some Internet classification projects from being carried out!

Nevertheless, it is fair to say that, so long as information exists, humanity will always need to organise it, and continually be developing tools to do so to best effect. It is equally fair to say that, although library collections are increasingly moving towards the digital, it is hard to envision the traditional media of books, journals and other print media, and their corresponding rows of shelving, being totally superseded any time soon.

[N.B. The featured image from this post is a public-domain photograph, taken by Carol M. Highsmith, of the main Reading Room at the Library of Congress. Source: Wikimedia Commons.]

The Library of Tomorrow

~ thoughts and reflections on the world of Library and Information Science

Tag Archives: Henri La Fontaine

Visiting the Mundaneum

Reductio ad Wikipedia?

An introduction to classification