1.1.18. Testing

For sure the programs that were analyzed were also tested, but the one presented here is rather a description than a technical test. I try to detect and evaluate the functions, procedures, which are available, and not to the same extent the way the very same functions and procedures work with different and large amount of data, with specific citation styles and data sources.

If, for example, the test of  the input, sorting and searching capabilities of a given package can be fast and thorough, I dare say that any test done on two crucial issues like output styles and import filters cannot be but partial and even ephemeral.

This statement is grounded on four issues: quantity, variety, evolution, lack of standardization.

Citation styles: one single package easily offers 100, 200 styles (EndNote more than 2300).
Each of them covers several document types often in terms of 20-30: article in a journal, chapter in a book, book (each: long/short), conference proceedings, thesis, technical report, audiovisual, computer software, artworks, Internet resources, legal materials ... 
Furthermore, scientific journals, publishers, scholar institutions, content providers change their output citation styles without warning the BMS producers, for example because they want to deal with a "new" kind of document type: blog, electronic thesis, RSS feed, or alreday existing, like ancient manuscript, letters, will:  something they did not use to consider before.
Thus, the producer of a BMS package that simply includes 150 styles, where each of them covers -let's say- only 7 types of documents, has to write down specifications for some 1.050 arrangements, and monitor them over the years. 
And what about the figures for packages that offer citation styles in number of thousands, theoretically across 20, 30 different reference types? Can you imagine how many full time person should be devoted to such a task to assure an accurate monitoring?

The same, and even worse, takes place as far as import filters are concerned.
Content providers tend to merge databases, to incorporate different exisiting sources, sometimes changing or creating data formats on their own, without rule, without standard, without documentation: again BMS producers have to discover them, interpretate them and monitoring them endlessly.

  • How many import filters in each BMS package?
  • How many data sources?
  • How many databases offered by each single content provider?
  • How many reference/document types included in each single database? (not easy to know: many sources do not provide a handy list)
  • How many fields in each Reference Type? (not easy to know: many sources do not provide a handy list)
  • How many idiosyncratic situations within the same field? (parsing, multi-value, string to ignore

Testing: Despite that, I usually do a few of these tests for each package, and I can assure that the outcomes are fairly disappointing, even as far as 'famous' packages and 'famous' styles and 'famous' data sources are concerned: errors, lack of precision, skidding ... are abundant.  I never trust shipped citation styles and import filters "as is" whenever I need to seriously rely on the data. I always double check them using printouts and a pencil and very few selected bibliographic records.
But any statement made about working or not working of a given package to this respect would be extremely partial.

If we wanted to carefully test just one style, a famous one, like Chicago A-B, we should :
- take one BMS package
- open the Chicago Manual of Style latest edition and keep it open ..
- use some 10-12 different types of documents (from book to journal article, from chapter to thesis, from manuscript to review, from letter to conference paper, from multivolume printed to audiovisual etc.)
-  format those 10-12 different document types in three citation situations: in-text, footnote: first and subsequent,  reference list.
Once done with one software package we would be expected to repeat the same test across all the other packages that are considered here. At the end we would have tested one (1) output style out of the hundreds that are taken into account by these software packages.
The very similar test should be done with import filters, also considering that we would not have the equivalent standard as the Chicago Manual of Style: record formats are here proprietary by definition.
Thorough testing? Sorry, I simply consider it unworthy and insufficient to give a grounded evaluation of the output and import capabilities of any package.

Lack of standardization: The bibliographic world is much less standardized of the library world: there are not agreed standards as far as input, coding of data,  output and exchange formats.  Citation is ruled by dozens, hundreds of citing styles. Import depends on data formats and here there simply are not rules at all: any content provider can create, modify its own format and often includes a large variety of them in its huge databank.
Here there are not common, and sometimes worldwide spread, standards like AACR2, ISBD, MARC and ISO 2709.
But I do not need to be reminded that the library world, notwithstanding all its long dated standards, has got its communication, exchange, retrospective conversion problems, as well.


Table of contents  | Index