ESB Forum
ISSN: 2283-303X

Semantic web: no to the web of data, yes to the web of documents, metadata and human beings

di Riccardo Ridi (in linea da: 14 gennaio 2021).

Contributo, consegnato il 6 maggio 2018, per il Semantic web manifesto dell'AIB, pubblicato il 12 ottobre 2020.

[1] If "semantic web" means to increase the quantity, quality, coherence, univocity, standardization, updating and interoperability of metadata that are present in the WWW, adding them to primary documents to facilitate their research, organization, understanding, selection, evaluation and use, then it is a sensible, realistic, useful project and librarians, archivists and scholars of library, archival and documentary sciences can provide an important help to it, thanks to their skills and values.

[2] If, on the other hand, "semantic web" means to replace primary documents with granular data combinable with each other from time to time, then it is a senseless and unrealistic project, because - as librarians, archivists and scholars of library, archival and documentary sciences know well - it is impossible even to imagine an overall system of production, storage, communication, acquisition and use of knowledge that leaves completely aside that fundamental element of the organization and management of information represented by the document. Only documentary structures with sufficient size, architecture, verifiability and persistence can actually guarantee systemic, organic, historical, philological, legal and authorial instances that can not be satisfied by information atoms that are too small, destructured and volatile.

[3] Finally, if "semantic web" means to delegate completely (or, in any case, to a large extent) to algorithms and automatic mechanisms the bundling of data to obtain more extensive information structures and the interpretation of data to derive from them evaluations and decisions, then it is a project that is not only impossible, but also dangerous, because - as epistemologists well know, at least from Hume and Kant onwards - not only documents, but even data are never objective and neutral, and their choice, organization, contextualization, interpretation and evaluation are activities in which machines can help but must never replace human beings.

[4] However a greater granularization of digital documents (which are often still as rigid and monolithic as the traditional ones due to cultural inertia and an excessive protection of intellectual property) could greatly increase their effectiveness and their possibility of being used both by software applications and by human beings. But this does not necessarily imply a forced "liquefaction" of all kinds of information in an indistinct dust of data. For each coherent set of information, the ideal level of granularity is the one that maximizes the possibilities of different reaggregations, of contents reusability in different contexts and of exploration of the information nodes along diversified paths without compromising their readability even as unitary documents, distinct from the others. Finding such a balance is by no means easy, and it is a decision impossible to delegate to a machine or to an abstract rule.