According to Chu (Information Representation and Retrieval in the Digital Age) “aboutness” denotes a keyword or subject representing an object/document. Searching by that “aboutness” of a document in conjunction with a field attribute such as author, publication year, or document type is a common way to refine information retrieval results.
“Relevance” is a bit trickier to define. I found it interesting that Beaza-Yates & Riberio-Neto (Modern Information Retrieval) indicated “relevance” is “in the eye of the beholder”, and thus is a very subjective term influenced by many factors. Those factors include time, location, and even the device being used to access the information. Although the definition of relevance may vary, it is a property that reflects the relationship between a user’s query and a document.
Determining the relevance of a document affects how to compute the average recall because it is difficult to determine the total number of relevant documents in a system. Recall is defined as the ratio between the number of relevant documents retrieved and the total number of relevant documents in a system. Looking at some of the algebraic equations in our texts related to the various methods of calculating precision and recall, the primary metrics evaluated to determine the quality of ranking, made my head spin!
Various studies have identified as many as 38-80 factors that affect relevance. One study (Xu & Chen, 2006) indicated topicality and novelty are two significant underlying dimensions of relevance. Topicality is defined as “the extent to which a retrieved document is perceived by the user to be related to her current topic of interest.” Novelty is defined as “the extent to which the content of a retrieved document is new to the user or different from what the user has known before.”
I understand the preferred methodologies to define recall, precision, and relevance will be automatic in nature, that is, they can be calculated without the necessity of human interaction/intervention. More human involvement means more time, cost, and harder logistics to any experimentation. However, can the results be truly relevant (there’s that word again!) without that interaction?
Xu, Y., & Chen, Z. (2006). Relevance judgment: What do information users consider beyond topicality. Journal of the American Society for Information Science and Technology, (57)7, 961-973.