## 28. August 2006

### Role & Filler-Similarity for Description Logics

At least in my opinion there a two ways to handle similarity between role-filler pairs: The first (and maybe most straightforward one) is to define similarity as product of the similarities derived by comparing roles and fillers (see equation 1). The second approach is a weighted sum of role and filler similarities (see equation 2).

As example both equations measure overlap between existential quantifications (sim

Imagine a transportation device ontology, where R specifies an

_{e}), where sim_{r}is the inter-role and sim_{c}the inter-filler (range-concept) similarity. Equation 1 returns 0 if compared roles or fillers are dissimilar (sim = 0), which is an advantage from the perspective of computation time and (more important) avoids misleading results as discussed below. Nevertheless defining role and filler similarity as equally important seems to be oversimplified. Moreover except {0, 1} the resulting similarity sim_{e}is per definition (of multiplication) smaller than sim_{r }and sim_{c}, which probably contradicts with humans way of perceiving similarity! The second approach however raises the question how to semi-automatically derive the weightings that determining the relative importance of inter-role respectively inter-filler similarity for sim_{e}. In addition, high similarity ratings (for sim_{e}) already occur if one of the measured similarities is significant, while the other may be even 0.Imagine a transportation device ontology, where R specifies an

*inside*and S a*disjoint*relation. If both fillers C and D stand for*waterways*, equation 1 yields 0, while equation 2 results in ω_{c}*1. Now one may argue that the weighting for inter-role similarity should be higher, but than you just need to switch the example (by defining dissimilar fillers) to run into the same difficulty again. To overcome this shortcoming I have recently added the notion of thresholds from neural networks to the additive similarity approach to define a minimum similarity value sim

_{r}and sim_{c}need to overleap, else sim_{e}is 0. The question of how to derive the weightings and the threshold is still open, but maybe it is possible to integrate the notion of commonality and variability used in MDSM**[37]**for this purpose. However until now the theory presented in**[73]**uses the product similarity approach, its idea of context-awareness is comparable to MDSM and therefore a combinations seems to be promising. As start I have used a threshold t = 0.3, ω_{r}= 0.6 and ω_{c}= 0.4 for some first experiments within a simplified accommodation ontology.**[37] **Rodríguez, A. M. and M.J. Egenhofer, Comparing Geospatial Entity Classes: An Asymmetric and Context-Dependent Similarity Measure. International Journal of Geographical Information Science, 2004. 18(3): p. 229-256

**[73]** Janowicz, K. (2006). Sim-DL: Towards a Semantic Similarity Measurement Theory for the Description Logic ALCNR in Geographic Information Retrieval. R. Meersman, Z. Tari, P. Herrero et al. (Eds.): SeBGIS 2006, OTM Workshops 2006, LNCS 4278, pp. 1681 – 1692, 2006.