ABSTRACT
Ontologies play a prominent role on the Semantic Web. They make possible the widespread publication of machine understandable data, opening myriad opportunities for automated information processing. However, because of the Semantic Web's distributed nature, data on it will inevitably come from many different ontologies. Information processing across ontologies is not possible without knowing the semantic mappings between their elements. Manually finding such mappings is tedious, error-prone, and clearly not possible at the Web scale. Hence, the development of tools to assist in the ontology mapping process is crucial to the success of the Semantic Web.We describe glue, a system that employs machine learning techniques to find such mappings. Given two ontologies, for each concept in one ontology glue finds the most similar concept in the other ontology. We give well-founded probabilistic definitions to several practical similarity measures, and show that glue can work with all of them. This is in contrast to most existing approaches, which deal with a single similarity measure. Another key feature of glue is that it uses multiple learning strategies, each of which exploits a different type of information either in the data instances or in the taxonomic structure of the ontologies. To further improve matching accuracy, we extend glue to incorporate commonsense knowledge and domain constraints into the matching process. For this purpose, we show that relaxation labeling, a well-known constraint optimization technique used in computer vision and other fields, can be adapted to work efficiently in our context. Our approach is thus distinguished in that it works with a variety of well-defined similarity notions and that it efficiently incorporates multiple types of knowledge. We describe a set of experiments on several real-world domains, and show that glue proposes highly accurate semantic mappings.
- http://ontobroker.semanticweb.org.Google Scholar
- www.daml.org.Google Scholar
- www.google.com.Google Scholar
- IEEE Intelligent Systems, 16(2), 2001.Google Scholar
- A. Agresti. Categorical Data Analysis. Wiley, New York, NY, 1990.Google Scholar
- T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, 279, 2001.Google Scholar
- D. Brickley and R. Guha. Resource Description Framework Schema Specification 1.0, 2000.Google Scholar
- J. Broekstra, M. Klein, S. Decker, D. Fensel, F. van Harmelen, and I. Horrocks. Enabling knowledge representation on the Web by Extending RDF Schema. In Proceedings of the Tenth International World Wide Web Conference, 2001. Google ScholarDigital Library
- D. Calvanese, D. G. Giuseppe, and M. Lenzerini. Ontology of Integration and Integration of Ontologies. In Proceedings of the 2001 Description Logic Workshop (DL 2001).Google Scholar
- S. Chakrabarti, B. Dom, and P. Indyk. Enhanced Hypertext Categorization Using Hyperlinks. In Proceedings of the ACM SIGMOD Conference, 1998. Google ScholarDigital Library
- H. Chalupsky. Ontomorph: A Translation system for symbolic knowledge. In Principles of Knowledge Representation and Reasoning, 2000.Google Scholar
- A. Doan, P. Domingos, and A. Halevy. Reconciling Schemas of Disparate Data Sources: A Machine Learning Approach. In Proceedings of the ACM SIGMOD Conference, 2001. Google ScholarDigital Library
- P. Domingos and M. Pazzani. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning, 29:103--130, 1997. Google ScholarDigital Library
- D. Fensel. Ontologies: Silver Bullet for Knowledge Management and Electronic Commerce. Springer-Verlag, 2001. Google ScholarDigital Library
- J. Heflin and J. Hendler. A Portrait of the Semantic Web in Action. IEEE Intelligent Systems, 16(2), 2001. Google ScholarDigital Library
- R. Hummel and S. Zucker. On the Foundations of Relaxation Labeling Processes. PAMI, 5(3):267--287, May 1983.Google Scholar
- R. Ichise, H. Takeda, and S. Honiden. Rule Induction for Concept Hierarchy Alignment. In Proceedings of the Workshop on Ontology Learning at the 17th International Joint Conference on Artificial Intelligence (IJCAI), 2001.Google Scholar
- M. Lacher and G. Groh. Facilitating the exchange of explixit knowledge through ontology mappings. In Proceedings of the 14th Int. FLAIRS conference, 2001. Google ScholarDigital Library
- D. Lin. An Information-Theoritic Definiton of Similarity. In Proceedings of the International Conference on Machine Learning (ICML), 1998. Google ScholarDigital Library
- S. Lloyd. An optimization approach to relaxation labeling algorithms. Image and Vision Computing, 1(2), 1983.Google Scholar
- J. Madhavan, P. Bernstein, and E. Rahm. Generic Schema Matching with Cupid. In Proceedings of the International Conference on Very Large Databases (VLDB), 2001. Google ScholarDigital Library
- A. Maedche. A Machine Learning Perspective for the Semantic Web. Semantic Web Working Symposium (SWWS) Position Paper, 2001.Google Scholar
- A. Maedche and S. Saab. Ontology Learning for the Semantic Web. IEEE Intelligent Systems, 16(2), 2001. Google ScholarDigital Library
- D. McGuinness, R. Fikes, J. Rice, and S. Wilder. The Chimaera Ontology Environment. In Proceedings of the 17th National Conference on Artificial Intelligence (AAAI), 2000. Google ScholarDigital Library
- S. Melnik, H. Molina-Garcia, and E. Rahm. Similarity Flooding: A Versatile Graph Matching Algorithm. In Proceedings of the International Conference on Data Engineering (ICDE), 2002. Google ScholarDigital Library
- T. Milo and S. Zohar. Using Schema Matching to Simplify Heterogeneous Data Translation. In Proceedings of the International Conference on Very Large Databases (VLDB), 1998. Google ScholarDigital Library
- P. Mitra, G. Wiederhold, and J. Jannink. Semi-automatic Integration of Knowledge Sources. In Proceedings of Fusion'99.Google Scholar
- N. Noy and M. Musen. PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. In Proceedings of the National Conference on Artificial Intelligence (AAAI), 2000. Google ScholarDigital Library
- N. Noy and M. Musen. Anchor-PROMPT: Using Non-Local Context for Semantic Matching. In Proceedings of the Workshop on Ontologies and Information Sharing at the International Joint Conference on Artificial Intelligence (IJCAI), 2001.Google Scholar
- B. Omelayenko. Learning of Ontologies for the Web: the Analysis of Existent approaches. In Proceedings of the International Workshop on Web Dynamics, 2001.Google Scholar
- L. Padro. A Hybrid Environment for Syntax-Semantic Tagging, 1998.Google Scholar
- N. Pernelle, M.-C. Rousset, and V. Ventos. Automatic Construction and Refinement of a Class Hierarchy over Semi-Structured Data. In Proceeding of the Workshop on Ontology Learning at the 17th International Joint Conference on Artificial Intelligence (IJCAI), 2001.Google Scholar
- E. Rahm and P. Bernstein. On Matching Schemas Automatically. VLDB Journal, 10(4), 2001. Google ScholarDigital Library
- K. M. Ting and I. H. Witten. Issues in stacked generalization. Journal of Artificial Intelligence Research (JAIR), 10:271--289, 1999. Google ScholarDigital Library
- M. Uschold. Where is the semantics in the Semantic Web? In Workshop on Ontologies in Agent Systems (OAS) at the 5th International Conference on Autonomous Agents, 2001.Google Scholar
- van Rijsbergen. Information Retrieval. London:Butterworths, 1979. Second Edition. Google ScholarDigital Library
- D. Wolpert. Stacked generalization. Neural Networks, 5:241--259, 1992. Google ScholarDigital Library
- L. Yan, R. Miller, L. Haas, and R. Fagin. Data Driven Understanding and Refinement of Schema Mappings. In Proceedings of the ACM SIGMOD, 2001. Google ScholarDigital Library
Index Terms
Learning to map between ontologies on the semantic web
Recommendations
Learning to match ontologies on the Semantic Web
On the Semantic Web, data will inevitably come from many different ontologies, and information processing across ontologies is not possible without knowing the semantic mappings between them. Manually finding such mappings is tedious, error-prone, and ...
Semantic SenseLab: Implementing the vision of the Semantic Web in neuroscience
Objective: Integrative neuroscience research needs a scalable informatics framework that enables semantic integration of diverse types of neuroscience data. This paper describes the use of the Web Ontology Language (OWL) and other Semantic Web ...
Semantic enrichment for medical ontologies
The Unified Medical Language System (UMLS) contains two separate but interconnected knowledge structures, the Semantic Network (upper level) and the Metathesaurus (lower level). In this paper, we have attempted to work out better how the use of such a ...
Comments