Header

UZH-Logo

Maintenance Infos

The address connector: noninvasive synchronization of hierarchical data sources


Augsten, Nikolaus; Böhlen, Michael Hanspeter; Gamper, Johann (2013). The address connector: noninvasive synchronization of hierarchical data sources. Knowledge and Information Systems (KAIS), 37(3):639-663.

Abstract

Different databases often store information about the same or related objects in the real world. To enable collaboration between these databases, data items that refer to the same object must be identified. Residential addresses are data of particular interest as they often provide the only link between related pieces of information in different databases. Unfortunately, residential addresses that describe the same location might vary considerably and hence need to be synchronized. Non-matching street names and addresses stored at different levels of granularity make address synchronization a challenging task. Common approaches assume an authoritative reference set and correct residential addresses according to the reference set. Often, however, no reference set is available, and correcting addresses with different granularity is not possible. We present the address connector, which links residential addresses that refer to the same location. Instead of correcting addresses according to an authoritative reference set, the connector defines a lookup function for residential addresses. Given a query address and a target database, the lookup returns all residential addresses in the target database that refer to the same location. The lookup supports addresses that are stored with different granularity. To align the addresses of two matching streets, we use a global greedy address-matching algorithm that guarantees a stable matching. We define the concept of address containment that allows us to correctly link addresses with different granularity. The evaluation of our solution on real-world data from a municipality shows that our solution is both effective and efficient.

Abstract

Different databases often store information about the same or related objects in the real world. To enable collaboration between these databases, data items that refer to the same object must be identified. Residential addresses are data of particular interest as they often provide the only link between related pieces of information in different databases. Unfortunately, residential addresses that describe the same location might vary considerably and hence need to be synchronized. Non-matching street names and addresses stored at different levels of granularity make address synchronization a challenging task. Common approaches assume an authoritative reference set and correct residential addresses according to the reference set. Often, however, no reference set is available, and correcting addresses with different granularity is not possible. We present the address connector, which links residential addresses that refer to the same location. Instead of correcting addresses according to an authoritative reference set, the connector defines a lookup function for residential addresses. Given a query address and a target database, the lookup returns all residential addresses in the target database that refer to the same location. The lookup supports addresses that are stored with different granularity. To align the addresses of two matching streets, we use a global greedy address-matching algorithm that guarantees a stable matching. We define the concept of address containment that allows us to correctly link addresses with different granularity. The evaluation of our solution on real-world data from a municipality shows that our solution is both effective and efficient.

Statistics

Citations

1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Date:2013
Deposited On:23 Dec 2013 14:37
Last Modified:08 Dec 2017 01:19
Publisher:Springer
ISSN:0219-3116
Publisher DOI:https://doi.org/10.1007/s10115-012-0582-x
Other Identification Number:merlin-id:8686

Download

Full text not available from this repository.
View at publisher