帮助 关于我们

返回检索结果

基于规则的中文地址要素解析方法
Rule-based Approach to Semantic Resolution of Chinese Addresses

查看参考文献31篇

文摘 在日常生产与生活中,地址是最常见的使用自然语言描述地理位置的参考系统之一.地址地理编码技术被认为是大量业务数据的GIS实现可视化定位和空间分析的桥梁,在房地产管理、土地管理、城市规划、公安、邮政、税收、电讯和公共卫生等领域中具有十分重要的应用前景.地址要素解析是中文地址编码的核心技术之一.它是将自然语言描述的地址拆分为在某一限定区域内,可以指定某一地理范围的地址要素的过程.实际上,这个过程可以被看作是一种特定的中文分词任务.由于语言和文化的原因,中文地址描述采用连续的字符串,而且普遍存在不规范现象.目前,采用的地址解析方法在较大程度上受限于词典的更新维护和规则的不完备问题.本文以大规模地名词典和地址数据库为数据源,通过系统分析地址要素的构词特征和句法模式,构建了各类地址要素的特征字库,提出了中文地址的数字表达方法,设计了RBAI中文地址要素解析算法,开发了相应的原型系统.实验结果准确率达到92%以上,处理效率达2800条/秒.这表明该方法符合大规模数据处理的应用需求,具有重要的推广应用价值.
其他语种文摘 A geographic information system (GIS) integrates hardware, software, and data for capturing, manag-ing, analyzing, and displaying all forms of geographically referenced information. Addresses are one of the most popular geographical reference systems in natural languages. Address geocoding is considered as the most effective approach to bridging the gap between business data in management information systems (MIS) and GIS, which supports geospatial information visualization and spatial analysis. Chinese address geocoding faces three significant problems, i.e. address models, address resolution and address matching, because of the un-standardization of Chinese place names and the shortage of national address databases. Address resolution aims to automatically split address strings in natural language into address units without semantic incompletion. It plays a fundamental role in address models and address matching Previous research focuses on rule or gazetteer based approaches, which are easily implemented but with poor coverage and performance. In theory, Chinese address resolution is similar to word segmentation in Chinese natural language processing Based on the investigation of large-scale Chinese place names and address syntactic patterns, this paper identifies primary and secondary general characters that represent a variety of address units. And then an address numerical representation method is presented to induce syntactical rules of Chinese addresses. Finally, we develop an RBAI algorithm for implementation Chinese address resolution and illustrate an example. The experimental results indicate that the proposed approach can achieve satisfactory ef-ficiency and effectiveness for large-scale data processing, the accuracy ratio over 92% and the processing rate over 2,800 items per second. The proposed approach and system can be extended to such fields as land management, asset management, city plan, public security, postal system, taxation, public health management and other loca-tion-base services.
来源 地球信息科学学报 ,2010,12(1):9-16 【扩展库】
关键词 中文地址 ; 语义解析 ; 地址编码 ; 地址表示
地址

南京师范大学, 虚拟地理环境教育部重点实验室, 江苏, 南京, 210046

语种 中文
文献类型 研究性论文
ISSN 1560-8999
学科 地球物理学
基金 国家863计划 ;  国家自然科学基金项目 ;  南京师范大学科研基金重点项目
文献收藏号 CSCD:3841731

参考文献 共 31 共2页

1.  陈述彭. 地理信息系统导论,1999 被引 234    
2.  王凌云. 国内地理编码数据库系统开发与研究. 计算机工程与应用,2004(21):210-212 被引 1    
3.  江洲. 地理信息系统地址编码技术标准化研究. 标准化研究,2007(5):22-25 被引 4    
4.  . U.S.Census Bureau 被引 2    
5.  . Open GIS Consortium (2008a):Geocoder Service Draft Candidate Implementation Specification 0.7.6 被引 1    
6.  Open GIS Consortium. Geoparser Service Draft Candidate.Implementation Specification 0.7.1,2008 被引 1    
7.  . Open GIS Consortium (2008c):Gazetteer Service Draft Candidate Implementation Specification 0.84 被引 1    
8.  Goldberg,D W. From Text to Geographic Coordinates:The Current State of Geocoding. URISA Journal,2007,19(1):33-46 被引 13    
9.  Christen,P. A Probabilistic Geocoding System Based on a National Address File. Proceedings of the 3rd Australasian Data Mining Conference,2004 被引 2    
10.  Borges K A V. The Web as a Data Source for Spatial Databases. Proceedings of the 4th ACM Workshop on Geographical Information Retrieval,2003:31-36 被引 2    
11.  Bakshi R. Exploit Online Sources of Accurately Geocode Addresses. Proceedings of the 12th ACM International Symposium on Advances in Geographic Information Systems,2004:194-203 被引 1    
12.  Borges K A V. OMT-G:An Object-oriented Data Model for Geographic Applications. GEOINFORMATICA,2001,5(3):221-260 被引 1    
13.  Christen P. Febrl-a Parallel Open Source Data Linkage System. Pacific-Asia Conference on Knowledge Discovery and Data Mining,2004:638-647 被引 1    
14.  McCurley K S. Geospatial Mapping and Navigation of the Web. Proceedings of the 10th International World Wide Web Conference,2001:221-229 被引 1    
15.  Christen P. Privacy-preserving Data Linkage and Geocoding:Current Approaches and Research Directions. Proceedings of the Sixth IEEE International Conference on Data Mining,2006:497-501 被引 1    
16.  Sengar V. Robust Location Search from Text Queries. Proceedings of the 15th Annual ACM International Symposium on Advances in Geographic Information Systems,2007 被引 1    
17.  Churches T. Preparation of Name and Address Data for Record Linkage Using Hidden Markov Models. Medical Informatics and Decision Making,2002,2(9) 被引 2    
18.  熊允泰. 试论城市地址编码问题,2003 被引 1    
19.  蒋景瞳. 我国城市地理信息标准化述评. 工程勘察,2006(3):50-54 被引 3    
20.  江绵康. 上海市基础地理要素编码标准编制研究. 地理与地理信息科学,2006(3):1-4 被引 2    
引证文献 38

1 陆娟 公安业务地理信息关联采集方式的研究 地球信息科学学报,2010,12(5):713-717
被引 0 次

2 邵妍 基于概率统计模型的快递地址自动分类方法 计算机工程,2012,38(23):277-280,283
被引 1

显示所有38篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号