Skip to main content

May
21
2011

Duke can be used to find duplicate records inside a single table/data source, or it can be used to find records in different tables/sources which most likely represent the same real-world entity. Duke is a fast and flexible deduplication (or entity resolution, or record linkage) engine written in Java on top of Lucene. At the moment (2011-04-07) it can process 1,000,000 records in 11 minutes on a standard laptop

deduplication lucene

Jan
20
2010

Recently, however, the popular open source search library, Apache Lucene, and the powerful Lucene-powered search server, Apache Solr, have added spatial capabilities. Lucene and Solr committer Grant Ingersoll walks you through the basics of spatial search

search semantic solr apache lucene geolocation delicious

in list: Semantic Web

1 - 2 of 2
Showing 20 items per page

Highlighter, Sticky notes, Tagging, Groups and Network: integrated suite dramatically boosting research productivity. Learn more »

Join Diigo
Move to top