Duplicate Detection: Resources
- an open-source Java-based package of approximate string-matching
- another open-source Java-based library of similarity metric techniques.
- MARLIN - a Weka-based Java package for
duplicate detection. Currently in the process of being cleaned up,
please contact firstname.lastname@example.org if you
would like to receive the current version.
(Freely Extensible Biomedical Record Linkage) does data standardisation
(segmentation and cleaning) and probabilistic record linkage ("fuzzy"
matching) of one or more files or data sources which do not share a unique
record key or identifier.
- The Link King - a
freely downloadable record linkage application for SAS.
- Projects, seminars, courses
Back to RIDDLE homepage
Last modified: August 25, 2003