dc.contributor.author |
Spiliopoulos, K |
en |
dc.contributor.author |
Sofianopoulou, S |
en |
dc.date.accessioned |
2014-03-01T01:55:51Z |
|
dc.date.available |
2014-03-01T01:55:51Z |
|
dc.date.issued |
2007 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/27858 |
|
dc.subject |
Data Cleansing |
en |
dc.subject |
Data Mining |
en |
dc.subject |
Large Scale |
en |
dc.subject |
levenshtein distance |
en |
dc.subject |
String Matching |
en |
dc.subject |
Upper Bound |
en |
dc.subject |
Shortest Path |
en |
dc.subject |
Shortest Path Problem |
en |
dc.title |
Calculating distances for dissimilar strings: The shortest path formulation revisited |
en |
heal.type |
journalArticle |
en |
heal.identifier.primary |
10.1016/j.ejor.2005.09.005 |
en |
heal.identifier.secondary |
http://dx.doi.org/10.1016/j.ejor.2005.09.005 |
en |
heal.publicationDate |
2007 |
en |
heal.abstract |
Fast detection of string differences is a prerequisite for string clustering problems. An example of such a problem is the identification of duplicate information in the data cleansing stage of the data mining process. The relevant algorithms allow the application of large-scale clustering techniques in order to create clusters of similar strings. The vast majority of comparisons, in such cases, |
en |
heal.journalName |
European Journal of Operational Research |
en |
dc.identifier.doi |
10.1016/j.ejor.2005.09.005 |
en |