recentpopularlog in

dedupe

« earlier   
GitHub - idealo/imagededup: 😎 Finding duplicate images made easy!
😎 Finding duplicate images made easy! Contribute to idealo/imagededup development by creating an account on GitHub.
image  deeplearning  python  dedupe 
yesterday by sanderant
idealo/imagededup: 😎 Finding duplicate images made easy!
😎 Finding duplicate images made easy! Contribute to idealo/imagededup development by creating an account on GitHub.
image  tensorflow  python  deeplearning  dedupe  imageprocessing  images  dedup 
3 days ago by e2b
idealo/imagededup: 😎 Finding duplicate images made easy!
😎 Finding duplicate images made easy! Contribute to idealo/imagededup development by creating an account on GitHub.
deeplearning  github  image  python  dedupe  imageprocessing 
6 days ago by hay
GitHub - idealo/imagededup: 😎 Finding duplicate images made easy!
😎 Finding duplicate images made easy! Contribute to idealo/imagededup development by creating an account on GitHub.
python  image  dedupe  imageprocessing  deeplearning 
8 days ago by synergyfactor
0x90d/videoduplicatefinder: Video Duplicate Finder - Crossplatform
Video Duplicate Finder - Crossplatform. Contribute to 0x90d/videoduplicatefinder development by creating an account on GitHub.
dedupe  video  ffmpeg 
10 days ago by Peter_Antigen
idealo/imagededup: 😎 Finding duplicate images made easy!
😎 Finding duplicate images made easy! Contribute to idealo/imagededup development by creating an account on GitHub.
dedupe  tensorflow  python 
10 days ago by Peter_Antigen
Minnesota police officers convicted of serious crimes still on the job - StarTribune.com
behind the scenes from Nick Diachopoulos in CJR. // Unsupervised approaches to grouping or clustering can sometimes be made more efficient by providing targeted feedback to the machine-learning system. For instance, Dedupe, a tool for grouping and linking noisy records, has been used by investigative journalists at the Minneapolis StarTribune for its “Shielded by the Badge” series. Dedupe uses an approach called active learning. As the system tries to cluster items together, it asks for feedback from a human trainer on the items it’s least confident about. This maximizes the value of human feedback for improving the results over time.
unsupervisedmachinelearning  unsupervised  journalism  police  activelearning  clustering  feedback  startribune  dedupe  machinelearning  artificialIntelligence  maryjowebster 
may 2019 by fcoel
Deduplicating files in Public Git Archive · source{d} blog
This summer, we announced the release of Public Git Archive, a dataset with 3TB of Git data from the most starred repositories on GitHub. Now it’s time to tell how we tried to deduplicate files in the latest revision of the repositories in PGA using our research project for code deduplication, src-d/apollo. Before diving deep, let’s quickly see why we created it. To the best of our knowledge, the only efforts to detect code clones at massive scale have been made by Lopes et. al., who leveraged a huge corpus of over 428 million files in 4 languages to map code clones on GitHub (DéjàVu project). They relied on syntactic features, i.e. identifiers (my_list, your_list, …) and literals (if, for, …), to compute the similarity between a pair of files. PGA has fewer files in the latest (HEAD) revision - 54 million, and we did not want to give our readers a DéjàVu by repeating the same analysis. So we aimed at something different: not only copy-paste between files, but also involuntary rewrites of the same abstractions. Thus we extracted and used semantic features from Universal Abstract Syntax Trees.
cs  git  github  source  dedupe 
october 2018 by euler
restic: Fast, secure, efficient backup program
restic 0.9.2 has just been released! Included are many fixes and support for application keys:
github  pinboard-fixup-github-titles  Go  restic  backup  deduplication  dedupe  secure-by-default  from twitter_favs
august 2018 by suhlig

Copy this bookmark:





to read