I need a tool to find duplicates or similar text blocks in a text file or a set...
source link: https://www.codesd.com/item/i-need-a-tool-to-find-duplicates-or-similar-text-blocks-in-a-text-file-or-a-set-of-singular-text-files.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
I need a tool to find duplicates or similar text blocks in a text file or a set of singular text files
I want to automate moving duplicate or similar C code into functions.
This must work under Linux.
A subset of your problem: Detecting duplicate code:
Try: PMD
Duplicate code can be hard to find, especially in a large project. But PMD's Copy/Paste Detector (CPD) can find it for you! CPD has been through three major incarnations:
- First we wrote it using a variant of Michael Wise's Greedy String Tiling algorithm (our variant is described here)
- Then it was completely rewritten by Brian Ewins using the Burrows-Wheeler transform
- Finally, it was rewritten by Steve Hawkins to use the Karp-Rabin string matching algorithm.
Note that CPD works with Java, JSP, C, C++, Fortran and PHP code.
Related Articles
Regex to find duplicate instances of text in a single field on a single MySQL row
You need a tool / template that publishes downloads similar to Wordpress?
VBA Macro to find duplicates between two text ranges
Suggestions for a macro to find duplicates in a SINGLE column
Algo find duplicates in a very wide range
What is the best way to find duplicate files in C ++?
Tools to find and measure the most modified code?
Find duplicate entries in a table with 1 billion lines
Sql Query to find duplicates in 2 columns where the values in the first column are the same
How to improve the MySql query that tries to find duplicate entries in a large database?
Effectively find duplicates in an unsorted sequence
Find duplicate hash keys in Perl and find the maximum value among them
How to return a single value and also find duplicate values? SQL
How to find duplicates in a list & lt; T & gt; quickly and update the original collection
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK