Delete Duplicate Lines

The Delete Duplicate Lines item is available via the Reformat Text button on the collection toolbar, except when the Recycle Bin tab is active. Use it to delete duplicate lines from the active clip. You can make a number of choices as to what AceText will consider a duplicate line.

Scope

If you've selected part of the file before using the Delete Duplicate Lines command, you can limit the command to delete only lines that are selected. If the first and/or last line in the selection are only partially selected, the selection will be expanded to include them entirely. If the selection is rectangular, lines covered by the selection will be deleted entirely.

Proximity of Duplicate Lines

Select "anywhere in the scope" to delete all lines that are duplicated anywhere. The first copy of the line will remain, while all the others will be deleted. If you've set the scope to "selected lines", the lines must be duplicated inside the selection. Lines that are not duplicated inside the selection will not be deleted, even if they have lines outside the selection.

Select "adjacent lines only" if you only want to delete a line's duplicates if they're immediately below the line they duplicate, without any other lines between them. If the file's lines are sorted alphabetically, then the end result of "anywhere in the scope" and "adjacent lines only" will be the same. In a sorted file, all duplicates are sorted together. However, selecting "adjacent lines only" will delete the duplicate lines significantly faster, certainly when the number of lines in the file is large. If you select "anywhere in the scope", AceText has to compare each line with every other line in the file.

Comparison Options

By turning on one or more comparison options, you can tell AceText to consider lines as duplicates even when they aren't identical.

The "compare selected columns only" is only available when you've made a selection that does not span more than one line, or when you've made a rectangular selection. With this option, AceText will only compare the selected columns. E.g. if the selection spans from column 10 to column 20, AceText will compare columns 10 through 20 of each line. If a line has less than 10 characters it will be considered blank. This has important consequences (see next section).

"Ignore differences in leading spaces and tabs" will treat lines that only differ in the number of spaces and tabs at the start of the line as duplicates. Similarly, "ignore trailing spaces and tabs" ignores differences in spaces and tabs at the end of each line. "Ignore all differences in spaces and tab" is more than a combination of the two previous options. AceText will then completely ignore all spaces and tabs, including spaces and tabs in the middle of lines.

"Ignore difference in case" compares lines without regard to the difference between upper case and lower case letters.

Lines to Delete

You must select one or two choices in the "lines to delete" section. Every line in the file belongs to one of the 3 categories. Selecting none of the options would have no effect, and selecting all of them would delete all the lines in the file.

Turn on "2nd and following occurrences of duplicate lines" and turn off the other two options to delete all duplicate lines, leaving only unique files in the file, regardless of whether they were previously unique. Use this to delete unnecessary duplicates from a file.

Turn on both "2nd..." and "1st occurrence of duplicate lines" to delete all duplicate lines, leaving only lines that were previously unique.

Turn on both "2nd..." and "non-duplicated lines" to leave only one copy of all lines that had duplicates. If you paste the contents of two lists that consist of unique lines (when viewed separately) into a file in AceText, then you can use this combination to get the lines that occurred in both files, but not the lines that occurred in only one of the files.

If you want to keep only lines that occur a certain number of times, use the Delete Duplicate Lines several times. E.g. if you only want lines that occur 3 times or more, use it twice with the "1st occurrence..." and "non-duplicated..." options turned on. Then use it again with the "2nd occurrence..." and "non-duplicated..." options. The first time you delete the lines that occur only once, the second time you delete lines that occur only twice, and the third time you delete the duplicates of lines that occur four times or more.

Blank Lines

Since blank lines are technically all duplicates of each other, AceText offers you an extra choice for blank lines. You can choose to either delete all blank lines, not to delete any blank lines, or to only delete duplicate blank lines. The "duplicate blank lines" option takes into account the "proximity" setting, deleting either all but the first blank lines ("anywhere in the scope"), or only replacing subsequent blank lines with a single blank line ("adjacent lines only").

If you've turned on the "compare selected columns only" option, a line may be considered blank even when it isn't. If a line is shorter than the leftmost column in the selection, it is considered to be blank, even if it does have text on it.

Lines with only spaces and tabs on them are only considered to be blank if you've turned one of the options to ignore differences in spaces and tabs. On a line with only spaces and tabs, all spaces and tabs are considered to be both leading and trailing at the same time.