Delete Duplicate Lines

The Delete Duplicate Lines item is available via the Reformat Text button on the collection toolbar and via the Format menu, except when the Recycle Bin tab is active. Use it to delete duplicate lines from the active clip. You can make a number of choices as to what AceText considers to be a duplicate line.

Scope

If you’ve selected part of the clip before using the Delete Duplicate Lines command then you can limit the command to delete only lines that are selected. If the first and/or last line in the selection are only partially selected then the selection is expanded to include them entirely. If the selection is rectangular then lines covered by the selection are deleted entirely.

Proximity of Duplicate Lines

Select “anywhere in the scope” to delete all lines that are duplicated anywhere. The first copy of the line remains, while all the others are deleted. If you’ve set the scope to “selected lines”, the lines must be duplicated inside the selection. Lines that are not duplicated inside the selection are not deleted, even if they have lines outside the selection.

Select “adjacent lines only” if you only want to delete a line’s duplicates if they’re immediately below the line they duplicate, without any other lines between them. If the clip’s lines are sorted alphabetically, then the end result of “anywhere in the scope” and “adjacent lines only” is the same. In a sorted clip, all duplicates are sorted together. However, selecting “adjacent lines only” deletes the duplicate lines significantly faster, certainly when the number of lines in the clip is large. If you select “anywhere in the scope”, AceText has to compare each line with every other line in the clip.

Comparison Options

By turning on one or more comparison options, you can tell AceText to consider lines as duplicates even when they aren’t identical.

The “compare selected columns only” option is only available when you’ve made a selection that does not span more than one line, or when you’ve made a rectangular selection. With this option, AceText only compares the selected columns. If the selection spans from column 10 to column 20, for example, then AceText compares columns 10 through 20 of each line. If a line has less than 10 characters it is considered to be blank. This has important consequences (see next section).

“Ignore differences in leading spaces and tabs” treats lines that only differ in the number of spaces and tabs at the start of the line as duplicates. Similarly, “ignore trailing spaces and tabs” ignores differences in spaces and tabs at the end of each line. “Ignore all differences in spaces and tab” is more than a combination of the two previous options. AceText then completely ignores all spaces and tabs, including spaces and tabs in the middle of lines.

“Ignore difference in case” compares lines without regard to the difference between uppercase and lowercase letters.

Lines to Delete

You must select one or two choices in the “lines to delete” section. Every line in the clip belongs to one of the 3 categories. Selecting none of the options would have no effect, and selecting all of them would delete all the lines in the clip.

Turn on “2nd and following occurrences of duplicate lines” and turn off the other two options to delete all duplicate lines, leaving only unique clips in the clip, regardless of whether they were previously unique. Use this to delete unnecessary duplicates from a clip.

Turn on both “2nd...” and “1st occurrence of duplicate lines” to delete all duplicate lines, leaving only lines that were previously unique.

Turn on both “2nd...” and “non-duplicated lines” to leave only one copy of all lines that had duplicates. If you paste the contents of two lists that consist of unique lines (when viewed separately) into a clip in AceText, then you can use this combination to get the lines that occurred in both clips, but not the lines that occurred in only one of the clips.

If you want to keep only lines that occur a certain number of times, use the Delete Duplicate Lines several times. E.g. if you only want lines that occur 3 times or more, use it twice with the “1st occurrence...” and “non-duplicated...” options turned on. Then use it again with the “2nd occurrence...” and “non-duplicated...” options. The first time you delete the lines that occur only once, the second time you delete lines that occur only twice, and the third time you delete the duplicates of lines that occur four times or more.

Blank Lines

Since blank lines are technically all duplicates of each other, AceText offers you an extra choice for blank lines. You can choose to either delete all blank lines, not to delete any blank lines, or to only delete duplicate blank lines. The “duplicate blank lines” option takes into account the “proximity” setting, deleting either all but the first blank lines (“anywhere in the scope”), or only replacing subsequent blank lines with a single blank line (“adjacent lines only”).

If you’ve turned on the “compare selected columns only” option, a line may be considered blank even when it isn’t. If a line is shorter than the leftmost column in the selection, it is considered to be blank, even if it does have text on it.

Lines with only spaces and tabs on them are only considered to be blank if you’ve turned one of the options to ignore differences in spaces and tabs. On a line with only spaces and tabs, all spaces and tabs are considered to be both leading and trailing at the same time.