How to remove unique strings from a textfile? Sorry guys I had to edit my example, because I didn't express my query properly. Let's say I have the .txt file: Happy sad Happy sad Happy sad Sa...--prophetes.ai

How to remove unique strings from a textfile? Sorry guys I had to edit my example, because I didn't express my query properly. Let's say I have the .txt file: Happy sad Happy sad Happy sad Sad happy Happy sad Happy sad Mad sad Mad happy Mad happy And I want to delete any string that is unique. Leaving the file with: Happy sad Happy sad Happy sad Happy sad Happy sad Mad happy Mad happy I understand that sort is able to get rid of duplicates (`sort file.txt | uniq`), so is there anyway we can do the opposite in bash using a command? Or would I just need to figure out a while loop for it? BTW `uniq -D file.txt > output.txt` doesn't work.

Using `awk`:

$ awk 'seen[$0]++; seen[$0] == 2' file
Happy sad
Happy sad
Happy sad
Happy sad
Happy sad
Mad happy
Mad happy

This uses the text of each line as the key into the associative array `seen`. The first `seen[$0]++` will cause a line that has been seen before to be printed since the value associated with the line will be non-zero on the second and subsequent times the line is seen. The `seen[$0] == 2` causes the line to be printed again if this is the second time the line has been seen (without this, you'll miss one occurrence of each duplicated line).

This is related to `awk '!seen[$0]++'` which is sometimes used to _remove_ duplicates without sorting (see e.g. [How does awk '!a[$0]++' work?](

* * *

To only get one copy of the duplicated lines:

awk 'seen[$0]++ == 1' file

or,

sort file | uniq -d