Artificial intelligent assistant

Extract only rows with duplicated strings in tab-delimited table I have a long list of data with 10 tab delimited columns. First two columns are the IDs. I would like to retrieve rows of selected IDs. I started with renaming the selected IDs, so that each of them prepended with `comp-`. Then I tried to extract the rows with selected IDs present in both column 1 and 2. file: comp-AA11232.1 GR55896.1 AB55887.1 comp-FR87559.1 comp-AC11232.1 comp-AE55888.1 comp-AC66742.1 comp-AD87559.1 Desired output: comp-AC11232.1 comp-AE55888.1 comp-AC66742.1 comp-AD87559.1 I was using `sed -n '/comp\-.*\tcomp\-.*/p' file`. The output files were all those that met criteria, but unfortunately some of the rows with same criteria missed out in the output files. Not sure what is happening here. Any idea? Or is there any better approach with grep/awk/sed in this case?

awk -F'\t' '$1 ~/^comp-/ && $2 ~/^comp-/' infile


same but pass the pattern from a parameter:


awk -F'\t' -v pat='comp-' '$1 ~"^" pat && $2 ~"^" pat' infile


or compare as string match and still pass from a parameter:


awk -F'\t' -v str='comp-' 'index($1, str)==1 && index($2, str)==1' infile


see also How do I find the text that matches a pattern? for other matching options.

xcX3v84RxoQ-4GxG32940ukFUIEgYdPy 526ced51544df6e18e53863a327c17b4