Join (large) files on alphanumeric pattern I have: File 1 like: sting_of_printable_characters*sting_of_printable_characters*sting_of_printable_characters*ALPHANUMERIC

Join (large) files on alphanumeric pattern I have: File 1 like: sting_of_printable_characterssting_of_printable_characterssting_of_printable_charactersALPHANUMERIC_PATTERN File 2 like: sting_of_printable_charactersALPHANUMERIC_PATTERN where `` is a field separator and the alphanumeric pattern is always the last field in the line. I am completely stumped on how to achieve the following and would appreciate some assistance. I need to essentially "join" (I've tried the `join` command and it doesn't seem to work with alphanumeric keys) these two files based on "ALPHANUMERIC_PATTERN", and only print where both files contain the same ALPHANUMERIC_PATTERN. I would prefer to use `awk` due to it's processing efficiency but anything would be very helpful. (These files are large.) The catch is that I need to see the output similar to the below: ALPHANUMERIC_PATTERNstuff_from_file_1*stuff_from_file_2

With `join` you could try like this:

join -t\* \
<(sed 's/\(.*\)\(\*\)\(.*\)/\3\2\1/' file1 | sort -t\* -k1,1) \
<(sed 's/\(.*\)\(\*\)\(.*\)/\3\2\1/' file2 | sort -t\* -k1,1)

The two `sed`s move the last field to the beginning of line, e.g.

field1*field2*...field(N-1)*field(N)

becomes

field(N)*field1*field2*...*field(N-1)

the results are then then `sort`ed on `1`st field and then `join`ed (always on `1`st field). This will print lines like:

field(N)*fields(1)to(N-1)*from*file1*fields(1)to(N-1)*from*file2

If you prefer working with temporary files and save `join` result to e.g. `outfile`:

sed 's/\(.*\)\(\*\)\(.*\)/\3\2\1/' file1 | sort -t\* -k1,1 > sorted_1
sed 's/\(.*\)\(\*\)\(.*\)/\3\2\1/' file2 | sort -t\* -k1,1 > sorted_2
join -t\* sorted_{1,2} > outfile
rm -f sorted_{1,2}