Remove duplicate based on condition awk/bash -
i remove duplicates dataset has 3 columns
a 0 3238 b 0 3367 c 0 3130 d 1 3130
i need remove lines contain duplicate values in third column, preferentially keeping value '1' in second column. know how remove duplicates using awk, can't work out how add in conditional statment.
thanks
give line try:
awk '{if($3 in a)a[$3]=$2==1?$0:a[$3];else a[$3]=$0}end{for(i in a)print a[i]}' file
Comments
Post a Comment