Python: merge lists of tuples based on its values -


i'm trying figure out method merge 2 lists in python in order accomplish this:

list_a = [(item_1, attribute_x), (item_2, attribute_y), (item_3, attribute_z)] list_b = [(item_1, attribute_n), (item_3, attribute_p) ] 

as result:

list_result = [(item_1, attribute_x, attribute_n), (item_2, attribute_y, false), (item_3, attribute_z, attribute_p)] 

any ideas?

here interesting way solve problem, robust function returns generator:

def combine_item_pairs(l1, l2):     d = {k:[v, false] k, v in l1}     key, value in l2:         if key in d:             d[key][1] = value         else:             d[key] = [false, value]     return (tuple([key]+value) key, value in d.iteritems()) 

using it:

>>> list(combine_item_pairs(list_a, list_b)) [('item_2', 'attribute_y', false), ('item_3', 'attribute_z', 'attribute_p'), ('item_1', 'attribute_x', 'attribute_n')] 

here bonus solution (same interface, more efficient solution:

from itertools import groupby operator import itemgetter  def combine_item_pairs(l1, l2):     return (tuple(list([k]+[i(1)(i) in g]+[false])[:3]) k, g in groupby(sorted(l1+l2), key=i(0))) 

results:

>>> list(combine_item_pairs(list_a, list_b)) [('item_1', 'attribute_n', 'attribute_x'), ('item_2', 'attribute_y', false), ('item_3', 'attribute_p', 'attribute_z')] 

note: efficiency of solution diminished if lists require sorting, or if lot of values absent. (also, absences reflected false value in last item of tuple, no way of knowing list missing item (that's price of efficiency) should used large data when less important know list missing item)


edit: timers:

a = [('item_1', 'attribute_x'), ('item_2', 'attribute_y'), ('item_3', 'attribute_z')] b = [('item_1', 'attribute_n'), ('item_3', 'attribute_p')]  def inbar(l1, l2):     d = {k:[v, false] k, v in l1}     key, value in l2:         if key in d:             d[key][1] = value         else:             d[key] = [false, value]     return (tuple([key]+value) key, value in d.iteritems())  def solus(l1, l2):     dict_a,dict_b = dict(l1), dict(l2)     items = sorted({i i,_ in l1+l2})     return [(i, dict_a.get(i,false), dict_b.get(i,false)) in items]  import timeit # running each timer 3 times sure. print timeit.timer('inbar(a, b)', 'from __main__ import a, b, inbar').repeat() # [2.2363221572247483, 2.1427426716407836, 2.1545361420851963] # [2.2058199808040575, 2.137495707329387, 2.178640404817184] # [2.4588094406466743, 2.4221991975274215, 2.3586636366037856] print timeit.timer('solus(a, b)', 'from __main__ import a, b, solus').repeat() # [5.841498824468664, 5.951693880486182, 5.866254325691159] # [5.843569212526087, 5.919173415087307, 6.027018876010061] # [6.41402184345621, 6.229860036924308, 6.562849100520403] 

Comments

Popular posts from this blog

java - activate/deactivate sonar maven plugin by profile? -

python - TypeError: can only concatenate tuple (not "float") to tuple -

java - What is the difference between String. and String.this. ? -