0

I have a list of lists with strings.

words = [['gamma_ray_bursts','merger','death','throes','magnetic_flares','neutrino_antineutrino','objections','bursts','double_neutron_star','parker_instability','positrons'],
 ['dot','gravitational_lensing','splittings','limits','amplifications','time_delays','extracting_information','fix','distant_quasars'],
 ['recoil','gamma_ray_bursts','neutron_stars','jennings','possible_origins','birthplaces','disjoint','arrival_directions'],
 ['sn_sn','type_ii_supernovae','distances','dilution','extinction','extragalactic_distance_scale','expanding_photosphere','distance','photospheres','supernovae_sn','span_wide_range'],
 ['photon_pair','high_energy','gamma_ray_burst','optical_depth','absorbing_medium','implications','problem','annihilation_radiation','emergent_spectrum','limit','radiation_transfer','collimation','regions']]
  1. I would like to remove any elements of a list if it is a substring of another element.
  2. I would like the order preserved

' I have tried this loop:

for string_list in words:
    for item in string_list: 
        for item1 in string_list:
            if item in item1 and item!= item1:
                string_list.remove(item)

It seems to work with smaller list of lists but outputs an error when I increase the len of the list.

ValueError                                Traceback (most recent call last)
<ipython-input-91-7546f608171f> in <module>
      4         for item1 in string_list:
      5             if item in item1 and item!= item1:
----> 6                 string_list.remove(item)

ValueError: list.remove(x): x not in list

expected output:

words = [['gamma_ray_bursts','merger','death','throes','magnetic_flares','neutrino_antineutrino','objections','double_neutron_star','parker_instability','positrons'], ['dot','gravitational_lensing','splittings','limits','amplifications','time_delays','extracting_information','fix','distant_quasars'],['recoil','gamma_ray_bursts','neutron_stars','jennings','possible_origins','birthplaces','disjoint','arrival_directions'], ['sn_sn','type_ii_supernovae','distances','dilution','extinction','extragalactic_distance_scale','expanding_photosphere','photospheres','supernovae_sn','span_wide_range'],['photon_pair','high_energy','gamma_ray_burst','optical_depth','absorbing_medium','implications','problem','annihilation_radiation','emergent_spectrum','limit','radiation_transfer','collimation','regions']]

I've searched the forums, there is a very similar question and the solution works sometimes but other times it outputs an error, it's not consistent where this error occurs. The length of the list is variable. Python - Remove any element from a list of strings that is a substring of another element

5
  • Please add the desired output. You question can be interpreted in multiple ways. What do you mean by "substring of another element."? You of a list of lists of strings. Are you working with substrings from the strings? I am confused. Also what is an element? Are you referring to the other lists? Commented Dec 11, 2019 at 13:45
  • 3
    its never good practice to change the contents of a list that you are trying to iterate over Commented Dec 11, 2019 at 13:53
  • Sorry, I added the expected output. Within each list, I would like to remove any element/string that is a substring of another element/string. Ex. list_1 = ['gamma_ray_bursts' ,... 'bursts'] remove 'bursts' output = ['gamma_ray_bursts',...] Each list of list should be checked for substrings independently. No. I'm referring to elements within each list, not to the other lists. Commented Dec 11, 2019 at 13:56
  • @ChrisDoyle I was wondering that. If that's the case, creating a new list without the substrings is better/acceptable practice? Commented Dec 11, 2019 at 13:58
  • so just to clarify your requirement, you want to remove any element of a list if its a sublist of another item in the same list. Commented Dec 11, 2019 at 14:03

1 Answer 1

1

Instead of removing elements from a list, why not create a new one matching your requirements (since being safer)?

# method to filter out substrings
def substr_in_list(elem, lst):
  for s in lst:
    if elem != s and elem in s:
      return True
  return False

words = [[j for j in i if not substr_in_list(j, i)] for i in words]

Output :

[['gamma_ray_bursts', 'merger', 'death', 'throes', 'magnetic_flares', 'neutrino_antineutrino', 'objections', 'double_neutron_star', 'parker_instability', 'positrons'], ['dot', 'gravitational_lensing', 'splittings', 'limits', 'amplifications', 'time_delays', 'extracting_information', 'fix', 'distant_quasars'], ['recoil', 'gamma_ray_bursts', 'neutron_stars', 'jennings', 'possible_origins', 'birthplaces', 'disjoint', 'arrival_directions'], ['sn_sn', 'type_ii_supernovae', 'distances', 'dilution', 'extinction', 'extragalactic_distance_scale', 'expanding_photosphere', 'photospheres', 'supernovae_sn', 'span_wide_range'], ['photon_pair', 'high_energy', 'gamma_ray_burst', 'optical_depth', 'absorbing_medium', 'implications', 'problem', 'annihilation_radiation', 'emergent_spectrum', 'limit', 'radiation_transfer', 'collimation', 'regions']]
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.