without itertools
Since we have now 3 or 4 variants with itertools, here is my version without itertools, almost as naive as the direct for loops I proposed in comments
def rec(d, rp=2):
if rp==0: return d
return rec({k1+(k2,): d[k1]*dl[k2] for k1 in d for k2 in dl}, rp-1)
def wrec(rp=2):
return rec({():1},rp)
So, just the same as the direct {(k1,k2): dl[k1]*dl[k2] for k1 in dl for k2 in dl}, but cross-producting one by one (it's a cross-product of a cross-product of ... if rp is big)
I use a "pure" version here, starting with {():1}, which is the neutral element of that cross-product. But of course, if we know there is no way the function would be called with rp=0 (after all, no other version works with rp=0), and even rp=1, we could save one recursion with
def wrec(rp=2):
return rec({(k,):dl[k] for k in dl}, rp-1)
with itertools
Just to have my own itertools version, here is one
{kk: mt.prod(dl[k] for k in kk) for kk in it.product(dl, repeat=2)}
Timings
(on my machine, with OP example. Depending on what is intended usage, it might be worth studying what happens when dl since increases, or when rp increases)
| method |
Timing |
| OP |
346 ÎĽs |
| J. Earls |
389 ÎĽs |
| J. Earls' simplified version |
426 ÎĽs |
| Stef |
468 ÎĽs |
| My itertools |
398 ÎĽs |
| Robert's |
363 ÎĽs |
| My no-itertools recursive |
167 ÎĽs |
| My 2nd no-itertools rp>0 |
136 ÎĽs |
| (for reference, direct iteration, but with forced rp=2) |
121 ÎĽs |
So the recursive version may be disappointing. It's always cool to pull up a itertools impressive yoga, when my recursive function is just what you learn at school. But it is faster, it depends on no library at all. And even if in python recursion is not often a good idea (because no tail call optimization, so even the simplest recursion can reach quite quickly the eponymous error "stack overflow"), here, I doubt you plan to have rp=20000 or something (if you do, you'll have other problems before stack overflow), so as disappointing as it is, I feel it is the best.
And if that no-itertools solution isn't acceptable, anyway, so far, the best solution is yours :D
Edit: impact of rp
Just to be sure, I wanted to see how rp impacts the timing. So, since I did it, I post it here. But it is quite boring. Roughly it has no impact. Just an almost constant multiplicative factor.
Following shows 3 times the same thing: First figure shows the log time (since it is clearly exponential with rp). It shows how basically the timing ratios stay constant (constant gap on the log scale). Second figure same with lin scale (just to better visualize how it is basically all equivalent, but for the solution without itertools). And 3rd shows the ratio between each solution and OP's, for each rp. Which show better than the 2 first how almost constant that ratio is (but stef's and for my recursive version that seems to have an even better ratio when rp rises, but with an obvious asymptotic constant ratio quite soon). But bottom line is: even considering other rp value doesn't change the previous conclusion: "fastest way is the boring recursive one, and if that is unacceptable, then it is yours"

vs dl size
Likewise, increasing the number of entries in dl has no impact on that conclusion neither. (again, it was expected. We expect for all method timing to be O(exp(rp)) and more precisely O(n^rp). So, with rp=2, quadratic in dl size, with rp=3, cubic, etc.

My code for timings
(the working part of the code is just copy taken from this page of contributors' code, including my own solution. So no answer about the question in this section. Just the timing/plotting code that OP requested in comments)
import itertools as it
import math as mt
from fractions import Fraction as fr
import timeit
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm
# Testing data
dl = {
'P1': fr(7, 120), 'P2': fr(20, 120), 'P3': fr(6, 120),
'P4': fr(10, 120), 'P6': fr(7, 120), 'P6': fr(18, 120),
'P7': fr(16, 120), 'P8': fr(19, 120), 'P9': fr(17, 120)
}
rp=2
######## Everybody's code ######
# Follow the same pattern for everybody: a function whose optional argument is rp.
# Use global variable dl as data
# OP
def org(rp=2):
iter_args = list(it.product(dl, repeat = rp))
iter_vals = list(map(mt.prod, it.product(dl.values(), repeat = rp)))
# Desired output
return {iter_args[i]: iter_vals[i] for i in range(len(iter_vals))}
# J. Earls first method
def jearls(rp=2):
products_entries = [
tuple(zip(*product))
for product in it.product(dl.items(), repeat=rp)
]
products = {
product[0]: mt.prod(product[1])
for product in products_entries
}
return products
# J. Earls simplified version. I skip it, since it is not faster
# (and they didn't claim it was faster. Just simpler, and less memory
# but I don't test memory)
def jearlsSimplified(rp=2):
return {
product[0]: mt.prod(product[1])
for product in (
tuple(zip(*product))
for product in it.product(dl.items(), repeat=rp)
)
}
# My itertools based version
def myit(rp=2):
return {kk: mt.prod(dl[k] for k in kk) for kk in it.product(dl, repeat=rp)}
# My recursive version
# Recursive function
def rec(d, rp=2):
if rp==0: return d
return rec({k1+(k2,): (d[k1]*dl[k2]) for k1 in d for k2 in dl}, rp-1)
# Wrapper (all other function just have this rp argument), if we assume rp could be 0
def wrecPure(rp=2):
return rec({():1}, rp)
# Wrapper if rp is at least 1
def wrecRpSup0(rp=2):
return rec({(k,):dl[k] for k in dl}, rp-1)
# Stef's version
from itertools import product # Stef has its own import. They're right about that, I think
import math
def stef(rp=2):
return {tuple(k for k,_ in p): mt.prod(v for _,v in p) for p in product(dl.items(), repeat=rp)}
# Robert's version
def robert(rp=2):
return dict(zip(
it.product(dl, repeat = rp),
map(mt.prod, it.product(dl.values(), repeat = rp))
))
# List of method to test, with a human name
methods=[(org, "OP"), (jearls, "J. Earls'"), (stef, "Stef's"), (robert, "Robert's"), (myit, "My itertools-based"), (wrecRpSup0, "recursive")]
def showTimingTable():
print("| Method | Timing |")
print("| - | - |")
for f, name in methods:
print(f"| {name} | {timeit.timeit(f, number=1000)/1000*1e6:.0f} ÎĽs |")
# Started from hard coded, hence the ame op/robert, when it can compare any pair
def robVsOpMatch(op=org, robert=robert, nameOp="OP", nameRobert="Robert's"):
orgsTime=np.array([timeit.timeit(op, number=1) for n in range(20000)])/1*1e6
robsTime=np.array([timeit.timeit(robert, number=1) for n in range(20000)])/1*1e6
fig,(ax1,ax2)=plt.subplots(2,1,sharex=True)
rng = (min(orgsTime.min(), robsTime.min()), max(np.quantile(orgsTime, 0.99), np.quantile(robsTime, 0.99)))
ax1.hist(orgsTime, bins=200, range=rng)
ax2.hist(robsTime, bins=200, range=rng)
ax1.set_title(nameOp)
ax2.set_title(nameRobert)
print(f"{nameRobert} wins {(robsTime<orgsTime).sum()/200} % of the times")
plt.show()
def showTimingsVsRp():
rps=list(range(2,8)) # 8 or more is to slow on my machine
nums=[1000,100,10,1,1,1] # Number of experiment to make for timing
# for low rp, we can have a lot to reduce randomness. For higher rp, 1 experiment is long enough, and randomness is reduced naturally by the timing
times = [[] for _ in methods]
with tqdm(total=2**(max(rps)-1)*len(methods)) as pbar:
for rp,n in zip(rps, nums):
for (f,name),tms in zip(methods, times):
tms.append(timeit.timeit(lambda: f(rp), number=n)/n*1e6)
pbar.update(2**(rp-2)) # I assume it is exponential starting for rp=2, which it is not. Especially with the `nums` thing. But well, it is just for progress bar. As long as it moves from times to times to tell that the computater is still alive...
times = np.array(times) # I need them in np format to be able to divide them
# 3 plots
fig, (ax1, ax2, ax3) = plt.subplots(1,3)
for (f,name),tms in zip(methods, times):
# y log scale
ax1.plot(rps, tms, label=name)
ax1.set_yscale('log')
ax1.legend()
# exact same, but linear scale
ax2.plot(rps, tms, label=name)
ax2.legend()
# ratio with OP
ax3.plot(rps, tms/times[0], label=name)
ax3.legend()
plt.show()
# Just a copy of the previous, but it's dl that changes
def showTimingsVsDlSize():
global dl # Because I need to alter the dl (since the functions don't have a dl parameter)
rp=2
lendl=list(range(9,27,3))
# no need for nums thing this times, it is just quadratic (with rp=2)
times = [[] for _ in methods]
with tqdm(total=len(methods)*len(lendl)) as pbar:
for N in lendl:
# A mock dict of N entries (same for everyone)
dl={f'P{i}': fr(np.random.randint(1,119), 120) for i in range(1, N+1)}
for (f,name),tms in zip(methods, times):
tms.append(timeit.timeit(f, number=100)/100*1e6)
pbar.update(1)
times = np.array(times) # I need them in np format to be able to divide them
# 3 plots
fig, (ax1, ax2, ax3) = plt.subplots(1,3)
for (f,name),tms in zip(methods, times):
# y log scale
ax1.plot(lendl, tms, label=name)
ax1.set_yscale('log')
ax1.legend()
# exact same, but linear scale
ax2.plot(lendl, tms, label=name)
ax2.legend()
# ratio with OP
ax3.plot(lendl, tms/times[0], label=name)
ax3.legend()
plt.show()
#showTimingTable()
#robVsOpMatch()
#showTimingsVsRp()
showTimingsVsDlSize()
{(k1,k2): dl[k1]*dl[k2] for k1 in dl for k2 in dl}is 3 times faster that your solution (and j.earls', since it is slightly slower than yours). Reading what you said in the question and in comments of the existing answer, I feel that for you jealrs>yours>naive, while timing-wise it is the exact opposite. So what is the criteria here?rp=2line. Which seems to indicate that it may also me not 2. In which case, nestedforloops can't work, if you don't know how many for loopimport matplotlib.pyplot as plt) or for well-known abbreviations (such asimport numpy as np) but I think you should either justimport mathand usemath.prod, orfrom math import prod; the middlemt.proddoesn't really add any clarityfrom operator import muland notfrom math import prodrpis a parameter, so that means for loops don't scale well. Thanks @Kelly. Have fixed the code