Data transformation involving dictionaries and lists

My initial problem is to transform a List of Mappings whose values are Lists of values. The transformed list must contain the Cartesian product of all lists (those that are values in the dictionaries) that belong to separate dictionaries (in other words, the values of the lists present in the same directories are "coupled").

Basically, if you disregard the keys of the dictionaries, this would be solved simply with itertools.product.

Input:

[

   {

      ('B3G', 'B1'): [1.0, 2.0], 

      ('B1G', 'B1'): [11.0, 12.0]

   }, 

   {

      ('B2G', 'B1'): [1.5, 2.5, 3.5]

   }

]

Output:

[

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 3.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 3.5}

]

To make the matter more confusing, the keys of each dictionaries are Tuples of strings.

Here is a possible implementation, using a class to isolate the whole mess.

@dataclass

class ParametricMapping:

"""Abstraction for multi-dimensional parametric mappings."""



mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



@property

def combinations(self) -> List[Mapping[Tuple[str], float]]:

    """Cartesian product adapted to work with dictionaries, roughly similar to `itertools.product`."""



    labels = [label for arg in self.mappings for label in tuple(arg.keys())]

    pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    def cartesian_product(*args):

        """Cartesian product similar to `itertools.product`"""

        result = []

        for pool in args:

            result = [x + [y] for x in result for y in pool]

        return result



    results = 

    for term in cartesian_product(*pools):

        results.append([pp for p in term for pp in p])



    tmp = 

    for r in results:

        tmp.append({k: v for k, v in zip(labels, r)})



    if len(tmp) == 0:

        return [{}]

    else:

        return tmp

Question: How can I improve this to make it cleaner (priority #1) and faster (#2)?

edited yesterday

Mathias Ettinger

23.9k33182

asked 2 days ago

Cedric H.

1515

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Mathias Ettinger
yesterday

add a comment |

Basically, if you disregard the keys of the dictionaries, this would be solved simply with itertools.product.

Input:

[

   {

      ('B3G', 'B1'): [1.0, 2.0], 

      ('B1G', 'B1'): [11.0, 12.0]

   }, 

   {

      ('B2G', 'B1'): [1.5, 2.5, 3.5]

   }

]

Output:

[

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 3.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 3.5}

]

To make the matter more confusing, the keys of each dictionaries are Tuples of strings.

Here is a possible implementation, using a class to isolate the whole mess.

@dataclass

class ParametricMapping:

"""Abstraction for multi-dimensional parametric mappings."""



mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



@property

def combinations(self) -> List[Mapping[Tuple[str], float]]:

    """Cartesian product adapted to work with dictionaries, roughly similar to `itertools.product`."""



    labels = [label for arg in self.mappings for label in tuple(arg.keys())]

    pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    def cartesian_product(*args):

        """Cartesian product similar to `itertools.product`"""

        result = []

        for pool in args:

            result = [x + [y] for x in result for y in pool]

        return result



    results = 

    for term in cartesian_product(*pools):

        results.append([pp for p in term for pp in p])



    tmp = 

    for r in results:

        tmp.append({k: v for k, v in zip(labels, r)})



    if len(tmp) == 0:

        return [{}]

    else:

        return tmp

Question: How can I improve this to make it cleaner (priority #1) and faster (#2)?

edited yesterday

Mathias Ettinger

23.9k33182

asked 2 days ago

Cedric H.

1515

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Mathias Ettinger
yesterday

add a comment |

Basically, if you disregard the keys of the dictionaries, this would be solved simply with itertools.product.

Input:

[

   {

      ('B3G', 'B1'): [1.0, 2.0], 

      ('B1G', 'B1'): [11.0, 12.0]

   }, 

   {

      ('B2G', 'B1'): [1.5, 2.5, 3.5]

   }

]

Output:

[

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 3.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 3.5}

]

To make the matter more confusing, the keys of each dictionaries are Tuples of strings.

Here is a possible implementation, using a class to isolate the whole mess.

@dataclass

class ParametricMapping:

"""Abstraction for multi-dimensional parametric mappings."""



mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



@property

def combinations(self) -> List[Mapping[Tuple[str], float]]:

    """Cartesian product adapted to work with dictionaries, roughly similar to `itertools.product`."""



    labels = [label for arg in self.mappings for label in tuple(arg.keys())]

    pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    def cartesian_product(*args):

        """Cartesian product similar to `itertools.product`"""

        result = []

        for pool in args:

            result = [x + [y] for x in result for y in pool]

        return result



    results = 

    for term in cartesian_product(*pools):

        results.append([pp for p in term for pp in p])



    tmp = 

    for r in results:

        tmp.append({k: v for k, v in zip(labels, r)})



    if len(tmp) == 0:

        return [{}]

    else:

        return tmp

Question: How can I improve this to make it cleaner (priority #1) and faster (#2)?

edited yesterday

Mathias Ettinger

23.9k33182

asked 2 days ago

Cedric H.

1515

Basically, if you disregard the keys of the dictionaries, this would be solved simply with itertools.product.

Input:

[

   {

      ('B3G', 'B1'): [1.0, 2.0], 

      ('B1G', 'B1'): [11.0, 12.0]

   }, 

   {

      ('B2G', 'B1'): [1.5, 2.5, 3.5]

   }

]

Output:

[

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 3.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 1.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 2.5},

  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 3.5}

]

To make the matter more confusing, the keys of each dictionaries are Tuples of strings.

Here is a possible implementation, using a class to isolate the whole mess.

@dataclass

class ParametricMapping:

"""Abstraction for multi-dimensional parametric mappings."""



mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



@property

def combinations(self) -> List[Mapping[Tuple[str], float]]:

    """Cartesian product adapted to work with dictionaries, roughly similar to `itertools.product`."""



    labels = [label for arg in self.mappings for label in tuple(arg.keys())]

    pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    def cartesian_product(*args):

        """Cartesian product similar to `itertools.product`"""

        result = []

        for pool in args:

            result = [x + [y] for x in result for y in pool]

        return result



    results = 

    for term in cartesian_product(*pools):

        results.append([pp for p in term for pp in p])



    tmp = 

    for r in results:

        tmp.append({k: v for k, v in zip(labels, r)})



    if len(tmp) == 0:

        return [{}]

    else:

        return tmp

Question: How can I improve this to make it cleaner (priority #1) and faster (#2)?

python python-3.x

edited yesterday

Mathias Ettinger

23.9k33182

asked 2 days ago

Cedric H.

1515

edited yesterday

Mathias Ettinger

23.9k33182

asked 2 days ago

Cedric H.

1515

edited yesterday

Mathias Ettinger

23.9k33182

edited yesterday

Mathias Ettinger

23.9k33182

edited yesterday

Mathias Ettinger

23.9k33182

asked 2 days ago

Cedric H.

1515

asked 2 days ago

Cedric H.

1515

asked 2 days ago

Cedric H.

1515

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Mathias Ettinger
yesterday

add a comment |

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Mathias Ettinger
yesterday

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Mathias Ettinger
yesterday

add a comment |

1 Answer
1

active

oldest

votes

TL;DR: Scroll to the end for the provided code with suggested improvements

Suggestions

1. Standalone `cartesian_product` helper function outside scope of `combinations`

A preferable solution would be to simply use itertools.product as most Python-savvy readers will be familiar with it - and it is well-documented for those that aren’t. However, since you explicitly mention it...

If you still don't want to use itertools.product:

While it does improve clarity to construct a hierarchy of scopes that expose things only when necessary, in this case, I believe that making cartesian_product a nested function inside combinations serves only to confuse the purpose of combinations. It is better to define cartesian_product above, making the code below cleaner and easier to understand. Now someone reading your code will first see cartesian_product’s definition and understand its relatively simple purpose. Then, inside ParametricMapping.combinations, readers are already familiar with cartesian_product, and their train of thought isn’t derailed by trying to understand a nested function.

2. `flatten` helper function

This helper function should be used thusly:

for term in cartesian_product(*pools):

    results.append(flatten(term))

This may seem silly as flattening is a simple operation, but in this example there are a few fairly tricky list/dict comprehensions nearby. Therefore, it may help to replace that piece with a simple flatten call to clean up some confusion and to emphasize that this is a straight-forward operation - A fact that could be lost on some readers of the code in its current state. My point here is that stacking lots of loops and comprehensions on top of each other (especially without documentation/comments) can quickly get messy and confusing.

3. Consolidate some code

If you took both of the above suggestions, a nice opportunity to consolidate some code has presented itself. After the above changes, the section that was originally

results = 

for term in cartesian_product(*pools):

    results.append([pp for p in term for pp in p])

should now be

results = 

for term in itertools.product(*pools):

    results.append(flatten(term))

This block can be expressed clearly and concisely with the following list comprehension:

results = [flatten(term) for term in itertools.product(*pools)]

Note that because of the separation of functionality into helper functions, the purpose of this list comprehension is abundantly clear. It creates results, which is a list containing the flattened outputs of product.

4. Check for empty `mappings` at top of `combinations`

Instead of the if/else at the end of combinations that checks for len(tmp) == 0, make the first two lines of combinations the following:

if self.mappings == [{}]:

    return [{}]

The else clause at the bottom of combinations is no longer necessary, and you can simply return tmp. This is cleaner because it handles the case where mappings is empty immediately, which means those cases bypass all the code that was being executed before when len(tmp) was evaluated at the end of combinations. This also allows readers to make the assumption that mappings is non-empty when they get into the actual work being done by combinations, which is just one less thing to worry about. Alternatively, a more concise option would be to replace

if len(tmp) == 0:

    return [{}]

else:

    return tmp

with

return tmp or [{}]

This works because if mappings is empty, the value of tmp will be an empty list, which will evaluate to False, returning the value following the or. This more concise version may come at the cost of decreased readability.

5. Define `labels` and `pools` using helper methods or non-init fields

Remove the first two code lines from combinations, and move them to either helper methods, or additional fields that are processed post-initialization. I recommend doing this to further clarify the work actually being done by combinations and to clean up the code overall. These both have the added benefit of exposing labels and pools as additional attributes of the dataclass that can be accessed outside of combinations.

See the ParametricMapping1 class defined in the section below for an implementation using helper methods, and see the alternative ParametricMapping2 class defined below it for an implementation using non-init fields.

TL;DR

In the end, if all suggestions are followed, the code should include the below declaration of flatten, along with one of the two following blocks (ParametricMapping1, or ParametricMapping2):

def flatten(l):

    return [item for sublist in l for item in sublist]

With either ...

@dataclass

class ParametricMapping1:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



    def _labels(self) -> List[Tuple[str]]:

        return flatten(self.mappings)



    def _pools(self) -> List[List[Sequence[float]]]:

        return [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        if self.mappings == [{}]:

            return [{}]



        pool_values = [flatten(term) for term in itertools.product(*self._pools())]

        return [dict(zip(self._labels(), v)) for v in pool_values]

Or ...

@dataclass

class ParametricMapping2:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

    labels: List[Tuple[str]] = field(init=False, repr=False)

    pools: List[List[Sequence[float]]] = field(init=False, repr=False)



    def __post_init__(self):

        self.labels = flatten(self.mappings)

        self.pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        pool_values = [flatten(term) for term in itertools.product(*self.pools)]

        return [dict(zip(self.labels, v)) for v in pool_values] or [{}]

Edit (2019-01-09 - 1530):

The definitions of _labels and self.labels in the above two code blocks, respectively, have been simplified per @MathiasEttinger's excellent suggestion. See revision history for their original definitions.

edited yesterday

answered 2 days ago

Hunter McGushion

1065

New contributor

1

Nice answer. Two thoughts though: isn't labels simply flatten(mappings)? And isn't zip already returning tuples? Why keep the map call?
– Mathias Ettinger
yesterday

Also I like to define flatten in terms of itertools.chain.from_iterable. But it's mostly a matter of personal taste.
– Mathias Ettinger
yesterday

Thanks a lot for the answer. I will investigate in details. @MathiasEttinger I didn't get your second point.
– Cedric H.
yesterday

2

@CedricH. I usually define flatten = itertools.chain.from_iterable as I prefer to work with iterables, but as an alternative to the flatten proposed by this answer, you could have def flatten(iterable): return list(itertools.chain.from_iterable(iterable)) instead. As said, there might not be much of a difference appart for style.
– Mathias Ettinger
yesterday

@MathiasEttinger Got it. I also changed the definition of labels based on your first comment.
– Cedric H.
yesterday

|
show 3 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211121%2fdata-transformation-involving-dictionaries-and-lists%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

TL;DR: Scroll to the end for the provided code with suggested improvements

Suggestions

1. Standalone `cartesian_product` helper function outside scope of `combinations`

If you still don't want to use itertools.product:

2. `flatten` helper function

This helper function should be used thusly:

for term in cartesian_product(*pools):

    results.append(flatten(term))

3. Consolidate some code

If you took both of the above suggestions, a nice opportunity to consolidate some code has presented itself. After the above changes, the section that was originally

results = 

for term in cartesian_product(*pools):

    results.append([pp for p in term for pp in p])

should now be

results = 

for term in itertools.product(*pools):

    results.append(flatten(term))

This block can be expressed clearly and concisely with the following list comprehension:

results = [flatten(term) for term in itertools.product(*pools)]

4. Check for empty `mappings` at top of `combinations`

Instead of the if/else at the end of combinations that checks for len(tmp) == 0, make the first two lines of combinations the following:

if self.mappings == [{}]:

    return [{}]

if len(tmp) == 0:

    return [{}]

else:

    return tmp

with

return tmp or [{}]

5. Define `labels` and `pools` using helper methods or non-init fields

TL;DR

In the end, if all suggestions are followed, the code should include the below declaration of flatten, along with one of the two following blocks (ParametricMapping1, or ParametricMapping2):

def flatten(l):

    return [item for sublist in l for item in sublist]

With either ...

@dataclass

class ParametricMapping1:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



    def _labels(self) -> List[Tuple[str]]:

        return flatten(self.mappings)



    def _pools(self) -> List[List[Sequence[float]]]:

        return [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        if self.mappings == [{}]:

            return [{}]



        pool_values = [flatten(term) for term in itertools.product(*self._pools())]

        return [dict(zip(self._labels(), v)) for v in pool_values]

Or ...

@dataclass

class ParametricMapping2:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

    labels: List[Tuple[str]] = field(init=False, repr=False)

    pools: List[List[Sequence[float]]] = field(init=False, repr=False)



    def __post_init__(self):

        self.labels = flatten(self.mappings)

        self.pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        pool_values = [flatten(term) for term in itertools.product(*self.pools)]

        return [dict(zip(self.labels, v)) for v in pool_values] or [{}]

Edit (2019-01-09 - 1530):

The definitions of _labels and self.labels in the above two code blocks, respectively, have been simplified per @MathiasEttinger's excellent suggestion. See revision history for their original definitions.

edited yesterday

answered 2 days ago

Hunter McGushion

1065

New contributor

1

Nice answer. Two thoughts though: isn't labels simply flatten(mappings)? And isn't zip already returning tuples? Why keep the map call?
– Mathias Ettinger
yesterday

Also I like to define flatten in terms of itertools.chain.from_iterable. But it's mostly a matter of personal taste.
– Mathias Ettinger
yesterday

Thanks a lot for the answer. I will investigate in details. @MathiasEttinger I didn't get your second point.
– Cedric H.
yesterday

2

@CedricH. I usually define flatten = itertools.chain.from_iterable as I prefer to work with iterables, but as an alternative to the flatten proposed by this answer, you could have def flatten(iterable): return list(itertools.chain.from_iterable(iterable)) instead. As said, there might not be much of a difference appart for style.
– Mathias Ettinger
yesterday

@MathiasEttinger Got it. I also changed the definition of labels based on your first comment.
– Cedric H.
yesterday

|
show 3 more comments

TL;DR: Scroll to the end for the provided code with suggested improvements

Suggestions

1. Standalone `cartesian_product` helper function outside scope of `combinations`

If you still don't want to use itertools.product:

2. `flatten` helper function

This helper function should be used thusly:

for term in cartesian_product(*pools):

    results.append(flatten(term))

3. Consolidate some code

If you took both of the above suggestions, a nice opportunity to consolidate some code has presented itself. After the above changes, the section that was originally

results = 

for term in cartesian_product(*pools):

    results.append([pp for p in term for pp in p])

should now be

results = 

for term in itertools.product(*pools):

    results.append(flatten(term))

This block can be expressed clearly and concisely with the following list comprehension:

results = [flatten(term) for term in itertools.product(*pools)]

4. Check for empty `mappings` at top of `combinations`

Instead of the if/else at the end of combinations that checks for len(tmp) == 0, make the first two lines of combinations the following:

if self.mappings == [{}]:

    return [{}]

if len(tmp) == 0:

    return [{}]

else:

    return tmp

with

return tmp or [{}]

5. Define `labels` and `pools` using helper methods or non-init fields

TL;DR

In the end, if all suggestions are followed, the code should include the below declaration of flatten, along with one of the two following blocks (ParametricMapping1, or ParametricMapping2):

def flatten(l):

    return [item for sublist in l for item in sublist]

With either ...

@dataclass

class ParametricMapping1:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



    def _labels(self) -> List[Tuple[str]]:

        return flatten(self.mappings)



    def _pools(self) -> List[List[Sequence[float]]]:

        return [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        if self.mappings == [{}]:

            return [{}]



        pool_values = [flatten(term) for term in itertools.product(*self._pools())]

        return [dict(zip(self._labels(), v)) for v in pool_values]

Or ...

@dataclass

class ParametricMapping2:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

    labels: List[Tuple[str]] = field(init=False, repr=False)

    pools: List[List[Sequence[float]]] = field(init=False, repr=False)



    def __post_init__(self):

        self.labels = flatten(self.mappings)

        self.pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        pool_values = [flatten(term) for term in itertools.product(*self.pools)]

        return [dict(zip(self.labels, v)) for v in pool_values] or [{}]

Edit (2019-01-09 - 1530):

The definitions of _labels and self.labels in the above two code blocks, respectively, have been simplified per @MathiasEttinger's excellent suggestion. See revision history for their original definitions.

edited yesterday

answered 2 days ago

Hunter McGushion

1065

New contributor

1

Nice answer. Two thoughts though: isn't labels simply flatten(mappings)? And isn't zip already returning tuples? Why keep the map call?
– Mathias Ettinger
yesterday

Also I like to define flatten in terms of itertools.chain.from_iterable. But it's mostly a matter of personal taste.
– Mathias Ettinger
yesterday

Thanks a lot for the answer. I will investigate in details. @MathiasEttinger I didn't get your second point.
– Cedric H.
yesterday

2

@CedricH. I usually define flatten = itertools.chain.from_iterable as I prefer to work with iterables, but as an alternative to the flatten proposed by this answer, you could have def flatten(iterable): return list(itertools.chain.from_iterable(iterable)) instead. As said, there might not be much of a difference appart for style.
– Mathias Ettinger
yesterday

@MathiasEttinger Got it. I also changed the definition of labels based on your first comment.
– Cedric H.
yesterday

|
show 3 more comments

TL;DR: Scroll to the end for the provided code with suggested improvements

Suggestions

1. Standalone `cartesian_product` helper function outside scope of `combinations`

If you still don't want to use itertools.product:

2. `flatten` helper function

This helper function should be used thusly:

for term in cartesian_product(*pools):

    results.append(flatten(term))

3. Consolidate some code

If you took both of the above suggestions, a nice opportunity to consolidate some code has presented itself. After the above changes, the section that was originally

results = 

for term in cartesian_product(*pools):

    results.append([pp for p in term for pp in p])

should now be

results = 

for term in itertools.product(*pools):

    results.append(flatten(term))

This block can be expressed clearly and concisely with the following list comprehension:

results = [flatten(term) for term in itertools.product(*pools)]

4. Check for empty `mappings` at top of `combinations`

Instead of the if/else at the end of combinations that checks for len(tmp) == 0, make the first two lines of combinations the following:

if self.mappings == [{}]:

    return [{}]

if len(tmp) == 0:

    return [{}]

else:

    return tmp

with

return tmp or [{}]

5. Define `labels` and `pools` using helper methods or non-init fields

TL;DR

In the end, if all suggestions are followed, the code should include the below declaration of flatten, along with one of the two following blocks (ParametricMapping1, or ParametricMapping2):

def flatten(l):

    return [item for sublist in l for item in sublist]

With either ...

@dataclass

class ParametricMapping1:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



    def _labels(self) -> List[Tuple[str]]:

        return flatten(self.mappings)



    def _pools(self) -> List[List[Sequence[float]]]:

        return [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        if self.mappings == [{}]:

            return [{}]



        pool_values = [flatten(term) for term in itertools.product(*self._pools())]

        return [dict(zip(self._labels(), v)) for v in pool_values]

Or ...

@dataclass

class ParametricMapping2:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

    labels: List[Tuple[str]] = field(init=False, repr=False)

    pools: List[List[Sequence[float]]] = field(init=False, repr=False)



    def __post_init__(self):

        self.labels = flatten(self.mappings)

        self.pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        pool_values = [flatten(term) for term in itertools.product(*self.pools)]

        return [dict(zip(self.labels, v)) for v in pool_values] or [{}]

Edit (2019-01-09 - 1530):

The definitions of _labels and self.labels in the above two code blocks, respectively, have been simplified per @MathiasEttinger's excellent suggestion. See revision history for their original definitions.

edited yesterday

answered 2 days ago

Hunter McGushion

1065

New contributor

TL;DR: Scroll to the end for the provided code with suggested improvements

Suggestions

1. Standalone `cartesian_product` helper function outside scope of `combinations`

If you still don't want to use itertools.product:

2. `flatten` helper function

This helper function should be used thusly:

for term in cartesian_product(*pools):

    results.append(flatten(term))

3. Consolidate some code

If you took both of the above suggestions, a nice opportunity to consolidate some code has presented itself. After the above changes, the section that was originally

results = 

for term in cartesian_product(*pools):

    results.append([pp for p in term for pp in p])

should now be

results = 

for term in itertools.product(*pools):

    results.append(flatten(term))

This block can be expressed clearly and concisely with the following list comprehension:

results = [flatten(term) for term in itertools.product(*pools)]

4. Check for empty `mappings` at top of `combinations`

Instead of the if/else at the end of combinations that checks for len(tmp) == 0, make the first two lines of combinations the following:

if self.mappings == [{}]:

    return [{}]

if len(tmp) == 0:

    return [{}]

else:

    return tmp

with

return tmp or [{}]

5. Define `labels` and `pools` using helper methods or non-init fields

TL;DR

In the end, if all suggestions are followed, the code should include the below declaration of flatten, along with one of the two following blocks (ParametricMapping1, or ParametricMapping2):

def flatten(l):

    return [item for sublist in l for item in sublist]

With either ...

@dataclass

class ParametricMapping1:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])



    def _labels(self) -> List[Tuple[str]]:

        return flatten(self.mappings)



    def _pools(self) -> List[List[Sequence[float]]]:

        return [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        if self.mappings == [{}]:

            return [{}]



        pool_values = [flatten(term) for term in itertools.product(*self._pools())]

        return [dict(zip(self._labels(), v)) for v in pool_values]

Or ...

@dataclass

class ParametricMapping2:

    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

    labels: List[Tuple[str]] = field(init=False, repr=False)

    pools: List[List[Sequence[float]]] = field(init=False, repr=False)



    def __post_init__(self):

        self.labels = flatten(self.mappings)

        self.pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]



    @property

    def combinations(self) -> List[Mapping[Tuple[str], float]]:

        pool_values = [flatten(term) for term in itertools.product(*self.pools)]

        return [dict(zip(self.labels, v)) for v in pool_values] or [{}]

Edit (2019-01-09 - 1530):

The definitions of _labels and self.labels in the above two code blocks, respectively, have been simplified per @MathiasEttinger's excellent suggestion. See revision history for their original definitions.

edited yesterday

answered 2 days ago

Hunter McGushion

1065

New contributor

edited yesterday

answered 2 days ago

Hunter McGushion

1065

New contributor

answered 2 days ago

Hunter McGushion

1065

answered 2 days ago

Hunter McGushion

1065

New contributor

Hunter McGushion is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

Nice answer. Two thoughts though: isn't labels simply flatten(mappings)? And isn't zip already returning tuples? Why keep the map call?
– Mathias Ettinger
yesterday

Also I like to define flatten in terms of itertools.chain.from_iterable. But it's mostly a matter of personal taste.
– Mathias Ettinger
yesterday

Thanks a lot for the answer. I will investigate in details. @MathiasEttinger I didn't get your second point.
– Cedric H.
yesterday

2

@CedricH. I usually define flatten = itertools.chain.from_iterable as I prefer to work with iterables, but as an alternative to the flatten proposed by this answer, you could have def flatten(iterable): return list(itertools.chain.from_iterable(iterable)) instead. As said, there might not be much of a difference appart for style.
– Mathias Ettinger
yesterday

@MathiasEttinger Got it. I also changed the definition of labels based on your first comment.
– Cedric H.
yesterday

|
show 3 more comments

1

Nice answer. Two thoughts though: isn't labels simply flatten(mappings)? And isn't zip already returning tuples? Why keep the map call?
– Mathias Ettinger
yesterday

Also I like to define flatten in terms of itertools.chain.from_iterable. But it's mostly a matter of personal taste.
– Mathias Ettinger
yesterday

Thanks a lot for the answer. I will investigate in details. @MathiasEttinger I didn't get your second point.
– Cedric H.
yesterday

2

@CedricH. I usually define flatten = itertools.chain.from_iterable as I prefer to work with iterables, but as an alternative to the flatten proposed by this answer, you could have def flatten(iterable): return list(itertools.chain.from_iterable(iterable)) instead. As said, there might not be much of a difference appart for style.
– Mathias Ettinger
yesterday

@MathiasEttinger Got it. I also changed the definition of labels based on your first comment.
– Cedric H.
yesterday

Nice answer. Two thoughts though: isn't labels simply flatten(mappings)? And isn't zip already returning tuples? Why keep the map call?
– Mathias Ettinger
yesterday

Also I like to define flatten in terms of itertools.chain.from_iterable. But it's mostly a matter of personal taste.
– Mathias Ettinger
yesterday

Thanks a lot for the answer. I will investigate in details. @MathiasEttinger I didn't get your second point.
– Cedric H.
yesterday

@CedricH. I usually define flatten = itertools.chain.from_iterable as I prefer to work with iterables, but as an alternative to the flatten proposed by this answer, you could have def flatten(iterable): return list(itertools.chain.from_iterable(iterable)) instead. As said, there might not be much of a difference appart for style.
– Mathias Ettinger
yesterday

@MathiasEttinger Got it. I also changed the definition of labels based on your first comment.
– Cedric H.
yesterday

|
show 3 more comments

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Data transformation involving dictionaries and lists

1 Answer 1

Suggestions

1. Standalone cartesian_product helper function outside scope of combinations

2. flatten helper function

3. Consolidate some code

4. Check for empty mappings at top of combinations

5. Define labels and pools using helper methods or non-init fields

TL;DR

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Suggestions

1. Standalone cartesian_product helper function outside scope of combinations

2. flatten helper function

3. Consolidate some code

4. Check for empty mappings at top of combinations

5. Define labels and pools using helper methods or non-init fields

TL;DR

Suggestions

1. Standalone cartesian_product helper function outside scope of combinations

2. flatten helper function

3. Consolidate some code

4. Check for empty mappings at top of combinations

5. Define labels and pools using helper methods or non-init fields

TL;DR

Suggestions

1. Standalone cartesian_product helper function outside scope of combinations

2. flatten helper function

3. Consolidate some code

4. Check for empty mappings at top of combinations

5. Define labels and pools using helper methods or non-init fields

TL;DR

Suggestions

1. Standalone cartesian_product helper function outside scope of combinations

2. flatten helper function

3. Consolidate some code

4. Check for empty mappings at top of combinations

5. Define labels and pools using helper methods or non-init fields

TL;DR

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Список кардиналов, возведённых папой римским Каликстом III

Deduzione

Mysql.sock missing - “Can't connect to local MySQL server through socket”

1 Answer
1

1. Standalone `cartesian_product` helper function outside scope of `combinations`

2. `flatten` helper function

4. Check for empty `mappings` at top of `combinations`

5. Define `labels` and `pools` using helper methods or non-init fields

1 Answer
1

1 Answer
1

1. Standalone `cartesian_product` helper function outside scope of `combinations`

2. `flatten` helper function

4. Check for empty `mappings` at top of `combinations`

5. Define `labels` and `pools` using helper methods or non-init fields

1. Standalone `cartesian_product` helper function outside scope of `combinations`

2. `flatten` helper function

4. Check for empty `mappings` at top of `combinations`

5. Define `labels` and `pools` using helper methods or non-init fields

1. Standalone `cartesian_product` helper function outside scope of `combinations`

2. `flatten` helper function

4. Check for empty `mappings` at top of `combinations`

5. Define `labels` and `pools` using helper methods or non-init fields

1. Standalone `cartesian_product` helper function outside scope of `combinations`

2. `flatten` helper function

4. Check for empty `mappings` at top of `combinations`

5. Define `labels` and `pools` using helper methods or non-init fields