Merging (not removing) duplicated elements in a vector












1












$begingroup$


There's a struct named Player containing name and stat:



struct Player {
name: String,
stat: Stat,
}

struct Stat {
points: u32,
fouls: u32,
}


And I have a Vec<Player>. What is interesting is that there can be multiple players with same name and different stats like the following:



let players = vec![
Player {
name: String::from("player1"),
stat: Stat {
points: 10,
fouls: 1,
},
},
Player {
name: String::from("player2"),
stat: Stat {
points: 30,
fouls: 3,
},
},
Player {
name: String::from("player1"),
stat: Stat {
points: 5,
fouls: 1,
},
},
];


The first and the third elements have same name, but with different stat values. What I want to do is merging those elements into one by adding all stat values. After the merging process, the output vector will have 2 elements like the following:



[
Player {
name: String::from("player2"),
stat: Stat {
points: 30,
fouls: 3,
},
},
Player {
name: String::from("player1"),
stat: Stat {
points: 15,
fouls: 2,
},
},
]


(The order of output vector is not important in this case.)



And the code below is my solution:



let dup_merged: Vec<_> = players
.into_iter()
.fold(
HashMap::new(),
|mut acc: HashMap<String, Player>, curr: Player| {
match acc.get(&curr.name) {
Some(prev) => {
// Is it better to modify prev.stat
// instead of inserting new Player instance?
// HashMap<String, &Player> required to modification?
acc.insert(
curr.name.clone(),
Player {
name: curr.name,
stat: Stat {
points: prev.stat.points + curr.stat.points,
fouls: prev.stat.fouls + curr.stat.fouls,
},
},
);
}
None => {
acc.insert(curr.name.clone(), curr);
}
}
acc
},
)
.into_iter() // Hope there's better way to get a vector of a hash map
.map(|(_, v)| v)
.collect();


Playground link



I used HashMap (key: player.name: String, value: player: Player) to accumulate stat values for players with same name, then converted the HashMap to Vec<Player> again. I think my solution can be simpler, more idiomatic and refined better.










share|improve this question











$endgroup$

















    1












    $begingroup$


    There's a struct named Player containing name and stat:



    struct Player {
    name: String,
    stat: Stat,
    }

    struct Stat {
    points: u32,
    fouls: u32,
    }


    And I have a Vec<Player>. What is interesting is that there can be multiple players with same name and different stats like the following:



    let players = vec![
    Player {
    name: String::from("player1"),
    stat: Stat {
    points: 10,
    fouls: 1,
    },
    },
    Player {
    name: String::from("player2"),
    stat: Stat {
    points: 30,
    fouls: 3,
    },
    },
    Player {
    name: String::from("player1"),
    stat: Stat {
    points: 5,
    fouls: 1,
    },
    },
    ];


    The first and the third elements have same name, but with different stat values. What I want to do is merging those elements into one by adding all stat values. After the merging process, the output vector will have 2 elements like the following:



    [
    Player {
    name: String::from("player2"),
    stat: Stat {
    points: 30,
    fouls: 3,
    },
    },
    Player {
    name: String::from("player1"),
    stat: Stat {
    points: 15,
    fouls: 2,
    },
    },
    ]


    (The order of output vector is not important in this case.)



    And the code below is my solution:



    let dup_merged: Vec<_> = players
    .into_iter()
    .fold(
    HashMap::new(),
    |mut acc: HashMap<String, Player>, curr: Player| {
    match acc.get(&curr.name) {
    Some(prev) => {
    // Is it better to modify prev.stat
    // instead of inserting new Player instance?
    // HashMap<String, &Player> required to modification?
    acc.insert(
    curr.name.clone(),
    Player {
    name: curr.name,
    stat: Stat {
    points: prev.stat.points + curr.stat.points,
    fouls: prev.stat.fouls + curr.stat.fouls,
    },
    },
    );
    }
    None => {
    acc.insert(curr.name.clone(), curr);
    }
    }
    acc
    },
    )
    .into_iter() // Hope there's better way to get a vector of a hash map
    .map(|(_, v)| v)
    .collect();


    Playground link



    I used HashMap (key: player.name: String, value: player: Player) to accumulate stat values for players with same name, then converted the HashMap to Vec<Player> again. I think my solution can be simpler, more idiomatic and refined better.










    share|improve this question











    $endgroup$















      1












      1








      1





      $begingroup$


      There's a struct named Player containing name and stat:



      struct Player {
      name: String,
      stat: Stat,
      }

      struct Stat {
      points: u32,
      fouls: u32,
      }


      And I have a Vec<Player>. What is interesting is that there can be multiple players with same name and different stats like the following:



      let players = vec![
      Player {
      name: String::from("player1"),
      stat: Stat {
      points: 10,
      fouls: 1,
      },
      },
      Player {
      name: String::from("player2"),
      stat: Stat {
      points: 30,
      fouls: 3,
      },
      },
      Player {
      name: String::from("player1"),
      stat: Stat {
      points: 5,
      fouls: 1,
      },
      },
      ];


      The first and the third elements have same name, but with different stat values. What I want to do is merging those elements into one by adding all stat values. After the merging process, the output vector will have 2 elements like the following:



      [
      Player {
      name: String::from("player2"),
      stat: Stat {
      points: 30,
      fouls: 3,
      },
      },
      Player {
      name: String::from("player1"),
      stat: Stat {
      points: 15,
      fouls: 2,
      },
      },
      ]


      (The order of output vector is not important in this case.)



      And the code below is my solution:



      let dup_merged: Vec<_> = players
      .into_iter()
      .fold(
      HashMap::new(),
      |mut acc: HashMap<String, Player>, curr: Player| {
      match acc.get(&curr.name) {
      Some(prev) => {
      // Is it better to modify prev.stat
      // instead of inserting new Player instance?
      // HashMap<String, &Player> required to modification?
      acc.insert(
      curr.name.clone(),
      Player {
      name: curr.name,
      stat: Stat {
      points: prev.stat.points + curr.stat.points,
      fouls: prev.stat.fouls + curr.stat.fouls,
      },
      },
      );
      }
      None => {
      acc.insert(curr.name.clone(), curr);
      }
      }
      acc
      },
      )
      .into_iter() // Hope there's better way to get a vector of a hash map
      .map(|(_, v)| v)
      .collect();


      Playground link



      I used HashMap (key: player.name: String, value: player: Player) to accumulate stat values for players with same name, then converted the HashMap to Vec<Player> again. I think my solution can be simpler, more idiomatic and refined better.










      share|improve this question











      $endgroup$




      There's a struct named Player containing name and stat:



      struct Player {
      name: String,
      stat: Stat,
      }

      struct Stat {
      points: u32,
      fouls: u32,
      }


      And I have a Vec<Player>. What is interesting is that there can be multiple players with same name and different stats like the following:



      let players = vec![
      Player {
      name: String::from("player1"),
      stat: Stat {
      points: 10,
      fouls: 1,
      },
      },
      Player {
      name: String::from("player2"),
      stat: Stat {
      points: 30,
      fouls: 3,
      },
      },
      Player {
      name: String::from("player1"),
      stat: Stat {
      points: 5,
      fouls: 1,
      },
      },
      ];


      The first and the third elements have same name, but with different stat values. What I want to do is merging those elements into one by adding all stat values. After the merging process, the output vector will have 2 elements like the following:



      [
      Player {
      name: String::from("player2"),
      stat: Stat {
      points: 30,
      fouls: 3,
      },
      },
      Player {
      name: String::from("player1"),
      stat: Stat {
      points: 15,
      fouls: 2,
      },
      },
      ]


      (The order of output vector is not important in this case.)



      And the code below is my solution:



      let dup_merged: Vec<_> = players
      .into_iter()
      .fold(
      HashMap::new(),
      |mut acc: HashMap<String, Player>, curr: Player| {
      match acc.get(&curr.name) {
      Some(prev) => {
      // Is it better to modify prev.stat
      // instead of inserting new Player instance?
      // HashMap<String, &Player> required to modification?
      acc.insert(
      curr.name.clone(),
      Player {
      name: curr.name,
      stat: Stat {
      points: prev.stat.points + curr.stat.points,
      fouls: prev.stat.fouls + curr.stat.fouls,
      },
      },
      );
      }
      None => {
      acc.insert(curr.name.clone(), curr);
      }
      }
      acc
      },
      )
      .into_iter() // Hope there's better way to get a vector of a hash map
      .map(|(_, v)| v)
      .collect();


      Playground link



      I used HashMap (key: player.name: String, value: player: Player) to accumulate stat values for players with same name, then converted the HashMap to Vec<Player> again. I think my solution can be simpler, more idiomatic and refined better.







      functional-programming rust iterator






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 30 mins ago









      Jamal

      30.4k11121227




      30.4k11121227










      asked 13 hours ago









      philipjkimphilipjkim

      24719




      24719






















          1 Answer
          1






          active

          oldest

          votes


















          2












          $begingroup$

          Looks pretty good already! Here's some ideas for improvement.



          You can make implement Add and AddAssign for Stat so that you can use + and += on those. Here's an AddAssign implementation:



          impl std::ops::AddAssign for Stat {
          fn add_assign(&mut self, other: Self) {
          self.points += other.points;
          self.fouls += other.fouls;
          }
          }


          I like that you reached for a HashMap and fold, those are definitely how I'd do this. However, you only need to store the name and stat (HashMap<String, Stat>), which will make it so you don't have to clone the name each time and also reduce size. You can simply reconstruct the players at the end, before the collection.



          Another thing is that you currently perform two lookups in the hashmap for every insertion. The Entry API is a great tool to learn for interacting with maps. It lets you interact with a possibly missing element of the map and only perform one lookup.



          Here's a implementation with what I've mentioned so far.



          use std::collections::hash_map::Entry;

          let dup_merged: Vec<_> = players
          .into_iter()
          .fold(
          HashMap::new(),
          |mut acc: HashMap<String, Stat>, curr: Player| {
          match acc.entry(curr.name) {
          Entry::Occupied(mut occ) => {
          // This player already exists, increase its stats
          *occ.get_mut() += curr.stat;
          }
          Entry::Vacant(vac) => {
          // No such player exists, insert these stats
          vac.insert(curr.stat);
          }
          }
          acc
          },
          )
          .into_iter()
          .map(|(k, v)| Player { name: k, stat: v })
          .collect();


          There's another simplification that can be made, due to Stat being a small, cheap struct with sensible default values: You can have Stat implement Default, and then instead of inserting an entry if its missing, you can just have it default to a zero'd Stat and always add:



          #[derive(Default, Debug)]
          struct Stat {
          points: u32,
          fouls: u32,
          }
          // ...
          let dup_merged: Vec<_> = players
          .into_iter()
          .fold(
          HashMap::new(),
          |mut acc: HashMap<String, Stat>, curr: Player| {
          *acc.entry(curr.name).or_default() += curr.stat;
          acc
          },
          )
          .into_iter()
          .map(|(k, v)| Player { name: k, stat: v })
          .collect();





          share|improve this answer









          $endgroup$













            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "196"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215259%2fmerging-not-removing-duplicated-elements-in-a-vector%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2












            $begingroup$

            Looks pretty good already! Here's some ideas for improvement.



            You can make implement Add and AddAssign for Stat so that you can use + and += on those. Here's an AddAssign implementation:



            impl std::ops::AddAssign for Stat {
            fn add_assign(&mut self, other: Self) {
            self.points += other.points;
            self.fouls += other.fouls;
            }
            }


            I like that you reached for a HashMap and fold, those are definitely how I'd do this. However, you only need to store the name and stat (HashMap<String, Stat>), which will make it so you don't have to clone the name each time and also reduce size. You can simply reconstruct the players at the end, before the collection.



            Another thing is that you currently perform two lookups in the hashmap for every insertion. The Entry API is a great tool to learn for interacting with maps. It lets you interact with a possibly missing element of the map and only perform one lookup.



            Here's a implementation with what I've mentioned so far.



            use std::collections::hash_map::Entry;

            let dup_merged: Vec<_> = players
            .into_iter()
            .fold(
            HashMap::new(),
            |mut acc: HashMap<String, Stat>, curr: Player| {
            match acc.entry(curr.name) {
            Entry::Occupied(mut occ) => {
            // This player already exists, increase its stats
            *occ.get_mut() += curr.stat;
            }
            Entry::Vacant(vac) => {
            // No such player exists, insert these stats
            vac.insert(curr.stat);
            }
            }
            acc
            },
            )
            .into_iter()
            .map(|(k, v)| Player { name: k, stat: v })
            .collect();


            There's another simplification that can be made, due to Stat being a small, cheap struct with sensible default values: You can have Stat implement Default, and then instead of inserting an entry if its missing, you can just have it default to a zero'd Stat and always add:



            #[derive(Default, Debug)]
            struct Stat {
            points: u32,
            fouls: u32,
            }
            // ...
            let dup_merged: Vec<_> = players
            .into_iter()
            .fold(
            HashMap::new(),
            |mut acc: HashMap<String, Stat>, curr: Player| {
            *acc.entry(curr.name).or_default() += curr.stat;
            acc
            },
            )
            .into_iter()
            .map(|(k, v)| Player { name: k, stat: v })
            .collect();





            share|improve this answer









            $endgroup$


















              2












              $begingroup$

              Looks pretty good already! Here's some ideas for improvement.



              You can make implement Add and AddAssign for Stat so that you can use + and += on those. Here's an AddAssign implementation:



              impl std::ops::AddAssign for Stat {
              fn add_assign(&mut self, other: Self) {
              self.points += other.points;
              self.fouls += other.fouls;
              }
              }


              I like that you reached for a HashMap and fold, those are definitely how I'd do this. However, you only need to store the name and stat (HashMap<String, Stat>), which will make it so you don't have to clone the name each time and also reduce size. You can simply reconstruct the players at the end, before the collection.



              Another thing is that you currently perform two lookups in the hashmap for every insertion. The Entry API is a great tool to learn for interacting with maps. It lets you interact with a possibly missing element of the map and only perform one lookup.



              Here's a implementation with what I've mentioned so far.



              use std::collections::hash_map::Entry;

              let dup_merged: Vec<_> = players
              .into_iter()
              .fold(
              HashMap::new(),
              |mut acc: HashMap<String, Stat>, curr: Player| {
              match acc.entry(curr.name) {
              Entry::Occupied(mut occ) => {
              // This player already exists, increase its stats
              *occ.get_mut() += curr.stat;
              }
              Entry::Vacant(vac) => {
              // No such player exists, insert these stats
              vac.insert(curr.stat);
              }
              }
              acc
              },
              )
              .into_iter()
              .map(|(k, v)| Player { name: k, stat: v })
              .collect();


              There's another simplification that can be made, due to Stat being a small, cheap struct with sensible default values: You can have Stat implement Default, and then instead of inserting an entry if its missing, you can just have it default to a zero'd Stat and always add:



              #[derive(Default, Debug)]
              struct Stat {
              points: u32,
              fouls: u32,
              }
              // ...
              let dup_merged: Vec<_> = players
              .into_iter()
              .fold(
              HashMap::new(),
              |mut acc: HashMap<String, Stat>, curr: Player| {
              *acc.entry(curr.name).or_default() += curr.stat;
              acc
              },
              )
              .into_iter()
              .map(|(k, v)| Player { name: k, stat: v })
              .collect();





              share|improve this answer









              $endgroup$
















                2












                2








                2





                $begingroup$

                Looks pretty good already! Here's some ideas for improvement.



                You can make implement Add and AddAssign for Stat so that you can use + and += on those. Here's an AddAssign implementation:



                impl std::ops::AddAssign for Stat {
                fn add_assign(&mut self, other: Self) {
                self.points += other.points;
                self.fouls += other.fouls;
                }
                }


                I like that you reached for a HashMap and fold, those are definitely how I'd do this. However, you only need to store the name and stat (HashMap<String, Stat>), which will make it so you don't have to clone the name each time and also reduce size. You can simply reconstruct the players at the end, before the collection.



                Another thing is that you currently perform two lookups in the hashmap for every insertion. The Entry API is a great tool to learn for interacting with maps. It lets you interact with a possibly missing element of the map and only perform one lookup.



                Here's a implementation with what I've mentioned so far.



                use std::collections::hash_map::Entry;

                let dup_merged: Vec<_> = players
                .into_iter()
                .fold(
                HashMap::new(),
                |mut acc: HashMap<String, Stat>, curr: Player| {
                match acc.entry(curr.name) {
                Entry::Occupied(mut occ) => {
                // This player already exists, increase its stats
                *occ.get_mut() += curr.stat;
                }
                Entry::Vacant(vac) => {
                // No such player exists, insert these stats
                vac.insert(curr.stat);
                }
                }
                acc
                },
                )
                .into_iter()
                .map(|(k, v)| Player { name: k, stat: v })
                .collect();


                There's another simplification that can be made, due to Stat being a small, cheap struct with sensible default values: You can have Stat implement Default, and then instead of inserting an entry if its missing, you can just have it default to a zero'd Stat and always add:



                #[derive(Default, Debug)]
                struct Stat {
                points: u32,
                fouls: u32,
                }
                // ...
                let dup_merged: Vec<_> = players
                .into_iter()
                .fold(
                HashMap::new(),
                |mut acc: HashMap<String, Stat>, curr: Player| {
                *acc.entry(curr.name).or_default() += curr.stat;
                acc
                },
                )
                .into_iter()
                .map(|(k, v)| Player { name: k, stat: v })
                .collect();





                share|improve this answer









                $endgroup$



                Looks pretty good already! Here's some ideas for improvement.



                You can make implement Add and AddAssign for Stat so that you can use + and += on those. Here's an AddAssign implementation:



                impl std::ops::AddAssign for Stat {
                fn add_assign(&mut self, other: Self) {
                self.points += other.points;
                self.fouls += other.fouls;
                }
                }


                I like that you reached for a HashMap and fold, those are definitely how I'd do this. However, you only need to store the name and stat (HashMap<String, Stat>), which will make it so you don't have to clone the name each time and also reduce size. You can simply reconstruct the players at the end, before the collection.



                Another thing is that you currently perform two lookups in the hashmap for every insertion. The Entry API is a great tool to learn for interacting with maps. It lets you interact with a possibly missing element of the map and only perform one lookup.



                Here's a implementation with what I've mentioned so far.



                use std::collections::hash_map::Entry;

                let dup_merged: Vec<_> = players
                .into_iter()
                .fold(
                HashMap::new(),
                |mut acc: HashMap<String, Stat>, curr: Player| {
                match acc.entry(curr.name) {
                Entry::Occupied(mut occ) => {
                // This player already exists, increase its stats
                *occ.get_mut() += curr.stat;
                }
                Entry::Vacant(vac) => {
                // No such player exists, insert these stats
                vac.insert(curr.stat);
                }
                }
                acc
                },
                )
                .into_iter()
                .map(|(k, v)| Player { name: k, stat: v })
                .collect();


                There's another simplification that can be made, due to Stat being a small, cheap struct with sensible default values: You can have Stat implement Default, and then instead of inserting an entry if its missing, you can just have it default to a zero'd Stat and always add:



                #[derive(Default, Debug)]
                struct Stat {
                points: u32,
                fouls: u32,
                }
                // ...
                let dup_merged: Vec<_> = players
                .into_iter()
                .fold(
                HashMap::new(),
                |mut acc: HashMap<String, Stat>, curr: Player| {
                *acc.entry(curr.name).or_default() += curr.stat;
                acc
                },
                )
                .into_iter()
                .map(|(k, v)| Player { name: k, stat: v })
                .collect();






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 6 hours ago









                JayDeppJayDepp

                3212




                3212






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Code Review Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215259%2fmerging-not-removing-duplicated-elements-in-a-vector%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Сан-Квентин

                    8-я гвардейская общевойсковая армия

                    Алькесар