How to construct a list of lengths efficiently











up vote
13
down vote

favorite
6












Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...










share|improve this question
























  • What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    4 hours ago















up vote
13
down vote

favorite
6












Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...










share|improve this question
























  • What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    4 hours ago













up vote
13
down vote

favorite
6









up vote
13
down vote

favorite
6






6





Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...










share|improve this question















Say I have a sorted list of integers



RandomInteger[{1, 100000}, 10000] // Sort // Short


I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:



Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]


This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...







list-manipulation performance-tuning filtering






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 5 hours ago









Alexey Popkov

38k4104260




38k4104260










asked 16 hours ago









AccidentalFourierTransform

4,9261941




4,9261941












  • What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    4 hours ago


















  • What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
    – Thies Heidecke
    4 hours ago
















What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago




What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago










4 Answers
4






active

oldest

votes

















up vote
14
down vote



accepted










You can use the usual UnitStep + Total tricks:



r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

r1 === r2



{0.435358, Null}



{41.4357, Null}



True




Update



As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



mincounts[s_] := With[
{
unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
},
With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
]
]


Comparison:



SeedRandom[1];
s=RandomInteger[{1,100000},10000]//Sort;

(* my first answer *)
r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
(* J42161217's answer *)
r2 = Flatten[
Join[
{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
]
][[;;10000]]; // AbsoluteTiming
(* using Nearest *)
r3 = mincounts[s]; //AbsoluteTiming

r1 === r2 === r3



{0.432897, Null}



{0.122198, Null}



{0.025923, Null}



True







share|improve this answer























  • Ah, great answer, as usual.
    – AccidentalFourierTransform
    16 hours ago










  • Can you please check my answer because my laptop is very slow
    – J42161217
    14 hours ago


















up vote
9
down vote













I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?



r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
Table[Table[i, Differences[s][[i]]], {i,
Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



{0.157123, Null}




Using MapThread the same code is way faster



r6 = Flatten[
Join[{Table[0, s[[1]] - 1]},
MapThread[
Table, {Range[t = Length[Select[s, # <= 10000 &]]],
Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

r6===r3



{0.008387, Null}



True







share|improve this answer



















  • 1




    These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
    – Okkes Dulgerci
    13 hours ago












  • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
    – J42161217
    13 hours ago


















up vote
8
down vote













BinCounts and Accumulate combination is faster than all the methods posted so far:



r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



0.00069




versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
1 ;; Length[s]]]
]; // RepeatedTiming // First



 0.00081




r3 = mincounts[s]; // RepeatedTiming // First



0.016




r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
RepeatedTiming // First



0.149




r2 == r3 == r4 == r5



True







share|improve this answer























  • Beat me to it - BinCounts is the way... +1
    – ciao
    10 hours ago










  • Hey @ciao, you are back?!!
    – kglr
    10 hours ago










  • Sorry @Henrik; thanks for the edit.
    – kglr
    7 hours ago










  • Short and fast. No need sorting.. Excellent!!... +1
    – Okkes Dulgerci
    2 hours ago


















up vote
3
down vote













s = Sort[RandomInteger[{1, 100000}, 10000]];


Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



This can be done with this function:



mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
If[(Head[rules] === Rule) && (rules[[1]] === {}),
rules[[2]],
With[{spopt = SystemOptions["SparseArrayOptions"]},
Internal`WithLocalSettings[
SetSystemOptions[
"SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
SparseArray[rules, dims, background],
SetSystemOptions[spopt]]
]
]


So, in a total, r can be obtained as follows:



r4 = Accumulate[
mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
]; // RepeatedTiming // First



0.00055




For comparison:



r3 = Flatten[
Join[{Table[0, s[[1]] - 1]},
Table[Table[i, Differences[s][[i]]], {i,
Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
RepeatedTiming // First
r3 == r4



0.116



True







share|improve this answer





















    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "387"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f185874%2fhow-to-construct-a-list-of-lengths-efficiently%23new-answer', 'question_page');
    }
    );

    Post as a guest
































    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    14
    down vote



    accepted










    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True







    share|improve this answer























    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      16 hours ago










    • Can you please check my answer because my laptop is very slow
      – J42161217
      14 hours ago















    up vote
    14
    down vote



    accepted










    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True







    share|improve this answer























    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      16 hours ago










    • Can you please check my answer because my laptop is very slow
      – J42161217
      14 hours ago













    up vote
    14
    down vote



    accepted







    up vote
    14
    down vote



    accepted






    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True







    share|improve this answer














    You can use the usual UnitStep + Total tricks:



    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

    r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming

    r1 === r2



    {0.435358, Null}



    {41.4357, Null}



    True




    Update



    As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:



    mincounts[s_] := With[
    {
    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],
    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]
    },
    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},
    counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]
    ]
    ]


    Comparison:



    SeedRandom[1];
    s=RandomInteger[{1,100000},10000]//Sort;

    (* my first answer *)
    r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming
    (* J42161217's answer *)
    r2 = Flatten[
    Join[
    {Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]
    ]
    ][[;;10000]]; // AbsoluteTiming
    (* using Nearest *)
    r3 = mincounts[s]; //AbsoluteTiming

    r1 === r2 === r3



    {0.432897, Null}



    {0.122198, Null}



    {0.025923, Null}



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 13 hours ago

























    answered 16 hours ago









    Carl Woll

    64.8k285169




    64.8k285169












    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      16 hours ago










    • Can you please check my answer because my laptop is very slow
      – J42161217
      14 hours ago


















    • Ah, great answer, as usual.
      – AccidentalFourierTransform
      16 hours ago










    • Can you please check my answer because my laptop is very slow
      – J42161217
      14 hours ago
















    Ah, great answer, as usual.
    – AccidentalFourierTransform
    16 hours ago




    Ah, great answer, as usual.
    – AccidentalFourierTransform
    16 hours ago












    Can you please check my answer because my laptop is very slow
    – J42161217
    14 hours ago




    Can you please check my answer because my laptop is very slow
    – J42161217
    14 hours ago










    up vote
    9
    down vote













    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True







    share|improve this answer



















    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      13 hours ago












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      13 hours ago















    up vote
    9
    down vote













    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True







    share|improve this answer



















    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      13 hours ago












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      13 hours ago













    up vote
    9
    down vote










    up vote
    9
    down vote









    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True







    share|improve this answer














    I think this is at least x3 faster than Mr. Carl Woll's answer

    Can anybody compare my timing?



    r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming



    {0.157123, Null}




    Using MapThread the same code is way faster



    r6 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    MapThread[
    Table, {Range[t = Length[Select[s, # <= 10000 &]]],
    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming

    r6===r3



    {0.008387, Null}



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 4 hours ago

























    answered 14 hours ago









    J42161217

    2,853218




    2,853218








    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      13 hours ago












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      13 hours ago














    • 1




      These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
      – Okkes Dulgerci
      13 hours ago












    • Hey, thanks for checking. I will use your timing. Thanks for the confirmation
      – J42161217
      13 hours ago








    1




    1




    These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
    – Okkes Dulgerci
    13 hours ago






    These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
    – Okkes Dulgerci
    13 hours ago














    Hey, thanks for checking. I will use your timing. Thanks for the confirmation
    – J42161217
    13 hours ago




    Hey, thanks for checking. I will use your timing. Thanks for the confirmation
    – J42161217
    13 hours ago










    up vote
    8
    down vote













    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True







    share|improve this answer























    • Beat me to it - BinCounts is the way... +1
      – ciao
      10 hours ago










    • Hey @ciao, you are back?!!
      – kglr
      10 hours ago










    • Sorry @Henrik; thanks for the edit.
      – kglr
      7 hours ago










    • Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 hours ago















    up vote
    8
    down vote













    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True







    share|improve this answer























    • Beat me to it - BinCounts is the way... +1
      – ciao
      10 hours ago










    • Hey @ciao, you are back?!!
      – kglr
      10 hours ago










    • Sorry @Henrik; thanks for the edit.
      – kglr
      7 hours ago










    • Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 hours ago













    up vote
    8
    down vote










    up vote
    8
    down vote









    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True







    share|improve this answer














    BinCounts and Accumulate combination is faster than all the methods posted so far:



    r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First 



    0.00069




    versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:



    r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[
    1 ;; Length[s]]]
    ]; // RepeatedTiming // First



     0.00081




    r3 = mincounts[s]; // RepeatedTiming // First



    0.016




    r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 
    Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First



    0.149




    r2 == r3 == r4 == r5



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 7 hours ago









    Henrik Schumacher

    44.8k265130




    44.8k265130










    answered 11 hours ago









    kglr

    170k8193397




    170k8193397












    • Beat me to it - BinCounts is the way... +1
      – ciao
      10 hours ago










    • Hey @ciao, you are back?!!
      – kglr
      10 hours ago










    • Sorry @Henrik; thanks for the edit.
      – kglr
      7 hours ago










    • Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 hours ago


















    • Beat me to it - BinCounts is the way... +1
      – ciao
      10 hours ago










    • Hey @ciao, you are back?!!
      – kglr
      10 hours ago










    • Sorry @Henrik; thanks for the edit.
      – kglr
      7 hours ago










    • Short and fast. No need sorting.. Excellent!!... +1
      – Okkes Dulgerci
      2 hours ago
















    Beat me to it - BinCounts is the way... +1
    – ciao
    10 hours ago




    Beat me to it - BinCounts is the way... +1
    – ciao
    10 hours ago












    Hey @ciao, you are back?!!
    – kglr
    10 hours ago




    Hey @ciao, you are back?!!
    – kglr
    10 hours ago












    Sorry @Henrik; thanks for the edit.
    – kglr
    7 hours ago




    Sorry @Henrik; thanks for the edit.
    – kglr
    7 hours ago












    Short and fast. No need sorting.. Excellent!!... +1
    – Okkes Dulgerci
    2 hours ago




    Short and fast. No need sorting.. Excellent!!... +1
    – Okkes Dulgerci
    2 hours ago










    up vote
    3
    down vote













    s = Sort[RandomInteger[{1, 100000}, 10000]];


    Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



    This can be done with this function:



    mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
    If[(Head[rules] === Rule) && (rules[[1]] === {}),
    rules[[2]],
    With[{spopt = SystemOptions["SparseArrayOptions"]},
    Internal`WithLocalSettings[
    SetSystemOptions[
    "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
    SparseArray[rules, dims, background],
    SetSystemOptions[spopt]]
    ]
    ]


    So, in a total, r can be obtained as follows:



    r4 = Accumulate[
    mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
    ]; // RepeatedTiming // First



    0.00055




    For comparison:



    r3 = Flatten[
    Join[{Table[0, s[[1]] - 1]},
    Table[Table[i, Differences[s][[i]]], {i,
    Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
    RepeatedTiming // First
    r3 == r4



    0.116



    True







    share|improve this answer

























      up vote
      3
      down vote













      s = Sort[RandomInteger[{1, 100000}, 10000]];


      Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



      This can be done with this function:



      mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
      If[(Head[rules] === Rule) && (rules[[1]] === {}),
      rules[[2]],
      With[{spopt = SystemOptions["SparseArrayOptions"]},
      Internal`WithLocalSettings[
      SetSystemOptions[
      "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
      SparseArray[rules, dims, background],
      SetSystemOptions[spopt]]
      ]
      ]


      So, in a total, r can be obtained as follows:



      r4 = Accumulate[
      mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
      ]; // RepeatedTiming // First



      0.00055




      For comparison:



      r3 = Flatten[
      Join[{Table[0, s[[1]] - 1]},
      Table[Table[i, Differences[s][[i]]], {i,
      Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
      RepeatedTiming // First
      r3 == r4



      0.116



      True







      share|improve this answer























        up vote
        3
        down vote










        up vote
        3
        down vote









        s = Sort[RandomInteger[{1, 100000}, 10000]];


        Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



        This can be done with this function:



        mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
        If[(Head[rules] === Rule) && (rules[[1]] === {}),
        rules[[2]],
        With[{spopt = SystemOptions["SparseArrayOptions"]},
        Internal`WithLocalSettings[
        SetSystemOptions[
        "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
        SparseArray[rules, dims, background],
        SetSystemOptions[spopt]]
        ]
        ]


        So, in a total, r can be obtained as follows:



        r4 = Accumulate[
        mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
        ]; // RepeatedTiming // First



        0.00055




        For comparison:



        r3 = Flatten[
        Join[{Table[0, s[[1]] - 1]},
        Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
        RepeatedTiming // First
        r3 == r4



        0.116



        True







        share|improve this answer












        s = Sort[RandomInteger[{1, 100000}, 10000]];


        Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.



        This can be done with this function:



        mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 
        If[(Head[rules] === Rule) && (rules[[1]] === {}),
        rules[[2]],
        With[{spopt = SystemOptions["SparseArrayOptions"]},
        Internal`WithLocalSettings[
        SetSystemOptions[
        "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],
        SparseArray[rules, dims, background],
        SetSystemOptions[spopt]]
        ]
        ]


        So, in a total, r can be obtained as follows:



        r4 = Accumulate[
        mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]
        ]; // RepeatedTiming // First



        0.00055




        For comparison:



        r3 = Flatten[
        Join[{Table[0, s[[1]] - 1]},
        Table[Table[i, Differences[s][[i]]], {i,
        Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; //
        RepeatedTiming // First
        r3 == r4



        0.116



        True








        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 11 hours ago









        Henrik Schumacher

        44.8k265130




        44.8k265130






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f185874%2fhow-to-construct-a-list-of-lengths-efficiently%23new-answer', 'question_page');
            }
            );

            Post as a guest




















































































            Popular posts from this blog

            Сан-Квентин

            8-я гвардейская общевойсковая армия

            Алькесар