How to construct a list of lengths efficiently

up vote
13
down vote

favorite

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...

edited 5 hours ago

Alexey Popkov

38k4104260

asked 16 hours ago

AccidentalFourierTransform

4,9261941

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago

add a comment |

up vote
13
down vote

favorite

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

edited 5 hours ago

Alexey Popkov

38k4104260

asked 16 hours ago

AccidentalFourierTransform

4,9261941

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago

add a comment |

up vote
13
down vote

favorite

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

edited 5 hours ago

Alexey Popkov

38k4104260

asked 16 hours ago

AccidentalFourierTransform

4,9261941

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

list-manipulation performance-tuning filtering

edited 5 hours ago

Alexey Popkov

38k4104260

asked 16 hours ago

AccidentalFourierTransform

4,9261941

edited 5 hours ago

Alexey Popkov

38k4104260

asked 16 hours ago

AccidentalFourierTransform

4,9261941

edited 5 hours ago

Alexey Popkov

38k4104260

edited 5 hours ago

Alexey Popkov

38k4104260

edited 5 hours ago

Alexey Popkov

38k4104260

asked 16 hours ago

AccidentalFourierTransform

4,9261941

asked 16 hours ago

AccidentalFourierTransform

4,9261941

asked 16 hours ago

AccidentalFourierTransform

4,9261941

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago

add a comment |

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
4 hours ago

add a comment |

4 Answers
4

active

oldest

votes

up vote
14
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited 13 hours ago

answered 16 hours ago

Carl Woll

64.8k285169

Ah, great answer, as usual.
– AccidentalFourierTransform
16 hours ago

Can you please check my answer because my laptop is very slow
– J42161217
14 hours ago

add a comment |

up vote
9
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited 4 hours ago

answered 14 hours ago

J42161217

2,853218

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
13 hours ago

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
13 hours ago

add a comment |

up vote
8
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited 7 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

kglr

170k8193397

Beat me to it - BinCounts is the way... +1
– ciao
10 hours ago

Hey @ciao, you are back?!!
– kglr
10 hours ago

Sorry @Henrik; thanks for the edit.
– kglr
7 hours ago

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
2 hours ago

add a comment |

up vote
3
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in a total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

answered 11 hours ago

Henrik Schumacher

44.8k265130

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "387"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f185874%2fhow-to-construct-a-list-of-lengths-efficiently%23new-answer', 'question_page');
}
);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
14
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited 13 hours ago

answered 16 hours ago

Carl Woll

64.8k285169

Ah, great answer, as usual.
– AccidentalFourierTransform
16 hours ago

Can you please check my answer because my laptop is very slow
– J42161217
14 hours ago

add a comment |

up vote
14
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited 13 hours ago

answered 16 hours ago

Carl Woll

64.8k285169

Ah, great answer, as usual.
– AccidentalFourierTransform
16 hours ago

Can you please check my answer because my laptop is very slow
– J42161217
14 hours ago

add a comment |

up vote
14
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited 13 hours ago

answered 16 hours ago

Carl Woll

64.8k285169

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited 13 hours ago

answered 16 hours ago

Carl Woll

64.8k285169

edited 13 hours ago

answered 16 hours ago

Carl Woll

64.8k285169

answered 16 hours ago

Carl Woll

64.8k285169

answered 16 hours ago

Carl Woll

64.8k285169

Ah, great answer, as usual.
– AccidentalFourierTransform
16 hours ago

Can you please check my answer because my laptop is very slow
– J42161217
14 hours ago

add a comment |

Ah, great answer, as usual.
– AccidentalFourierTransform
16 hours ago

Can you please check my answer because my laptop is very slow
– J42161217
14 hours ago

Ah, great answer, as usual.
– AccidentalFourierTransform
16 hours ago

Can you please check my answer because my laptop is very slow
– J42161217
14 hours ago

add a comment |

up vote
9
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited 4 hours ago

answered 14 hours ago

J42161217

2,853218

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
13 hours ago

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
13 hours ago

add a comment |

up vote
9
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited 4 hours ago

answered 14 hours ago

J42161217

2,853218

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
13 hours ago

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
13 hours ago

add a comment |

up vote
9
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited 4 hours ago

answered 14 hours ago

J42161217

2,853218

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited 4 hours ago

answered 14 hours ago

J42161217

2,853218

edited 4 hours ago

answered 14 hours ago

J42161217

2,853218

answered 14 hours ago

J42161217

2,853218

answered 14 hours ago

J42161217

2,853218

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
13 hours ago

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
13 hours ago

add a comment |

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
13 hours ago

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
13 hours ago

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
13 hours ago

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
13 hours ago

add a comment |

up vote
8
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited 7 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

kglr

170k8193397

Beat me to it - BinCounts is the way... +1
– ciao
10 hours ago

Hey @ciao, you are back?!!
– kglr
10 hours ago

Sorry @Henrik; thanks for the edit.
– kglr
7 hours ago

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
2 hours ago

add a comment |

up vote
8
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited 7 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

kglr

170k8193397

Beat me to it - BinCounts is the way... +1
– ciao
10 hours ago

Hey @ciao, you are back?!!
– kglr
10 hours ago

Sorry @Henrik; thanks for the edit.
– kglr
7 hours ago

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
2 hours ago

add a comment |

up vote
8
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited 7 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

kglr

170k8193397

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited 7 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

kglr

170k8193397

edited 7 hours ago

Henrik Schumacher

44.8k265130

edited 7 hours ago

Henrik Schumacher

44.8k265130

edited 7 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

kglr

170k8193397

answered 11 hours ago

kglr

170k8193397

answered 11 hours ago

kglr

170k8193397

Beat me to it - BinCounts is the way... +1
– ciao
10 hours ago

Hey @ciao, you are back?!!
– kglr
10 hours ago

Sorry @Henrik; thanks for the edit.
– kglr
7 hours ago

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
2 hours ago

add a comment |

Beat me to it - BinCounts is the way... +1
– ciao
10 hours ago

Hey @ciao, you are back?!!
– kglr
10 hours ago

Sorry @Henrik; thanks for the edit.
– kglr
7 hours ago

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
2 hours ago

Beat me to it - BinCounts is the way... +1
– ciao
10 hours ago

Hey @ciao, you are back?!!
– kglr
10 hours ago

Sorry @Henrik; thanks for the edit.
– kglr
7 hours ago

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
2 hours ago

add a comment |

up vote
3
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in a total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

answered 11 hours ago

Henrik Schumacher

44.8k265130

add a comment |

up vote
3
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in a total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

answered 11 hours ago

Henrik Schumacher

44.8k265130

add a comment |

up vote
3
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in a total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

answered 11 hours ago

Henrik Schumacher

44.8k265130

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in a total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

answered 11 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

Henrik Schumacher

44.8k265130

answered 11 hours ago

Henrik Schumacher

44.8k265130

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Gfrktyl