Prime factorization in Python











up vote
3
down vote

favorite












Beginner programmer here. What can I improve in my code? Are there things I'm doing wrong or are there easier ways of doing what I did?



def reduce(n):
n = abs(n)
ctr = 0
factors_list =
for i in range(2,n+1):
if ctr >= 1:
break
if n % i == 0:
factors_list.append(i)
factors_list.append(int(n/i))
ctr += 1
return factors_list


def isPrime(n):
return 1 in reduce(n)


def primeFactorization(n):
if isPrime(n):
return reduce(n)
factors = reduce(n)
primeFactors =
while True:
for e in factors:
if isPrime(e):
primeFactors.append(e)
factors.remove(e)
else:
factors.extend(reduce(e))
factors.remove(e)
if len(factors) == 0:
break
return sorted(primeFactors)









share|improve this question




























    up vote
    3
    down vote

    favorite












    Beginner programmer here. What can I improve in my code? Are there things I'm doing wrong or are there easier ways of doing what I did?



    def reduce(n):
    n = abs(n)
    ctr = 0
    factors_list =
    for i in range(2,n+1):
    if ctr >= 1:
    break
    if n % i == 0:
    factors_list.append(i)
    factors_list.append(int(n/i))
    ctr += 1
    return factors_list


    def isPrime(n):
    return 1 in reduce(n)


    def primeFactorization(n):
    if isPrime(n):
    return reduce(n)
    factors = reduce(n)
    primeFactors =
    while True:
    for e in factors:
    if isPrime(e):
    primeFactors.append(e)
    factors.remove(e)
    else:
    factors.extend(reduce(e))
    factors.remove(e)
    if len(factors) == 0:
    break
    return sorted(primeFactors)









    share|improve this question


























      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      Beginner programmer here. What can I improve in my code? Are there things I'm doing wrong or are there easier ways of doing what I did?



      def reduce(n):
      n = abs(n)
      ctr = 0
      factors_list =
      for i in range(2,n+1):
      if ctr >= 1:
      break
      if n % i == 0:
      factors_list.append(i)
      factors_list.append(int(n/i))
      ctr += 1
      return factors_list


      def isPrime(n):
      return 1 in reduce(n)


      def primeFactorization(n):
      if isPrime(n):
      return reduce(n)
      factors = reduce(n)
      primeFactors =
      while True:
      for e in factors:
      if isPrime(e):
      primeFactors.append(e)
      factors.remove(e)
      else:
      factors.extend(reduce(e))
      factors.remove(e)
      if len(factors) == 0:
      break
      return sorted(primeFactors)









      share|improve this question















      Beginner programmer here. What can I improve in my code? Are there things I'm doing wrong or are there easier ways of doing what I did?



      def reduce(n):
      n = abs(n)
      ctr = 0
      factors_list =
      for i in range(2,n+1):
      if ctr >= 1:
      break
      if n % i == 0:
      factors_list.append(i)
      factors_list.append(int(n/i))
      ctr += 1
      return factors_list


      def isPrime(n):
      return 1 in reduce(n)


      def primeFactorization(n):
      if isPrime(n):
      return reduce(n)
      factors = reduce(n)
      primeFactors =
      while True:
      for e in factors:
      if isPrime(e):
      primeFactors.append(e)
      factors.remove(e)
      else:
      factors.extend(reduce(e))
      factors.remove(e)
      if len(factors) == 0:
      break
      return sorted(primeFactors)






      python beginner python-3.x primes






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 26 at 5:40









      200_success

      127k15148412




      127k15148412










      asked Nov 26 at 0:23









      BrowserM

      162




      162






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          4
          down vote













          Documentation and tests



          Before improving your code, is it important to write tests for it to make you you don't break anything.



          As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:



          def print_results():
          print("i, primeFactorization(i), reduce(i), isPrime(i)")
          for i in range(15):
          print(i, primeFactorization(i), reduce(i), isPrime(i))


          Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.



          We could:




          • write doc to specify what the functions does (return prime factorisation)


          • write tests for the actual expected behavior


          • fix the code



          Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.



          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert isPrime(p), p
          for np in not_primes:
          assert not isPrime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = primeFactorization(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = primeFactorization(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()


          Now, it feels fuch safer to improve the code.



          Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.



          Style



          Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.



          Among other things, the functions names should be in snake_case.



          Algorithm



          Splitting a problem into smaller problems and writing functions for these is usually a good idea.



          Unfortunately, I am not fully convinced that the reduce functions really helps you here.



          Let's see how things can be improved anyway.



          Small simplifications/optimisations



          We could use the divmod builtin to get both the quotient and the remainder of the division.



          The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.



          At this stage, we have:



          def reduce(n):
          n = abs(n)
          factors_list =
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          factors_list.append(i)
          factors_list.append(p)
          break
          return factors_list


          Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:



          def reduce(n):
          n = abs(n)
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return


          Now, it is clear that the function:




          • returns in the cases 0 and 1

          • return [n, 1] when n is prime

          • return [d, n/d] where d is the smallest (prime) divisisors otherwise.


          Also, reduce is called more than needed: everytime we do



          if isPrime(n):
          ret = reduce(n)


          on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.



          Thus, we could get an optimisation boost by writing:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if 1 in factors:
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if 1 in new_factors:
          primeFactors.append(e)
          factors.remove(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.



          In your case, we could use this in reduce, change slightly how reduce behaves and write:



          def reduce(n):
          """Return [a, b] where a is the smallest divisor of n and n = a * b."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return [n, 1]


          def isPrime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and reduce(n) == [n, 1]


          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          which is much faster.



          Then, you could get rid of:



              if len(factors) == 0:
          break


          by looping with while factors.



          Then using, list.pop(), you could get rid of remove (which takes a linear time):



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          Then it appears that the initial check is not really required as the logic is already performed inside the loop:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop(0)
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          return primeFactors


          Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:



          import math

          def get_smallest_div(n):
          """Return the smallest divisor of n."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return i
          return n


          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and get_smallest_div(n) == n

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          factors = [n]
          while factors:
          n = factors.pop(0)
          div = get_smallest_div(n)
          if div == n: # prime
          prime_factors.append(n)
          else:
          factors.extend([div, n//div])
          return prime_factors

          def print_results():
          print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")
          for i in range(15):
          print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))

          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert is_prime(p), p
          for np in not_primes:
          assert not is_prime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = get_prime_factors(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = get_prime_factors(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          start = time.perf_counter()
          import time
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()
          print(get_prime_factors(9000000))
          print(get_prime_factors(9000000 * 701 * 701))
          print(time.perf_counter() - start)


          My solution for this



          If I was to write this from scratch, here is how I'd do it:



          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          if n < 2:
          return False
          return all(n % i for i in range(2, int(math.sqrt(n)) + 1))

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          d = 2
          while d * d <= n:
          while n % d == 0:
          n //= d
          prime_factors.append(d)
          d += 1
          if n > 1: # to avoid 1 as a factor
          assert d <= n
          prime_factors.append(n)
          return prime_factors





          share|improve this answer



















          • 2




            In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
            – Graipher
            Nov 26 at 15:47






          • 1




            @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
            – Josay
            Nov 26 at 16:10











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208410%2fprime-factorization-in-python%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          4
          down vote













          Documentation and tests



          Before improving your code, is it important to write tests for it to make you you don't break anything.



          As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:



          def print_results():
          print("i, primeFactorization(i), reduce(i), isPrime(i)")
          for i in range(15):
          print(i, primeFactorization(i), reduce(i), isPrime(i))


          Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.



          We could:




          • write doc to specify what the functions does (return prime factorisation)


          • write tests for the actual expected behavior


          • fix the code



          Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.



          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert isPrime(p), p
          for np in not_primes:
          assert not isPrime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = primeFactorization(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = primeFactorization(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()


          Now, it feels fuch safer to improve the code.



          Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.



          Style



          Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.



          Among other things, the functions names should be in snake_case.



          Algorithm



          Splitting a problem into smaller problems and writing functions for these is usually a good idea.



          Unfortunately, I am not fully convinced that the reduce functions really helps you here.



          Let's see how things can be improved anyway.



          Small simplifications/optimisations



          We could use the divmod builtin to get both the quotient and the remainder of the division.



          The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.



          At this stage, we have:



          def reduce(n):
          n = abs(n)
          factors_list =
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          factors_list.append(i)
          factors_list.append(p)
          break
          return factors_list


          Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:



          def reduce(n):
          n = abs(n)
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return


          Now, it is clear that the function:




          • returns in the cases 0 and 1

          • return [n, 1] when n is prime

          • return [d, n/d] where d is the smallest (prime) divisisors otherwise.


          Also, reduce is called more than needed: everytime we do



          if isPrime(n):
          ret = reduce(n)


          on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.



          Thus, we could get an optimisation boost by writing:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if 1 in factors:
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if 1 in new_factors:
          primeFactors.append(e)
          factors.remove(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.



          In your case, we could use this in reduce, change slightly how reduce behaves and write:



          def reduce(n):
          """Return [a, b] where a is the smallest divisor of n and n = a * b."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return [n, 1]


          def isPrime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and reduce(n) == [n, 1]


          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          which is much faster.



          Then, you could get rid of:



              if len(factors) == 0:
          break


          by looping with while factors.



          Then using, list.pop(), you could get rid of remove (which takes a linear time):



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          Then it appears that the initial check is not really required as the logic is already performed inside the loop:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop(0)
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          return primeFactors


          Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:



          import math

          def get_smallest_div(n):
          """Return the smallest divisor of n."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return i
          return n


          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and get_smallest_div(n) == n

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          factors = [n]
          while factors:
          n = factors.pop(0)
          div = get_smallest_div(n)
          if div == n: # prime
          prime_factors.append(n)
          else:
          factors.extend([div, n//div])
          return prime_factors

          def print_results():
          print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")
          for i in range(15):
          print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))

          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert is_prime(p), p
          for np in not_primes:
          assert not is_prime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = get_prime_factors(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = get_prime_factors(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          start = time.perf_counter()
          import time
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()
          print(get_prime_factors(9000000))
          print(get_prime_factors(9000000 * 701 * 701))
          print(time.perf_counter() - start)


          My solution for this



          If I was to write this from scratch, here is how I'd do it:



          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          if n < 2:
          return False
          return all(n % i for i in range(2, int(math.sqrt(n)) + 1))

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          d = 2
          while d * d <= n:
          while n % d == 0:
          n //= d
          prime_factors.append(d)
          d += 1
          if n > 1: # to avoid 1 as a factor
          assert d <= n
          prime_factors.append(n)
          return prime_factors





          share|improve this answer



















          • 2




            In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
            – Graipher
            Nov 26 at 15:47






          • 1




            @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
            – Josay
            Nov 26 at 16:10















          up vote
          4
          down vote













          Documentation and tests



          Before improving your code, is it important to write tests for it to make you you don't break anything.



          As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:



          def print_results():
          print("i, primeFactorization(i), reduce(i), isPrime(i)")
          for i in range(15):
          print(i, primeFactorization(i), reduce(i), isPrime(i))


          Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.



          We could:




          • write doc to specify what the functions does (return prime factorisation)


          • write tests for the actual expected behavior


          • fix the code



          Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.



          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert isPrime(p), p
          for np in not_primes:
          assert not isPrime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = primeFactorization(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = primeFactorization(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()


          Now, it feels fuch safer to improve the code.



          Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.



          Style



          Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.



          Among other things, the functions names should be in snake_case.



          Algorithm



          Splitting a problem into smaller problems and writing functions for these is usually a good idea.



          Unfortunately, I am not fully convinced that the reduce functions really helps you here.



          Let's see how things can be improved anyway.



          Small simplifications/optimisations



          We could use the divmod builtin to get both the quotient and the remainder of the division.



          The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.



          At this stage, we have:



          def reduce(n):
          n = abs(n)
          factors_list =
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          factors_list.append(i)
          factors_list.append(p)
          break
          return factors_list


          Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:



          def reduce(n):
          n = abs(n)
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return


          Now, it is clear that the function:




          • returns in the cases 0 and 1

          • return [n, 1] when n is prime

          • return [d, n/d] where d is the smallest (prime) divisisors otherwise.


          Also, reduce is called more than needed: everytime we do



          if isPrime(n):
          ret = reduce(n)


          on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.



          Thus, we could get an optimisation boost by writing:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if 1 in factors:
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if 1 in new_factors:
          primeFactors.append(e)
          factors.remove(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.



          In your case, we could use this in reduce, change slightly how reduce behaves and write:



          def reduce(n):
          """Return [a, b] where a is the smallest divisor of n and n = a * b."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return [n, 1]


          def isPrime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and reduce(n) == [n, 1]


          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          which is much faster.



          Then, you could get rid of:



              if len(factors) == 0:
          break


          by looping with while factors.



          Then using, list.pop(), you could get rid of remove (which takes a linear time):



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          Then it appears that the initial check is not really required as the logic is already performed inside the loop:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop(0)
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          return primeFactors


          Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:



          import math

          def get_smallest_div(n):
          """Return the smallest divisor of n."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return i
          return n


          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and get_smallest_div(n) == n

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          factors = [n]
          while factors:
          n = factors.pop(0)
          div = get_smallest_div(n)
          if div == n: # prime
          prime_factors.append(n)
          else:
          factors.extend([div, n//div])
          return prime_factors

          def print_results():
          print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")
          for i in range(15):
          print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))

          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert is_prime(p), p
          for np in not_primes:
          assert not is_prime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = get_prime_factors(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = get_prime_factors(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          start = time.perf_counter()
          import time
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()
          print(get_prime_factors(9000000))
          print(get_prime_factors(9000000 * 701 * 701))
          print(time.perf_counter() - start)


          My solution for this



          If I was to write this from scratch, here is how I'd do it:



          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          if n < 2:
          return False
          return all(n % i for i in range(2, int(math.sqrt(n)) + 1))

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          d = 2
          while d * d <= n:
          while n % d == 0:
          n //= d
          prime_factors.append(d)
          d += 1
          if n > 1: # to avoid 1 as a factor
          assert d <= n
          prime_factors.append(n)
          return prime_factors





          share|improve this answer



















          • 2




            In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
            – Graipher
            Nov 26 at 15:47






          • 1




            @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
            – Josay
            Nov 26 at 16:10













          up vote
          4
          down vote










          up vote
          4
          down vote









          Documentation and tests



          Before improving your code, is it important to write tests for it to make you you don't break anything.



          As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:



          def print_results():
          print("i, primeFactorization(i), reduce(i), isPrime(i)")
          for i in range(15):
          print(i, primeFactorization(i), reduce(i), isPrime(i))


          Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.



          We could:




          • write doc to specify what the functions does (return prime factorisation)


          • write tests for the actual expected behavior


          • fix the code



          Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.



          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert isPrime(p), p
          for np in not_primes:
          assert not isPrime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = primeFactorization(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = primeFactorization(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()


          Now, it feels fuch safer to improve the code.



          Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.



          Style



          Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.



          Among other things, the functions names should be in snake_case.



          Algorithm



          Splitting a problem into smaller problems and writing functions for these is usually a good idea.



          Unfortunately, I am not fully convinced that the reduce functions really helps you here.



          Let's see how things can be improved anyway.



          Small simplifications/optimisations



          We could use the divmod builtin to get both the quotient and the remainder of the division.



          The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.



          At this stage, we have:



          def reduce(n):
          n = abs(n)
          factors_list =
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          factors_list.append(i)
          factors_list.append(p)
          break
          return factors_list


          Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:



          def reduce(n):
          n = abs(n)
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return


          Now, it is clear that the function:




          • returns in the cases 0 and 1

          • return [n, 1] when n is prime

          • return [d, n/d] where d is the smallest (prime) divisisors otherwise.


          Also, reduce is called more than needed: everytime we do



          if isPrime(n):
          ret = reduce(n)


          on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.



          Thus, we could get an optimisation boost by writing:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if 1 in factors:
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if 1 in new_factors:
          primeFactors.append(e)
          factors.remove(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.



          In your case, we could use this in reduce, change slightly how reduce behaves and write:



          def reduce(n):
          """Return [a, b] where a is the smallest divisor of n and n = a * b."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return [n, 1]


          def isPrime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and reduce(n) == [n, 1]


          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          which is much faster.



          Then, you could get rid of:



              if len(factors) == 0:
          break


          by looping with while factors.



          Then using, list.pop(), you could get rid of remove (which takes a linear time):



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          Then it appears that the initial check is not really required as the logic is already performed inside the loop:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop(0)
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          return primeFactors


          Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:



          import math

          def get_smallest_div(n):
          """Return the smallest divisor of n."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return i
          return n


          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and get_smallest_div(n) == n

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          factors = [n]
          while factors:
          n = factors.pop(0)
          div = get_smallest_div(n)
          if div == n: # prime
          prime_factors.append(n)
          else:
          factors.extend([div, n//div])
          return prime_factors

          def print_results():
          print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")
          for i in range(15):
          print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))

          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert is_prime(p), p
          for np in not_primes:
          assert not is_prime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = get_prime_factors(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = get_prime_factors(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          start = time.perf_counter()
          import time
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()
          print(get_prime_factors(9000000))
          print(get_prime_factors(9000000 * 701 * 701))
          print(time.perf_counter() - start)


          My solution for this



          If I was to write this from scratch, here is how I'd do it:



          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          if n < 2:
          return False
          return all(n % i for i in range(2, int(math.sqrt(n)) + 1))

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          d = 2
          while d * d <= n:
          while n % d == 0:
          n //= d
          prime_factors.append(d)
          d += 1
          if n > 1: # to avoid 1 as a factor
          assert d <= n
          prime_factors.append(n)
          return prime_factors





          share|improve this answer














          Documentation and tests



          Before improving your code, is it important to write tests for it to make you you don't break anything.



          As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:



          def print_results():
          print("i, primeFactorization(i), reduce(i), isPrime(i)")
          for i in range(15):
          print(i, primeFactorization(i), reduce(i), isPrime(i))


          Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.



          We could:




          • write doc to specify what the functions does (return prime factorisation)


          • write tests for the actual expected behavior


          • fix the code



          Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.



          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert isPrime(p), p
          for np in not_primes:
          assert not isPrime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = primeFactorization(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = primeFactorization(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()


          Now, it feels fuch safer to improve the code.



          Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.



          Style



          Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.



          Among other things, the functions names should be in snake_case.



          Algorithm



          Splitting a problem into smaller problems and writing functions for these is usually a good idea.



          Unfortunately, I am not fully convinced that the reduce functions really helps you here.



          Let's see how things can be improved anyway.



          Small simplifications/optimisations



          We could use the divmod builtin to get both the quotient and the remainder of the division.



          The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.



          At this stage, we have:



          def reduce(n):
          n = abs(n)
          factors_list =
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          factors_list.append(i)
          factors_list.append(p)
          break
          return factors_list


          Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:



          def reduce(n):
          n = abs(n)
          for i in range(2,n+1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return


          Now, it is clear that the function:




          • returns in the cases 0 and 1

          • return [n, 1] when n is prime

          • return [d, n/d] where d is the smallest (prime) divisisors otherwise.


          Also, reduce is called more than needed: everytime we do



          if isPrime(n):
          ret = reduce(n)


          on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.



          Thus, we could get an optimisation boost by writing:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if 1 in factors:
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if 1 in new_factors:
          primeFactors.append(e)
          factors.remove(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.



          In your case, we could use this in reduce, change slightly how reduce behaves and write:



          def reduce(n):
          """Return [a, b] where a is the smallest divisor of n and n = a * b."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return [i, p]
          return [n, 1]


          def isPrime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and reduce(n) == [n, 1]


          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while True:
          for e in factors:
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          factors.remove(e)
          if len(factors) == 0:
          break
          ret = sorted(primeFactors)
          return ret


          which is much faster.



          Then, you could get rid of:



              if len(factors) == 0:
          break


          by looping with while factors.



          Then using, list.pop(), you could get rid of remove (which takes a linear time):



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          factors = reduce(n)
          if factors == [n, 1]: # prime
          return [n]
          primeFactors =
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          Then it appears that the initial check is not really required as the logic is already performed inside the loop:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop()
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          ret = sorted(primeFactors)
          return ret


          We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:



          def primeFactorization(n):
          """Return the prime factorisation of n in sorted order."""
          primeFactors =
          factors = [n]
          while factors:
          e = factors.pop(0)
          new_factors = reduce(e)
          if new_factors == [e, 1]: # prime
          primeFactors.append(e)
          else:
          factors.extend(new_factors)
          return primeFactors


          Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:



          import math

          def get_smallest_div(n):
          """Return the smallest divisor of n."""
          n = abs(n)
          for i in range(2, int(math.sqrt(n)) + 1):
          p, q = divmod(n, i)
          if q == 0:
          return i
          return n


          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          return n > 1 and get_smallest_div(n) == n

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          factors = [n]
          while factors:
          n = factors.pop(0)
          div = get_smallest_div(n)
          if div == n: # prime
          prime_factors.append(n)
          else:
          factors.extend([div, n//div])
          return prime_factors

          def print_results():
          print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")
          for i in range(15):
          print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))

          def test_is_prime():
          primes = [2, 3, 5, 7, 11, 13]
          not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]
          for p in primes:
          assert is_prime(p), p
          for np in not_primes:
          assert not is_prime(np), np

          def test_prime_factorization():
          prime_factorisations = {
          2: [2],
          3: [3],
          4: [2, 2],
          5: [5],
          6: [2, 3],
          7: [7],
          8: [2, 2, 2],
          9: [3, 3],
          10: [2, 5],
          11: [11],
          12: [2, 2, 3],
          13: [13],
          14: [2, 7],
          }
          for n, factors in prime_factorisations.items():
          ret = get_prime_factors(n)
          assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)

          def test_prime_factorization_randomised():
          import random
          n = random.randint(2, 10000)
          ret = get_prime_factors(n)
          m = 1
          assert sorted(ret) == ret, "return is not sorted for n:" + str(n)
          for p in ret:
          assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)
          m *= p
          assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)

          if __name__ == '__main__':
          start = time.perf_counter()
          import time
          print_results()
          test_is_prime()
          test_prime_factorization()
          for i in range(300):
          test_prime_factorization_randomised()
          print(get_prime_factors(9000000))
          print(get_prime_factors(9000000 * 701 * 701))
          print(time.perf_counter() - start)


          My solution for this



          If I was to write this from scratch, here is how I'd do it:



          def is_prime(n):
          """Return True if n is a prime number, False otherwise."""
          if n < 2:
          return False
          return all(n % i for i in range(2, int(math.sqrt(n)) + 1))

          def get_prime_factors(n):
          """Return the prime factorisation of n in sorted order."""
          prime_factors =
          d = 2
          while d * d <= n:
          while n % d == 0:
          n //= d
          prime_factors.append(d)
          d += 1
          if n > 1: # to avoid 1 as a factor
          assert d <= n
          prime_factors.append(n)
          return prime_factors






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 27 at 9:51

























          answered Nov 26 at 10:13









          Josay

          24.6k13783




          24.6k13783








          • 2




            In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
            – Graipher
            Nov 26 at 15:47






          • 1




            @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
            – Josay
            Nov 26 at 16:10














          • 2




            In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
            – Graipher
            Nov 26 at 15:47






          • 1




            @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
            – Josay
            Nov 26 at 16:10








          2




          2




          In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
          – Graipher
          Nov 26 at 15:47




          In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list.
          – Graipher
          Nov 26 at 15:47




          1




          1




          @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
          – Josay
          Nov 26 at 16:10




          @Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own.
          – Josay
          Nov 26 at 16:10


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208410%2fprime-factorization-in-python%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Сан-Квентин

          8-я гвардейская общевойсковая армия

          Алькесар