Gfrktyl

Question

Beginner programmer here. What can I improve in my code? Are there things I'm doing wrong or are there easier ways of doing what I did?

def reduce(n):

    n = abs(n)

    ctr = 0

    factors_list = 

    for i in range(2,n+1):

        if ctr >= 1:

            break

        if n % i == 0:

            factors_list.append(i)

            factors_list.append(int(n/i))

            ctr += 1

    return factors_list





def isPrime(n):

    return 1 in reduce(n)





def primeFactorization(n):

    if isPrime(n):

        return reduce(n)

    factors = reduce(n)

    primeFactors = 

    while True:

        for e in factors:

            if isPrime(e):

                primeFactors.append(e)

                factors.remove(e)

            else:

                factors.extend(reduce(e))

                factors.remove(e)

        if len(factors) == 0:

            break

    return sorted(primeFactors)

score 4 · Answer 1 · 2018-11-27 09:51:16Z

Documentation and tests

Before improving your code, is it important to write tests for it to make you you don't break anything.

As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:

def print_results():

    print("i, primeFactorization(i), reduce(i), isPrime(i)")

    for i in range(15):

        print(i, primeFactorization(i), reduce(i), isPrime(i))

Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.

We could:

write doc to specify what the functions does (return prime factorisation)

write tests for the actual expected behavior

fix the code

Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.

def test_is_prime():

    primes = [2, 3, 5, 7, 11, 13]

    not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]

    for p in primes:

        assert isPrime(p), p

    for np in not_primes:

        assert not isPrime(np), np



def test_prime_factorization():

    prime_factorisations = {

        2: [2],

        3: [3],

        4: [2, 2],

        5: [5],

        6: [2, 3],

        7: [7],

        8: [2, 2, 2],

        9: [3, 3],

        10: [2, 5],

        11: [11],

        12: [2, 2, 3],

        13: [13],

        14: [2, 7],

    }

    for n, factors in prime_factorisations.items():

        ret = primeFactorization(n)

        assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)



def test_prime_factorization_randomised():

    import random

    n = random.randint(2, 10000)

    ret = primeFactorization(n)

    m = 1

    assert sorted(ret) == ret, "return is not sorted for n:" + str(n)

    for p in ret:

        assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)

        m *= p

    assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)



if __name__ == '__main__':

    print_results()

    test_is_prime()

    test_prime_factorization()

    for i in range(300):

        test_prime_factorization_randomised()

Now, it feels fuch safer to improve the code.

Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.

Style

Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.

Among other things, the functions names should be in snake_case.

Algorithm

Splitting a problem into smaller problems and writing functions for these is usually a good idea.

Unfortunately, I am not fully convinced that the reduce functions really helps you here.

Let's see how things can be improved anyway.

Small simplifications/optimisations

We could use the divmod builtin to get both the quotient and the remainder of the division.

The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.

At this stage, we have:

def reduce(n):

    n = abs(n)

    factors_list = 

    for i in range(2,n+1):

        p, q = divmod(n, i)

        if q == 0:

            factors_list.append(i)

            factors_list.append(p)

            break

    return factors_list

Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:

def reduce(n):

    n = abs(n)

    for i in range(2,n+1):

        p, q = divmod(n, i)

        if q == 0:

            return [i, p]

    return

Now, it is clear that the function:

returns in the cases 0 and 1

return [n, 1] when n is prime

return [d, n/d] where d is the smallest (prime) divisisors otherwise.

Also, reduce is called more than needed: everytime we do

if isPrime(n):

    ret = reduce(n)

on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.

Thus, we could get an optimisation boost by writing:

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    factors = reduce(n)

    if 1 in factors:

        return [n]

    primeFactors = 

    while True:

        for e in factors:

            new_factors = reduce(e)

            if 1 in new_factors:

                primeFactors.append(e)

                factors.remove(e)

            else:

                factors.extend(new_factors)

                factors.remove(e)

        if len(factors) == 0:

            break

    ret = sorted(primeFactors)

    return ret

Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.

In your case, we could use this in reduce, change slightly how reduce behaves and write:

def reduce(n):

    """Return [a, b] where a is the smallest divisor of n and n = a * b."""

    n = abs(n)

    for i in range(2, int(math.sqrt(n)) + 1):

        p, q = divmod(n, i)

        if q == 0:

            return [i, p]

    return [n, 1]





def isPrime(n):

    """Return True if n is a prime number, False otherwise."""

    return n > 1 and reduce(n) == [n, 1]





def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    factors = reduce(n)

    if factors == [n, 1]:  # prime

        return [n] 

    primeFactors = 

    while True:

        for e in factors:

            new_factors = reduce(e)

            if new_factors == [e, 1]:  # prime

                primeFactors.append(e)

            else:

                factors.extend(new_factors)

            factors.remove(e)

        if len(factors) == 0:

            break

    ret = sorted(primeFactors)

    return ret

which is much faster.

Then, you could get rid of:

    if len(factors) == 0:

        break

by looping with while factors.

Then using, list.pop(), you could get rid of remove (which takes a linear time):

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    factors = reduce(n)

    if factors == [n, 1]:  # prime

        return [n]

    primeFactors = 

    while factors:

        e = factors.pop()

        new_factors = reduce(e)

        if new_factors == [e, 1]:  # prime

            primeFactors.append(e)

        else:

            factors.extend(new_factors)

    ret = sorted(primeFactors)

    return ret

Then it appears that the initial check is not really required as the logic is already performed inside the loop:

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    primeFactors = 

    factors = [n]

    while factors:

        e = factors.pop()

        new_factors = reduce(e)

        if new_factors == [e, 1]:  # prime

            primeFactors.append(e)

        else:

            factors.extend(new_factors)

    ret = sorted(primeFactors)

    return ret

We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    primeFactors = 

    factors = [n]

    while factors:

        e = factors.pop(0)

        new_factors = reduce(e)

        if new_factors == [e, 1]:  # prime

            primeFactors.append(e)

        else:

            factors.extend(new_factors)

    return primeFactors

Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:

import math



def get_smallest_div(n):

    """Return the smallest divisor of n."""

    n = abs(n)

    for i in range(2, int(math.sqrt(n)) + 1):

        p, q = divmod(n, i)

        if q == 0:

            return i

    return n





def is_prime(n):

    """Return True if n is a prime number, False otherwise."""

    return n > 1 and get_smallest_div(n) == n



def get_prime_factors(n):

    """Return the prime factorisation of n in sorted order."""

    prime_factors = 

    factors = [n]

    while factors:

        n = factors.pop(0)

        div = get_smallest_div(n)

        if div == n:  # prime

            prime_factors.append(n)

        else:

            factors.extend([div, n//div])

    return prime_factors



def print_results():

    print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")

    for i in range(15):

        print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))



def test_is_prime():

    primes = [2, 3, 5, 7, 11, 13]

    not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]

    for p in primes:

        assert is_prime(p), p

    for np in not_primes:

        assert not is_prime(np), np



def test_prime_factorization():

    prime_factorisations = {

        2: [2],

        3: [3],

        4: [2, 2],

        5: [5],

        6: [2, 3],

        7: [7],

        8: [2, 2, 2],

        9: [3, 3],

        10: [2, 5],

        11: [11],

        12: [2, 2, 3],

        13: [13],

        14: [2, 7],

    }

    for n, factors in prime_factorisations.items():

        ret = get_prime_factors(n)

        assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)



def test_prime_factorization_randomised():

    import random

    n = random.randint(2, 10000)

    ret = get_prime_factors(n)

    m = 1

    assert sorted(ret) == ret, "return is not sorted for n:" + str(n)

    for p in ret:

        assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)

        m *= p

    assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)



if __name__ == '__main__':

    start = time.perf_counter()

    import time

    print_results()

    test_is_prime()

    test_prime_factorization()

    for i in range(300):

        test_prime_factorization_randomised()

    print(get_prime_factors(9000000))

    print(get_prime_factors(9000000 * 701 * 701))

    print(time.perf_counter() - start)

My solution for this

If I was to write this from scratch, here is how I'd do it:

def is_prime(n):

    """Return True if n is a prime number, False otherwise."""

    if n < 2:

        return False

    return all(n % i for i in range(2, int(math.sqrt(n)) + 1))



def get_prime_factors(n):

    """Return the prime factorisation of n in sorted order."""

    prime_factors = 

    d = 2

    while d * d <= n:

        while n % d == 0:

            n //= d

            prime_factors.append(d)

        d += 1

    if n > 1:  # to avoid 1 as a factor

        assert d <= n

        prime_factors.append(n)

    return prime_factors

In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list. — Nov 26 at 15:47
@Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own. — Nov 26 at 16:10

score 4 · Answer 2 · 2018-11-27 09:51:16Z

Documentation and tests

Before improving your code, is it important to write tests for it to make you you don't break anything.

As you do it, you might want that you need to be a bit clearer about the behavior of your function. Let's see what your functions return with a simple piece of code:

def print_results():

    print("i, primeFactorization(i), reduce(i), isPrime(i)")

    for i in range(15):

        print(i, primeFactorization(i), reduce(i), isPrime(i))

Most results seem ok but why do we sometimes have "1" in the return value for primeFactorisation. In theory, we should have only prime numbers in it.

We could:

write doc to specify what the functions does (return prime factorisation)

write tests for the actual expected behavior

fix the code

Here are various snippets I've written to test the code. In a more serious project, you could use a testing framework.

def test_is_prime():

    primes = [2, 3, 5, 7, 11, 13]

    not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]

    for p in primes:

        assert isPrime(p), p

    for np in not_primes:

        assert not isPrime(np), np



def test_prime_factorization():

    prime_factorisations = {

        2: [2],

        3: [3],

        4: [2, 2],

        5: [5],

        6: [2, 3],

        7: [7],

        8: [2, 2, 2],

        9: [3, 3],

        10: [2, 5],

        11: [11],

        12: [2, 2, 3],

        13: [13],

        14: [2, 7],

    }

    for n, factors in prime_factorisations.items():

        ret = primeFactorization(n)

        assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)



def test_prime_factorization_randomised():

    import random

    n = random.randint(2, 10000)

    ret = primeFactorization(n)

    m = 1

    assert sorted(ret) == ret, "return is not sorted for n:" + str(n)

    for p in ret:

        assert isPrime(p), "factor " + str(p) + " is not prime for n:" + str(n)

        m *= p

    assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)



if __name__ == '__main__':

    print_results()

    test_is_prime()

    test_prime_factorization()

    for i in range(300):

        test_prime_factorization_randomised()

Now, it feels fuch safer to improve the code.

Also, you could use this to perform benchmarks: compute the operations many times and/or on huge numbers to measure the time it takes.

Style

Python has a style guide called PEP 8. I highly recommend reading it (many times) and trying to apply it as much as possible.

Among other things, the functions names should be in snake_case.

Algorithm

Splitting a problem into smaller problems and writing functions for these is usually a good idea.

Unfortunately, I am not fully convinced that the reduce functions really helps you here.

Let's see how things can be improved anyway.

Small simplifications/optimisations

We could use the divmod builtin to get both the quotient and the remainder of the division.

The ctr variable seems useless. It is used to break after we've incremented it. We could just break directly.

At this stage, we have:

def reduce(n):

    n = abs(n)

    factors_list = 

    for i in range(2,n+1):

        p, q = divmod(n, i)

        if q == 0:

            factors_list.append(i)

            factors_list.append(p)

            break

    return factors_list

Now it is clear that we either add i and p to the list only once or we add nothing as all. We could make this clearer:

def reduce(n):

    n = abs(n)

    for i in range(2,n+1):

        p, q = divmod(n, i)

        if q == 0:

            return [i, p]

    return

Now, it is clear that the function:

returns in the cases 0 and 1

return [n, 1] when n is prime

return [d, n/d] where d is the smallest (prime) divisisors otherwise.

Also, reduce is called more than needed: everytime we do

if isPrime(n):

    ret = reduce(n)

on a prime number, we actually perform reduce twice which is very expensice, in particular for primes as we iterate up to n.

Thus, we could get an optimisation boost by writing:

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    factors = reduce(n)

    if 1 in factors:

        return [n]

    primeFactors = 

    while True:

        for e in factors:

            new_factors = reduce(e)

            if 1 in new_factors:

                primeFactors.append(e)

                factors.remove(e)

            else:

                factors.extend(new_factors)

                factors.remove(e)

        if len(factors) == 0:

            break

    ret = sorted(primeFactors)

    return ret

Another key hindsight is prime factorisation is that the smallest divisors of n is at most sqrt(n) which limits the range you have to look in.

In your case, we could use this in reduce, change slightly how reduce behaves and write:

def reduce(n):

    """Return [a, b] where a is the smallest divisor of n and n = a * b."""

    n = abs(n)

    for i in range(2, int(math.sqrt(n)) + 1):

        p, q = divmod(n, i)

        if q == 0:

            return [i, p]

    return [n, 1]





def isPrime(n):

    """Return True if n is a prime number, False otherwise."""

    return n > 1 and reduce(n) == [n, 1]





def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    factors = reduce(n)

    if factors == [n, 1]:  # prime

        return [n] 

    primeFactors = 

    while True:

        for e in factors:

            new_factors = reduce(e)

            if new_factors == [e, 1]:  # prime

                primeFactors.append(e)

            else:

                factors.extend(new_factors)

            factors.remove(e)

        if len(factors) == 0:

            break

    ret = sorted(primeFactors)

    return ret

which is much faster.

Then, you could get rid of:

    if len(factors) == 0:

        break

by looping with while factors.

Then using, list.pop(), you could get rid of remove (which takes a linear time):

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    factors = reduce(n)

    if factors == [n, 1]:  # prime

        return [n]

    primeFactors = 

    while factors:

        e = factors.pop()

        new_factors = reduce(e)

        if new_factors == [e, 1]:  # prime

            primeFactors.append(e)

        else:

            factors.extend(new_factors)

    ret = sorted(primeFactors)

    return ret

Then it appears that the initial check is not really required as the logic is already performed inside the loop:

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    primeFactors = 

    factors = [n]

    while factors:

        e = factors.pop()

        new_factors = reduce(e)

        if new_factors == [e, 1]:  # prime

            primeFactors.append(e)

        else:

            factors.extend(new_factors)

    ret = sorted(primeFactors)

    return ret

We can actually get rid of the sorting by popping the first element of the list, thus generating the factors in order:

def primeFactorization(n):

    """Return the prime factorisation of n in sorted order."""

    primeFactors = 

    factors = [n]

    while factors:

        e = factors.pop(0)

        new_factors = reduce(e)

        if new_factors == [e, 1]:  # prime

            primeFactors.append(e)

        else:

            factors.extend(new_factors)

    return primeFactors

Now, instead of a reduce function, we could write a somehow equivalent but easier to use get_smallest_div function. Taking this chance to rename all functions, the whole code becomes:

import math



def get_smallest_div(n):

    """Return the smallest divisor of n."""

    n = abs(n)

    for i in range(2, int(math.sqrt(n)) + 1):

        p, q = divmod(n, i)

        if q == 0:

            return i

    return n





def is_prime(n):

    """Return True if n is a prime number, False otherwise."""

    return n > 1 and get_smallest_div(n) == n



def get_prime_factors(n):

    """Return the prime factorisation of n in sorted order."""

    prime_factors = 

    factors = [n]

    while factors:

        n = factors.pop(0)

        div = get_smallest_div(n)

        if div == n:  # prime

            prime_factors.append(n)

        else:

            factors.extend([div, n//div])

    return prime_factors



def print_results():

    print("i, get_prime_factors(i), get_smallest_div(i), is_prime(i)")

    for i in range(15):

        print(i, get_prime_factors(i), get_smallest_div(i), is_prime(i))



def test_is_prime():

    primes = [2, 3, 5, 7, 11, 13]

    not_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 100, 100000]

    for p in primes:

        assert is_prime(p), p

    for np in not_primes:

        assert not is_prime(np), np



def test_prime_factorization():

    prime_factorisations = {

        2: [2],

        3: [3],

        4: [2, 2],

        5: [5],

        6: [2, 3],

        7: [7],

        8: [2, 2, 2],

        9: [3, 3],

        10: [2, 5],

        11: [11],

        12: [2, 2, 3],

        13: [13],

        14: [2, 7],

    }

    for n, factors in prime_factorisations.items():

        ret = get_prime_factors(n)

        assert ret == factors, str(n) + ": " + str(ret) + "!=" + str(factors)



def test_prime_factorization_randomised():

    import random

    n = random.randint(2, 10000)

    ret = get_prime_factors(n)

    m = 1

    assert sorted(ret) == ret, "return is not sorted for n:" + str(n)

    for p in ret:

        assert is_prime(p), "factor " + str(p) + " is not prime for n:" + str(n)

        m *= p

    assert m == n, "product of factors does not lead to original value:" + str(n) + ", " + str(m)



if __name__ == '__main__':

    start = time.perf_counter()

    import time

    print_results()

    test_is_prime()

    test_prime_factorization()

    for i in range(300):

        test_prime_factorization_randomised()

    print(get_prime_factors(9000000))

    print(get_prime_factors(9000000 * 701 * 701))

    print(time.perf_counter() - start)

My solution for this

If I was to write this from scratch, here is how I'd do it:

def is_prime(n):

    """Return True if n is a prime number, False otherwise."""

    if n < 2:

        return False

    return all(n % i for i in range(2, int(math.sqrt(n)) + 1))



def get_prime_factors(n):

    """Return the prime factorisation of n in sorted order."""

    prime_factors = 

    d = 2

    while d * d <= n:

        while n % d == 0:

            n //= d

            prime_factors.append(d)

        d += 1

    if n > 1:  # to avoid 1 as a factor

        assert d <= n

        prime_factors.append(n)

    return prime_factors

In one of the intermediate steps you do for e in factors and then later do factors.remove(e). Removing elements from a list while iterating over it produces interesting results, you should probably iterate over a copy of the list. — Nov 26 at 15:47
@Graipher this is part of the original code but this is definitly worth mentionning. It is probably worth an answer on its own. — Nov 26 at 16:10

搜尋此網誌

Gfrktyl

Prime factorization in Python

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Список кардиналов, возведённых папой римским Каликстом III

Deduzione

Mysql.sock missing - “Can't connect to local MySQL server through socket”

Prime factorization in Python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Список кардиналов, возведённых папой римским Каликстом III

Deduzione

Mysql.sock missing - “Can't connect to local MySQL server through socket”

1 Answer
1

1 Answer
1

1 Answer
1