Program to calculate Hash and Size of a torrent

I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self is used.

import hashlib, bencode



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']

        self.md5hash = self.md5hash(self.info)

        self.size = self.size(self.files)



    def md5hash(self, info):

        return hashlib.sha1(bencode.bencode(info)).hexdigest()



    def size(self, files):

        filesize = 0

        for file in files:

            filesize += file['length']          

        return filesize





torrentfile = Torrent(open("test.torrent", "rb"))

print(torrentfile.md5hash)

print(torrentfile.size)

edited Jan 1 at 18:19

asked Jan 1 at 18:11

Labrinth

235

New contributor

2

Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45

add a comment |

import hashlib, bencode



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']

        self.md5hash = self.md5hash(self.info)

        self.size = self.size(self.files)



    def md5hash(self, info):

        return hashlib.sha1(bencode.bencode(info)).hexdigest()



    def size(self, files):

        filesize = 0

        for file in files:

            filesize += file['length']          

        return filesize





torrentfile = Torrent(open("test.torrent", "rb"))

print(torrentfile.md5hash)

print(torrentfile.size)

edited Jan 1 at 18:19

asked Jan 1 at 18:11

Labrinth

235

New contributor

2

Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45

add a comment |

import hashlib, bencode



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']

        self.md5hash = self.md5hash(self.info)

        self.size = self.size(self.files)



    def md5hash(self, info):

        return hashlib.sha1(bencode.bencode(info)).hexdigest()



    def size(self, files):

        filesize = 0

        for file in files:

            filesize += file['length']          

        return filesize





torrentfile = Torrent(open("test.torrent", "rb"))

print(torrentfile.md5hash)

print(torrentfile.size)

edited Jan 1 at 18:19

asked Jan 1 at 18:11

Labrinth

235

New contributor

import hashlib, bencode



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']

        self.md5hash = self.md5hash(self.info)

        self.size = self.size(self.files)



    def md5hash(self, info):

        return hashlib.sha1(bencode.bencode(info)).hexdigest()



    def size(self, files):

        filesize = 0

        for file in files:

            filesize += file['length']          

        return filesize





torrentfile = Torrent(open("test.torrent", "rb"))

print(torrentfile.md5hash)

print(torrentfile.size)

python object-oriented

edited Jan 1 at 18:19

asked Jan 1 at 18:11

Labrinth

235

New contributor

edited Jan 1 at 18:19

asked Jan 1 at 18:11

Labrinth

235

New contributor

edited Jan 1 at 18:19

asked Jan 1 at 18:11

Labrinth

235

New contributor

asked Jan 1 at 18:11

Labrinth

235

asked Jan 1 at 18:11

Labrinth

235

New contributor

Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

2

Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45

add a comment |

2

Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45

Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45

add a comment |

2 Answers
2

active

oldest

votes

Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.

In terms of the specific code, you're doing things slightly wrong in 2 ways.

Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.

Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__, and while caching is good, this is a bad way of trying to achieve it.

There are 2 alternatives that spring to mind, depending on what you want to do:

If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:

def __init__(self, torrentfile):

    self.metainfo = bencode.bdecode(torrentfile.read())

    self.info = self.metainfo['info']

    self.files = self.metainfo['info']['files']

    self.md5hash = self.calculate_md5hash()

    self.size = self.calculate_size()



def calculate_md5hash(self):

    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



def calculate_size(self):

    filesize = 0

    for file in self.files:

        filesize += file['length']          

    return filesize

Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache

lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.

from functools import lru_cache



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']



    @lru_cache()

    def md5hash(self):

        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



    @lru_cache()

    def size(self):

        filesize = 0

        for file in self.files:

            filesize += file['length']          

        return filesize

Then call the methods explicitly:

print(torrentfile.md5hash())

print(torrentfile.size())

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

answered Jan 1 at 18:29

match

5316

add a comment |

@MathiasEttinger picked up most of the issues. Here's a random assortment of others:

Import what you need

If you do:

from hashlib import sha1

from bencode import bencode, bdecode

Then your usage can be shortened to:

self.metainfo = bdecode(torrentfile.read())

# ...

return sha1(bencode(info)).hexdigest()

Use list comprehensions

This:

    filesize = 0

    for file in files:

        filesize += file['length']          

    return filesize

can be

return sum(f['length'] for f in files)

Use context management

You don't close your file, which is an issue; but you don't need to do it explicitly - do it implicitly:

with open("test.torrent", "rb") as torrentfile:

    torrent = Torrent(torrentfile)

print(torrent.md5hash)

print(torrent.size)

Note that this assumes Torrent is done with the file at the end of the constructor.

Use a `main` function

Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.

answered Jan 2 at 3:12

Reinderien

3,743821

Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Labrinth is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210700%2fprogram-to-calculate-hash-and-size-of-a-torrent%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.

In terms of the specific code, you're doing things slightly wrong in 2 ways.

Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.

There are 2 alternatives that spring to mind, depending on what you want to do:

If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:

def __init__(self, torrentfile):

    self.metainfo = bencode.bdecode(torrentfile.read())

    self.info = self.metainfo['info']

    self.files = self.metainfo['info']['files']

    self.md5hash = self.calculate_md5hash()

    self.size = self.calculate_size()



def calculate_md5hash(self):

    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



def calculate_size(self):

    filesize = 0

    for file in self.files:

        filesize += file['length']          

    return filesize

Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache

lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.

from functools import lru_cache



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']



    @lru_cache()

    def md5hash(self):

        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



    @lru_cache()

    def size(self):

        filesize = 0

        for file in self.files:

            filesize += file['length']          

        return filesize

Then call the methods explicitly:

print(torrentfile.md5hash())

print(torrentfile.size())

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

answered Jan 1 at 18:29

match

5316

add a comment |

Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.

In terms of the specific code, you're doing things slightly wrong in 2 ways.

Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.

There are 2 alternatives that spring to mind, depending on what you want to do:

If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:

def __init__(self, torrentfile):

    self.metainfo = bencode.bdecode(torrentfile.read())

    self.info = self.metainfo['info']

    self.files = self.metainfo['info']['files']

    self.md5hash = self.calculate_md5hash()

    self.size = self.calculate_size()



def calculate_md5hash(self):

    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



def calculate_size(self):

    filesize = 0

    for file in self.files:

        filesize += file['length']          

    return filesize

Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache

lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.

from functools import lru_cache



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']



    @lru_cache()

    def md5hash(self):

        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



    @lru_cache()

    def size(self):

        filesize = 0

        for file in self.files:

            filesize += file['length']          

        return filesize

Then call the methods explicitly:

print(torrentfile.md5hash())

print(torrentfile.size())

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

answered Jan 1 at 18:29

match

5316

add a comment |

Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.

In terms of the specific code, you're doing things slightly wrong in 2 ways.

Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.

There are 2 alternatives that spring to mind, depending on what you want to do:

If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:

def __init__(self, torrentfile):

    self.metainfo = bencode.bdecode(torrentfile.read())

    self.info = self.metainfo['info']

    self.files = self.metainfo['info']['files']

    self.md5hash = self.calculate_md5hash()

    self.size = self.calculate_size()



def calculate_md5hash(self):

    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



def calculate_size(self):

    filesize = 0

    for file in self.files:

        filesize += file['length']          

    return filesize

Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache

lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.

from functools import lru_cache



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']



    @lru_cache()

    def md5hash(self):

        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



    @lru_cache()

    def size(self):

        filesize = 0

        for file in self.files:

            filesize += file['length']          

        return filesize

Then call the methods explicitly:

print(torrentfile.md5hash())

print(torrentfile.size())

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

answered Jan 1 at 18:29

match

5316

Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.

In terms of the specific code, you're doing things slightly wrong in 2 ways.

Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.

There are 2 alternatives that spring to mind, depending on what you want to do:

If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:

def __init__(self, torrentfile):

    self.metainfo = bencode.bdecode(torrentfile.read())

    self.info = self.metainfo['info']

    self.files = self.metainfo['info']['files']

    self.md5hash = self.calculate_md5hash()

    self.size = self.calculate_size()



def calculate_md5hash(self):

    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



def calculate_size(self):

    filesize = 0

    for file in self.files:

        filesize += file['length']          

    return filesize

Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache

lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.

from functools import lru_cache



class Torrent(object):



    def __init__(self, torrentfile):

        self.metainfo = bencode.bdecode(torrentfile.read())

        self.info = self.metainfo['info']

        self.files = self.metainfo['info']['files']



    @lru_cache()

    def md5hash(self):

        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()



    @lru_cache()

    def size(self):

        filesize = 0

        for file in self.files:

            filesize += file['length']          

        return filesize

Then call the methods explicitly:

print(torrentfile.md5hash())

print(torrentfile.size())

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

answered Jan 1 at 18:29

match

5316

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

edited Jan 1 at 20:22

Mathias Ettinger

23.6k33182

answered Jan 1 at 18:29

match

5316

answered Jan 1 at 18:29

match

5316

answered Jan 1 at 18:29

match

5316

add a comment |

@MathiasEttinger picked up most of the issues. Here's a random assortment of others:

Import what you need

If you do:

from hashlib import sha1

from bencode import bencode, bdecode

Then your usage can be shortened to:

self.metainfo = bdecode(torrentfile.read())

# ...

return sha1(bencode(info)).hexdigest()

Use list comprehensions

This:

    filesize = 0

    for file in files:

        filesize += file['length']          

    return filesize

can be

return sum(f['length'] for f in files)

Use context management

You don't close your file, which is an issue; but you don't need to do it explicitly - do it implicitly:

with open("test.torrent", "rb") as torrentfile:

    torrent = Torrent(torrentfile)

print(torrent.md5hash)

print(torrent.size)

Note that this assumes Torrent is done with the file at the end of the constructor.

Use a `main` function

Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.

answered Jan 2 at 3:12

Reinderien

3,743821

Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago

add a comment |

@MathiasEttinger picked up most of the issues. Here's a random assortment of others:

Import what you need

If you do:

from hashlib import sha1

from bencode import bencode, bdecode

Then your usage can be shortened to:

self.metainfo = bdecode(torrentfile.read())

# ...

return sha1(bencode(info)).hexdigest()

Use list comprehensions

This:

    filesize = 0

    for file in files:

        filesize += file['length']          

    return filesize

can be

return sum(f['length'] for f in files)

Use context management

You don't close your file, which is an issue; but you don't need to do it explicitly - do it implicitly:

with open("test.torrent", "rb") as torrentfile:

    torrent = Torrent(torrentfile)

print(torrent.md5hash)

print(torrent.size)

Note that this assumes Torrent is done with the file at the end of the constructor.

Use a `main` function

Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.

answered Jan 2 at 3:12

Reinderien

3,743821

Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago

add a comment |

@MathiasEttinger picked up most of the issues. Here's a random assortment of others:

Import what you need

If you do:

from hashlib import sha1

from bencode import bencode, bdecode

Then your usage can be shortened to:

self.metainfo = bdecode(torrentfile.read())

# ...

return sha1(bencode(info)).hexdigest()

Use list comprehensions

This:

    filesize = 0

    for file in files:

        filesize += file['length']          

    return filesize

can be

return sum(f['length'] for f in files)

Use context management

You don't close your file, which is an issue; but you don't need to do it explicitly - do it implicitly:

with open("test.torrent", "rb") as torrentfile:

    torrent = Torrent(torrentfile)

print(torrent.md5hash)

print(torrent.size)

Note that this assumes Torrent is done with the file at the end of the constructor.

Use a `main` function

Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.

answered Jan 2 at 3:12

Reinderien

3,743821

@MathiasEttinger picked up most of the issues. Here's a random assortment of others:

Import what you need

If you do:

from hashlib import sha1

from bencode import bencode, bdecode

Then your usage can be shortened to:

self.metainfo = bdecode(torrentfile.read())

# ...

return sha1(bencode(info)).hexdigest()

Use list comprehensions

This:

    filesize = 0

    for file in files:

        filesize += file['length']          

    return filesize

can be

return sum(f['length'] for f in files)

Use context management

You don't close your file, which is an issue; but you don't need to do it explicitly - do it implicitly:

with open("test.torrent", "rb") as torrentfile:

    torrent = Torrent(torrentfile)

print(torrent.md5hash)

print(torrent.size)

Note that this assumes Torrent is done with the file at the end of the constructor.

Use a `main` function

Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.

answered Jan 2 at 3:12

Reinderien

3,743821

answered Jan 2 at 3:12

Reinderien

3,743821

answered Jan 2 at 3:12

Reinderien

3,743821

answered Jan 2 at 3:12

Reinderien

3,743821

Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago

add a comment |

Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago

Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago

add a comment |

Labrinth is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Labrinth is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

EKPj1i7MSF3kkdSTlochg Wo,Sy94LeUE22UYIlsYbc Xx3,W,XUF,Qp6EsueOlmne3,9J1UwsK9YG35qRMx,5xr BqH

搜尋此網誌

Gfrktyl