Program to calculate Hash and Size of a torrent
I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self
is used.
import hashlib, bencode
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)
def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()
def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize
torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)
python object-oriented
New contributor
add a comment |
I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self
is used.
import hashlib, bencode
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)
def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()
def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize
torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)
python object-oriented
New contributor
2
Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45
add a comment |
I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self
is used.
import hashlib, bencode
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)
def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()
def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize
torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)
python object-oriented
New contributor
I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self
is used.
import hashlib, bencode
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)
def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()
def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize
torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)
python object-oriented
python object-oriented
New contributor
New contributor
edited Jan 1 at 18:19
New contributor
asked Jan 1 at 18:11
Labrinth
235
235
New contributor
New contributor
2
Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45
add a comment |
2
Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45
2
2
Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45
Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45
add a comment |
2 Answers
2
active
oldest
votes
Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.
In terms of the specific code, you're doing things slightly wrong in 2 ways.
Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info
and file
as self.info
and self.file
, so your methods only need the self
argument.
Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__
, and while caching is good, this is a bad way of trying to achieve it.
There are 2 alternatives that spring to mind, depending on what you want to do:
If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.calculate_md5hash()
self.size = self.calculate_size()
def calculate_md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
def calculate_size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache
lru_cache
will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.
from functools import lru_cache
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
@lru_cache()
def md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
@lru_cache()
def size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Then call the methods explicitly:
print(torrentfile.md5hash())
print(torrentfile.size())
add a comment |
@MathiasEttinger picked up most of the issues. Here's a random assortment of others:
Import what you need
If you do:
from hashlib import sha1
from bencode import bencode, bdecode
Then your usage can be shortened to:
self.metainfo = bdecode(torrentfile.read())
# ...
return sha1(bencode(info)).hexdigest()
Use list comprehensions
This:
filesize = 0
for file in files:
filesize += file['length']
return filesize
can be
return sum(f['length'] for f in files)
Use context management
You don't close
your file, which is an issue; but you don't need to do it explicitly - do it implicitly:
with open("test.torrent", "rb") as torrentfile:
torrent = Torrent(torrentfile)
print(torrent.md5hash)
print(torrent.size)
Note that this assumes Torrent
is done with the file at the end of the constructor.
Use a main
function
Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Labrinth is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210700%2fprogram-to-calculate-hash-and-size-of-a-torrent%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.
In terms of the specific code, you're doing things slightly wrong in 2 ways.
Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info
and file
as self.info
and self.file
, so your methods only need the self
argument.
Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__
, and while caching is good, this is a bad way of trying to achieve it.
There are 2 alternatives that spring to mind, depending on what you want to do:
If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.calculate_md5hash()
self.size = self.calculate_size()
def calculate_md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
def calculate_size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache
lru_cache
will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.
from functools import lru_cache
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
@lru_cache()
def md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
@lru_cache()
def size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Then call the methods explicitly:
print(torrentfile.md5hash())
print(torrentfile.size())
add a comment |
Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.
In terms of the specific code, you're doing things slightly wrong in 2 ways.
Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info
and file
as self.info
and self.file
, so your methods only need the self
argument.
Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__
, and while caching is good, this is a bad way of trying to achieve it.
There are 2 alternatives that spring to mind, depending on what you want to do:
If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.calculate_md5hash()
self.size = self.calculate_size()
def calculate_md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
def calculate_size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache
lru_cache
will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.
from functools import lru_cache
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
@lru_cache()
def md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
@lru_cache()
def size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Then call the methods explicitly:
print(torrentfile.md5hash())
print(torrentfile.size())
add a comment |
Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.
In terms of the specific code, you're doing things slightly wrong in 2 ways.
Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info
and file
as self.info
and self.file
, so your methods only need the self
argument.
Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__
, and while caching is good, this is a bad way of trying to achieve it.
There are 2 alternatives that spring to mind, depending on what you want to do:
If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.calculate_md5hash()
self.size = self.calculate_size()
def calculate_md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
def calculate_size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache
lru_cache
will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.
from functools import lru_cache
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
@lru_cache()
def md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
@lru_cache()
def size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Then call the methods explicitly:
print(torrentfile.md5hash())
print(torrentfile.size())
Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.
In terms of the specific code, you're doing things slightly wrong in 2 ways.
Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info
and file
as self.info
and self.file
, so your methods only need the self
argument.
Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__
, and while caching is good, this is a bad way of trying to achieve it.
There are 2 alternatives that spring to mind, depending on what you want to do:
If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.calculate_md5hash()
self.size = self.calculate_size()
def calculate_md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
def calculate_size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache
lru_cache
will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.
from functools import lru_cache
class Torrent(object):
def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
@lru_cache()
def md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()
@lru_cache()
def size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize
Then call the methods explicitly:
print(torrentfile.md5hash())
print(torrentfile.size())
edited Jan 1 at 20:22
Mathias Ettinger
23.6k33182
23.6k33182
answered Jan 1 at 18:29
match
5316
5316
add a comment |
add a comment |
@MathiasEttinger picked up most of the issues. Here's a random assortment of others:
Import what you need
If you do:
from hashlib import sha1
from bencode import bencode, bdecode
Then your usage can be shortened to:
self.metainfo = bdecode(torrentfile.read())
# ...
return sha1(bencode(info)).hexdigest()
Use list comprehensions
This:
filesize = 0
for file in files:
filesize += file['length']
return filesize
can be
return sum(f['length'] for f in files)
Use context management
You don't close
your file, which is an issue; but you don't need to do it explicitly - do it implicitly:
with open("test.torrent", "rb") as torrentfile:
torrent = Torrent(torrentfile)
print(torrent.md5hash)
print(torrent.size)
Note that this assumes Torrent
is done with the file at the end of the constructor.
Use a main
function
Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
add a comment |
@MathiasEttinger picked up most of the issues. Here's a random assortment of others:
Import what you need
If you do:
from hashlib import sha1
from bencode import bencode, bdecode
Then your usage can be shortened to:
self.metainfo = bdecode(torrentfile.read())
# ...
return sha1(bencode(info)).hexdigest()
Use list comprehensions
This:
filesize = 0
for file in files:
filesize += file['length']
return filesize
can be
return sum(f['length'] for f in files)
Use context management
You don't close
your file, which is an issue; but you don't need to do it explicitly - do it implicitly:
with open("test.torrent", "rb") as torrentfile:
torrent = Torrent(torrentfile)
print(torrent.md5hash)
print(torrent.size)
Note that this assumes Torrent
is done with the file at the end of the constructor.
Use a main
function
Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
add a comment |
@MathiasEttinger picked up most of the issues. Here's a random assortment of others:
Import what you need
If you do:
from hashlib import sha1
from bencode import bencode, bdecode
Then your usage can be shortened to:
self.metainfo = bdecode(torrentfile.read())
# ...
return sha1(bencode(info)).hexdigest()
Use list comprehensions
This:
filesize = 0
for file in files:
filesize += file['length']
return filesize
can be
return sum(f['length'] for f in files)
Use context management
You don't close
your file, which is an issue; but you don't need to do it explicitly - do it implicitly:
with open("test.torrent", "rb") as torrentfile:
torrent = Torrent(torrentfile)
print(torrent.md5hash)
print(torrent.size)
Note that this assumes Torrent
is done with the file at the end of the constructor.
Use a main
function
Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.
@MathiasEttinger picked up most of the issues. Here's a random assortment of others:
Import what you need
If you do:
from hashlib import sha1
from bencode import bencode, bdecode
Then your usage can be shortened to:
self.metainfo = bdecode(torrentfile.read())
# ...
return sha1(bencode(info)).hexdigest()
Use list comprehensions
This:
filesize = 0
for file in files:
filesize += file['length']
return filesize
can be
return sum(f['length'] for f in files)
Use context management
You don't close
your file, which is an issue; but you don't need to do it explicitly - do it implicitly:
with open("test.torrent", "rb") as torrentfile:
torrent = Torrent(torrentfile)
print(torrent.md5hash)
print(torrent.size)
Note that this assumes Torrent
is done with the file at the end of the constructor.
Use a main
function
Put your global code into a main function to clean up the global namespace and allow others to use your code as a library rather than a command.
answered Jan 2 at 3:12
Reinderien
3,743821
3,743821
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
add a comment |
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
Just to add - while it isn't an issue in a small project like this, always be aware that importing using 'from' can potentially pollute your namespace, and sometimes having a single 'top-level' module name with child methods is a safer route. (Agree with everything else though!)
– match
16 hours ago
add a comment |
Labrinth is a new contributor. Be nice, and check out our Code of Conduct.
Labrinth is a new contributor. Be nice, and check out our Code of Conduct.
Labrinth is a new contributor. Be nice, and check out our Code of Conduct.
Labrinth is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210700%2fprogram-to-calculate-hash-and-size-of-a-torrent%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
Jan 1 at 18:45