Apache Nutch hangs on a url during fetch cycle indefinitely [closed]
I am running Nutch 1.15 in distributed Hadoop mode, when it tries to fetch a file (185 MB) in this particular case, and it hangs with aborted threads. Then, in next cycles of Fetch Data, it will try to fetch same file over again, then thread hangs and abort. The Nutch will not skip over this url, but repeatedly trying to fetch the same file over and over again in the next Fetch cycles indefinitely. Is there a way for the Nutch to skip the url? Thanks!
apache-http-server threads fetch
closed as too broad by DavidPostill♦ Feb 3 at 14:24
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
I am running Nutch 1.15 in distributed Hadoop mode, when it tries to fetch a file (185 MB) in this particular case, and it hangs with aborted threads. Then, in next cycles of Fetch Data, it will try to fetch same file over again, then thread hangs and abort. The Nutch will not skip over this url, but repeatedly trying to fetch the same file over and over again in the next Fetch cycles indefinitely. Is there a way for the Nutch to skip the url? Thanks!
apache-http-server threads fetch
closed as too broad by DavidPostill♦ Feb 3 at 14:24
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
I am running Nutch 1.15 in distributed Hadoop mode, when it tries to fetch a file (185 MB) in this particular case, and it hangs with aborted threads. Then, in next cycles of Fetch Data, it will try to fetch same file over again, then thread hangs and abort. The Nutch will not skip over this url, but repeatedly trying to fetch the same file over and over again in the next Fetch cycles indefinitely. Is there a way for the Nutch to skip the url? Thanks!
apache-http-server threads fetch
I am running Nutch 1.15 in distributed Hadoop mode, when it tries to fetch a file (185 MB) in this particular case, and it hangs with aborted threads. Then, in next cycles of Fetch Data, it will try to fetch same file over again, then thread hangs and abort. The Nutch will not skip over this url, but repeatedly trying to fetch the same file over and over again in the next Fetch cycles indefinitely. Is there a way for the Nutch to skip the url? Thanks!
apache-http-server threads fetch
apache-http-server threads fetch
asked Feb 3 at 4:40
user9142638user9142638
61
61
closed as too broad by DavidPostill♦ Feb 3 at 14:24
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
closed as too broad by DavidPostill♦ Feb 3 at 14:24
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
add a comment |
0
active
oldest
votes
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes