PUBG API Client request: Transforming and storing raw json on batches












-1












$begingroup$


I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.



I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:



.                                
├── config.py (global configuration)
├── core
│   ├── client.py (generic requests class)
│   └── persist.py (mongodb layer class)
├── pubg
│   ├── controller.py (handler for pubg)
│   └── models.py (models for pubg)
└── main.py


The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.



The psdseudo-code looks like this:



class Client:
...
def request(self, endpoint, params=None:
response = self.session.get(endpoint, timeout=TIMEOUT, params=params)
return response.json()

class Pubg(Client):
# Does the request to the official API endpoints {palyer, match, telemtry}
...
def matches(self, params, shard):
res =
url = f'{SHARD_URL}/{shard}/matches'
response = self.request(url, params)
res.append(Match(item))
return res

class PubgManager(Pubg):
# Transform the data and stores it (either in mongo or disk)
# I kept in the class the list of matches to process (matches_to_process) and shard
def __init__(self, shard, players, mongo):
super().__init__(api_key)
# Keep mongo instance in order to persist data later
self.mongo = mongo
self.mongo_pubg = mongo.get_client().pubg
...

def process_matches(list_matches):
for m_id in list_matches:
res = self.matches(m_id)
data = transform_request(res)
store_in_mongo(data)
store_in_disk(data)



The particular code of process_matches -which is getting bigger- is the following:



def process_matches(self, list_matches, telemetry=True, store_mongo=True,
store_disk=False, return_data=False):
"""
Having a list of matches, request the information in the upper class,
extracts certain data out of the response json, create some data structures
and stores it in mongo and disk as json.

This is only for a list of players which is in self.players,
so each element will be checked whether it concers to the particular player

The Pubg API returns you the following info
- Match: Description and results about the match
- Participant: Results of the match by player
- Telemetry: Events happened during the game

Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events.
We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).
So we have to check ALL the dictionares within telemetry list to avoid
non interesting information.

Each dictionary contains the following info:
https://documentation.pubg.com/en/telemetry-objects.html

- _T: Event type, whether is relative to player/match
- _D: Timestamp
- common: state of the game (isGame)
- character: what a player does
- gamestate: how is the game going
- item: how somebody interacts with an item
- stats: info about player in a point of time (kills, distance walked, ... )


Not always we want to store in disk, in mongo or return the dictionary so
we have a flag for each one.
Not always we want to process telemetry since it's pretty big, so we have
a flag to avoid it


Parameters
----------
list_matches
telemetry
store_mongo
store_disk
return_data

Returns
-------
A dictionary with the match data, participants data and telemetry events for a set
of players
"""

# Create the Data Structure to return
res = dict({
'match_data': list(),
'participant_data': list(),
'telemetry_users': list(),
'telemetry_match': list(),
})
# Iterate over the matches id
for m_id in list_matches:

# get info from matches
m_info = self.match(m_id, self.shard)
m_data = m_info.get_data()
participants = {
m_id: m_data.get('participants')
}
# Store info
if store_mongo:
self.mongo.insert_item('matches', m_data)
self.mongo.insert_item('participants', participants)
# Update response
if return_data:
res.get('match_data').append(m_data)
res.get('participant_data').append(participants)
if telemetry:
t_url = m_data.get('telemetry')
t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')

user_activity = list()
match_activity = list()
other_activity = list()

for action in t_data:
# We are only interested in the events AFTER lift off
# The only event without `common` is matchStart which it's
# been just pop out the list
is_game = action.get('common').get('isGame') >= 0.1
# if is_game:
# tele_ingame_data.append(action)

action_type = action.get('_T')
pl = self.players
if is_game and action_type in EVENTS_USER:
# Is the player attacking somebody?
try:
attacks = action.get('attacker').get('name') in pl
if attacks:
user_activity.append(action)
except AttributeError:
pass
# Is the player beign attacked?
try:
is_victim = action.get('victim').get('name') in pl
if is_victim:
user_activity.append(action)
except AttributeError:
pass
# Is the player interacting with something ?
try:
user = action.get('character').get('name') in pl
if user:
user_activity.append(action)
except AttributeError:
pass
elif is_game and action_type in EVENTS_GAME:
match_activity.append(action)

elif is_game:
other_activity.append(action)
else:
pass
user_info = {
'_id': m_id,
'id': m_id,
'data': user_activity
}
match_info = {
'_id': m_id,
'id': m_id,
'match_details': match_info,
'data': match_activity
}
# Update values
if return_data:
res.get('telemetry_users').append(user_info)
res.get('telemetry_match').append(match_info)
# Store info
if store_mongo:
self.mongo.insert_item('telemetry_users', user_info)
self.mongo.insert_item('telemetry_match', match_info)
if store_disk:
path = 'output/'
filename = f'{m_id}.json'
data = {
'match_info': match_info,
'telemetry': t_data,
}
store_json(path, filename, data)



Having in mind the code works, my questions are:




  • Is this structure ok? Is this "pythonic"?


  • process_matches() is very big. Should I keep in the same method, both the data transformation and storage?


  • process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

  • In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?










share|improve this question









New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
    $endgroup$
    – Graham
    yesterday










  • $begingroup$
    The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
    $endgroup$
    – Jamal
    58 mins ago










  • $begingroup$
    @Jamal I've just changed the title accordingly. Thank you.
    $endgroup$
    – Koehler
    11 mins ago
















-1












$begingroup$


I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.



I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:



.                                
├── config.py (global configuration)
├── core
│   ├── client.py (generic requests class)
│   └── persist.py (mongodb layer class)
├── pubg
│   ├── controller.py (handler for pubg)
│   └── models.py (models for pubg)
└── main.py


The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.



The psdseudo-code looks like this:



class Client:
...
def request(self, endpoint, params=None:
response = self.session.get(endpoint, timeout=TIMEOUT, params=params)
return response.json()

class Pubg(Client):
# Does the request to the official API endpoints {palyer, match, telemtry}
...
def matches(self, params, shard):
res =
url = f'{SHARD_URL}/{shard}/matches'
response = self.request(url, params)
res.append(Match(item))
return res

class PubgManager(Pubg):
# Transform the data and stores it (either in mongo or disk)
# I kept in the class the list of matches to process (matches_to_process) and shard
def __init__(self, shard, players, mongo):
super().__init__(api_key)
# Keep mongo instance in order to persist data later
self.mongo = mongo
self.mongo_pubg = mongo.get_client().pubg
...

def process_matches(list_matches):
for m_id in list_matches:
res = self.matches(m_id)
data = transform_request(res)
store_in_mongo(data)
store_in_disk(data)



The particular code of process_matches -which is getting bigger- is the following:



def process_matches(self, list_matches, telemetry=True, store_mongo=True,
store_disk=False, return_data=False):
"""
Having a list of matches, request the information in the upper class,
extracts certain data out of the response json, create some data structures
and stores it in mongo and disk as json.

This is only for a list of players which is in self.players,
so each element will be checked whether it concers to the particular player

The Pubg API returns you the following info
- Match: Description and results about the match
- Participant: Results of the match by player
- Telemetry: Events happened during the game

Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events.
We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).
So we have to check ALL the dictionares within telemetry list to avoid
non interesting information.

Each dictionary contains the following info:
https://documentation.pubg.com/en/telemetry-objects.html

- _T: Event type, whether is relative to player/match
- _D: Timestamp
- common: state of the game (isGame)
- character: what a player does
- gamestate: how is the game going
- item: how somebody interacts with an item
- stats: info about player in a point of time (kills, distance walked, ... )


Not always we want to store in disk, in mongo or return the dictionary so
we have a flag for each one.
Not always we want to process telemetry since it's pretty big, so we have
a flag to avoid it


Parameters
----------
list_matches
telemetry
store_mongo
store_disk
return_data

Returns
-------
A dictionary with the match data, participants data and telemetry events for a set
of players
"""

# Create the Data Structure to return
res = dict({
'match_data': list(),
'participant_data': list(),
'telemetry_users': list(),
'telemetry_match': list(),
})
# Iterate over the matches id
for m_id in list_matches:

# get info from matches
m_info = self.match(m_id, self.shard)
m_data = m_info.get_data()
participants = {
m_id: m_data.get('participants')
}
# Store info
if store_mongo:
self.mongo.insert_item('matches', m_data)
self.mongo.insert_item('participants', participants)
# Update response
if return_data:
res.get('match_data').append(m_data)
res.get('participant_data').append(participants)
if telemetry:
t_url = m_data.get('telemetry')
t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')

user_activity = list()
match_activity = list()
other_activity = list()

for action in t_data:
# We are only interested in the events AFTER lift off
# The only event without `common` is matchStart which it's
# been just pop out the list
is_game = action.get('common').get('isGame') >= 0.1
# if is_game:
# tele_ingame_data.append(action)

action_type = action.get('_T')
pl = self.players
if is_game and action_type in EVENTS_USER:
# Is the player attacking somebody?
try:
attacks = action.get('attacker').get('name') in pl
if attacks:
user_activity.append(action)
except AttributeError:
pass
# Is the player beign attacked?
try:
is_victim = action.get('victim').get('name') in pl
if is_victim:
user_activity.append(action)
except AttributeError:
pass
# Is the player interacting with something ?
try:
user = action.get('character').get('name') in pl
if user:
user_activity.append(action)
except AttributeError:
pass
elif is_game and action_type in EVENTS_GAME:
match_activity.append(action)

elif is_game:
other_activity.append(action)
else:
pass
user_info = {
'_id': m_id,
'id': m_id,
'data': user_activity
}
match_info = {
'_id': m_id,
'id': m_id,
'match_details': match_info,
'data': match_activity
}
# Update values
if return_data:
res.get('telemetry_users').append(user_info)
res.get('telemetry_match').append(match_info)
# Store info
if store_mongo:
self.mongo.insert_item('telemetry_users', user_info)
self.mongo.insert_item('telemetry_match', match_info)
if store_disk:
path = 'output/'
filename = f'{m_id}.json'
data = {
'match_info': match_info,
'telemetry': t_data,
}
store_json(path, filename, data)



Having in mind the code works, my questions are:




  • Is this structure ok? Is this "pythonic"?


  • process_matches() is very big. Should I keep in the same method, both the data transformation and storage?


  • process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

  • In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?










share|improve this question









New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
    $endgroup$
    – Graham
    yesterday










  • $begingroup$
    The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
    $endgroup$
    – Jamal
    58 mins ago










  • $begingroup$
    @Jamal I've just changed the title accordingly. Thank you.
    $endgroup$
    – Koehler
    11 mins ago














-1












-1








-1


0



$begingroup$


I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.



I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:



.                                
├── config.py (global configuration)
├── core
│   ├── client.py (generic requests class)
│   └── persist.py (mongodb layer class)
├── pubg
│   ├── controller.py (handler for pubg)
│   └── models.py (models for pubg)
└── main.py


The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.



The psdseudo-code looks like this:



class Client:
...
def request(self, endpoint, params=None:
response = self.session.get(endpoint, timeout=TIMEOUT, params=params)
return response.json()

class Pubg(Client):
# Does the request to the official API endpoints {palyer, match, telemtry}
...
def matches(self, params, shard):
res =
url = f'{SHARD_URL}/{shard}/matches'
response = self.request(url, params)
res.append(Match(item))
return res

class PubgManager(Pubg):
# Transform the data and stores it (either in mongo or disk)
# I kept in the class the list of matches to process (matches_to_process) and shard
def __init__(self, shard, players, mongo):
super().__init__(api_key)
# Keep mongo instance in order to persist data later
self.mongo = mongo
self.mongo_pubg = mongo.get_client().pubg
...

def process_matches(list_matches):
for m_id in list_matches:
res = self.matches(m_id)
data = transform_request(res)
store_in_mongo(data)
store_in_disk(data)



The particular code of process_matches -which is getting bigger- is the following:



def process_matches(self, list_matches, telemetry=True, store_mongo=True,
store_disk=False, return_data=False):
"""
Having a list of matches, request the information in the upper class,
extracts certain data out of the response json, create some data structures
and stores it in mongo and disk as json.

This is only for a list of players which is in self.players,
so each element will be checked whether it concers to the particular player

The Pubg API returns you the following info
- Match: Description and results about the match
- Participant: Results of the match by player
- Telemetry: Events happened during the game

Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events.
We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).
So we have to check ALL the dictionares within telemetry list to avoid
non interesting information.

Each dictionary contains the following info:
https://documentation.pubg.com/en/telemetry-objects.html

- _T: Event type, whether is relative to player/match
- _D: Timestamp
- common: state of the game (isGame)
- character: what a player does
- gamestate: how is the game going
- item: how somebody interacts with an item
- stats: info about player in a point of time (kills, distance walked, ... )


Not always we want to store in disk, in mongo or return the dictionary so
we have a flag for each one.
Not always we want to process telemetry since it's pretty big, so we have
a flag to avoid it


Parameters
----------
list_matches
telemetry
store_mongo
store_disk
return_data

Returns
-------
A dictionary with the match data, participants data and telemetry events for a set
of players
"""

# Create the Data Structure to return
res = dict({
'match_data': list(),
'participant_data': list(),
'telemetry_users': list(),
'telemetry_match': list(),
})
# Iterate over the matches id
for m_id in list_matches:

# get info from matches
m_info = self.match(m_id, self.shard)
m_data = m_info.get_data()
participants = {
m_id: m_data.get('participants')
}
# Store info
if store_mongo:
self.mongo.insert_item('matches', m_data)
self.mongo.insert_item('participants', participants)
# Update response
if return_data:
res.get('match_data').append(m_data)
res.get('participant_data').append(participants)
if telemetry:
t_url = m_data.get('telemetry')
t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')

user_activity = list()
match_activity = list()
other_activity = list()

for action in t_data:
# We are only interested in the events AFTER lift off
# The only event without `common` is matchStart which it's
# been just pop out the list
is_game = action.get('common').get('isGame') >= 0.1
# if is_game:
# tele_ingame_data.append(action)

action_type = action.get('_T')
pl = self.players
if is_game and action_type in EVENTS_USER:
# Is the player attacking somebody?
try:
attacks = action.get('attacker').get('name') in pl
if attacks:
user_activity.append(action)
except AttributeError:
pass
# Is the player beign attacked?
try:
is_victim = action.get('victim').get('name') in pl
if is_victim:
user_activity.append(action)
except AttributeError:
pass
# Is the player interacting with something ?
try:
user = action.get('character').get('name') in pl
if user:
user_activity.append(action)
except AttributeError:
pass
elif is_game and action_type in EVENTS_GAME:
match_activity.append(action)

elif is_game:
other_activity.append(action)
else:
pass
user_info = {
'_id': m_id,
'id': m_id,
'data': user_activity
}
match_info = {
'_id': m_id,
'id': m_id,
'match_details': match_info,
'data': match_activity
}
# Update values
if return_data:
res.get('telemetry_users').append(user_info)
res.get('telemetry_match').append(match_info)
# Store info
if store_mongo:
self.mongo.insert_item('telemetry_users', user_info)
self.mongo.insert_item('telemetry_match', match_info)
if store_disk:
path = 'output/'
filename = f'{m_id}.json'
data = {
'match_info': match_info,
'telemetry': t_data,
}
store_json(path, filename, data)



Having in mind the code works, my questions are:




  • Is this structure ok? Is this "pythonic"?


  • process_matches() is very big. Should I keep in the same method, both the data transformation and storage?


  • process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

  • In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?










share|improve this question









New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.



I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:



.                                
├── config.py (global configuration)
├── core
│   ├── client.py (generic requests class)
│   └── persist.py (mongodb layer class)
├── pubg
│   ├── controller.py (handler for pubg)
│   └── models.py (models for pubg)
└── main.py


The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.



The psdseudo-code looks like this:



class Client:
...
def request(self, endpoint, params=None:
response = self.session.get(endpoint, timeout=TIMEOUT, params=params)
return response.json()

class Pubg(Client):
# Does the request to the official API endpoints {palyer, match, telemtry}
...
def matches(self, params, shard):
res =
url = f'{SHARD_URL}/{shard}/matches'
response = self.request(url, params)
res.append(Match(item))
return res

class PubgManager(Pubg):
# Transform the data and stores it (either in mongo or disk)
# I kept in the class the list of matches to process (matches_to_process) and shard
def __init__(self, shard, players, mongo):
super().__init__(api_key)
# Keep mongo instance in order to persist data later
self.mongo = mongo
self.mongo_pubg = mongo.get_client().pubg
...

def process_matches(list_matches):
for m_id in list_matches:
res = self.matches(m_id)
data = transform_request(res)
store_in_mongo(data)
store_in_disk(data)



The particular code of process_matches -which is getting bigger- is the following:



def process_matches(self, list_matches, telemetry=True, store_mongo=True,
store_disk=False, return_data=False):
"""
Having a list of matches, request the information in the upper class,
extracts certain data out of the response json, create some data structures
and stores it in mongo and disk as json.

This is only for a list of players which is in self.players,
so each element will be checked whether it concers to the particular player

The Pubg API returns you the following info
- Match: Description and results about the match
- Participant: Results of the match by player
- Telemetry: Events happened during the game

Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events.
We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).
So we have to check ALL the dictionares within telemetry list to avoid
non interesting information.

Each dictionary contains the following info:
https://documentation.pubg.com/en/telemetry-objects.html

- _T: Event type, whether is relative to player/match
- _D: Timestamp
- common: state of the game (isGame)
- character: what a player does
- gamestate: how is the game going
- item: how somebody interacts with an item
- stats: info about player in a point of time (kills, distance walked, ... )


Not always we want to store in disk, in mongo or return the dictionary so
we have a flag for each one.
Not always we want to process telemetry since it's pretty big, so we have
a flag to avoid it


Parameters
----------
list_matches
telemetry
store_mongo
store_disk
return_data

Returns
-------
A dictionary with the match data, participants data and telemetry events for a set
of players
"""

# Create the Data Structure to return
res = dict({
'match_data': list(),
'participant_data': list(),
'telemetry_users': list(),
'telemetry_match': list(),
})
# Iterate over the matches id
for m_id in list_matches:

# get info from matches
m_info = self.match(m_id, self.shard)
m_data = m_info.get_data()
participants = {
m_id: m_data.get('participants')
}
# Store info
if store_mongo:
self.mongo.insert_item('matches', m_data)
self.mongo.insert_item('participants', participants)
# Update response
if return_data:
res.get('match_data').append(m_data)
res.get('participant_data').append(participants)
if telemetry:
t_url = m_data.get('telemetry')
t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')

user_activity = list()
match_activity = list()
other_activity = list()

for action in t_data:
# We are only interested in the events AFTER lift off
# The only event without `common` is matchStart which it's
# been just pop out the list
is_game = action.get('common').get('isGame') >= 0.1
# if is_game:
# tele_ingame_data.append(action)

action_type = action.get('_T')
pl = self.players
if is_game and action_type in EVENTS_USER:
# Is the player attacking somebody?
try:
attacks = action.get('attacker').get('name') in pl
if attacks:
user_activity.append(action)
except AttributeError:
pass
# Is the player beign attacked?
try:
is_victim = action.get('victim').get('name') in pl
if is_victim:
user_activity.append(action)
except AttributeError:
pass
# Is the player interacting with something ?
try:
user = action.get('character').get('name') in pl
if user:
user_activity.append(action)
except AttributeError:
pass
elif is_game and action_type in EVENTS_GAME:
match_activity.append(action)

elif is_game:
other_activity.append(action)
else:
pass
user_info = {
'_id': m_id,
'id': m_id,
'data': user_activity
}
match_info = {
'_id': m_id,
'id': m_id,
'match_details': match_info,
'data': match_activity
}
# Update values
if return_data:
res.get('telemetry_users').append(user_info)
res.get('telemetry_match').append(match_info)
# Store info
if store_mongo:
self.mongo.insert_item('telemetry_users', user_info)
self.mongo.insert_item('telemetry_match', match_info)
if store_disk:
path = 'output/'
filename = f'{m_id}.json'
data = {
'match_info': match_info,
'telemetry': t_data,
}
store_json(path, filename, data)



Having in mind the code works, my questions are:




  • Is this structure ok? Is this "pythonic"?


  • process_matches() is very big. Should I keep in the same method, both the data transformation and storage?


  • process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

  • In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?







python performance object-oriented api web-services






share|improve this question









New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 18 mins ago







Koehler













New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 days ago









KoehlerKoehler

121




121




New contributor




Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
    $endgroup$
    – Graham
    yesterday










  • $begingroup$
    The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
    $endgroup$
    – Jamal
    58 mins ago










  • $begingroup$
    @Jamal I've just changed the title accordingly. Thank you.
    $endgroup$
    – Koehler
    11 mins ago


















  • $begingroup$
    I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
    $endgroup$
    – Graham
    yesterday










  • $begingroup$
    The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
    $endgroup$
    – Jamal
    58 mins ago










  • $begingroup$
    @Jamal I've just changed the title accordingly. Thank you.
    $endgroup$
    – Koehler
    11 mins ago
















$begingroup$
I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
$endgroup$
– Graham
yesterday




$begingroup$
I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
$endgroup$
– Graham
yesterday












$begingroup$
The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
$endgroup$
– Jamal
58 mins ago




$begingroup$
The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
$endgroup$
– Jamal
58 mins ago












$begingroup$
@Jamal I've just changed the title accordingly. Thank you.
$endgroup$
– Koehler
11 mins ago




$begingroup$
@Jamal I've just changed the title accordingly. Thank you.
$endgroup$
– Koehler
11 mins ago










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Koehler is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211781%2fpubg-api-client-request-transforming-and-storing-raw-json-on-batches%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








Koehler is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Koehler is a new contributor. Be nice, and check out our Code of Conduct.













Koehler is a new contributor. Be nice, and check out our Code of Conduct.












Koehler is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211781%2fpubg-api-client-request-transforming-and-storing-raw-json-on-batches%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Сан-Квентин

8-я гвардейская общевойсковая армия

Алькесар