PUBG API Client request: Transforming and storing raw json on batches

-1

I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.

I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:

.                                

├── config.py           (global configuration)

├── core                  

│   ├── client.py       (generic requests class)  

│   └── persist.py      (mongodb layer class)

├── pubg                 

│   ├── controller.py   (handler for pubg)

│   └── models.py       (models  for pubg)

└── main.py

The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.

The psdseudo-code looks like this:

class Client:

    ...

    def request(self, endpoint, params=None:

        response = self.session.get(endpoint, timeout=TIMEOUT, params=params)

        return response.json()



class Pubg(Client):

    # Does the request to the official API endpoints {palyer, match, telemtry}

    ...

    def matches(self, params, shard):

        res = 

        url = f'{SHARD_URL}/{shard}/matches'

        response = self.request(url, params)

        res.append(Match(item))

        return res



class PubgManager(Pubg):

    # Transform the data and stores it (either in mongo or disk)

    # I kept in the class the list of matches to process (matches_to_process) and shard 

    def __init__(self, shard, players, mongo):

        super().__init__(api_key)

        # Keep mongo instance in order to persist data later

        self.mongo = mongo

        self.mongo_pubg = mongo.get_client().pubg

        ...



    def process_matches(list_matches):

        for m_id in list_matches:

            res = self.matches(m_id)

            data = transform_request(res)

            store_in_mongo(data)

            store_in_disk(data)

The particular code of process_matches -which is getting bigger- is the following:

def process_matches(self, list_matches, telemetry=True, store_mongo=True,

                    store_disk=False, return_data=False):

    """

    Having a list of matches, request the information in the upper class,

    extracts certain data out of the response json, create some data structures

    and stores it in mongo and disk as json.



    This is only for a list of players which is in self.players, 

    so each element will be checked whether it concers to the particular player 



    The Pubg API returns you the following info

        - Match: Description and results about the match

        - Participant: Results of the match by player

        - Telemetry: Events happened  during the game



    Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events. 

    We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).

    So we have to check ALL the dictionares within telemetry list to avoid 

    non interesting information.



    Each dictionary contains the following info:

    https://documentation.pubg.com/en/telemetry-objects.html



        - _T: Event type, whether is relative to player/match

        - _D: Timestamp

        - common: state of the game (isGame)

        - character: what a player does

        - gamestate: how is the game going

        - item: how somebody interacts with an item

        - stats: info about player in a point of time (kills, distance walked, ... )





    Not always we want to store in disk, in mongo or return the dictionary so

    we have a flag for each one.

    Not always we want to process telemetry since it's pretty big, so we have

    a flag to avoid it





    Parameters

    ----------

    list_matches

    telemetry

    store_mongo

    store_disk

    return_data



    Returns

    -------

    A dictionary with the match data, participants data and telemetry events for a set

    of players

    """



    # Create the Data Structure to return

    res = dict({

        'match_data': list(),

        'participant_data': list(),

        'telemetry_users': list(),

        'telemetry_match': list(),

    })

    # Iterate over the matches id

    for m_id in list_matches:



        # get info from matches

        m_info = self.match(m_id, self.shard)

        m_data = m_info.get_data()

        participants = {

            m_id: m_data.get('participants')

        }

        # Store info

        if store_mongo:

            self.mongo.insert_item('matches', m_data)

            self.mongo.insert_item('participants', participants)

        # Update response

        if return_data:

            res.get('match_data').append(m_data)

            res.get('participant_data').append(participants)

        if telemetry:

            t_url = m_data.get('telemetry')

            t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')



            user_activity = list()

            match_activity = list()

            other_activity = list()



            for action in t_data:

                # We are only interested in the events AFTER lift off

                # The only event without `common` is matchStart which it's

                # been just pop out the list

                is_game = action.get('common').get('isGame') >= 0.1

                # if is_game:

                #     tele_ingame_data.append(action)



                action_type = action.get('_T')

                pl = self.players

                if is_game and action_type in EVENTS_USER:

                    # Is the player attacking somebody?

                    try:

                        attacks = action.get('attacker').get('name') in pl

                        if attacks:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player beign attacked?

                    try:

                        is_victim = action.get('victim').get('name') in pl

                        if is_victim:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player interacting with something ?

                    try:

                        user = action.get('character').get('name') in pl

                        if user:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                elif is_game and action_type in EVENTS_GAME:

                    match_activity.append(action)



                elif is_game:

                    other_activity.append(action)

                else:

                    pass

            user_info = {

                '_id': m_id,

                'id': m_id,

                'data': user_activity

            }

            match_info = {

                '_id': m_id,

                'id': m_id,

                'match_details': match_info,

                'data': match_activity

            }

            # Update values

            if return_data:

                res.get('telemetry_users').append(user_info)

                res.get('telemetry_match').append(match_info)

            # Store info

            if store_mongo:

                self.mongo.insert_item('telemetry_users', user_info)

                self.mongo.insert_item('telemetry_match', match_info)

            if store_disk:

                path = 'output/'

                filename = f'{m_id}.json'

                data = {

                    'match_info': match_info,

                    'telemetry': t_data,

                }

                store_json(path, filename, data)

Having in mind the code works, my questions are:

Is this structure ok? Is this "pythonic"?

process_matches() is very big. Should I keep in the same method, both the data transformation and storage?

process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?

edited 18 mins ago

asked 2 days ago

Koehler

121

New contributor

$begingroup$
I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
$endgroup$
– Graham
yesterday

$begingroup$
The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
$endgroup$
– Jamal♦
58 mins ago

$begingroup$
@Jamal I've just changed the title accordingly. Thank you.
$endgroup$
– Koehler
11 mins ago

add a comment |

-1

I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.

I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:

.                                

├── config.py           (global configuration)

├── core                  

│   ├── client.py       (generic requests class)  

│   └── persist.py      (mongodb layer class)

├── pubg                 

│   ├── controller.py   (handler for pubg)

│   └── models.py       (models  for pubg)

└── main.py

The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.

The psdseudo-code looks like this:

class Client:

    ...

    def request(self, endpoint, params=None:

        response = self.session.get(endpoint, timeout=TIMEOUT, params=params)

        return response.json()



class Pubg(Client):

    # Does the request to the official API endpoints {palyer, match, telemtry}

    ...

    def matches(self, params, shard):

        res = 

        url = f'{SHARD_URL}/{shard}/matches'

        response = self.request(url, params)

        res.append(Match(item))

        return res



class PubgManager(Pubg):

    # Transform the data and stores it (either in mongo or disk)

    # I kept in the class the list of matches to process (matches_to_process) and shard 

    def __init__(self, shard, players, mongo):

        super().__init__(api_key)

        # Keep mongo instance in order to persist data later

        self.mongo = mongo

        self.mongo_pubg = mongo.get_client().pubg

        ...



    def process_matches(list_matches):

        for m_id in list_matches:

            res = self.matches(m_id)

            data = transform_request(res)

            store_in_mongo(data)

            store_in_disk(data)

The particular code of process_matches -which is getting bigger- is the following:

def process_matches(self, list_matches, telemetry=True, store_mongo=True,

                    store_disk=False, return_data=False):

    """

    Having a list of matches, request the information in the upper class,

    extracts certain data out of the response json, create some data structures

    and stores it in mongo and disk as json.



    This is only for a list of players which is in self.players, 

    so each element will be checked whether it concers to the particular player 



    The Pubg API returns you the following info

        - Match: Description and results about the match

        - Participant: Results of the match by player

        - Telemetry: Events happened  during the game



    Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events. 

    We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).

    So we have to check ALL the dictionares within telemetry list to avoid 

    non interesting information.



    Each dictionary contains the following info:

    https://documentation.pubg.com/en/telemetry-objects.html



        - _T: Event type, whether is relative to player/match

        - _D: Timestamp

        - common: state of the game (isGame)

        - character: what a player does

        - gamestate: how is the game going

        - item: how somebody interacts with an item

        - stats: info about player in a point of time (kills, distance walked, ... )





    Not always we want to store in disk, in mongo or return the dictionary so

    we have a flag for each one.

    Not always we want to process telemetry since it's pretty big, so we have

    a flag to avoid it





    Parameters

    ----------

    list_matches

    telemetry

    store_mongo

    store_disk

    return_data



    Returns

    -------

    A dictionary with the match data, participants data and telemetry events for a set

    of players

    """



    # Create the Data Structure to return

    res = dict({

        'match_data': list(),

        'participant_data': list(),

        'telemetry_users': list(),

        'telemetry_match': list(),

    })

    # Iterate over the matches id

    for m_id in list_matches:



        # get info from matches

        m_info = self.match(m_id, self.shard)

        m_data = m_info.get_data()

        participants = {

            m_id: m_data.get('participants')

        }

        # Store info

        if store_mongo:

            self.mongo.insert_item('matches', m_data)

            self.mongo.insert_item('participants', participants)

        # Update response

        if return_data:

            res.get('match_data').append(m_data)

            res.get('participant_data').append(participants)

        if telemetry:

            t_url = m_data.get('telemetry')

            t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')



            user_activity = list()

            match_activity = list()

            other_activity = list()



            for action in t_data:

                # We are only interested in the events AFTER lift off

                # The only event without `common` is matchStart which it's

                # been just pop out the list

                is_game = action.get('common').get('isGame') >= 0.1

                # if is_game:

                #     tele_ingame_data.append(action)



                action_type = action.get('_T')

                pl = self.players

                if is_game and action_type in EVENTS_USER:

                    # Is the player attacking somebody?

                    try:

                        attacks = action.get('attacker').get('name') in pl

                        if attacks:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player beign attacked?

                    try:

                        is_victim = action.get('victim').get('name') in pl

                        if is_victim:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player interacting with something ?

                    try:

                        user = action.get('character').get('name') in pl

                        if user:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                elif is_game and action_type in EVENTS_GAME:

                    match_activity.append(action)



                elif is_game:

                    other_activity.append(action)

                else:

                    pass

            user_info = {

                '_id': m_id,

                'id': m_id,

                'data': user_activity

            }

            match_info = {

                '_id': m_id,

                'id': m_id,

                'match_details': match_info,

                'data': match_activity

            }

            # Update values

            if return_data:

                res.get('telemetry_users').append(user_info)

                res.get('telemetry_match').append(match_info)

            # Store info

            if store_mongo:

                self.mongo.insert_item('telemetry_users', user_info)

                self.mongo.insert_item('telemetry_match', match_info)

            if store_disk:

                path = 'output/'

                filename = f'{m_id}.json'

                data = {

                    'match_info': match_info,

                    'telemetry': t_data,

                }

                store_json(path, filename, data)

Having in mind the code works, my questions are:

Is this structure ok? Is this "pythonic"?

process_matches() is very big. Should I keep in the same method, both the data transformation and storage?

process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?

edited 18 mins ago

asked 2 days ago

Koehler

121

New contributor

$begingroup$
I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
$endgroup$
– Graham
yesterday

$begingroup$
The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
$endgroup$
– Jamal♦
58 mins ago

$begingroup$
@Jamal I've just changed the title accordingly. Thank you.
$endgroup$
– Koehler
11 mins ago

add a comment |

-1

I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.

I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:

.                                

├── config.py           (global configuration)

├── core                  

│   ├── client.py       (generic requests class)  

│   └── persist.py      (mongodb layer class)

├── pubg                 

│   ├── controller.py   (handler for pubg)

│   └── models.py       (models  for pubg)

└── main.py

The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.

The psdseudo-code looks like this:

class Client:

    ...

    def request(self, endpoint, params=None:

        response = self.session.get(endpoint, timeout=TIMEOUT, params=params)

        return response.json()



class Pubg(Client):

    # Does the request to the official API endpoints {palyer, match, telemtry}

    ...

    def matches(self, params, shard):

        res = 

        url = f'{SHARD_URL}/{shard}/matches'

        response = self.request(url, params)

        res.append(Match(item))

        return res



class PubgManager(Pubg):

    # Transform the data and stores it (either in mongo or disk)

    # I kept in the class the list of matches to process (matches_to_process) and shard 

    def __init__(self, shard, players, mongo):

        super().__init__(api_key)

        # Keep mongo instance in order to persist data later

        self.mongo = mongo

        self.mongo_pubg = mongo.get_client().pubg

        ...



    def process_matches(list_matches):

        for m_id in list_matches:

            res = self.matches(m_id)

            data = transform_request(res)

            store_in_mongo(data)

            store_in_disk(data)

The particular code of process_matches -which is getting bigger- is the following:

def process_matches(self, list_matches, telemetry=True, store_mongo=True,

                    store_disk=False, return_data=False):

    """

    Having a list of matches, request the information in the upper class,

    extracts certain data out of the response json, create some data structures

    and stores it in mongo and disk as json.



    This is only for a list of players which is in self.players, 

    so each element will be checked whether it concers to the particular player 



    The Pubg API returns you the following info

        - Match: Description and results about the match

        - Participant: Results of the match by player

        - Telemetry: Events happened  during the game



    Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events. 

    We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).

    So we have to check ALL the dictionares within telemetry list to avoid 

    non interesting information.



    Each dictionary contains the following info:

    https://documentation.pubg.com/en/telemetry-objects.html



        - _T: Event type, whether is relative to player/match

        - _D: Timestamp

        - common: state of the game (isGame)

        - character: what a player does

        - gamestate: how is the game going

        - item: how somebody interacts with an item

        - stats: info about player in a point of time (kills, distance walked, ... )





    Not always we want to store in disk, in mongo or return the dictionary so

    we have a flag for each one.

    Not always we want to process telemetry since it's pretty big, so we have

    a flag to avoid it





    Parameters

    ----------

    list_matches

    telemetry

    store_mongo

    store_disk

    return_data



    Returns

    -------

    A dictionary with the match data, participants data and telemetry events for a set

    of players

    """



    # Create the Data Structure to return

    res = dict({

        'match_data': list(),

        'participant_data': list(),

        'telemetry_users': list(),

        'telemetry_match': list(),

    })

    # Iterate over the matches id

    for m_id in list_matches:



        # get info from matches

        m_info = self.match(m_id, self.shard)

        m_data = m_info.get_data()

        participants = {

            m_id: m_data.get('participants')

        }

        # Store info

        if store_mongo:

            self.mongo.insert_item('matches', m_data)

            self.mongo.insert_item('participants', participants)

        # Update response

        if return_data:

            res.get('match_data').append(m_data)

            res.get('participant_data').append(participants)

        if telemetry:

            t_url = m_data.get('telemetry')

            t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')



            user_activity = list()

            match_activity = list()

            other_activity = list()



            for action in t_data:

                # We are only interested in the events AFTER lift off

                # The only event without `common` is matchStart which it's

                # been just pop out the list

                is_game = action.get('common').get('isGame') >= 0.1

                # if is_game:

                #     tele_ingame_data.append(action)



                action_type = action.get('_T')

                pl = self.players

                if is_game and action_type in EVENTS_USER:

                    # Is the player attacking somebody?

                    try:

                        attacks = action.get('attacker').get('name') in pl

                        if attacks:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player beign attacked?

                    try:

                        is_victim = action.get('victim').get('name') in pl

                        if is_victim:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player interacting with something ?

                    try:

                        user = action.get('character').get('name') in pl

                        if user:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                elif is_game and action_type in EVENTS_GAME:

                    match_activity.append(action)



                elif is_game:

                    other_activity.append(action)

                else:

                    pass

            user_info = {

                '_id': m_id,

                'id': m_id,

                'data': user_activity

            }

            match_info = {

                '_id': m_id,

                'id': m_id,

                'match_details': match_info,

                'data': match_activity

            }

            # Update values

            if return_data:

                res.get('telemetry_users').append(user_info)

                res.get('telemetry_match').append(match_info)

            # Store info

            if store_mongo:

                self.mongo.insert_item('telemetry_users', user_info)

                self.mongo.insert_item('telemetry_match', match_info)

            if store_disk:

                path = 'output/'

                filename = f'{m_id}.json'

                data = {

                    'match_info': match_info,

                    'telemetry': t_data,

                }

                store_json(path, filename, data)

Having in mind the code works, my questions are:

Is this structure ok? Is this "pythonic"?

process_matches() is very big. Should I keep in the same method, both the data transformation and storage?

process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?

edited 18 mins ago

asked 2 days ago

Koehler

121

New contributor

I've been lately trying to improve my coding, in particular stop creating single files, but I have a lot of questions about how to organize the code and if what I'm doing makes sense.

I wanted to organize the project in the following fashion, keeping in mind I'd like to "scale up" (reuse) most of the structure for other solutions:

.                                

├── config.py           (global configuration)

├── core                  

│   ├── client.py       (generic requests class)  

│   └── persist.py      (mongodb layer class)

├── pubg                 

│   ├── controller.py   (handler for pubg)

│   └── models.py       (models  for pubg)

└── main.py

The file controller.py has the "core" of the script: data request for the Official API (Pubg class) and data transformation (PubgManager class) during the requests time.

The psdseudo-code looks like this:

class Client:

    ...

    def request(self, endpoint, params=None:

        response = self.session.get(endpoint, timeout=TIMEOUT, params=params)

        return response.json()



class Pubg(Client):

    # Does the request to the official API endpoints {palyer, match, telemtry}

    ...

    def matches(self, params, shard):

        res = 

        url = f'{SHARD_URL}/{shard}/matches'

        response = self.request(url, params)

        res.append(Match(item))

        return res



class PubgManager(Pubg):

    # Transform the data and stores it (either in mongo or disk)

    # I kept in the class the list of matches to process (matches_to_process) and shard 

    def __init__(self, shard, players, mongo):

        super().__init__(api_key)

        # Keep mongo instance in order to persist data later

        self.mongo = mongo

        self.mongo_pubg = mongo.get_client().pubg

        ...



    def process_matches(list_matches):

        for m_id in list_matches:

            res = self.matches(m_id)

            data = transform_request(res)

            store_in_mongo(data)

            store_in_disk(data)

The particular code of process_matches -which is getting bigger- is the following:

def process_matches(self, list_matches, telemetry=True, store_mongo=True,

                    store_disk=False, return_data=False):

    """

    Having a list of matches, request the information in the upper class,

    extracts certain data out of the response json, create some data structures

    and stores it in mongo and disk as json.



    This is only for a list of players which is in self.players, 

    so each element will be checked whether it concers to the particular player 



    The Pubg API returns you the following info

        - Match: Description and results about the match

        - Participant: Results of the match by player

        - Telemetry: Events happened  during the game



    Telemetry is a huge list of dictionaries (up to 300k items) with ALL the events. 

    We only want few players (self.list_players) and few events (EVENTS_USER) as of certain point (isGame>0.1).

    So we have to check ALL the dictionares within telemetry list to avoid 

    non interesting information.



    Each dictionary contains the following info:

    https://documentation.pubg.com/en/telemetry-objects.html



        - _T: Event type, whether is relative to player/match

        - _D: Timestamp

        - common: state of the game (isGame)

        - character: what a player does

        - gamestate: how is the game going

        - item: how somebody interacts with an item

        - stats: info about player in a point of time (kills, distance walked, ... )





    Not always we want to store in disk, in mongo or return the dictionary so

    we have a flag for each one.

    Not always we want to process telemetry since it's pretty big, so we have

    a flag to avoid it





    Parameters

    ----------

    list_matches

    telemetry

    store_mongo

    store_disk

    return_data



    Returns

    -------

    A dictionary with the match data, participants data and telemetry events for a set

    of players

    """



    # Create the Data Structure to return

    res = dict({

        'match_data': list(),

        'participant_data': list(),

        'telemetry_users': list(),

        'telemetry_match': list(),

    })

    # Iterate over the matches id

    for m_id in list_matches:



        # get info from matches

        m_info = self.match(m_id, self.shard)

        m_data = m_info.get_data()

        participants = {

            m_id: m_data.get('participants')

        }

        # Store info

        if store_mongo:

            self.mongo.insert_item('matches', m_data)

            self.mongo.insert_item('participants', participants)

        # Update response

        if return_data:

            res.get('match_data').append(m_data)

            res.get('participant_data').append(participants)

        if telemetry:

            t_url = m_data.get('telemetry')

            t_data = self.telemetry(t_url, m_id).get_data().get('telemetry')



            user_activity = list()

            match_activity = list()

            other_activity = list()



            for action in t_data:

                # We are only interested in the events AFTER lift off

                # The only event without `common` is matchStart which it's

                # been just pop out the list

                is_game = action.get('common').get('isGame') >= 0.1

                # if is_game:

                #     tele_ingame_data.append(action)



                action_type = action.get('_T')

                pl = self.players

                if is_game and action_type in EVENTS_USER:

                    # Is the player attacking somebody?

                    try:

                        attacks = action.get('attacker').get('name') in pl

                        if attacks:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player beign attacked?

                    try:

                        is_victim = action.get('victim').get('name') in pl

                        if is_victim:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                    # Is the player interacting with something ?

                    try:

                        user = action.get('character').get('name') in pl

                        if user:

                            user_activity.append(action)

                    except AttributeError:

                        pass

                elif is_game and action_type in EVENTS_GAME:

                    match_activity.append(action)



                elif is_game:

                    other_activity.append(action)

                else:

                    pass

            user_info = {

                '_id': m_id,

                'id': m_id,

                'data': user_activity

            }

            match_info = {

                '_id': m_id,

                'id': m_id,

                'match_details': match_info,

                'data': match_activity

            }

            # Update values

            if return_data:

                res.get('telemetry_users').append(user_info)

                res.get('telemetry_match').append(match_info)

            # Store info

            if store_mongo:

                self.mongo.insert_item('telemetry_users', user_info)

                self.mongo.insert_item('telemetry_match', match_info)

            if store_disk:

                path = 'output/'

                filename = f'{m_id}.json'

                data = {

                    'match_info': match_info,

                    'telemetry': t_data,

                }

                store_json(path, filename, data)

Having in mind the code works, my questions are:

Is this structure ok? Is this "pythonic"?

process_matches() is very big. Should I keep in the same method, both the data transformation and storage?

process_matches() is getting slow (3 seconds for each request). As I'm proceeding sequentially it takes forever to process 500-1000 matches (500-1000 seconds). Should I split it in different methods, so I have one for requesting, another for storing in disk, on mongo and return?

In this case the previous answer is yes. Should I keep in self the API return, which might be quite big sometimes, around 100MB?

python performance object-oriented api web-services

edited 18 mins ago

asked 2 days ago

Koehler

121

New contributor

edited 18 mins ago

asked 2 days ago

Koehler

121

New contributor

edited 18 mins ago

asked 2 days ago

Koehler

121

New contributor

asked 2 days ago

Koehler

121

asked 2 days ago

Koehler

121

New contributor

Koehler is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
$endgroup$
– Graham
yesterday

$begingroup$
The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
$endgroup$
– Jamal♦
58 mins ago

$begingroup$
@Jamal I've just changed the title accordingly. Thank you.
$endgroup$
– Koehler
11 mins ago

add a comment |

$begingroup$
I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.
$endgroup$
– Graham
yesterday

$begingroup$
The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
$endgroup$
– Jamal♦
58 mins ago

$begingroup$
@Jamal I've just changed the title accordingly. Thank you.
$endgroup$
– Koehler
11 mins ago

I'm not sure why this is getting downvotes. It doesn't appear to be off-topic, and if it is, no one has explained why it's off-topic. The OP does introduce a structure at the beginning that may be outside of the scope of the site (I consider it context), but they do have non-structural code at the end. There may be a benefit in moving process_matches to the top of the question and shortening (if not removing) the structural component.

– Graham
yesterday

The current question title, which states your concerns about the code, is too general to be useful here. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.

– Jamal♦
58 mins ago

@Jamal I've just changed the title accordingly. Thank you.

– Koehler
11 mins ago

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Koehler is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f211781%2fpubg-api-client-request-transforming-and-storing-raw-json-on-batches%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

Koehler is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Koehler is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Gfrktyl