Polymorphic Data Socket Factory with Query Contingency Protocols in Python
I work on a small data team where we developed this tool when we began experiencing several job failures on our legacy big data systems. We have several data lakes, each with their own API, and until now had no contingencies for issues centering around query failures. Furthermore, our database sockets had no central repository for accessing our data. This tool brings all query platforms together in a (relatively) seamless experience by way of formulating a socket library and offers redundancy protocols when queries fail. This software is compatible with any SQL-similar interface that has an established Pythonic API, such as Impala, Hive, AWS, and terminal/command-line returns.
The usage is very straightforward:
# Upon instantiation:
# > mquery() imports all sockets and aliases via the factory
# > Default attempts on any socket is 8
# > Default socket is the first socket in the library
# > Default precedence is all sockets of the same type as the default socket
# > Default for fail_on_empty and fail_on_zero are both False
# Functional
a = mquery_lite().query(r"SELECT 'test;") # Returns the result set only; object instance is destructed on next mquery_lite() call
print(str(a.get_result()))
# Functional with overrides
a = mquery_lite().query(r"SELECT 'test;", query_attempts=2, fail_on_zero=True) # Returns the result set only, overriding the default query_attempts and fail_on_zero values
print(str(a.get_result()))
# Object Oriented with Overrides
a = mquery_lite(socket_precedence=["pd","data"]) # Precedence is set to both the pandas and all data sockets
a.query_attempts = 4
a.query(r"SELECT 'test';", query_attempts=2, socket_precedence=["pd"]) # Overrides on query submission
# query_attempts returns to 4 and socket_precedence returns to ["pd","data"]
print(str(a.get_result()))
# Object Oriented with Overrides and Handshake
a = mquery_lite(socket_precedence=["all","data"], query_attempts=4) # Precedence list will be the list of all sockets plus the additional list of all data sockets; there will be duplicates in the precedence list (not problematic)
a.handshake() # Test each socket in the current precedence list; retain only those that pass. Duplicates will persist.
a.query(r"SELECT 'test';", query_attempts=2)
# query_attempts returns to 4
print(str(a.get_result()))
Naturally, all proprietary information has been redacted from this post. I've also stripped most of the bells and whistles (e.g. typecasting property decorators) to demonstrate basic functionality and proof of concept in the code below; all pieces of this code can readily be found from open sources on the internet, but I haven't seen them tied together in this manner.
The mquery_lite()
object is the primary process of this code and can either be used functionally or instanced as an object. When called, mquery_lite()
determines if a query was provided - if so, it will instance itself, perform the query, then return the result pointer from the successful socket. If a query is not passed, mquery_lite()
remains instanced and user-modified settings are retained.
Sockets are imported by way of a generator-encapsulated factory. Their aliases are mapped in a separate library for ease of use when calling sockets. Sockets are separated by type, defined in the socket itself (we prefer to group by the expected output of the socket as this ensures consistent output on query failure; e.g. data frame, list of lists, generator, etc.). Sockets retain the query results until a new query is submitted.
The socket and alias libraries are automatically built on instantiation, based on the order in which they are present in the script. Collisions are rectified on a first-come-first-serve basis. The following object variables are created on instantiation:
query_attempts
(default 8) is the number of attemptsmquery_lite()
will make on a socket before moving to the next socket. An exponential timer (2^n) sets the pause between repeat queries on a socket.
socket_default
(defaultNone
) is the socket that will be substituted in the precedence list when an unknown alias is provided. Will default to the first socket in the library ifNone
is detected.
socket_precedence
(default) is the order in which sockets will be attempted. Will default to all sockets of the same type as the default socket in the library if
None
is detected.
fail_on_empty
(defaultFalse
) indicates if a query should raise an exception if it comes back empty (useful for command queries).
fail_on_zero
(defaultFalse
) indicates if a query should raise an exception if it comes back zero (useful for counts).
Results remain and failures occur at the socket level. Handling of permitted errors (raised from sockets) occurs in the .query()
method.
import pandas
import pyodbc
import time
def peek(x):
try:
return next(x)
except StopIteration:
return None
###############################################################################
### Dynamic Polymorphic Socket Factory ########################################
###############################################################################
class PermittedSocketError(Exception):
"""
A socket error that should trigger a retry, but not a program termination.
"""
pass
class socket:
DSN = "DSN=Your.DSN.Info.Here;" # Used in pyodbc and pandas sockets
def handshake(self, query="SELECT 'test';"):
self.execute(query, fail_on_empty=False, fail_on_zero=False)
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
class factory:
objects = {}
def add_factory(id, factory):
factory.factories.put[id] = factory
add_factory = staticmethod(add_factory)
def create_object(id):
if id not in factory.objects: # updated for Python 3
factory.objects[id] = eval(id + ".factory()")
return factory.objects[id].create()
create_object = staticmethod(create_object)
###############################################################################
### Socket Library ############################################################
###############################################################################
class generic_socket(socket):
socket_aliases = ["alias_1", "alias_2"]
socket_type = "type"
@property
def result(self):
# Any type of return handling can go here (such as a generator to improve post-parsing)
return self.__data_block
def execute(self, query, fail_on_empty, fail_on_zero):
# Set up for query
self.__data_block = None
try:
# Execute query
# Internal post query handling of error codes should raise exceptions here - useful for non-Pythonic (e.g. command-line) returns
# Likely not needed if using processes with full Pythonic exception handling
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
# Edit: The if-else statements above were note syntactically valid and should be changed to:
# if fail_on_empty and /*Check if Empty*/:
# raise PermittedSocketError("Empty return detected.")
# if fail_on_zero and /*Check if Zero*/:
# raise PermittedSocketError("Zero return detected.")
# if /*Permitted Error Behavior*/:
# raise PermittedSocketError("[msg] ")
# if /*Non-Permitted Error Behavior*/:
# raise Exception
# Exterior post query handling of permitted socket errors - Pythonic exceptions should be caught here
except PermittedSocketError:
# Permitted error post-process, such as reinitializing security protocols or invalidating metadata
# Permitted errors are re-raised and handled within mquery_lite()
raise
class factory:
def create(self):
return generic_socket()
class pandas_socket(socket):
socket_aliases = ["pandas","pd"]
socket_type = "data"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
self.data_block = pandas.read_sql(query, connection)
connection.close()
if fail_on_empty and self.data_block.dropna().empty:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and self.data_block.shape == (1,1) and int(float(self.data_block.iloc[0,0])) == 0:
raise PermittedSocketError("Zero return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pandas_socket()
class pyodbc_socket(socket):
socket_aliases = ["pyodbc"]
socket_type = "standard"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
cursor = connection.cursor()
cursor.execute(query)
self.data_block = cursor.fetchall()
cursor.close()
connection.close()
row = peek(iter(self.data_block))
if fail_on_empty and not row:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and len(row) == 1 and peek(iter(row)) in (0, "0"):
raise PermittedSocketError("Zero return detected.")
except pyodbc.ProgrammingError:
# Thrown when .fetchall() returns nothing
self.__data_block = [()]
raise PermittedSocketError("Empty return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pyodbc_socket()
###############################################################################
### mquery_lite() #############################################################
###############################################################################
class mquery_lite(object):
def __new__(cls, query=None, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
# https://howto.lintel.in/python-__new__-magic-method-explained/
if query is not None:
mquery_instance = super(mquery_lite, cls).__new__(cls)
mquery_instance.__init__(query_attempts, socket_default, socket_precedence, fail_on_empty, fail_on_zero)
mquery_instance.query(query)
return mquery_instance.get_results()
else:
return super(mquery_lite, cls).__new__(cls)
### CTOR
def __init__(self, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
### Socket Library
self.socket_library = {socket.__name__:factory.create_object(socket.__name__) for socket in socket.__subclasses__()}
self.socket_aliases = ({socket:[socket] for socket in self.socket_library})
self.socket_aliases.update({alias:[socket] for socket in self.socket_library for alias in self.socket_library[socket].socket_aliases if alias not in self.socket_aliases})
self.socket_aliases.update({socket_type:[socket for socket in self.socket_library if self.socket_library[socket].socket_type == socket_type] for socket_type in {self.socket_library[unique_socket_type].socket_type for unique_socket_type in self.socket_library}})
self.socket_aliases.update({"all":[socket for socket in self.socket_library]})
self.query_attempts:int = query_attempts
self.socket_default:str = socket_default
if socket_default is None:
self.socket_default = next(iter(self.socket_library))
self.socket_precedence = socket_precedence
if socket_precedence == :
self.socket_precedence:list = self.socket_aliases[self.socket_library[self.socket_default].socket_type]
self.fail_on_empty:bool = fail_on_empty
self.fail_on_zero:bool = fail_on_empty
def handshake(self):
precedence_candidates =
for alias in self.socket_precedence:
for socket in self.socket_aliases[alias]:
try:
self.socket_library[socket].handshake()
precedence_candidates.append(socket)
except PermittedSocketError:
continue
if len(precedence_candidates) != 0:
self.socket_precedence = precedence_candidates
def get_results(self):
return self.result_socket.result
### Query Execution
def query(self, query, query_attempts=None, socket_precedence=, fail_on_empty=None, fail_on_zero=None):
# Overrides
if query_attempts is None:
query_attempts = self.query_attempts
if socket_precedence==:
for i in self.socket_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
else:
candidate_precedence = socket_precedence[:]
socket_precedence =
for i in candidate_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
if fail_on_empty is None: fail_on_empty = self.fail_on_empty
if fail_on_zero is None: fail_on_empty = self.fail_on_zero
# Loop through socket precedence list
for socket in socket_precedence:
try:
# Loop through socket attempts on current socket
for attempt_n in range(query_attempts):
try:
# Exponential timer; pauses 2^n seconds on the current socket
if attempt_n > 0:
print("Waiting " + str(2**attempt_n) + " seconds before reattempting...")
for k in range(2**attempt_n): time.sleep(1)
# Query attempt
self.socket_library[socket].execute(query, fail_on_empty, fail_on_zero)
self.result_socket = self.socket_library[socket]
return
except PermittedSocketError:
print("mquery() failed on socket "" + str(socket) + "".")
if attempt_n+1 == query_attempts:
raise
pass
except PermittedSocketError:
if socket == socket_precedence[-1]:
print("mquery() failed after trying all attempts on all sockets.")
raise
print("mquery() failed after all attempts on socket "" + str(socket) + ""; moving to next socket.")
continue
Questions are mostly along the lines of: We've tried to make this as "Pythonic" as possible - have we missed anything? Are there libraries that already perform this in a more efficient manner?
python object-oriented python-3.x sql database
add a comment |
I work on a small data team where we developed this tool when we began experiencing several job failures on our legacy big data systems. We have several data lakes, each with their own API, and until now had no contingencies for issues centering around query failures. Furthermore, our database sockets had no central repository for accessing our data. This tool brings all query platforms together in a (relatively) seamless experience by way of formulating a socket library and offers redundancy protocols when queries fail. This software is compatible with any SQL-similar interface that has an established Pythonic API, such as Impala, Hive, AWS, and terminal/command-line returns.
The usage is very straightforward:
# Upon instantiation:
# > mquery() imports all sockets and aliases via the factory
# > Default attempts on any socket is 8
# > Default socket is the first socket in the library
# > Default precedence is all sockets of the same type as the default socket
# > Default for fail_on_empty and fail_on_zero are both False
# Functional
a = mquery_lite().query(r"SELECT 'test;") # Returns the result set only; object instance is destructed on next mquery_lite() call
print(str(a.get_result()))
# Functional with overrides
a = mquery_lite().query(r"SELECT 'test;", query_attempts=2, fail_on_zero=True) # Returns the result set only, overriding the default query_attempts and fail_on_zero values
print(str(a.get_result()))
# Object Oriented with Overrides
a = mquery_lite(socket_precedence=["pd","data"]) # Precedence is set to both the pandas and all data sockets
a.query_attempts = 4
a.query(r"SELECT 'test';", query_attempts=2, socket_precedence=["pd"]) # Overrides on query submission
# query_attempts returns to 4 and socket_precedence returns to ["pd","data"]
print(str(a.get_result()))
# Object Oriented with Overrides and Handshake
a = mquery_lite(socket_precedence=["all","data"], query_attempts=4) # Precedence list will be the list of all sockets plus the additional list of all data sockets; there will be duplicates in the precedence list (not problematic)
a.handshake() # Test each socket in the current precedence list; retain only those that pass. Duplicates will persist.
a.query(r"SELECT 'test';", query_attempts=2)
# query_attempts returns to 4
print(str(a.get_result()))
Naturally, all proprietary information has been redacted from this post. I've also stripped most of the bells and whistles (e.g. typecasting property decorators) to demonstrate basic functionality and proof of concept in the code below; all pieces of this code can readily be found from open sources on the internet, but I haven't seen them tied together in this manner.
The mquery_lite()
object is the primary process of this code and can either be used functionally or instanced as an object. When called, mquery_lite()
determines if a query was provided - if so, it will instance itself, perform the query, then return the result pointer from the successful socket. If a query is not passed, mquery_lite()
remains instanced and user-modified settings are retained.
Sockets are imported by way of a generator-encapsulated factory. Their aliases are mapped in a separate library for ease of use when calling sockets. Sockets are separated by type, defined in the socket itself (we prefer to group by the expected output of the socket as this ensures consistent output on query failure; e.g. data frame, list of lists, generator, etc.). Sockets retain the query results until a new query is submitted.
The socket and alias libraries are automatically built on instantiation, based on the order in which they are present in the script. Collisions are rectified on a first-come-first-serve basis. The following object variables are created on instantiation:
query_attempts
(default 8) is the number of attemptsmquery_lite()
will make on a socket before moving to the next socket. An exponential timer (2^n) sets the pause between repeat queries on a socket.
socket_default
(defaultNone
) is the socket that will be substituted in the precedence list when an unknown alias is provided. Will default to the first socket in the library ifNone
is detected.
socket_precedence
(default) is the order in which sockets will be attempted. Will default to all sockets of the same type as the default socket in the library if
None
is detected.
fail_on_empty
(defaultFalse
) indicates if a query should raise an exception if it comes back empty (useful for command queries).
fail_on_zero
(defaultFalse
) indicates if a query should raise an exception if it comes back zero (useful for counts).
Results remain and failures occur at the socket level. Handling of permitted errors (raised from sockets) occurs in the .query()
method.
import pandas
import pyodbc
import time
def peek(x):
try:
return next(x)
except StopIteration:
return None
###############################################################################
### Dynamic Polymorphic Socket Factory ########################################
###############################################################################
class PermittedSocketError(Exception):
"""
A socket error that should trigger a retry, but not a program termination.
"""
pass
class socket:
DSN = "DSN=Your.DSN.Info.Here;" # Used in pyodbc and pandas sockets
def handshake(self, query="SELECT 'test';"):
self.execute(query, fail_on_empty=False, fail_on_zero=False)
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
class factory:
objects = {}
def add_factory(id, factory):
factory.factories.put[id] = factory
add_factory = staticmethod(add_factory)
def create_object(id):
if id not in factory.objects: # updated for Python 3
factory.objects[id] = eval(id + ".factory()")
return factory.objects[id].create()
create_object = staticmethod(create_object)
###############################################################################
### Socket Library ############################################################
###############################################################################
class generic_socket(socket):
socket_aliases = ["alias_1", "alias_2"]
socket_type = "type"
@property
def result(self):
# Any type of return handling can go here (such as a generator to improve post-parsing)
return self.__data_block
def execute(self, query, fail_on_empty, fail_on_zero):
# Set up for query
self.__data_block = None
try:
# Execute query
# Internal post query handling of error codes should raise exceptions here - useful for non-Pythonic (e.g. command-line) returns
# Likely not needed if using processes with full Pythonic exception handling
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
# Edit: The if-else statements above were note syntactically valid and should be changed to:
# if fail_on_empty and /*Check if Empty*/:
# raise PermittedSocketError("Empty return detected.")
# if fail_on_zero and /*Check if Zero*/:
# raise PermittedSocketError("Zero return detected.")
# if /*Permitted Error Behavior*/:
# raise PermittedSocketError("[msg] ")
# if /*Non-Permitted Error Behavior*/:
# raise Exception
# Exterior post query handling of permitted socket errors - Pythonic exceptions should be caught here
except PermittedSocketError:
# Permitted error post-process, such as reinitializing security protocols or invalidating metadata
# Permitted errors are re-raised and handled within mquery_lite()
raise
class factory:
def create(self):
return generic_socket()
class pandas_socket(socket):
socket_aliases = ["pandas","pd"]
socket_type = "data"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
self.data_block = pandas.read_sql(query, connection)
connection.close()
if fail_on_empty and self.data_block.dropna().empty:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and self.data_block.shape == (1,1) and int(float(self.data_block.iloc[0,0])) == 0:
raise PermittedSocketError("Zero return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pandas_socket()
class pyodbc_socket(socket):
socket_aliases = ["pyodbc"]
socket_type = "standard"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
cursor = connection.cursor()
cursor.execute(query)
self.data_block = cursor.fetchall()
cursor.close()
connection.close()
row = peek(iter(self.data_block))
if fail_on_empty and not row:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and len(row) == 1 and peek(iter(row)) in (0, "0"):
raise PermittedSocketError("Zero return detected.")
except pyodbc.ProgrammingError:
# Thrown when .fetchall() returns nothing
self.__data_block = [()]
raise PermittedSocketError("Empty return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pyodbc_socket()
###############################################################################
### mquery_lite() #############################################################
###############################################################################
class mquery_lite(object):
def __new__(cls, query=None, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
# https://howto.lintel.in/python-__new__-magic-method-explained/
if query is not None:
mquery_instance = super(mquery_lite, cls).__new__(cls)
mquery_instance.__init__(query_attempts, socket_default, socket_precedence, fail_on_empty, fail_on_zero)
mquery_instance.query(query)
return mquery_instance.get_results()
else:
return super(mquery_lite, cls).__new__(cls)
### CTOR
def __init__(self, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
### Socket Library
self.socket_library = {socket.__name__:factory.create_object(socket.__name__) for socket in socket.__subclasses__()}
self.socket_aliases = ({socket:[socket] for socket in self.socket_library})
self.socket_aliases.update({alias:[socket] for socket in self.socket_library for alias in self.socket_library[socket].socket_aliases if alias not in self.socket_aliases})
self.socket_aliases.update({socket_type:[socket for socket in self.socket_library if self.socket_library[socket].socket_type == socket_type] for socket_type in {self.socket_library[unique_socket_type].socket_type for unique_socket_type in self.socket_library}})
self.socket_aliases.update({"all":[socket for socket in self.socket_library]})
self.query_attempts:int = query_attempts
self.socket_default:str = socket_default
if socket_default is None:
self.socket_default = next(iter(self.socket_library))
self.socket_precedence = socket_precedence
if socket_precedence == :
self.socket_precedence:list = self.socket_aliases[self.socket_library[self.socket_default].socket_type]
self.fail_on_empty:bool = fail_on_empty
self.fail_on_zero:bool = fail_on_empty
def handshake(self):
precedence_candidates =
for alias in self.socket_precedence:
for socket in self.socket_aliases[alias]:
try:
self.socket_library[socket].handshake()
precedence_candidates.append(socket)
except PermittedSocketError:
continue
if len(precedence_candidates) != 0:
self.socket_precedence = precedence_candidates
def get_results(self):
return self.result_socket.result
### Query Execution
def query(self, query, query_attempts=None, socket_precedence=, fail_on_empty=None, fail_on_zero=None):
# Overrides
if query_attempts is None:
query_attempts = self.query_attempts
if socket_precedence==:
for i in self.socket_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
else:
candidate_precedence = socket_precedence[:]
socket_precedence =
for i in candidate_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
if fail_on_empty is None: fail_on_empty = self.fail_on_empty
if fail_on_zero is None: fail_on_empty = self.fail_on_zero
# Loop through socket precedence list
for socket in socket_precedence:
try:
# Loop through socket attempts on current socket
for attempt_n in range(query_attempts):
try:
# Exponential timer; pauses 2^n seconds on the current socket
if attempt_n > 0:
print("Waiting " + str(2**attempt_n) + " seconds before reattempting...")
for k in range(2**attempt_n): time.sleep(1)
# Query attempt
self.socket_library[socket].execute(query, fail_on_empty, fail_on_zero)
self.result_socket = self.socket_library[socket]
return
except PermittedSocketError:
print("mquery() failed on socket "" + str(socket) + "".")
if attempt_n+1 == query_attempts:
raise
pass
except PermittedSocketError:
if socket == socket_precedence[-1]:
print("mquery() failed after trying all attempts on all sockets.")
raise
print("mquery() failed after all attempts on socket "" + str(socket) + ""; moving to next socket.")
continue
Questions are mostly along the lines of: We've tried to make this as "Pythonic" as possible - have we missed anything? Are there libraries that already perform this in a more efficient manner?
python object-oriented python-3.x sql database
add a comment |
I work on a small data team where we developed this tool when we began experiencing several job failures on our legacy big data systems. We have several data lakes, each with their own API, and until now had no contingencies for issues centering around query failures. Furthermore, our database sockets had no central repository for accessing our data. This tool brings all query platforms together in a (relatively) seamless experience by way of formulating a socket library and offers redundancy protocols when queries fail. This software is compatible with any SQL-similar interface that has an established Pythonic API, such as Impala, Hive, AWS, and terminal/command-line returns.
The usage is very straightforward:
# Upon instantiation:
# > mquery() imports all sockets and aliases via the factory
# > Default attempts on any socket is 8
# > Default socket is the first socket in the library
# > Default precedence is all sockets of the same type as the default socket
# > Default for fail_on_empty and fail_on_zero are both False
# Functional
a = mquery_lite().query(r"SELECT 'test;") # Returns the result set only; object instance is destructed on next mquery_lite() call
print(str(a.get_result()))
# Functional with overrides
a = mquery_lite().query(r"SELECT 'test;", query_attempts=2, fail_on_zero=True) # Returns the result set only, overriding the default query_attempts and fail_on_zero values
print(str(a.get_result()))
# Object Oriented with Overrides
a = mquery_lite(socket_precedence=["pd","data"]) # Precedence is set to both the pandas and all data sockets
a.query_attempts = 4
a.query(r"SELECT 'test';", query_attempts=2, socket_precedence=["pd"]) # Overrides on query submission
# query_attempts returns to 4 and socket_precedence returns to ["pd","data"]
print(str(a.get_result()))
# Object Oriented with Overrides and Handshake
a = mquery_lite(socket_precedence=["all","data"], query_attempts=4) # Precedence list will be the list of all sockets plus the additional list of all data sockets; there will be duplicates in the precedence list (not problematic)
a.handshake() # Test each socket in the current precedence list; retain only those that pass. Duplicates will persist.
a.query(r"SELECT 'test';", query_attempts=2)
# query_attempts returns to 4
print(str(a.get_result()))
Naturally, all proprietary information has been redacted from this post. I've also stripped most of the bells and whistles (e.g. typecasting property decorators) to demonstrate basic functionality and proof of concept in the code below; all pieces of this code can readily be found from open sources on the internet, but I haven't seen them tied together in this manner.
The mquery_lite()
object is the primary process of this code and can either be used functionally or instanced as an object. When called, mquery_lite()
determines if a query was provided - if so, it will instance itself, perform the query, then return the result pointer from the successful socket. If a query is not passed, mquery_lite()
remains instanced and user-modified settings are retained.
Sockets are imported by way of a generator-encapsulated factory. Their aliases are mapped in a separate library for ease of use when calling sockets. Sockets are separated by type, defined in the socket itself (we prefer to group by the expected output of the socket as this ensures consistent output on query failure; e.g. data frame, list of lists, generator, etc.). Sockets retain the query results until a new query is submitted.
The socket and alias libraries are automatically built on instantiation, based on the order in which they are present in the script. Collisions are rectified on a first-come-first-serve basis. The following object variables are created on instantiation:
query_attempts
(default 8) is the number of attemptsmquery_lite()
will make on a socket before moving to the next socket. An exponential timer (2^n) sets the pause between repeat queries on a socket.
socket_default
(defaultNone
) is the socket that will be substituted in the precedence list when an unknown alias is provided. Will default to the first socket in the library ifNone
is detected.
socket_precedence
(default) is the order in which sockets will be attempted. Will default to all sockets of the same type as the default socket in the library if
None
is detected.
fail_on_empty
(defaultFalse
) indicates if a query should raise an exception if it comes back empty (useful for command queries).
fail_on_zero
(defaultFalse
) indicates if a query should raise an exception if it comes back zero (useful for counts).
Results remain and failures occur at the socket level. Handling of permitted errors (raised from sockets) occurs in the .query()
method.
import pandas
import pyodbc
import time
def peek(x):
try:
return next(x)
except StopIteration:
return None
###############################################################################
### Dynamic Polymorphic Socket Factory ########################################
###############################################################################
class PermittedSocketError(Exception):
"""
A socket error that should trigger a retry, but not a program termination.
"""
pass
class socket:
DSN = "DSN=Your.DSN.Info.Here;" # Used in pyodbc and pandas sockets
def handshake(self, query="SELECT 'test';"):
self.execute(query, fail_on_empty=False, fail_on_zero=False)
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
class factory:
objects = {}
def add_factory(id, factory):
factory.factories.put[id] = factory
add_factory = staticmethod(add_factory)
def create_object(id):
if id not in factory.objects: # updated for Python 3
factory.objects[id] = eval(id + ".factory()")
return factory.objects[id].create()
create_object = staticmethod(create_object)
###############################################################################
### Socket Library ############################################################
###############################################################################
class generic_socket(socket):
socket_aliases = ["alias_1", "alias_2"]
socket_type = "type"
@property
def result(self):
# Any type of return handling can go here (such as a generator to improve post-parsing)
return self.__data_block
def execute(self, query, fail_on_empty, fail_on_zero):
# Set up for query
self.__data_block = None
try:
# Execute query
# Internal post query handling of error codes should raise exceptions here - useful for non-Pythonic (e.g. command-line) returns
# Likely not needed if using processes with full Pythonic exception handling
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
# Edit: The if-else statements above were note syntactically valid and should be changed to:
# if fail_on_empty and /*Check if Empty*/:
# raise PermittedSocketError("Empty return detected.")
# if fail_on_zero and /*Check if Zero*/:
# raise PermittedSocketError("Zero return detected.")
# if /*Permitted Error Behavior*/:
# raise PermittedSocketError("[msg] ")
# if /*Non-Permitted Error Behavior*/:
# raise Exception
# Exterior post query handling of permitted socket errors - Pythonic exceptions should be caught here
except PermittedSocketError:
# Permitted error post-process, such as reinitializing security protocols or invalidating metadata
# Permitted errors are re-raised and handled within mquery_lite()
raise
class factory:
def create(self):
return generic_socket()
class pandas_socket(socket):
socket_aliases = ["pandas","pd"]
socket_type = "data"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
self.data_block = pandas.read_sql(query, connection)
connection.close()
if fail_on_empty and self.data_block.dropna().empty:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and self.data_block.shape == (1,1) and int(float(self.data_block.iloc[0,0])) == 0:
raise PermittedSocketError("Zero return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pandas_socket()
class pyodbc_socket(socket):
socket_aliases = ["pyodbc"]
socket_type = "standard"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
cursor = connection.cursor()
cursor.execute(query)
self.data_block = cursor.fetchall()
cursor.close()
connection.close()
row = peek(iter(self.data_block))
if fail_on_empty and not row:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and len(row) == 1 and peek(iter(row)) in (0, "0"):
raise PermittedSocketError("Zero return detected.")
except pyodbc.ProgrammingError:
# Thrown when .fetchall() returns nothing
self.__data_block = [()]
raise PermittedSocketError("Empty return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pyodbc_socket()
###############################################################################
### mquery_lite() #############################################################
###############################################################################
class mquery_lite(object):
def __new__(cls, query=None, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
# https://howto.lintel.in/python-__new__-magic-method-explained/
if query is not None:
mquery_instance = super(mquery_lite, cls).__new__(cls)
mquery_instance.__init__(query_attempts, socket_default, socket_precedence, fail_on_empty, fail_on_zero)
mquery_instance.query(query)
return mquery_instance.get_results()
else:
return super(mquery_lite, cls).__new__(cls)
### CTOR
def __init__(self, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
### Socket Library
self.socket_library = {socket.__name__:factory.create_object(socket.__name__) for socket in socket.__subclasses__()}
self.socket_aliases = ({socket:[socket] for socket in self.socket_library})
self.socket_aliases.update({alias:[socket] for socket in self.socket_library for alias in self.socket_library[socket].socket_aliases if alias not in self.socket_aliases})
self.socket_aliases.update({socket_type:[socket for socket in self.socket_library if self.socket_library[socket].socket_type == socket_type] for socket_type in {self.socket_library[unique_socket_type].socket_type for unique_socket_type in self.socket_library}})
self.socket_aliases.update({"all":[socket for socket in self.socket_library]})
self.query_attempts:int = query_attempts
self.socket_default:str = socket_default
if socket_default is None:
self.socket_default = next(iter(self.socket_library))
self.socket_precedence = socket_precedence
if socket_precedence == :
self.socket_precedence:list = self.socket_aliases[self.socket_library[self.socket_default].socket_type]
self.fail_on_empty:bool = fail_on_empty
self.fail_on_zero:bool = fail_on_empty
def handshake(self):
precedence_candidates =
for alias in self.socket_precedence:
for socket in self.socket_aliases[alias]:
try:
self.socket_library[socket].handshake()
precedence_candidates.append(socket)
except PermittedSocketError:
continue
if len(precedence_candidates) != 0:
self.socket_precedence = precedence_candidates
def get_results(self):
return self.result_socket.result
### Query Execution
def query(self, query, query_attempts=None, socket_precedence=, fail_on_empty=None, fail_on_zero=None):
# Overrides
if query_attempts is None:
query_attempts = self.query_attempts
if socket_precedence==:
for i in self.socket_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
else:
candidate_precedence = socket_precedence[:]
socket_precedence =
for i in candidate_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
if fail_on_empty is None: fail_on_empty = self.fail_on_empty
if fail_on_zero is None: fail_on_empty = self.fail_on_zero
# Loop through socket precedence list
for socket in socket_precedence:
try:
# Loop through socket attempts on current socket
for attempt_n in range(query_attempts):
try:
# Exponential timer; pauses 2^n seconds on the current socket
if attempt_n > 0:
print("Waiting " + str(2**attempt_n) + " seconds before reattempting...")
for k in range(2**attempt_n): time.sleep(1)
# Query attempt
self.socket_library[socket].execute(query, fail_on_empty, fail_on_zero)
self.result_socket = self.socket_library[socket]
return
except PermittedSocketError:
print("mquery() failed on socket "" + str(socket) + "".")
if attempt_n+1 == query_attempts:
raise
pass
except PermittedSocketError:
if socket == socket_precedence[-1]:
print("mquery() failed after trying all attempts on all sockets.")
raise
print("mquery() failed after all attempts on socket "" + str(socket) + ""; moving to next socket.")
continue
Questions are mostly along the lines of: We've tried to make this as "Pythonic" as possible - have we missed anything? Are there libraries that already perform this in a more efficient manner?
python object-oriented python-3.x sql database
I work on a small data team where we developed this tool when we began experiencing several job failures on our legacy big data systems. We have several data lakes, each with their own API, and until now had no contingencies for issues centering around query failures. Furthermore, our database sockets had no central repository for accessing our data. This tool brings all query platforms together in a (relatively) seamless experience by way of formulating a socket library and offers redundancy protocols when queries fail. This software is compatible with any SQL-similar interface that has an established Pythonic API, such as Impala, Hive, AWS, and terminal/command-line returns.
The usage is very straightforward:
# Upon instantiation:
# > mquery() imports all sockets and aliases via the factory
# > Default attempts on any socket is 8
# > Default socket is the first socket in the library
# > Default precedence is all sockets of the same type as the default socket
# > Default for fail_on_empty and fail_on_zero are both False
# Functional
a = mquery_lite().query(r"SELECT 'test;") # Returns the result set only; object instance is destructed on next mquery_lite() call
print(str(a.get_result()))
# Functional with overrides
a = mquery_lite().query(r"SELECT 'test;", query_attempts=2, fail_on_zero=True) # Returns the result set only, overriding the default query_attempts and fail_on_zero values
print(str(a.get_result()))
# Object Oriented with Overrides
a = mquery_lite(socket_precedence=["pd","data"]) # Precedence is set to both the pandas and all data sockets
a.query_attempts = 4
a.query(r"SELECT 'test';", query_attempts=2, socket_precedence=["pd"]) # Overrides on query submission
# query_attempts returns to 4 and socket_precedence returns to ["pd","data"]
print(str(a.get_result()))
# Object Oriented with Overrides and Handshake
a = mquery_lite(socket_precedence=["all","data"], query_attempts=4) # Precedence list will be the list of all sockets plus the additional list of all data sockets; there will be duplicates in the precedence list (not problematic)
a.handshake() # Test each socket in the current precedence list; retain only those that pass. Duplicates will persist.
a.query(r"SELECT 'test';", query_attempts=2)
# query_attempts returns to 4
print(str(a.get_result()))
Naturally, all proprietary information has been redacted from this post. I've also stripped most of the bells and whistles (e.g. typecasting property decorators) to demonstrate basic functionality and proof of concept in the code below; all pieces of this code can readily be found from open sources on the internet, but I haven't seen them tied together in this manner.
The mquery_lite()
object is the primary process of this code and can either be used functionally or instanced as an object. When called, mquery_lite()
determines if a query was provided - if so, it will instance itself, perform the query, then return the result pointer from the successful socket. If a query is not passed, mquery_lite()
remains instanced and user-modified settings are retained.
Sockets are imported by way of a generator-encapsulated factory. Their aliases are mapped in a separate library for ease of use when calling sockets. Sockets are separated by type, defined in the socket itself (we prefer to group by the expected output of the socket as this ensures consistent output on query failure; e.g. data frame, list of lists, generator, etc.). Sockets retain the query results until a new query is submitted.
The socket and alias libraries are automatically built on instantiation, based on the order in which they are present in the script. Collisions are rectified on a first-come-first-serve basis. The following object variables are created on instantiation:
query_attempts
(default 8) is the number of attemptsmquery_lite()
will make on a socket before moving to the next socket. An exponential timer (2^n) sets the pause between repeat queries on a socket.
socket_default
(defaultNone
) is the socket that will be substituted in the precedence list when an unknown alias is provided. Will default to the first socket in the library ifNone
is detected.
socket_precedence
(default) is the order in which sockets will be attempted. Will default to all sockets of the same type as the default socket in the library if
None
is detected.
fail_on_empty
(defaultFalse
) indicates if a query should raise an exception if it comes back empty (useful for command queries).
fail_on_zero
(defaultFalse
) indicates if a query should raise an exception if it comes back zero (useful for counts).
Results remain and failures occur at the socket level. Handling of permitted errors (raised from sockets) occurs in the .query()
method.
import pandas
import pyodbc
import time
def peek(x):
try:
return next(x)
except StopIteration:
return None
###############################################################################
### Dynamic Polymorphic Socket Factory ########################################
###############################################################################
class PermittedSocketError(Exception):
"""
A socket error that should trigger a retry, but not a program termination.
"""
pass
class socket:
DSN = "DSN=Your.DSN.Info.Here;" # Used in pyodbc and pandas sockets
def handshake(self, query="SELECT 'test';"):
self.execute(query, fail_on_empty=False, fail_on_zero=False)
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
class factory:
objects = {}
def add_factory(id, factory):
factory.factories.put[id] = factory
add_factory = staticmethod(add_factory)
def create_object(id):
if id not in factory.objects: # updated for Python 3
factory.objects[id] = eval(id + ".factory()")
return factory.objects[id].create()
create_object = staticmethod(create_object)
###############################################################################
### Socket Library ############################################################
###############################################################################
class generic_socket(socket):
socket_aliases = ["alias_1", "alias_2"]
socket_type = "type"
@property
def result(self):
# Any type of return handling can go here (such as a generator to improve post-parsing)
return self.__data_block
def execute(self, query, fail_on_empty, fail_on_zero):
# Set up for query
self.__data_block = None
try:
# Execute query
# Internal post query handling of error codes should raise exceptions here - useful for non-Pythonic (e.g. command-line) returns
# Likely not needed if using processes with full Pythonic exception handling
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
# Edit: The if-else statements above were note syntactically valid and should be changed to:
# if fail_on_empty and /*Check if Empty*/:
# raise PermittedSocketError("Empty return detected.")
# if fail_on_zero and /*Check if Zero*/:
# raise PermittedSocketError("Zero return detected.")
# if /*Permitted Error Behavior*/:
# raise PermittedSocketError("[msg] ")
# if /*Non-Permitted Error Behavior*/:
# raise Exception
# Exterior post query handling of permitted socket errors - Pythonic exceptions should be caught here
except PermittedSocketError:
# Permitted error post-process, such as reinitializing security protocols or invalidating metadata
# Permitted errors are re-raised and handled within mquery_lite()
raise
class factory:
def create(self):
return generic_socket()
class pandas_socket(socket):
socket_aliases = ["pandas","pd"]
socket_type = "data"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
self.data_block = pandas.read_sql(query, connection)
connection.close()
if fail_on_empty and self.data_block.dropna().empty:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and self.data_block.shape == (1,1) and int(float(self.data_block.iloc[0,0])) == 0:
raise PermittedSocketError("Zero return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pandas_socket()
class pyodbc_socket(socket):
socket_aliases = ["pyodbc"]
socket_type = "standard"
@property
def result(self):
return self.data_block
def execute(self, query, fail_on_empty, fail_on_zero):
self.data_block = None
try:
connection = pyodbc.connect(self.DSN, autocommit=True)
cursor = connection.cursor()
cursor.execute(query)
self.data_block = cursor.fetchall()
cursor.close()
connection.close()
row = peek(iter(self.data_block))
if fail_on_empty and not row:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and len(row) == 1 and peek(iter(row)) in (0, "0"):
raise PermittedSocketError("Zero return detected.")
except pyodbc.ProgrammingError:
# Thrown when .fetchall() returns nothing
self.__data_block = [()]
raise PermittedSocketError("Empty return detected.")
except PermittedSocketError:
raise
class factory:
def create(self):
return pyodbc_socket()
###############################################################################
### mquery_lite() #############################################################
###############################################################################
class mquery_lite(object):
def __new__(cls, query=None, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
# https://howto.lintel.in/python-__new__-magic-method-explained/
if query is not None:
mquery_instance = super(mquery_lite, cls).__new__(cls)
mquery_instance.__init__(query_attempts, socket_default, socket_precedence, fail_on_empty, fail_on_zero)
mquery_instance.query(query)
return mquery_instance.get_results()
else:
return super(mquery_lite, cls).__new__(cls)
### CTOR
def __init__(self, query_attempts=8, socket_default=None, socket_precedence=, fail_on_empty=False, fail_on_zero=False):
### Socket Library
self.socket_library = {socket.__name__:factory.create_object(socket.__name__) for socket in socket.__subclasses__()}
self.socket_aliases = ({socket:[socket] for socket in self.socket_library})
self.socket_aliases.update({alias:[socket] for socket in self.socket_library for alias in self.socket_library[socket].socket_aliases if alias not in self.socket_aliases})
self.socket_aliases.update({socket_type:[socket for socket in self.socket_library if self.socket_library[socket].socket_type == socket_type] for socket_type in {self.socket_library[unique_socket_type].socket_type for unique_socket_type in self.socket_library}})
self.socket_aliases.update({"all":[socket for socket in self.socket_library]})
self.query_attempts:int = query_attempts
self.socket_default:str = socket_default
if socket_default is None:
self.socket_default = next(iter(self.socket_library))
self.socket_precedence = socket_precedence
if socket_precedence == :
self.socket_precedence:list = self.socket_aliases[self.socket_library[self.socket_default].socket_type]
self.fail_on_empty:bool = fail_on_empty
self.fail_on_zero:bool = fail_on_empty
def handshake(self):
precedence_candidates =
for alias in self.socket_precedence:
for socket in self.socket_aliases[alias]:
try:
self.socket_library[socket].handshake()
precedence_candidates.append(socket)
except PermittedSocketError:
continue
if len(precedence_candidates) != 0:
self.socket_precedence = precedence_candidates
def get_results(self):
return self.result_socket.result
### Query Execution
def query(self, query, query_attempts=None, socket_precedence=, fail_on_empty=None, fail_on_zero=None):
# Overrides
if query_attempts is None:
query_attempts = self.query_attempts
if socket_precedence==:
for i in self.socket_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
else:
candidate_precedence = socket_precedence[:]
socket_precedence =
for i in candidate_precedence:
for j in self.socket_aliases[i]:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
if fail_on_empty is None: fail_on_empty = self.fail_on_empty
if fail_on_zero is None: fail_on_empty = self.fail_on_zero
# Loop through socket precedence list
for socket in socket_precedence:
try:
# Loop through socket attempts on current socket
for attempt_n in range(query_attempts):
try:
# Exponential timer; pauses 2^n seconds on the current socket
if attempt_n > 0:
print("Waiting " + str(2**attempt_n) + " seconds before reattempting...")
for k in range(2**attempt_n): time.sleep(1)
# Query attempt
self.socket_library[socket].execute(query, fail_on_empty, fail_on_zero)
self.result_socket = self.socket_library[socket]
return
except PermittedSocketError:
print("mquery() failed on socket "" + str(socket) + "".")
if attempt_n+1 == query_attempts:
raise
pass
except PermittedSocketError:
if socket == socket_precedence[-1]:
print("mquery() failed after trying all attempts on all sockets.")
raise
print("mquery() failed after all attempts on socket "" + str(socket) + ""; moving to next socket.")
continue
Questions are mostly along the lines of: We've tried to make this as "Pythonic" as possible - have we missed anything? Are there libraries that already perform this in a more efficient manner?
python object-oriented python-3.x sql database
python object-oriented python-3.x sql database
edited yesterday
asked Dec 26 at 17:51
Miller
313212
313212
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
By PEP8, your classes sockets
, factory
, etc. should be capitalized. You also need newlines between your class methods. These can all be fixed fairly easily with the use of a stand-alone or IDE-builtin linter.
This comment:
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
should be moved to a docstring on the inside of the class:
class Factory:
"""Dynamic socket factory
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
"""
So far as I can tell, socket
is an abstract base class. You can pull in abc
if you want to, but at the very least, you should explicitly show execute
as being a "pure virtual" (in C++ parlance) method:
def execute(self, query, fail_on_empty, fail_on_zero):
raise NotImplementedError()
This code:
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
can lose the else
, because the previous block has already raise
d.
I'm not sure what's happened here - whether it's a redaction, or what - but it doesn't look syntactically valid:
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
Your except PermittedSocketError:
and its accompanying comment Permitted errors are re-raised
are a little odd. If the error is permitted, and you're doing nothing but re-raising, why have the try
block in the first place?
This:
else:
return super(mquery_lite, cls).__new__(cls)
doesn't need an else
, for the same reason as that raise
I've described above.
The series of list comprehensions seen after ### Socket Library
really needs to have most or all of those broken up onto multiple lines, especially when you have multiple for
.
self.socket_library
is a dictionary, but I'm unclear on your usage. You have this:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
Is your intention to look through the keys of socket_library
, ignore the values, and add present keys to socket_precedence
? If you want to use its values, this needs to change.
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
To answer your question regardingPermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regardingsocket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.
– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raiseNotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.
– Miller
yesterday
1
You define it in the parent, raisingNotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.
– Reinderien
yesterday
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210377%2fpolymorphic-data-socket-factory-with-query-contingency-protocols-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
By PEP8, your classes sockets
, factory
, etc. should be capitalized. You also need newlines between your class methods. These can all be fixed fairly easily with the use of a stand-alone or IDE-builtin linter.
This comment:
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
should be moved to a docstring on the inside of the class:
class Factory:
"""Dynamic socket factory
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
"""
So far as I can tell, socket
is an abstract base class. You can pull in abc
if you want to, but at the very least, you should explicitly show execute
as being a "pure virtual" (in C++ parlance) method:
def execute(self, query, fail_on_empty, fail_on_zero):
raise NotImplementedError()
This code:
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
can lose the else
, because the previous block has already raise
d.
I'm not sure what's happened here - whether it's a redaction, or what - but it doesn't look syntactically valid:
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
Your except PermittedSocketError:
and its accompanying comment Permitted errors are re-raised
are a little odd. If the error is permitted, and you're doing nothing but re-raising, why have the try
block in the first place?
This:
else:
return super(mquery_lite, cls).__new__(cls)
doesn't need an else
, for the same reason as that raise
I've described above.
The series of list comprehensions seen after ### Socket Library
really needs to have most or all of those broken up onto multiple lines, especially when you have multiple for
.
self.socket_library
is a dictionary, but I'm unclear on your usage. You have this:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
Is your intention to look through the keys of socket_library
, ignore the values, and add present keys to socket_precedence
? If you want to use its values, this needs to change.
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
To answer your question regardingPermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regardingsocket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.
– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raiseNotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.
– Miller
yesterday
1
You define it in the parent, raisingNotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.
– Reinderien
yesterday
add a comment |
By PEP8, your classes sockets
, factory
, etc. should be capitalized. You also need newlines between your class methods. These can all be fixed fairly easily with the use of a stand-alone or IDE-builtin linter.
This comment:
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
should be moved to a docstring on the inside of the class:
class Factory:
"""Dynamic socket factory
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
"""
So far as I can tell, socket
is an abstract base class. You can pull in abc
if you want to, but at the very least, you should explicitly show execute
as being a "pure virtual" (in C++ parlance) method:
def execute(self, query, fail_on_empty, fail_on_zero):
raise NotImplementedError()
This code:
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
can lose the else
, because the previous block has already raise
d.
I'm not sure what's happened here - whether it's a redaction, or what - but it doesn't look syntactically valid:
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
Your except PermittedSocketError:
and its accompanying comment Permitted errors are re-raised
are a little odd. If the error is permitted, and you're doing nothing but re-raising, why have the try
block in the first place?
This:
else:
return super(mquery_lite, cls).__new__(cls)
doesn't need an else
, for the same reason as that raise
I've described above.
The series of list comprehensions seen after ### Socket Library
really needs to have most or all of those broken up onto multiple lines, especially when you have multiple for
.
self.socket_library
is a dictionary, but I'm unclear on your usage. You have this:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
Is your intention to look through the keys of socket_library
, ignore the values, and add present keys to socket_precedence
? If you want to use its values, this needs to change.
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
To answer your question regardingPermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regardingsocket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.
– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raiseNotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.
– Miller
yesterday
1
You define it in the parent, raisingNotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.
– Reinderien
yesterday
add a comment |
By PEP8, your classes sockets
, factory
, etc. should be capitalized. You also need newlines between your class methods. These can all be fixed fairly easily with the use of a stand-alone or IDE-builtin linter.
This comment:
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
should be moved to a docstring on the inside of the class:
class Factory:
"""Dynamic socket factory
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
"""
So far as I can tell, socket
is an abstract base class. You can pull in abc
if you want to, but at the very least, you should explicitly show execute
as being a "pure virtual" (in C++ parlance) method:
def execute(self, query, fail_on_empty, fail_on_zero):
raise NotImplementedError()
This code:
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
can lose the else
, because the previous block has already raise
d.
I'm not sure what's happened here - whether it's a redaction, or what - but it doesn't look syntactically valid:
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
Your except PermittedSocketError:
and its accompanying comment Permitted errors are re-raised
are a little odd. If the error is permitted, and you're doing nothing but re-raising, why have the try
block in the first place?
This:
else:
return super(mquery_lite, cls).__new__(cls)
doesn't need an else
, for the same reason as that raise
I've described above.
The series of list comprehensions seen after ### Socket Library
really needs to have most or all of those broken up onto multiple lines, especially when you have multiple for
.
self.socket_library
is a dictionary, but I'm unclear on your usage. You have this:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
Is your intention to look through the keys of socket_library
, ignore the values, and add present keys to socket_precedence
? If you want to use its values, this needs to change.
By PEP8, your classes sockets
, factory
, etc. should be capitalized. You also need newlines between your class methods. These can all be fixed fairly easily with the use of a stand-alone or IDE-builtin linter.
This comment:
# Dynamic socket factory
# https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
should be moved to a docstring on the inside of the class:
class Factory:
"""Dynamic socket factory
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Factory.html
"""
So far as I can tell, socket
is an abstract base class. You can pull in abc
if you want to, but at the very least, you should explicitly show execute
as being a "pure virtual" (in C++ parlance) method:
def execute(self, query, fail_on_empty, fail_on_zero):
raise NotImplementedError()
This code:
if /*Permitted Error Behavior*/:
raise PermittedSocketError("[msg] ")
else:
raise
can lose the else
, because the previous block has already raise
d.
I'm not sure what's happened here - whether it's a redaction, or what - but it doesn't look syntactically valid:
if fail_on_empty and /*Check if Empty*/:
raise PermittedSocketError("Empty return detected.")
if fail_on_zero and /*Check if Zero*/:
raise PermittedSocketError("Zero return detected.")
Your except PermittedSocketError:
and its accompanying comment Permitted errors are re-raised
are a little odd. If the error is permitted, and you're doing nothing but re-raising, why have the try
block in the first place?
This:
else:
return super(mquery_lite, cls).__new__(cls)
doesn't need an else
, for the same reason as that raise
I've described above.
The series of list comprehensions seen after ### Socket Library
really needs to have most or all of those broken up onto multiple lines, especially when you have multiple for
.
self.socket_library
is a dictionary, but I'm unclear on your usage. You have this:
if j in self.socket_library:
socket_precedence.append(j)
else:
socket_precedence.append(self.default_socket)
Is your intention to look through the keys of socket_library
, ignore the values, and add present keys to socket_precedence
? If you want to use its values, this needs to change.
answered 2 days ago
Reinderien
3,299720
3,299720
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
To answer your question regardingPermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regardingsocket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.
– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raiseNotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.
– Miller
yesterday
1
You define it in the parent, raisingNotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.
– Reinderien
yesterday
add a comment |
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
To answer your question regardingPermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regardingsocket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.
– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raiseNotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.
– Miller
yesterday
1
You define it in the parent, raisingNotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.
– Reinderien
yesterday
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
Thanks for the response @Reinderien. I'm unsure about post etiquette here, but this is likely going to be a couple of comments. Regarding PEP 8: done - implementing in the next version. Docstrings: the original code is docstring'd to the nines; it defo got a little messy when we truncated the code. Both if-else blocks you mentioned were indeed victims of redaction; I've added a comment block in the code above to show the correct syntax as I didn't want to change the original post. Awesome info regarding the __new__() function as well - I didn't know it could just be left alone.
– Miller
yesterday
To answer your question regarding
PermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regarding socket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.– Miller
yesterday
To answer your question regarding
PermittedSocketError
and re-raising: There are some cases when certain code is useful immediately after a permitted query failure (e.g. REFRESH statement in Impala or Kerberos re-authentication). Otherwise, yes, the try block within the sockets serve little purpose and should just be raised back into mquery_lite() for handling. Regarding socket_precedence
: Indeed, the intention was to capture the keys of the sockets only so that the precedence list can be examined later if needed; otherwise you just get a list of pointers to the sockets.– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raise
NotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.– Miller
yesterday
The one element of your response I'm struggling on is the "pure virtual" reference. Does this get defined outside the socket classes just to raise
NotImplementedError
when someone tries to execute without a socket reference? Thanks again for the thorough review.– Miller
yesterday
1
1
You define it in the parent, raising
NotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.– Reinderien
yesterday
You define it in the parent, raising
NotImplementedError
, and in children, where it does not raise. This does a few things - makes it obvious that it's expected to be implemented in children; and acts as a more explicit failure if you accidentally call it without having defined it in a child.– Reinderien
yesterday
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210377%2fpolymorphic-data-socket-factory-with-query-contingency-protocols-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown