Compare 2 big source by Python 3 [on hold]

I have 11 big files (text file with raw data) on the FTP server.

2016-04-10 00:00:00| 000102840111|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

And also have another source on Google Big Query (11 tables).

And I need to build a process to compare 2 sources (the data in FTP and GBQ) to find out the new records (this process need to run as daily, the FTP server will update files every morning). And this is the path I used:

I export all the data ( 11 tables) on GBQ and download it to my local, and create historical data by MySQL.

After that, I read the data in the FTP server and compare with historical data to find out the new records (use a simple query).

Send it to GBQ, also insert to historical data.

Is this way is the best solution? Could anyone please can recommend me another solution?

asked Dec 26 at 8:01

Han Van Pham

254

put on hold as off-topic by Jamal♦ Dec 26 at 8:08

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." – Jamal

If this question can be reworded to fit the rules in the help center, please edit the question.

add a comment |

I have 11 big files (text file with raw data) on the FTP server.

2016-04-10 00:00:00| 000102840111|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

And also have another source on Google Big Query (11 tables).

I export all the data ( 11 tables) on GBQ and download it to my local, and create historical data by MySQL.

After that, I read the data in the FTP server and compare with historical data to find out the new records (use a simple query).

Send it to GBQ, also insert to historical data.

Is this way is the best solution? Could anyone please can recommend me another solution?

asked Dec 26 at 8:01

Han Van Pham

254

put on hold as off-topic by Jamal♦ Dec 26 at 8:08

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." – Jamal

If this question can be reworded to fit the rules in the help center, please edit the question.

add a comment |

I have 11 big files (text file with raw data) on the FTP server.

2016-04-10 00:00:00| 000102840111|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

And also have another source on Google Big Query (11 tables).

I export all the data ( 11 tables) on GBQ and download it to my local, and create historical data by MySQL.

After that, I read the data in the FTP server and compare with historical data to find out the new records (use a simple query).

Send it to GBQ, also insert to historical data.

Is this way is the best solution? Could anyone please can recommend me another solution?

asked Dec 26 at 8:01

Han Van Pham

254

I have 11 big files (text file with raw data) on the FTP server.

2016-04-10 00:00:00| 000102840111|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

2018-04-10 00:00:00| 000102840687|4987043|4845485

And also have another source on Google Big Query (11 tables).

I export all the data ( 11 tables) on GBQ and download it to my local, and create historical data by MySQL.

After that, I read the data in the FTP server and compare with historical data to find out the new records (use a simple query).

Send it to GBQ, also insert to historical data.

Is this way is the best solution? Could anyone please can recommend me another solution?

python-3.x

asked Dec 26 at 8:01

Han Van Pham

254

asked Dec 26 at 8:01

Han Van Pham

254

asked Dec 26 at 8:01

Han Van Pham

254

asked Dec 26 at 8:01

Han Van Pham

254

asked Dec 26 at 8:01

Han Van Pham

254

put on hold as off-topic by Jamal♦ Dec 26 at 8:08

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." – Jamal

If this question can be reworded to fit the rules in the help center, please edit the question.

put on hold as off-topic by Jamal♦ Dec 26 at 8:08

This question appears to be off-topic. The users who voted to close gave this specific reason:

"Code not implemented or not working as intended: Code Review is a community where programmers peer-review your working code to address issues such as security, maintainability, performance, and scalability. We require that the code be working correctly, to the best of the author's knowledge, before proceeding with a review." – Jamal

If this question can be reworded to fit the rules in the help center, please edit the question.

add a comment |

active

oldest

votes

This page is only for reference, If you need detailed information, please check here

WG3JYeZqJ8LZZpnxw2M76 oRgn cqgOMM Ov2 iEVx iw8v,Xqyk PoBiAqV6MIhmo9Koz mI4DvOUOKm20lo9yNL4hYPBEsM,xDX

搜尋此網誌

Gfrktyl