Data Compare Tool using Pandas, Flask with MongoDB and Docker + AWS ECR — Part 3

Subham Kumar Sahoo
5 min readJan 30, 2023

Data comparison between MySQL tables using Python Pandas and Flask. MongoDB for logging the statistics and CI/CD using Docker with AWS ECR.

Architecture Diagram

Previous part : https://medium.com/@subham-sahoo/data-compare-tool-using-pandas-flask-with-mongodb-and-docker-aws-ecr-part-2-a673daf80cd0

Let’s run the application!!

Disclaimer : If you get stuck just do not get demotivated and leave stuffs halfway. Believe me the answers to your problems lie in just one search on internet (maybe more than one 😉).

GitHub repository : https://github.com/sksgit7/Data-compare-docker

Step 5 : Build images and run containers

After all the code are in place, we will use docker-compose command to build the images and run the images in the containers.

Open DOcker Desktop application first (might take few mins). Open a terminal on your system, go to the db-compare directory where compose.yaml file is there and run :

docker-compose -f compose.yaml up
  • Creates docker network “db-compare_default”.
  • Builds db-compare image for Flask app.
  • Pulls mongo image from DockerHub if not there already.
  • Builds mongo-express-db-compare image for mongo-express.
  • Starts containers for db-compare-app and mongodb.
  • Waits for the mongo-db container to be up and running. Then starts the mongo-express container.
  • Also attaches the named volume (mongo-data) and host volume (tgt-docker).

We can verify the same on Docker desktop application also.

Then on any browser ping “http://localhost:3000” to open the Flask app. Then fill the DB connection and SQL query details.

SRC_EMP table:

TGT_EMP table:

So, let’s say there is a process that adds 10% bonus with salary from SRC_EMP table and loads it to TGT_EMP table. And accordingly we have mentioned the SQL queries in the web form.

Note — The name of the columns getting fetched from the source and target query should be same. Use alias if required.

But here we can see only Id 1 record matches i.e. 2000 in source and 2200 in target (+10%). But Id 2 record does not have correct value in target. And Id 3 present in Source only and Id 4 in target only. So, it is expected that:

  • src_diff_tgt.csv will contain 2 records Id 2 and Id 3.
  • tgt_diff_src.csv will contain 2 records Id 2 and 4.
  • Matches is only 1 record.

Let’s click on Submit.

Here we can see the compare statistics as expected here. And we can also see files in the tgt-docker folder inside a folder named as the time when we submitted it.

Now let’s ping “http:localhost:8080” on browser and we will see the Mongo-Express dashboard. Then click on db-compare database and result table. There we can see a record with statistics and info.

We can also click on individual records (documents) to see a detailed view.

We can also see the result files getting formed in Flask app container. Run below command on terminal to log in to the container:

docker exec -it db-compare_db-compare-app_1 /bin/sh

Here after -it we need to mention container name. We can get it from Docker Desktop or from the terminal (when we run the compose.yaml, it logs the container names at start).

Then we can use ls command as shown below to see the folders getting generated inside tgt folder and csv files inside.

Use “exit;” to come out of the container shell.

Note — We can see the logs of any container using below command (or on Docker Desktop too):

docker logs <container id/name>

For Flask app container:

Similarly we can submit other requests for comparison.

Step 6 : Stopping and removing the containers

We can stop, pause and remove containers using terminal as well as Docker Desktop application. On the terminal where we have used the “docker-compose” command to start the build and container run, there we can see some logs getting printed.

There we can simply do “Ctrl+C” on keyboard to stop the containers and then use below command to remove the containers:

docker-compose -f compose.yaml down

Now we can see the containers are no longer there on Docker Desktop. But the images will still be there. To re-create and start the containers or to re-start stopped containers we can use the previous “docker-compose” command with “up” at the end.

Now we have developed and tested the application on our system. But how to share this one with other users so that they can use it on their system??

In the next part we will create private docker repositories on AWS ECR (Elastic Container Registry) and push our docker images to those. From there anyone having access to those private repositories, can pull the images and start the application using docker-compose.

Next part : https://medium.com/@subham-sahoo/data-compare-tool-using-pandas-flask-with-mongodb-and-docker-aws-ecr-part-4-c5af84f04f68

If you liked this project, kindly do clap. Feel free to share it with others.

👉Follow me on medium for more such interesting contents and projects.

Check out my other projects here : https://medium.com/@subham-sahoo/

Connect with me at LinkedIn. ✨

Thank you!!

References :

All references can be found here : https://github.com/sksgit7/Data-compare-docker/blob/main/references.txt

--

--