Do you have any difficulty analysing the CSV, Json, Avro, Parquet, XML file in your company or research project? In this blog, I am going to explain how to analyse this type of files using standard SQL. Another point is to note that , there is no size limit! Sounds good?

The following image shows just a simple example of how you query CSV file using standard SQL. Let’s deep dive into the architecture and steps that needs to be done

Kafka is one of the popular open-source data streaming tool that is used for multiple use-cases. The main idea is to implement the event-driven architecture to process and distribute the data in a scalable and real-time way. Since you are working with data, it is important to implement a consistent data integration, and from this point of view, I will mention how to tackle unexpected situations on streaming projects like invalid data, exceptions etc

In this section, I am going to explain one of the most popular architectural transformation approach which is about moving from monolith to event-driven Microservices architecture. Lots of companies are making this transformation and I would like to elaborate on why the IT world is making that transformation and examine all pros and cons of the different approaches.

What is a monolith application?

In general, a monolithic application confronts a single database and multiple layered applications on top of it. These layers can be differentiated based on the architectural design such as user interface, business layer and data layer.

AWS Certified Solutions Architect Professional Exam is one of the most popular and challengeable exam within AWS certification exams. In this section, I have listed up some example questions and answers to them

Question 1

The accounting firm has an on-prem database in the Oracle server. Basically, the database is used to store customer information and accounting movements. Due to audit rules, user information needs to be stored for 5 years. Once the audit authority comes in 5 years, they want to query via SQL within random customers in order to see whether customers are being tracked. Due to some financial issues…

Thymeleaf is a simple library that allows to run the HTML templates in Java applications. In this blog, I am going to create a simple web form within a Java project in order to get some inputs via thee web page. This blog will be very summarised to see how the approach is simple

Steps need to be done

Step 1-Create a simple java project via maven

Step 2-Add the following libraries to the maven, we are going to use Spring Boot Starter. The main libraries are spring-boot-starter-web and thymeleaf

Step 3-Create an Application with the tag @SpringBootApplication this will be the starting point…

Near real-time processing is one of the popular approaches that require processing data within minutes. Once the data arrives at the storage layer, you need to process it within a couple of minutes.

There is no exact specification to define near-real in terms of the processing time period. I used the following graph as a reference from Oreilly that gives an idea to explain different processing types. For the near-real-time, you can see the processing time is between roughly 5 min to 60 min.

For me, the most important thing is to focus on how much value that you create…

Kubernetes Application Developer exam is one of the challengeable assessment, it is different from other exams since you need to implement real configurations instead of selecting A,B,C,D.

In this section, I am going to give you some exam preparation questions and answers for CKAD-Certified Kubernetes Application Developer. In order to run them, you can use minikube on your local computer.

Question 1

In the default namespace, which of the following command help you to identify the top memory consuming pod?

  • kubectl top pod
  • kubectl exec pod
  • kubectl logs pod
  • kubectl get pods -o wide

Answer 1

Once you run the command at the below…

CCDAK Confluent Certified Developer for Apache Kafka

CCDAK is one of the most popular exam for Apache Kafka. In this section, I have listed up some example questions

Question 1

Kafka Connect can be run in these modes; (Select two option)

  • Distributed Mode
  • Vertical mode
  • Batch mode
  • Standalone mode

Answer 1

Kafka can be run with Standalone mode and Distributed mode. Standalone mode is useful for development and testing Kafka Connect on a local machine.

Distributed mode runs Connect workers on multiple machines (nodes)

Question 2

To add a field without default value is a ….. compatibility

  • Backward
  • Forward
  • Full
  • Nonen

Answer 2

To add a field without default value is forward compatibility (or delete…

When you need to process any amount of data, there are different types of data processing approaches like batch, stream processing and micro-batch. According to your use case, you can use these processing methods with the help of libraries such as Spark,Hadoop etc.

Before explaining 3 different processing methods, I would like to give some hints about the value of data processing. When you see the following diagram, please put attention to the interesting term; The diminishing value of data …

In this section, I m going to quickly explain RTO and RPO which are mostly used in your disaster recovery strategy. Before digging into this topic, I would recommend to read this blog

Disaster Recovery Strategy in AWS

RTO (Recovery Time Object) is the time difference between the disaster starting time and system restoring time. Let’s suppose the disruption happens in your infrastructure. Depends on your disaster recovery strategy, your infrastructure will be restored sometime later, and the business process will continue accordingly. This time interval can be entitled Recovery Time Object (RTO).

RPO (Recovery Point Objective) is the recovery…


Data & Cloud Architect and Trainer .

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store