Data Engineering Books. Readers Digest to Learn Data… | by 💡Mike Shakhomirov | Nov, 2023


Readers Digest to Learn Data Engineering Gradually

Photo by Tamas Pap on Unsplash

In this story, I would like to talk about data engineering books and resources that might be of interest to those who learn data engineering (DE). I realised that there aren’t many of them in the market explaining data engineering as a concept holistically as a whole thing. Some of them are great with how to use particular tools and data platform architectures and some of them are my favourite bedtime reads: astonishingly easy to fall asleep while reading and gloriously boring. Some are great for strategy decision-making and some might seem a bit outdated but still useful. I hope you’ll find it interesting.

Disclosure: This post may contain affiliate links, meaning I get a commission if you decide to make a purchase through my links, at no cost to you.

Work with Massive Datasets to Design Data Models and Automate Data Pipelines Using Python
Paul Crickard, 2020

This is a great book for those who would like to learn open-source Apache tools for data engineering. It covers all essential data engineering topics such as data modeling and offers an abundance of examples of the most common data transformations. As mentioned in the book description it is about Python and data modelling so readers will focus on ETL techniques to extract, cleanse and enrich the datasets using Python tools. It explains Apache Kafka and Apache Spark in detail but also covers the essentials of working with file formats, data transformation and cleansing. The book offers some really good views on data pipeline deployments as well as working with data environments.

One of my stories with advanced ETL techniques to complement this book:

by Joe Reis, Matt Housley
Released June 2022
Publisher: O’Reilly Media, Inc.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*