The Data Warehouse ETL Toolkit comprehensively covers the entire process of effectively building and managing a data warehouse. The book includes pragmatic solutions for the most time and labor intensive stage of data warehousing, Extract-Transform-Load (ETL). Summary Of The Book The Data Warehouse ETL Toolkit is essentially a guide for novice as well as advanced data warehouse developers and managers. The book elaborates on the foundation of the data warehousing system known as the Extract, Transform, and Load (ETL) system. The ETL system is the backbone of any data warehouse and often consumes 70 percent of the resources required for the implementation and maintenance of a typical data warehouse. The book is organized around the four steps that are involved in the ETL system viz. extraction of data from the source systems, enforcing data quality and consistency standards, conforming data to facilitate the use of separate sources together, and finally delivering data in a presentation-ready format that developers can use to build applications. The Data Warehouse ETL Toolkit describes the most efficient practices for data extraction from scattered sources throughout the enterprise, removal of redundancy and inaccuracy, transformation of resultant data into correctly formatted data structures, and finally physical uploading of the end product into the data warehouse. The book also includes useful ETL techniques to save time, a comprehensive guide on building dimensional structures and practical advice on maintaining data quality. About The Authors Ralph Kimball, Ph.D., is the founder of the Kimball Group. He is a leading expert in the data warehouse industry and is also a well-known speaker, consultant, teacher, and writer. His other books include Data Warehouse Lifecycle Toolkit and The Data Webhouse Toolkit. Kimball completed his doctorate in electrical engineering at Stanford University, specialising in Man-Machine Systems Design. Kimball was given the Alexander C. Williams award by the IEEE Human Factors Society for his work on the Xerox Star Workstation. While working as the Vice President of Applications at Metaphor Computer Systems, Kimball invented the capsule facility, which was the ?rst commercial implementation of the graphical data?ow interface, now in widespread use in all ETL tools. Kimball has been writing for the Intelligent Enterprise magazine since 1995, and is the recipient of the Readers’ Choice Award. Joe Caserta is the founder and Principal of Caserta Concepts, LLC. He is an expert on data warehousing, with years of industry experience and practical application of major data warehousing tools and databases. He has authored other books such as ETL Techniques for Data Warehousing: The Complete Guide to Extract, Transform, and Load Techniques and The Relatives. Caserta studied Database Application Development and Design at Columbia University, New York. He is an active contributor for print and online magazines, and is a member-contributor at DWList, a major online community for data warehousing professionals.
The Data Warehouse ETL Toolkit shows data warehouse developers how to effectively manage the ETL (Extract, Transform, and Load) phase of the data warehouse development lifecycle. The authors show developers the best methods for extracting data from scattered sources throughout the enterprise, removing obsolete, redundant, and inaccurate data, transforming the remaining data into correctly formatted data structures, and then physically loading them into the data warehouse.
|Number Of Pages||510|