Check nearby libraries
Buy this book
We've had a drastic growth in volume of information in recent years, so it could be a challenge or in another view an opportunity for big companies and high turnover corporations. Although we've had different types of databases and also frameworks for analyzing data, this amount of data could make a indissoluble problem for those traditional systems. In this book we are going to show you a new way for analyzing this amount of data.
Check nearby libraries
Buy this book
Previews available in: English
Subjects
Big data, Web usage miningShowing 1 featured edition. View all 1 editions?
Edition | Availability |
---|---|
1 |
aaaa
Libraries near you:
WorldCat
|
Book Details
Table of Contents
Interoduction: What is Big Data ; Defining structured data ; Defining unstructured data ; Rethinking data management ; Big data capabilities ; Is new technology needed? ; Big data or business intelligence ; The value of big data.
Chapter 1. Hadoop: What is Hadoop? ; Hadoop vs. RDBMS ; Hadoop installation and running ; Cluster Mode installing and running ; Hadoop startup ; Hadoop shutdown.
Chapter 2. MapReduce: Apache Hadoop core components ; Hadoop distributed file system (HDFS) ; MapReduce ; Underneath of MapReduce process ; Hadoop data flow instructor ; Fault tolerance ; Speculative execution ; Key-value pair databases in a Big Data environment ; MapReduce algorithms ; General reducer-side join ; Optimized reducer-side join ; Map-size partition join ; Map-side partition merge join ; TF-IDF and Map-Reduce ; Implementation in Apache PIG ; Different MapReduce languages ; YARN.
Chapter 3. HDFS: A brief history ; Overview of HDFS ; Reasons for downtime ; Use cases ; HDFS formation process ; HDFS architecture ; Configuring HDFS ; Interacting with HDFS.
Chapter 4: HBase: What is HBase? ; Columnar databases ; Bloom filter ; Why and when HBase? ; HBase architecture ; Tables, rows, columns, and cells in HBase ; Master and region server ; Data model operations ; Better performance.
Appendix & references: Sqoop ; Prerequisites ; Usage of Sqoop ; Installing and running Sqoop ; Controlling the Hadoop installation ; Generic and specific arguments ; Connecting to a database server ; Controlling parallelism ; Best practices for selecting Apache Hadoop hardware.
Edition Notes
Book and table of contents contain spelling and grammatical errors.
Classifications
The Physical Object
ID Numbers
Community Reviews (0)
Feedback?March 17, 2022 | Created by ImportBot | import new book |