Apache Flume is a distributed, reliable, and available service used to efficiently collect, aggregate, and move large amounts of log data. It is used to stream logs from application servers to HDFS for ad hoc analysis. This book starts with an archi
How can you get your data from frontend servers to Hadoop in near real time? With this complete reference guide, you’ll learn Flume’s rich set of features for collecting, aggregating, and writing large amounts of streaming data to the Hadoop Distrib