Bairavcloud

Technology	Feature	Use Cases
Apache Flink	Streaming Dataflow Engine Event Time Semantics Exactly-Once Semantics Backpressure Control APIs for Streaming and Batch Applications Connectors for Third-Party Data Sources	Real-Time Stream Processing on High-Throughput Data Sources Writing Both Streaming and Batch Applications
Ganglia	Scalable and Distributed System for Monitoring Clusters and Grids Generates Reports and Views the Performance of Cluster and Individual Node Instances Ingests and Visualizes Hadoop and Spark Metrics	Monitoring Cluster Performance, Inspecting Performance of Individual Node Instances
Apache Hadoop	Supports Massive Data Processing across Cluster of Instances Processing Models such as MapReduce and Tez Distributed File System called HDFS	Increased Processing and Storage Capacity High Availability
HBase	Open Source & Non-Relational Distributed Database for Hadoop Ecosystem Runs on Top of HDFS Integrates with Apache Hive Backup and Restore from Amazon S3	Providing Non-Relational Database Capabilities Direct Input and Output to MapReduce Framework SQL-Like Queries over HBase Tables Data Persistence and Disaster Recovery
HCatalog	Allows Access to Hive Metastore Tables within Pig, Spark SQL, Custom MapReduce Applications, REST Interface, and Command Line Client Supports AWS Glue Data Catalog as Metastore for Hive	Accessing Hive Metastore Tables within Various Applications Using AWS Glue Data Catalog as Metastore for Hive