Pepperdata Project Hosts HDFS on Kubernetes
Pepperdata has launched a project aiming to enable the Apache Spark in-memory computing framework for big data analytics applications. Pepperdata CTO Sean Suchtuer says the Hadoop File System (HDFS) on Kubernetes open-source project hosted on GitHub seeks to take advantage of a unique opportunity to unify the underlying infrastructure employed to support both big data and traditional applications. One of the primary issues that IT organizations regularly encounter when deploying big data applications is that they require dedicated infrastructure. The HDFS on Kubernetes project eliminates that requirement by making it possible to deploy the Apache Spark on top of any HDFS running Kubernetes to provide a common layer of abstraction across multiple types and classes of IT infrastructure, he explains. Suchtuer says this is significant because increasingly organizations need to be able to apply advanced analytics and machine learning algorithms across all the data the organization possesses. Deploying on HDFS makes it possible to consistently manage all the silos where that data resides, regardless of whether they are deployed on-premises or in a cloud that supports, for example, the S3 interface defined by Amazon Web Services (AWS).
Read Full Article at https://containerjournal.com/2017/11/14/pepperdata-project-hosts-hdfs-kubernetes/
Tags: abstraction, advantage, apache, computing, deploy, example, framework, hadoop, infrastructure, interface, machine, opportunity, organization, pepperdata, project, suchtuer, support, system