Pepperdata enables companies to better manage Hadoop deployments

A Q&A with Pepperdata co-founder and VP product Chad Carson. The Sunnyvale-based startup, which enables companies to have greater control when working with Hadoop, announced the completion of a $5 million Series A funding round at the end of April. The funding was led by Signia Venture Partners and Webb Investment Network. Investors in the company also include Jerry Yang of AME Cloud Ventures and co-founder of Yahoo; Ed Zander, former CEO and chairman of Motorola; and Andy Ludwick, founding CEO of Bay Networks. It was founded in 2012 by Carson and Sean Suchter.

SUB: Please describe Pepperdata and your primary innovation.

Carson: Pepperdata users benefit from increased predictability, capacity and visibility when working with Hadoop. We strive to offer our customers a greater level of control for Hadoop, ensuring that critical production jobs are completed on time while making extra capacity available to users for other jobs. Users can get up-and-running with Pepperdata in less than an hour on any size cluster with any standard Hadoop distribution.

It’s easiest to describe this process with an analogy. Hadoop can be compared to a freeway with metering lights—new cars, tasks, are allowed on the road but have no awareness or control once they’re on it. Pepperdata constantly monitors and controls every car, creating ‘fast lanes’ for higher-priority tasks when, and only when, the lanes are needed.

SUB: Who are your target markets and users?

Carson: We focus on enterprises that want to rely on Hadoop. Line of business owners and operators in these environments need to be able to count on Hadoop to achieve desired results on time every time, without having to worry about being slowed down by other users. In order to maximize their investment, operators need to understand what’s going on within their cluster and be able to maximize throughput. Our value proposition provides organizations the freedom to use Hadoop in any way they choose and to realize the full potential it can deliver.

SUB: Who do you consider to be your competition, and what differentiates Pepperdata from the competition?

Carson: While the Hadoop marketplace has become rather crowded and includes management tools from Hadoop vendors along with YARN and Mesos, Pepperdata doesn’t compete directly with anyone at this time. We recognized that there are already a number of quality tools in existence for cluster management; rather than trying to replace them, we complement them.

Ambari and Cloudera Manager are focused on getting clusters up-and-running, configuring them, identifying problematic nodes and conducting workflow management. They have node-level metrics about hardware usage, but not finer-grained information. In contrast, Pepperdata provides detailed visibility into hardware resource usage—per user, per job, even per task—every few seconds. That lets operators understand exactly what’s happening on the cluster and which users and groups are truly using the limited resources. While standard Hadoop and Mesos let users run various jobs and applications on clusters, they don’t control what happens once those applications are running. Pepperdata’s software is continuously monitoring and controlling every task as it runs, which lets us ensure predictability for the most important jobs.

SUB: You recently announced that you’ve raised $5 million in Series A funding. Why was this a particularly good time to raise more outside funding?

Carson: The excitement and momentum continues to build for Hadoop—and based on its promise and delivery of a scalable approach to data processing, we aren’t at all surprised. We boast an impressive list of investors who believe in our commitment to helping companies realize Hadoop’s full potential.

SUB: How do you plan to use the funds, and do you have plans to seek additional funding in the near future?

Carson: We already have a number of customers using our software to increase visibility, capacity and predictability for their large-scale Hadoop clusters—some running well over a thousand nodes—and we will be using this new funding to accelerate investments in our product development and further build out our team.

SUB: What was the inspiration behind the idea for Pepperdata? Was there an ‘aha’ moment, or was the idea more gradual in developing?

Carson: Sean Suchter and I founded Pepperdata to help every company benefit from the same kinds of massive data systems we’d built and used at places like Inktomi, Yahoo and Bing. Sean managed Yahoo’s web search engineering team—the first commercial user of Hadoop, before it was even called Hadoop; and my sponsored search optimization and ranking team was the first group in the world to increase revenue using Hadoop.

As early Hadoop adopters, we saw great promise in its scalable approach to data processing. For the first time, our teams were able to use huge datasets to directly solve business problems—without worrying about the plumbing needed to move and process all of the data. But at the same time, we struggled with Hadoop’s key limitation—the lack of a distributed cluster supervisor to control hardware use in real-time. We wanted to make sure we were solving a real data problem for real business users.

SUB: What were the first steps you took in establishing the company?

Carson: When we started Pepperdata in 2012, we began with a full-fledged fact finding mission to understand how both technical and non-technical business users use data across a range of industries. After talking to hundreds of users, we saw that as companies start to rely on Hadoop, their needs follow patterns we had seen previously in our careers at both Yahoo and Microsoft. Ultimately, the problem is that when these clusters are so critical to the business, they have to deliver on time, every time.

SUB: How did you come up with the name? What is the story or meaning behind it?

Carson: When we came up with the name ‘Pepperdata,’ we were playing with a bunch of active words like ‘vibrant,’ ‘vivid,’ and ‘live’ that convey quick motion and lots of energy. Hadoop is a massive and powerful—but lumbering—system that was designed to run big pipeline jobs without SLAs. Rather than companies being held back by the system, we wanted to make their data a little more active, a little hotter, a little spicier. Enter Pepperdata.

SUB: What have the most significant challenges been so far to building the company?

Carson: We’re solving really, really complicated technical problems, and our product runs on huge clusters with tight performance requirements—that means we need to have world-class distributed system engineers. We have quite a few already, but we’re always looking for more. Attracting and retaining top engineering talent is often the difference between success and failure.

SUB: How do you generate revenue or plan to generate revenue?

Carson: Revenue will continue to be generated by selling our market-leading software to enterprise customers from a wide range of industries.

SUB: What are your goals for Pepperdata over the next year or so?

Carson: During the next year we will continue to work with enterprise organizations that understand that Hadoop is not hype, but a real business tool that companies are using to derive value. Hadoop is already a critical piece of infrastructure for many enterprises, and we’ve seen countless examples of success. Our goal is to ensure that Pepperdata is positioned to help enterprises as they continue to incorporate Hadoop infrastructure into their core business. Pepperdata lets companies rely on Hadoop to give them results on time, every time.