Spark: Big Data Cluster Computing in Production

Kuvaus

TIPS, TRICKS, AND SOLUTIONS FOR USING SPARK IN PRODUCTION

Spark's popularity means the field is expanding?in terms of both use and capability. Faster than Hadoop and MapReduce, but compatible with Java^®, Scala, Python^®, and R, this open source clustering framework is becoming a must-have skill. Spark: Big Data Cluster Computing in Production goes beyond the basics to show you how to bring Spark to real-world production environments. With expert instruction, real-life use cases, and frank discussion, this guide helps you move past the challenges and bring proof-of-concept Spark applications live.

Fine-tune your Spark app to run on production data
Manage resources, organize storage, and master monitoring
Learn about potential problems from real-world use cases, and see where Spark fits best
Estimate cluster size and nail down hardware requirements
Tune up performance with memory management, partitioning, shuffling, and more
Ensure data security with Kerberos
Head off Spark streaming problems in production
Integrate Spark with Yarn, Mesos, Tachyon, and more

Lisätietoja

Kirjoittaja	Ilya Ganelin, Ema Orhian, Kai Sasaki, Brennon York
Julkaisija	Wiley
Julkaisuvuosi	2016
Kannen tyyppi	Pehmeäkantinen
EAN	9781119254010

Kuvaus

Lisätietoja

Goodreads-arvostelut

Olibro

Asiakaspalvelu

Tietoa

Ota yhteyttä