
Ebook Info
- Published: 2018
- Number of pages: 951 pages
- Format: PDF
- File Size: 7.88 MB
- Authors: Bill Chambers
Description
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library.Get a gentle overview of big data and SparkLearn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examplesDive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFramesUnderstand how Spark runs on a clusterDebug, monitor, and tune Spark clusters and applicationsLearn the power of Structured Streaming, Sparkâ??s stream-processing engineLearn how you can apply MLlib to a variety of problems, including classification or recommendation
User’s Reviews
Reviews from Amazon users which were colected at the time this book was published on the website:
⭐This is a good book to understand the context and drive behind the development of Spark, by its developers. It helps us understand the approach, the larger context, and the general idea of spark, but is very definitely not a book that provides immediate and actionable knowledge about details of Spark.
⭐I wasn’t sure about this book initially but as I started to use spark and read the book in parallel I discovered it explained very well the behind the scene that I needed to understand. I would recommend this to people that already program in other languages such as Python and want to start using pyspark
⭐Love the book. It gets hands on right away and give you both scala and python versions of code. I used databricks community version of spark. Some code is wrong. Python is sometimes but rarely missing. Highly recommend this to anyone who is looking to gain knowledge in Spark
⭐+s:+ Great intro text.+ Very detailed with lots of code samples.+ ML section is thorough (if limited in depth)+ all code is on GitHub :)+ conceptual+ tuning and optimizations sections-s:- Organization is a little choppy – to understand Structured Streamimg aggregations requires jumping back and forth to aggregations section (for example)- Copy-pasting code samples is annoying.- Kindle for Mac is sucky: resizing windows and adjusting text size breaks the flow, sometimes requiring a restart. Indexing is weird and it ”depaginates”- Could use a few sections in wide vs narrow…
⭐Like most people I bought this book to reference at work. But the kindle app does not work behind a firewall. So I can’t read this book at work where I need. I contacted O’Reilly customer service and they sent me a web link to the book. So I’m happy now.
⭐In my field just knowing the technology is not good enough, I had to be really good at it. This book covers Spark Fundamentals and advanced topics in great details with lot of good examples. Someone with SQL background and little bit of programming experience can very easily follow all the examples and implement them in real time projects.
⭐This is a great beginner to intermediate book on Spark. The authors did an excellent job explaining concepts and gave a lot of examples (in Scala and Python).My only complaint is that you can’t use Kindle Cloud Reader. For a normal book it might not be an issue, but for a programming book, you’d probably want to read it on your computer so you can take notes, type in examples, and search. I’ve bought other O’Reilly books and haven’t had this issue in the past (this book seems to be the exception). Right now you’re limited to kindle apps so a table might look like this on your phone or tablet: +—————– ———-+ | some_field | another_field +—————– ———-+ | a | bThe more I reference this book, the more I think its a big disadvantage.
⭐This book presents the main Spark concepts, particularly the v2.x Structured API in tutorial fashion using Scala and Python. Much of this information is available piecemeal online, but I found it valuable to have it ordered and explained thoroughly rather than digging through stackoverflow or trying to make sense of the docs.After presenting how Spark works and the Structured and low level RDD APIs, the book helps you deploy, monitor, and tune your application to run on a cluster. There is a detailed section on Structured Streaming explaining windowing and event time processing, plus a section on advanced machine learning analytics.
⭐Fantastic book – a must for Spark enthusiasts. Book layout and code snippets all work well and show each use case and purpose clearly, which wasn’t always case with other books/videos I have explored.I’m bookmarking virtually every 3rd page because there are such good examples.Some spelling errors here and there, but well worth the money.
⭐Good read but very expensive for what it is.
⭐I read this book as a preparation for databricks certification and it helped me a lot to understand best practices and core concepts of Spark 2.x
⭐Really good in depth guide into Spark. An extremely helpful reference point when one wants to optimise their spark jobs.
⭐One of the best books I have read: very clear and empowers you to use spark. Highly recommended to pro and beginners alike.
Keywords
Free Download Spark: The Definitive Guide: Big Data Processing Made Simple 1st Edition in PDF format
Spark: The Definitive Guide: Big Data Processing Made Simple 1st Edition PDF Free Download
Download Spark: The Definitive Guide: Big Data Processing Made Simple 1st Edition 2018 PDF Free
Spark: The Definitive Guide: Big Data Processing Made Simple 1st Edition 2018 PDF Free Download
Download Spark: The Definitive Guide: Big Data Processing Made Simple 1st Edition PDF
Free Download Ebook Spark: The Definitive Guide: Big Data Processing Made Simple 1st Edition