Duke Says Sold Books > Data Modeling Design > Get Apache Spark 2.x Cookbook PDF

Get Apache Spark 2.x Cookbook PDF

By Rishi Yadav

Key Features

  • This booklet includes recipes on the best way to use Apache Spark as a unified compute engine
  • Cover the best way to attach a variety of resource platforms to Apache Spark
  • Covers quite a few components of computing device studying together with supervised/unsupervised studying & suggestion engines

Book Description

While Apache Spark 1.x won loads of traction and adoption within the early years, Spark 2.x promises remarkable advancements within the parts of API, schema knowledge, functionality, based Streaming, and simplifying development blocks to construct greater, speedier, smarter, and extra obtainable giant facts purposes. This e-book uncovers a lot of these beneficial properties within the kind of based recipes to research and mature huge and complicated units of data.

Starting with fitting and configuring Apache Spark with quite a few cluster managers, you are going to learn how to organize improvement environments. additional on, you'll be brought to operating with RDDs, DataFrames and Datasets to function on schema conscious facts, and real-time streaming with numerous assets akin to Twitter movement and Apache Kafka. additionally, you will paintings via recipes on computer studying, together with supervised studying, unsupervised studying & suggestion engines in Spark.

Last yet now not least, the ultimate few chapters delve deeper into the options of graph processing utilizing GraphX, securing your implementations, cluster optimization, and troubleshooting.

What you are going to learn

  • Install and configure Apache Spark with a variety of cluster managers & on AWS
  • Set up a improvement atmosphere for Apache Spark together with Databricks Cloud notebook
  • Find out find out how to function on facts in Spark with schemas
  • Get to grips with real-time streaming analytics utilizing Spark Streaming & established Streaming
  • Master supervised studying and unsupervised studying utilizing MLlib
  • Build a suggestion engine utilizing MLlib
  • Graph processing utilizing GraphX and GraphFrames libraries
  • Develop a suite of universal functions or undertaking varieties, and options that resolve complicated substantial facts problems

About the Author

Rishi Yadav has 19 years of expertise in designing and constructing firm purposes. he's an open resource software program professional and advises American businesses on monstrous information and public cloud tendencies. Rishi used to be commemorated as one in all Silicon Valley's forty below forty in 2014. He earned his bachelor's measure from the celebrated Indian Institute of expertise, Delhi, in 1998.

About 12 years in the past, Rishi began InfoObjects, a firm that is helping data-driven companies achieve new insights into info. InfoObjects combines the facility of open resource and large information to unravel company demanding situations for its consumers and has a unique specialize in Apache Spark. the corporate has been at the Inc. 5000 checklist of the quickest becoming businesses for six years in a row. InfoObjects has additionally been named the easiest position to paintings within the Bay quarter in 2014 and 2015.

Rishi is an open resource contributor and energetic blogger.

Table of Contents

  1. Getting begun with Apache Spark
  2. Developing purposes with Spark
  3. Spark SQL
  4. Working with exterior info Sources
  5. Spark Streaming
  6. Getting begun with laptop Learning
  7. Supervised studying with MLlib – Regression
  8. Supervised studying with MLlib – Classification
  9. Unsupervised learning
  10. Recommendations utilizing Collaborative Filtering
  11. Graph Processing utilizing GraphX and GraphFrames
  12. Optimizations and function Tuning

Show description

Read or Download Apache Spark 2.x Cookbook PDF

Best data modeling & design books

Download e-book for kindle: Agent Intelligence Through Data Mining: 14 (Multiagent by Andreas L. Symeonidis,Pericles A. Mitkas

Wisdom, hidden in voluminous facts repositories typically created and maintained through today’s functions, could be extracted by way of info mining. your next step is to rework this stumbled on wisdom into the inference mechanisms or just the habit of brokers and multi-agent platforms. Agent Intelligence via information Mining addresses this factor, in addition to the controversial problem of producing intelligence from facts whereas shifting it to a separate, in all probability independent, software program entity.

New PDF release: SQL Server 2014 Design & Programming

SQL Server 2014 programming builds on achievements of a long time in complex relational database expertise. one of the new SQL Server 2014 positive factors is popular: in-memory OLTP tables. Disk used to be consistently the slowest a part of the pc process. due to the fact that reminiscence is considerable, it really is logical to put tables into reminiscence to achieve in functionality.

New PDF release: SQL Server 2016 Reporting Services Cookbook

Create interactive cross-platform stories and dashboards utilizing SQL Server 2016 Reporting ServicesAbout This BookGet in control with the newly-introduced improvements and the extra complicated question and reporting featuresEasily entry your very important information by way of developing visually attractive dashboards within the strength BI useful recipeCreate cross-browser and cross-platform reviews utilizing SQL Server 2016 Reporting ServicesWho This ebook Is ForThis ebook is for software program pros who strengthen and enforce reporting ideas utilizing Microsoft SQL Server.

NoSQL and SQL Data Modeling: Bringing Together Data, - download pdf or read online

How can we layout for info whilst conventional layout thoughts can't expand to new database applied sciences? during this period of massive facts and the web of items, it truly is crucial that we have got the instruments we have to comprehend the knowledge coming to us swifter than ever prior to, and to layout databases and information processing structures which could adapt simply to ever-changing information schemas and ever-changing enterprise requisites.

Extra info for Apache Spark 2.x Cookbook

Example text

Download PDF sample

Apache Spark 2.x Cookbook by Rishi Yadav

by John

Rated 4.23 of 5 – based on 36 votes