Big Data Beyond the Hype: A Guide to Conversations for Today’s Data Center

Name: Big Data Beyond the Hype: A Guide to Conversations for Today’s Data Center
Author: Paul Zikopoulos

Specificaties

Paperback, blz. | Engels

McGraw-Hill Education | e druk, 2014

ISBN13: 9780071844659

Rubricering

McGraw-Hill Education e druk, 2014 9780071844659

€ 26,06

In winkelwagen

Levertijd ongeveer 10 werkdagen

Samenvatting

Gain insight into how to govern and consume IBM's unique in-motion and at-rest Big Data analytic capabilities

A. R. Ammons once said, "A word too much repeated falls out of being", and although the term Big Data sometimes seems to be "too much repeated", it's not about to fall "out of being". That said, it is subject to a lot of hype. The term Big Data is a bit of a misnomer. Truth be told, we're not even big fans of the term--despite the fact that it is so prominently displayed on the cover of this book--because it implies that other data is somehow small (it might be) or that this particular type of data is large in size (it can be, but doesn’t have to be).

This is Big Data in a nutshell: It is the ability to retain, process, and understand data like never before. It can mean more data than what you are using today; but it can also mean different kinds of data, a venture into the unstructured world where most of today's data resides. The Big Data opportunity. It's a shift, rift, lift, or cliff for your business--this book is going to help you experience the shift and lift, while those that don't work to get beyond the hype end up in a rift or cliff.

In this book you will learn how cognitive computing systems, like IBM Watson, fit into the Big Data world. You'll learn how Big Data needs a "ground-to-cloud" architecture, what a Data Refinery looks like, and theimportance of a next generation data platform. Gain an understanding of the concepts of data-in-motion, data-at-rest (technologies like Hadoop play here, as well as others), the role that NoSQL and polyglot play in a leading edge analytics architecture, and more. Get details about the Big Data platform manifesto and why it is a must for any Big Data project. Capturing, storing, refining, transforming, governing, securing, and analyzing data, traditionally or as a service, are important topics alsocovered in this book.

Specificaties

ISBN13:9780071844659

Taal:Engels

Bindwijze:paperback

Uitgever:McGraw-Hill Education

Hoofdrubriek:Databases, Computer en informatica

Inhoudsopgave

Introduction Part I Opening Conversations About Big Data 1 Getting Hype out of the Way: Big Data and Beyond There’s Gold in “Them There” Hills! Why Is Big Data Important? Brought to You by the Letter V: How We Define Big Data Cognitive Computing Why Does the Big Data World Need Cognitive Computing? A Big Data and Analytics Platform Manifesto 1. Discover, Explore, and Navigate Big Data Sources 2. Land, Manage, and Store Huge Volumes of Any Data 3. Structured and Controlled Data 4. Manage and Analyze Unstructured Data 5. Analyze Data in Real Time 6. A Rich Library of Analytical Functions and Tools 7. Integrate and Govern All Data Sources Cognitive Computing Systems Of Cloud and Manifestos… Wrapping It Up 2 To SQL or Not to SQL: That’s Not the Question, It’s the Era of Polyglot Persistence Core Value Systems: What Makes a NoSQL Practitioner Tick What Is NoSQL? Is Hadoop a NoSQL Database? Different Strokes for Different Folks: The NoSQL Classification System Give Me a Key, I’ll Give You a Value: The Key/Value Store The Grand-Daddy of Them All: The Document Store Column Family, Columnar Store, or BigTable Derivatives: What Do We Call You? Don’t Underestimate the Underdog: The Graph Store From ACID to CAP CAP Theorem and a Meatloaf Song: “Two Out of Three Ain’t Bad” Let Me Get This Straight: There Is SQL, NoSQL, and Now NewSQL? Wrapping It Up 3 Composing Cloud Applications: Why We Love the Bluemix and the IBM Cloud At Your Service: Explaining Cloud Provisioning Models Setting a Foundation for the Cloud: Infrastructure as a Service IaaS for Tomorrow…Available Today: IBM SoftLayer Powers the IBM Cloud Noisy Neighbors Can Be Bad Neighbors: The Multitenant Cloud Building the Developer’s Sandbox with Platform as a Service If You Have Only a Couple of Minutes: PaaS and IBM Bluemix in a Nutshell Digging Deeper into PaaS Being Social on the Cloud: How Bluemix Integrates Platforms and Architectures Understanding the Hybrid Cloud: Playing Frankenstein Without the Horror Tried and Tested: How Deployable Patterns Simplify PaaS Composing the Fabric of Cloud Services: IBM Bluemix Parting Words on Platform as a Service Consuming Functionality Without the Stress: Software as a Service The Cloud Bazaar: SaaS and the API Economy Demolishing the Barrier to Entry for Cloud-Ready Analytics: IBM’s dashDB Build More, Grow More, Know More: dashDB’s Cloud SaaS Refinery as a Service Wrapping It Up 4 The Data Zones Model: A New Approach to Managing Data Challenges with the Traditional Approach Agility Cost Depth of Insight Next-Generation Information Management Architectures Prepare for Touchdown: The Landing Zone Into the Unknown: The Exploration Zone Into the Deep: The Deep Analytic Zone Curtain Call: The New Staging Zone You Have Questions? We Have Answers! The Queryable Archive Zone In Big Data We Trust: The Trusted Data Zone A Zone for Business Reporting From Forecast to Nowcast: The Real-Time Processing and Analytics Zone Ladies and Gentlemen, Presenting… “The Data Zones Model” Part II Watson Foundations 5 Starting Out with a Solid Base: A Tour of Watson Foundations Overview of Watson Foundations A Continuum of Analytics Capabilities: Foundations for Watson 6 Landing Your Data in Style with Blue Suit Hadoop: InfoSphere BigInsights Where Do Elephants Come From: What Is Hadoop? A Brief History of Hadoop Components of Hadoop and Related Projects Open Source…and Proud of It Making Analytics on Hadoop Easy The Real Deal for SQL on Hadoop: Big SQL Machine Learning for the Masses: Big R and SystemML The Advanced Text Analytics Toolkit Data Discovery and Visualization: BigSheets Spatiotemporal Analytics Finding Needles in Haystacks of Needles: Indexing and Search in BigInsights Cradle-to-Grave Application Development Support The BigInsights Integrated Development Environment The BigInsights Application Lifecycle An App Store for Hadoop: Easy Deployment and Execution of Custom Applications Keeping the Sandbox Tidy: Sharing and Managing Hadoop The BigInsights Web Console Monitoring the Aspects of Your Cluster Securing the BigInsights for Hadoop Cluster Adaptive MapReduce A Flexible File System for Hadoop: GPFS-FPO Playing Nice: Integration with Other Data Center Systems IBM InfoSphere System z Connector for Hadoop IBM PureData System for Analytics InfoSphere Streams for Data in Motion InfoSphere Information Server for Data Integration Matching at Scale with Big Match Securing Hadoop with Guardium and Optim Broad Integration Support Deployment Flexibility BigInsights Editions: Free, Low-Cost, and Premium Offerings A Low-Cost Way to Get Started: Running BigInsights on the Cloud Higher-Class Hardware: Power and System z Support Get Started Quickly! Wrapping It Up 7 “In the Moment” Analytics: InfoSphere Streams Introducing Streaming Data Analysis How InfoSphere Streams Works A Simple Streams Application Recommended Uses for Streams How Is Streams Different from CEP Systems? Stream Processing Modes: Preserve Currency or Preserve Each Record High Availability Dynamically Distributed Processing InfoSphere Streams Platform Components The Streams Console An Integrated Development Environment for Streams: Streams Studio The Streams Processing Language Source and Sink Adapters Analytical Operators Streams Toolkits Solution Accelerators Use Cases Get Started Quickly! Wrapping It Up 8 700 Million Times Faster Than the Blink of an Eye: BLU Acceleration What Is BLU Acceleration? What Does a Next Generation Database Service for Analytics Look Like? Seamlessly Integrated Hardware Optimized Convince Me to Take BLU Acceleration for a Test Drive Pedal to the Floor: How Fast Is BLU Acceleration? From Minimized to Minuscule: BLU Acceleration Compression Ratios Where Will I Use BLU Acceleration? How BLU Acceleration Came to Be: Seven Big Ideas Big Idea #1: KISS It! Big Idea #2: Actionable Compression and Computer-Friendly Encoding Big Idea #3: Multiplying the Power of the CPU Big Idea #4: Parallel Vector Processing Big Idea #5: Get Organized…by Column Big Idea #6: Dynamic In-Memory Processing Big Idea #7: Data Skipping How Seven Big Ideas Optimize the Hardware Stack The Sum of All Big Ideas: BLU Acceleration in Action DB2 with BLU Acceleration Shadow Tables: When OLTP + OLAP = 1 DB What Lurks in These Shadows Isn’t Anything to Be Scared of: Operational Reporting Wrapping It Up 9 An Expert Integrated System for Deep Analytics Before We Begin: Bursting into the Cloud Starting on the Whiteboard: Netezza’s Design Principles Appliance Simplicity: Minimize the Human Effort Process Analytics Closer to the Data Store Balanced + MPP = Linear Scalability Modular Design: Support Flexible Configurations and Extreme Scalability What’s in the Box? The Netezza Appliance Architecture Overview A Look Inside a Netezza Box How a Query Runs in Netezza How Netezza Is a Platform for Analytics Wrapping It Up 10 Build More, Grow More, Sleep More: IBM Cloudant Cloudant: “White Glove” Database as a Service Where Did Cloudant Roll in From? Cloudant or Hadoop? Being Flexible: Schemas with JSON Cloudant Clustering: Scaling for the Cloud Avoiding Mongo-Size Outages: Sleep Soundly with Cloudant Replication Cloudant Sync Brings Data to a Mobile World Make Data, Not War: Cloudant Versioning and Conflict Resolution Unlocking GIS Data with Cloudant Geospatial Cloudant Local Here on In: For Techies… For Techies: Leveraging the Cloudant Primary Index Exploring Data with Cloudant’s Secondary Index “Views” Performing Ad Hoc Queries with the Cloudant Search Index Parameters That Govern a Logical Cloudant Database Remember! Cloudant Is DBaaS Wrapping It Up Part III Calming the Waters: Big Data Governance 11 Guiding Principles for Data Governance The IBM Data Governance Council Maturity Model Wrapping It Up 12 Security Is NOT an Afterthought Security Big Data: How It’s Different Securing Big Data in Hadoop Culture, Definition, Charter, Foundation, and Data Governance What Is Sensitive Data? The Masquerade Gala: Masking Sensitive Data Don’t Break the DAM: Monitoring and Controlling Access to Data Protecting Data at Rest Wrapping It Up 13 Big Data Lifecycle Management A Foundation for Data Governance: The Information Governance Catalog Data on Demand: Data Click Data Integration Data Quality Veracity as a Service: IBM DataWorks Managing Your Test Data: Optim Test Data Management A Retirement Home for Your Data: Optim Data Archive Wrapping It Up 14 Matching at Scale: Big Match What Is Matching Anyway? A Teaser: Where Are You Going to Use Big Match? Matching on Hadoop Matching Approaches Big Match Architecture Big Match Algorithm Configuration Files Big Match Applications HBase Tables Probabilistic Matching Engine How It Works Extract Search Applications for Big Match Enabling the Landing Zone Enhanced 360-Degree View of Your Customers More Reliable Data Exploration Large-Scale Searches for Matching Records Wrapping It Up

Uw winkelwagen

Big Data Beyond the Hype: A Guide to Conversations for Today’s Data Center

Samenvatting

Specificaties

Inhoudsopgave

Rubrieken

Personen

Trefwoorden