Introduction<br/>Part I Opening Conversations About Big Data<br/>1 Getting Hype out of the Way: Big Data and Beyond<br/>There’s Gold in “Them There” Hills!<br/>Why Is Big Data Important?<br/>Brought to You by the Letter V: How We Define Big Data<br/>Cognitive Computing<br/>Why Does the Big Data World Need Cognitive Computing?<br/>A Big Data and Analytics Platform Manifesto<br/>1. Discover, Explore, and Navigate Big Data Sources<br/>2. Land, Manage, and Store Huge Volumes of Any Data<br/>3. Structured and Controlled Data<br/>4. Manage and Analyze Unstructured Data<br/>5. Analyze Data in Real Time<br/>6. A Rich Library of Analytical Functions and Tools<br/>7. Integrate and Govern All Data Sources<br/>Cognitive Computing Systems<br/>Of Cloud and Manifestos…<br/>Wrapping It Up<br/>2 To SQL or Not to SQL: That’s Not the Question, It’s the Era of Polyglot Persistence<br/>Core Value Systems: What Makes a NoSQL Practitioner Tick<br/>What Is NoSQL?<br/>Is Hadoop a NoSQL Database?<br/>Different Strokes for Different Folks: The NoSQL Classification System<br/>Give Me a Key, I’ll Give You a Value: The Key/Value Store<br/>The Grand-Daddy of Them All: The Document Store<br/>Column Family, Columnar Store, or BigTable Derivatives: What Do We Call You?<br/>Don’t Underestimate the Underdog: The Graph Store<br/>From ACID to CAP<br/>CAP Theorem and a Meatloaf Song: “Two Out of Three Ain’t Bad”<br/>Let Me Get This Straight: There Is SQL, NoSQL, and Now NewSQL?<br/>Wrapping It Up<br/>3 Composing Cloud Applications: Why We Love the Bluemix and the IBM Cloud<br/>At Your Service: Explaining Cloud Provisioning Models<br/>Setting a Foundation for the Cloud: Infrastructure as a Service<br/>IaaS for Tomorrow…Available Today: IBM SoftLayer Powers the IBM Cloud<br/>Noisy Neighbors Can Be Bad Neighbors: The Multitenant Cloud<br/>Building the Developer’s Sandbox with Platform as a Service<br/>If You Have Only a Couple of Minutes: PaaS and IBM Bluemix in a Nutshell<br/>Digging Deeper into PaaS<br/>Being Social on the Cloud: How Bluemix Integrates Platforms and Architectures<br/>Understanding the Hybrid Cloud: Playing Frankenstein Without the Horror<br/>Tried and Tested: How Deployable Patterns Simplify PaaS<br/>Composing the Fabric of Cloud Services: IBM Bluemix<br/>Parting Words on Platform as a Service<br/>Consuming Functionality Without the Stress: Software as a Service<br/>The Cloud Bazaar: SaaS and the API Economy<br/>Demolishing the Barrier to Entry for Cloud-Ready Analytics: IBM’s dashDB<br/>Build More, Grow More, Know More: dashDB’s Cloud SaaS<br/>Refinery as a Service<br/>Wrapping It Up<br/>4 The Data Zones Model: A New Approach to Managing Data<br/>Challenges with the Traditional Approach<br/>Agility<br/>Cost<br/>Depth of Insight<br/>Next-Generation Information Management Architectures<br/>Prepare for Touchdown: The Landing Zone<br/>Into the Unknown: The Exploration Zone<br/>Into the Deep: The Deep Analytic Zone<br/>Curtain Call: The New Staging Zone<br/>You Have Questions? We Have Answers! The Queryable Archive Zone<br/>In Big Data We Trust: The Trusted Data Zone<br/>A Zone for Business Reporting<br/>From Forecast to Nowcast: The Real-Time Processing and Analytics Zone<br/>Ladies and Gentlemen, Presenting… “The Data Zones Model”<br/>Part II Watson Foundations<br/>5 Starting Out with a Solid Base: A Tour of Watson Foundations<br/>Overview of Watson Foundations<br/>A Continuum of Analytics Capabilities: Foundations for Watson<br/>6 Landing Your Data in Style with Blue Suit Hadoop: InfoSphere BigInsights<br/>Where Do Elephants Come From: What Is Hadoop?<br/>A Brief History of Hadoop<br/>Components of Hadoop and Related Projects<br/>Open Source…and Proud of It<br/>Making Analytics on Hadoop Easy<br/>The Real Deal for SQL on Hadoop: Big SQL<br/>Machine Learning for the Masses: Big R and SystemML<br/>The Advanced Text Analytics Toolkit<br/>Data Discovery and Visualization: BigSheets<br/>Spatiotemporal Analytics<br/>Finding Needles in Haystacks of Needles: Indexing and Search in BigInsights<br/>Cradle-to-Grave Application Development Support<br/>The BigInsights Integrated Development Environment<br/>The BigInsights Application Lifecycle<br/>An App Store for Hadoop: Easy Deployment and Execution of Custom Applications<br/>Keeping the Sandbox Tidy: Sharing and Managing Hadoop<br/>The BigInsights Web Console<br/>Monitoring the Aspects of Your Cluster<br/>Securing the BigInsights for Hadoop Cluster<br/>Adaptive MapReduce<br/>A Flexible File System for Hadoop: GPFS-FPO<br/>Playing Nice: Integration with Other Data Center Systems<br/>IBM InfoSphere System z Connector for Hadoop<br/>IBM PureData System for Analytics<br/>InfoSphere Streams for Data in Motion<br/>InfoSphere Information Server for Data Integration<br/>Matching at Scale with Big Match<br/>Securing Hadoop with Guardium and Optim<br/>Broad Integration Support<br/>Deployment Flexibility<br/>BigInsights Editions: Free, Low-Cost, and Premium Offerings<br/>A Low-Cost Way to Get Started: Running BigInsights on the Cloud<br/>Higher-Class Hardware: Power and System z Support<br/>Get Started Quickly!<br/>Wrapping It Up<br/>7 “In the Moment” Analytics: InfoSphere Streams<br/>Introducing Streaming Data Analysis<br/>How InfoSphere Streams Works<br/>A Simple Streams Application<br/>Recommended Uses for Streams<br/>How Is Streams Different from CEP Systems?<br/>Stream Processing Modes: Preserve Currency or Preserve Each Record<br/>High Availability<br/>Dynamically Distributed Processing<br/>InfoSphere Streams Platform Components<br/>The Streams Console<br/>An Integrated Development Environment for Streams: Streams Studio<br/>The Streams Processing Language<br/>Source and Sink Adapters<br/>Analytical Operators<br/>Streams Toolkits<br/>Solution Accelerators<br/>Use Cases<br/>Get Started Quickly!<br/>Wrapping It Up<br/>8 700 Million Times Faster Than the Blink of an Eye: BLU Acceleration<br/>What Is BLU Acceleration?<br/>What Does a Next Generation Database Service for Analytics Look Like?<br/>Seamlessly Integrated<br/>Hardware Optimized<br/>Convince Me to Take BLU Acceleration for a Test Drive<br/>Pedal to the Floor: How Fast Is BLU Acceleration?<br/>From Minimized to Minuscule: BLU Acceleration Compression Ratios<br/>Where Will I Use BLU Acceleration?<br/>How BLU Acceleration Came to Be: Seven Big Ideas<br/>Big Idea #1: KISS It!<br/>Big Idea #2: Actionable Compression and Computer-Friendly Encoding<br/>Big Idea #3: Multiplying the Power of the CPU<br/>Big Idea #4: Parallel Vector Processing<br/>Big Idea #5: Get Organized…by Column<br/>Big Idea #6: Dynamic In-Memory Processing<br/>Big Idea #7: Data Skipping<br/>How Seven Big Ideas Optimize the Hardware Stack<br/>The Sum of All Big Ideas: BLU Acceleration in Action<br/>DB2 with BLU Acceleration Shadow Tables: When OLTP + OLAP = 1 DB<br/>What Lurks in These Shadows Isn’t Anything to Be Scared of: Operational Reporting<br/>Wrapping It Up<br/>9 An Expert Integrated System for Deep Analytics<br/>Before We Begin: Bursting into the Cloud<br/>Starting on the Whiteboard: Netezza’s Design Principles<br/>Appliance Simplicity: Minimize the Human Effort<br/>Process Analytics Closer to the Data Store<br/>Balanced + MPP = Linear Scalability<br/>Modular Design: Support Flexible Configurations and Extreme Scalability<br/>What’s in the Box? The Netezza Appliance Architecture Overview<br/>A Look Inside a Netezza Box<br/>How a Query Runs in Netezza<br/>How Netezza Is a Platform for Analytics<br/>Wrapping It Up<br/>10 Build More, Grow More, Sleep More: IBM Cloudant<br/>Cloudant: “White Glove” Database as a Service<br/>Where Did Cloudant Roll in From?<br/>Cloudant or Hadoop?<br/>Being Flexible: Schemas with JSON<br/>Cloudant Clustering: Scaling for the Cloud<br/>Avoiding Mongo-Size Outages: Sleep Soundly with Cloudant Replication<br/>Cloudant Sync Brings Data to a Mobile World<br/>Make Data, Not War: Cloudant Versioning and Conflict Resolution<br/>Unlocking GIS Data with Cloudant Geospatial<br/>Cloudant Local<br/>Here on In: For Techies…<br/>For Techies: Leveraging the Cloudant Primary Index<br/>Exploring Data with Cloudant’s Secondary Index “Views”<br/>Performing Ad Hoc Queries with the Cloudant Search Index<br/>Parameters That Govern a Logical Cloudant Database<br/>Remember! Cloudant Is DBaaS<br/>Wrapping It Up<br/>Part III Calming the Waters: Big Data Governance<br/>11 Guiding Principles for Data Governance<br/>The IBM Data Governance Council Maturity Model<br/>Wrapping It Up<br/>12 Security Is NOT an Afterthought<br/>Security Big Data: How It’s Different<br/>Securing Big Data in Hadoop<br/>Culture, Definition, Charter, Foundation, and Data Governance<br/>What Is Sensitive Data?<br/>The Masquerade Gala: Masking Sensitive Data<br/>Don’t Break the DAM: Monitoring and Controlling Access to Data<br/>Protecting Data at Rest<br/>Wrapping It Up<br/>13 Big Data Lifecycle Management<br/>A Foundation for Data Governance: The Information Governance Catalog<br/>Data on Demand: Data Click<br/>Data Integration<br/>Data Quality<br/>Veracity as a Service: IBM DataWorks<br/>Managing Your Test Data: Optim Test Data Management<br/>A Retirement Home for Your Data: Optim Data Archive<br/>Wrapping It Up<br/>14 Matching at Scale: Big Match<br/>What Is Matching Anyway?<br/>A Teaser: Where Are You Going to Use Big Match?<br/>Matching on Hadoop<br/>Matching Approaches<br/>Big Match Architecture<br/>Big Match Algorithm Configuration Files<br/>Big Match Applications<br/>HBase Tables<br/>Probabilistic Matching Engine<br/>How It Works<br/>Extract<br/>Search<br/>Applications for Big Match<br/>Enabling the Landing Zone<br/>Enhanced 360-Degree View of Your Customers<br/>More Reliable Data Exploration<br/>Large-Scale Searches for Matching Records<br/>Wrapping It Up