,

Data Wrangling with Python

Tips and Tools to Make Your Life Easier

Specificaties
Paperback, 488 blz. | Engels
O'Reilly | 1e druk, 2016
ISBN13: 9781491948811
Rubricering
Hoofdrubriek : Computer en informatica
O'Reilly 1e druk, 2016 9781491948811
€ 55,74
Levertijd ongeveer 15 werkdagen

Samenvatting

How do you take your data analysis skills beyond Excel to the next level? By learning just enough Python to get stuff done. This hands-on guide shows non-programmers like you how to process information that’s initially too messy or difficult to access. You don't need to know a thing about the Python programming language to get started.

Through various step-by-step exercises, you’ll learn how to acquire, clean, analyze, and present data efficiently. You’ll also discover how to automate your data process, schedule file- editing and clean-up tasks, process larger datasets, and create compelling stories with data you obtain.

- Quickly learn basic Python syntax, data types, and language concepts
- Work with both machine-readable and human-consumable data
- Scrape websites and APIs to find a bounty of useful information
- Clean and format data to eliminate duplicates and errors in your datasets
- Learn when to standardize data and when to test and script data cleanup
- Explore and analyze your datasets with new Python libraries and techniques
- Use Python solutions to automate your entire data-wrangling process

Specificaties

ISBN13:9781491948811
Taal:Engels
Bindwijze:paperback
Aantal pagina's:488
Uitgever:O'Reilly
Druk:1
Verschijningsdatum:28-2-2016

Inhoudsopgave

Preface

1. Introduction to Python
-Why Python
-Getting Started with Python
-Summary

2. Python Basics
-Basic Data Types
-Data Containers
-What Can the Various Data Types Do?
-Helpful Tools: type, dir, and help
-Putting It All Together
-What Does It All Mean?
-Summary

3. Data Meant to Be Read by Machines
-CSV Data
-JSON Data
-XML Data
-Summary

4. Working with Excel Files
-Installing Python Packages
-Parsing Excel Files
-Getting Started with Parsing
-Summary

5. PDFs and Problem Solving in Python
-Avoid Using PDFs!
-Programmatic Approaches to PDF Parsing
-Parsing PDFs Using pdfminer
-Learning How to Solve Problems
-Uncommon File Types
-Summary

6. Acquiring and Storing Data
-Not All Data Is Created Equal
-Fact Checking
-Readability, Cleanliness, and Longevity
-Where to Find Data
-Case Studies: Example Data Investigation
-Storing Your Data: When, Why, and How?
-Databases: A Brief Introduction
-When to Use a Simple File
-Alternative Data Storage
-Summary

7. Data Cleanup: Investigation, Matching, and Formatting
-Why Clean Data?
-Data Cleanup Basics
-Summary

8. Data Cleanup: Standardizing and Scripting
-Normalizing and Standardizing Your Data
-Saving Your Data
-Determining What Data Cleanup Is Right for Your Project
-Scripting Your Cleanup
-Testing with New Data
-Summary

9. Data Exploration and Analysis
-Exploring Your Data
-Analyzing Your Data
-Summary

10. Presenting Your Data
-Avoiding Storytelling Pitfalls
-Visualizing Your Data
-Presentation Tools
-Publishing Your Data
-Summary

11. Web Scraping: Acquiring and Storing Data from the Web
-What to Scrape and How
-Analyzing a Web Page
-Getting Pages: How to Request on the Internet
-Reading a Web Page with Beautiful Soup
-Reading a Web Page with LXML
-Summary

12. Advanced Web Scraping: Screen Scrapers and Spiders
-Browser-Based Parsing
-Spidering the Web
-Networks: How the Internet Works and Why It’s Breaking Your Script
-The Changing Web (or Why Your Script Broke)
-A (Few) Word(s) of Caution
-Summary

13. APIs
-API Features
-A Simple Data Pull from Twitter’s REST API
-Advanced Data Collection from Twitter’s REST API
-Advanced Data Collection from Twitter’s Streaming API
-Summary

14. Automation and Scaling
-Why Automate?
-Steps to Automate
-What Could Go Wrong?
-Where to Automate
-Special Tools for Automation
-Simple Automation
-Large-Scale Automation
-Monitoring Your Automation
-No System Is Foolproof
-Summary

15. Conclusion
-Duties of a Data Wrangler
-Beyond Data Wrangling
-Where Do You Go from Here?

Appendix A: Comparison of Languages Mentioned
Appendix B: Learning the Command Line
Appendix C: Advanced Python Setup
Appendix D: Python Gotchas
Appendix E: IPython Hints
Appendix F: Using Amazon Web Services

Index
€ 55,74
Levertijd ongeveer 15 werkdagen

Rubrieken

    Personen

      Trefwoorden

        Data Wrangling with Python