Time series data is of growing importance, especially with the rapid expansion of the Internet of Things. This concise guide shows you effective ways to collect, persist, and access large-scale time series data for analysis. You’ll explore the theory behind time series databases and learn practical methods for implementing them. Authors Ted Dunning and Ellen Friedman provide a detailed examination of open source tools such as OpenTSDB and new modifications that greatly speed up data ingestion.
You’ll learn:
- A variety of time series use cases
- The advantages of NoSQL databases for large-scale time series data
- NoSQL table design for high-performance time series databases
- The benefits and limitations of OpenTSDB
- How to access data in OpenTSDB using R, Go, and Ruby
- How time series databases contribute to practical machine learning projects
- How to handle the added complexity of geo-temporal data
For advice on analyzing time series data, check outPractical Machine Learning: A New Look at Anomaly Detection, also from Ted Dunning and Ellen Friedman.
Ted Dunning is Chief Applications Architect at MapR Technologiesand active in the open source community, being committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects and serves as a mentor for these Apache projects: Storm, Flink, Optiq, Datafu and Drill. He has contributed to Mahout clustering, classification, matrix decomposition algorithms and new Mahout Math library, and recently designed the t-digest algorithm used in several open source projects. He also architected the modifications for Open TSDB described in this book.
Ted was the chief architect behind the MusicMatch (now Yahoo Music)and Veoh recommendation systems, built fraud-detection systems forID Analytics (LifeLock), and has issued 24 patents to date. Ted has aPhD in computing science from University olsh