Skip to content
Scan a barcode
Scan
Hardcover The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition Book

ISBN: 0387848576

ISBN13: 9780387848570

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition

Select Format

Select Condition ThriftBooks Help Icon

Recommended

Format: Hardcover

Condition: Very Good

$54.89
Save $35.10!
List Price $89.99
Almost Gone, Only 4 Left!

Book Overview

This book describes the important ideas in a variety of fields such as medicine, biology, finance, and marketing in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of colour graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised...

Customer Reviews

5 ratings

my big brown book of statistic learning tools

This is a quite interesting, and extremely useful book, but it is wearing to read in large chunks. The problem, if you want to call it that, is that it is essentially a 700 page catalogue of clever hacks in statistical learning. From a technical point of view it is well-ehough structured, but there is not the slightest trace of an overarching philosophy. And if you don't actually have a philosophical perspective in place before you start, the read you face might well be an even harder grind. Be warned. Some of the reviews here complain that there is too much math. I don't think that is an issue. If you have decent intuitions in geometry, linear algebra, probability and information theory, then you should be able to cruise through and/or browse in a fairly relaxed way. If you don't have those intuitions, then you are attempting to read the wrong book. There were a couple of things that I expected (things I happen to know a bit about), but that were missing. On the unsupervised learning side, the discussion of Gaussian mixture clustering was, I thought, a bit short and superficial, and did not bring out the combination of theoretical and practical power that the method offers. On the supervised learning side, I was surprised that a book that dedicates so much time to linear regression finds no room for a discussion of Gaussian process regression as far as I could see (the nearest point of approach is the use of Gaussian radial basis functions [oops: having written that, I immediately came across a brief discussion (S5.8.1) of, essentially, GP regression - though with no reference to standard literature]).

data mining from the viewpoint of statisticians

Data mining is a field developed by computer scientists but many of its crucial elements are imbedded in important and subtle statistical concepts. Statisticians can play an important role in the development of this field but as was the case with artificial intelligence, expert systems and neural networks the statistical research community has been slow to respond. Hastie, Tibshirani and Friedman are changing this. Friedman has been a major player in pattern recognition of high dimensional data, in tree classification, regularized discriminant analysis and multivariate adaptive regression splines. He has also done some exciting new research on boosting methods. Hastie and Tibshirani invented additive models which are very general types of regression models. Tibshirani invented the lasso method and is a leader among the researchers on bootstrap. Hastie invented principal curves and surfaces. These tools and the expertise of these authors make them naturals to contribute to advances in data mining. They come with great expertise and see data mining from the statistical perspective. They see it as part of a more general process of statistical learning from data. The book is well written and illustrated with many pretty color graphs and figures. Color adds a dimension in pattern recognition and the authors exploit it in this book. It is really the first of its kind that treats data mining from a statistical perspective and is so comprehensive and up-to-date. The important statistical tools that are covered in this book include under the category of supervised learning; regression, discriminant analysis, kernel methods, model assessment and selection, bootstrapping, maximum likelihood and Bayesian inference, additive models, classification and regression trees, multivariate adaptive regression splines, boosting, regularization methods, nearest neighbor classification, k means clustering algorithms and neural networks. These methods are illustrated using real problems. Similarly under the category of unsupervised learning, clustering and association are covered. They cover the latest developments in principal components and principal curves, multidimensional scaling, factor analysis and projection pursuit. This book is innovative and fresh. It is an important contribution that will become a classic. The level is between intermediate and advanced. Good for an advanced special topics course for graduate students in statistics. A comparable text is the text by Mannila, Hand and Smyth. This book made effective use of color and maintained a competitive price. This had a major impact on publishers like Wiley that could not sell a book at this size and initial price. Wiley is still looking for a book comparable to this one that they can use to compete with Springer-Verlag. I know this information because I heard from the Wiley acquisitions editor that I worked with on my two books.

Most Useful Machine Learning Book

This book describes most of the important topics in machine learning. Most machine learning books just present a criterion and and an optimization algorithm. For instance, LDA is often presented as: here is the Fisher criterion, it seems like a good thing to maximize. "The Elements of Statistical Learning" also presents that this is the right criterion if the distributions of the data for each class are Gaussian with the same covariance. This book puts all the algorithms in the same statistical language, which makes them easy to compare and choose between. I also appreciate the emphasis this book puts on algorithms that are more recently popular/effective. I very much appreciate the discussions of logistic regression vs. LDA, ridge and lasso regression, boosting/additive logistic regression and additive trees, decision and regression trees, ... The only qualm I have with this book is that it is rather biased toward the authors' own research. It is difficult from reading this book alone to differentiate between classical techniques and the authors' recent proposed algorithms.

Counter to review from Sep 8

The review from September 8 expresses an opinion which is the exact opposite of mine, and is worded so strongly that I have to object. I gave a course using the book to bioinformaticians, most of them with a computer science background, and found the book exceptionally well prepared and suitable for a graduate course. The book serves the dual purpose of an introduction and a reference. An especially nice feature is how the authors explain the relationships and differences between different methods. By doing so, they provide context which I have not seen in any other book on this subject. The book is a very nice combination of basic theory and performance evaluation on data from a wide variety of domains and it is quite up-to-date. It has a well developed website going with it and the graphical material can be obtained electronically from the publisher. The book is an outstanding contribution to the field.

Useful book on data mining

I use data mining tools in my financial engineering and financial modeling work and I have found this book to be very useful. This book provides two crucial types of information. First, it provides enough theory to allow a potential user to understand the essential insights that motivate specific techniques and to evaluate the situations in which those technique are appropriate. Second, the book gives the exact algorithms to implement the various techniques. While no book I have seen covers every data mining methodology available, this one has the strongest coverage I have seen in additive models, non-linear regression, and CART/MART (regression/classification trees). It also has very strong coverage in many other areas. I highly recommend it.
Copyright © 2024 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks® and the ThriftBooks® logo are registered trademarks of Thrift Books Global, LLC
GoDaddy Verified and Secured