SQL Server Data Mining—How We Will Be Using Data Mining

Microsoft .NET Framework, ASP.NET, Visual C# (CSharp, C Sharp, C-Sharp) Developer Training, Visual Studio


Jump to: navigation, search
CSharp-Online.NET:Articles
Database Articles

SQL Server Data Mining

© 2007 Pearson Education, Inc.

How We Will Be Using Data Mining

As discussed in the section "High-Level Architecture," we chose the Microsoft Clustering and Microsoft Association algorithms for our solution. Knowing which algorithm is appropriate for your business problem will take some experimentation and research in the documentation. In fact, in a lot of cases, there is no obvious candidate at the outset, and you will need to try different algorithms against the same underlying data to see which is most appropriate. The data mining designer also includes a Mining Accuracy Chart tab that you can use to compare algorithms.

The first decision we need to make is where the data will come from. Analysis Services can use either the relational tables in your data source view or the actual cube itself as the source of data for the models. Because data mining is even more sensitive to flawed data than most applications, it is important to ensure that you perform as much data cleansing as possible against the source data prior to processing your models; so, at the very least, you should probably be using the tables in your data warehouse rather than directly using source systems.

However, using the cube as the source for data mining has a number of benefits, so we will be using that approach for this solution. The cube data has already been supplemented with additional attributes and calculated measures that the data mining algorithms can take advantage of. Also, the load process for data mining models can take some time, so using the cube as the source means that the aggregates will be used if applicable, potentially speeding up the processing time.


Previous_Page_.gif Next_Page_.gif


Personal tools