SQL Server Data Mining—High-Level Architecture
Microsoft .NET Framework, ASP.NET, Visual C# (CSharp, C Sharp, C-Sharp) Developer Training, Visual Studio
| CSharp-Online.NET:Articles |
| Database Articles |
| © 2007 Pearson Education, Inc. |
High-Level Architecture
We will add the Internet visit information to the existing data warehouse and Analysis Services cubes. Because the e-commerce application already extracts data from the Web logs and inserts it into a relational database, we will use this as the source for the ETL process. The data in this source database already has discrete user sessions identified.
Many e-commerce applications (including those based on the Microsoft Commerce Server platform) provide this kind of extraction and log processing functionality, but for custom Web sites, the only available tracking information may be the raw Internet Information Server (IIS) logs. A full treatment of the steps to extract this kind of information from Web log files is beyond the scope of this chapter; see the sidebar "Extracting Information from IIS Logs" for a high-level explanation.
After this information is in the data warehouse, we will use the data mining features of Analysis Services to achieve the business goals for segmentation and recommendations, as shown in Figure 10-1. For each area, we will create a data mining structure that describes the underlying business problem and then run the appropriate data mining algorithm against the data to build a mathematical model. This model can then be used both for predictions such as recommending a list of products or for grouping information in cubes together in new ways to enable more complex analyses.

Figure 10-1 High-level architecture
Data mining in Analysis Services has several different types of algorithms to perform tasks such as classification, regression, and segmentation. We will use the Microsoft Clustering algorithm to create a customer segmentation mining model, and then the model will provide these categories of customers' online behavior as a new dimension for cube analysis. We will use the Microsoft Association algorithm to build a data mining model that can be used to make product recommendations, and then add code to the Web site to query this model to suggest appropriate DVDs for online shoppers.
|

