CS2032 – DataMining and Data Warehousing
PART-A(2marks)1.Define data warehouse?
2.What are operational databases?
5.How a database design is represented in OLTP systems?
6. How a database design is represented in OLAP systems?
7.Write short notes on multidimensional data model?
8.Define data cube?
9.What are facts?
10.What are dimensions?
11.Define dimension table?
12.Define fact table?
13.What are lattice of cuboids?
14.What is apex cuboid?
15.List out the components of star schema?
16.What is snowflake schema?
17.List out the components of fact constellation schema?
18.Point out the major difference between the star schema and the snowflake
19.Which is popular in the data warehouse design, star schema model (or)
snowflake schema model?
20.Define concept hierarchy?
21.Define total order?
22.Define partial order?
23.Define schema hierarchy?
24.List out the OLAP operations in multidimensional data model?
25.What is roll-up operation?
26.What is drill-down operation?
27.What is slice operation?
28.What is dice operation?
29.What is pivot operation?
presentation of the data.
30.List out the views in the design of a data warehouse?
31.What are the methods for developing large software systems?
32.How the operation is performed in waterfall method?
33.How the operation is performed in spiral method?
34.List out the steps of the data warehouse design process?
38.What is enterprise warehouse?
39.What is data mart?
40.What are dependent and independent data marts?
41.What is virtual warehouse?
43.What are the types of indexing?
PART-B(16 MARKS )1. Discuss the components of data warehouse.
2. List out the differences between OLTP and OLAP.
3.Discuss the various schematic representations in multidimensional model.
4. Explain the OLAP operations I multidimensional model.
5. Explain the design and construction of a data warehouse.
6.Expalin the three-tier data warehouse architecture.
7. Explain indexing.
8.Write notes on metadata repository.
9. Write short notes on VLDB.
UNIT – II
PART-A(2marks)1.What are the classifications of tools for data mining?
2.What are commercial tools?
3. What are Public domain Tools?
4. What are Research prototypes?
5.What is the difference between generic single-task tools and generic multi-task tools?
6. What are the areas in which data warehouses are used in present and in future?
7. What are the other areas for Data warehousing and data mining?
8. Specify some of the sectors in which data warehousing and data mining are used?
9. Describe the use of DBMiner.
10. Applications of DBMiner.
11. Give some data mining tools.
12. Mention some of the application areas of data mining
13. Differentiate data query and knowledge query
14.Differentiate direct query answering and intelligent query answering.
15. Define visual data mining
16. What does audio data mining mean?
17.What are the factors involved while choosing data mining system?
18. Define DMQL
19. Define text mining
20. What does web mining mean
21.Define spatial data mining.
22. Explain multimedia data mining.
PART-B(16 MARKS )
31.Explain data mining applications for Biomedical and DNA data analysis.
32. Explain data mining applications fro financial data analysis.
33. Explain data mining applications for retail industry.
34. Explain data mining applications for Telecommunication industry.
35. Explain DBMiner tool in data mining.
36. Explain how data mining is used in health care analysis.
37. Explain how data mining is used in banking industry.
38. Explain the types of data mining.
Part-A(2 marks)1.Define Data mining.
2.Give some alternative terms for data mining.
3.What is KDD.
4.What are the steps involved in KDD process.
5.What is the use of the knowledge base?
7.Mention some of the data mining techniques.
8.Give few statistical techniques.
9.What is meta learning.
10.Define Genetic algorithm.
11.What is the purpose of Data mining Technique?
12.Define Predictive model.
13.Data mining tasks that are belongs to predictive model
14.Define descriptive model
15. Define the term summarization
16. List out the advanced database systems.
17. Define cluster analysis
18.Describe challenges to data mining regarding data mining methodology and user
19.Describe challenges to data mining regarding performance issues.
20.Describe issues relating to the diversity of database types.
21.What is meant by pattern?
22.How is a data warehouse different from a database?
PART-B(16 MARKS)1. Explain the evolution of Database technology?
2. Explain the steps of knowledge discovery in databases?
3. Explain the architecture of data mining system?
4. Explain various tasks in data mining?
Explain the taxonomy of data mining tasks?
5.Explain various techniques in data mining?
1.Define Association Rule Mining
2.When we can say the association rules are interesting?
3. Explain Association rule in mathematical notations.
4. Define support and confidence in Association rule mining.
5. How are association rules mined from large databases?
6. Describe the different classifications of Association rule mining.
7. What is the purpose of Apriori Algorithm?
8. Define anti-monotone property.
9. How to generate association rules from frequent item sets?
10. Give few techniques to improve the efficiency of Apriori algorithm.
11. What are the things suffering the performance of Apriori candidate
12. Describe the method of generating frequent item sets without candidate
13. Define Iceberg query.
14. Mention few approaches to mining Multilevel Association Rules
15. What are multidimensional association rules?
16. Define constraint-Based Association Mining.
17. Define the concept of classification.
18. What is Decision tree?
19. What is Attribute Selection Measure?
20. Describe Tree pruning methods.
21. Define Pre Pruning
22. Define Post Pruning.
23. What is meant by Pattern?
24. Define the concept of prediction.
PART-B(16 marks)1.Explain the issues regarding classification and prediction?
2.Explain classification by Decision tree induction?
3.Write short notes on patterns?
4.Explain mining single –dimensional Boolean associated rules from transactional databases?
5.Explain apriori algorithm?
6.Explain how the efficiency of apriori is improved?
7.Explain frequent item set without candidate without candidate generation?
8. Explain mining Multi-dimensional Boolean association rules from transaction
9.Explain constraint-based association mining?
2. What do you mean by Cluster Analysis?
3. What are the fields in which clustering techniques are used?
4.What are the requirements of cluster analysis?
5.What are the different types of data used for cluster analysis?
6. What are interval scaled variables?
7. Define Binary variables? And what are the two types of binary variables?
8. Define nominal, ordinal and ratio scaled variables?
9. What do u mean by partitioning method?
10. Define CLARA and CLARANS?
11. What is Hierarchical method?
12. Differentiate Agglomerative and Divisive Hierarchical Clustering
13. What is CURE?
14. Define Chameleon method?
15. Define Density based method?
16. What is a DBSCAN?
17. What do you mean by Grid Based Method?
18. What is a STING?
19. Define Wave Cluster?
20. What is Model based method?
21. What is the use of Regression?
22. What are the reasons for not using the linear regression model to estimate the output data?
23. What are the two approaches used by regression to perform classification?
24. What do u mean by logistic regression?
25. What is Time Series Analysis?
26. What are the various detected patterns?
27. What is Smoothing?
28. Give the formula for Pearson’s r
29. What is Auto regression?
PART-B(16 marks)1. Explain regression in predictive modeling?
2.Explain statistical perspective in data mining?
3. Explain Bayesian classification.
4. Discuss the requirements of clustering in data mining.
5. Explain the partitioning method of clustering.
6. Explain Visualization in data mining.