Problem 1: The Charles Book Club Case
Read the case and answer all the questions at the end of the case.
Readings:
Bhandari, Vinni, and Dr. Nitin Patel. “The Charles Book Club Case.”
Levin, Nissan, and Jacob Zahav. ” A Case Study in Database Marketing.” Tel Aviv University. Direct Marketing Educational Foundation, Inc.. March 1995.
Association of American Publishers. Industry Statistics, 2002.
A new title, "The Art History of Florence", is ready for release. CBC has sent a test mailing to a random sample of 4,000 customers from its customer base. The customer responses have been collated with past purchase data. The data has been randomly partitioned into 3 parts- Training Data (1800 customers): initial data to be used to fit response models, Validation Data (1400 customers): hold-out data used to compare the performance of different response models, and Test Data (800 Customers): data only to be used after a final model has been selected to estimate the likely accuracy of the model when it is deployed. The Sample Data are in a separate spreadsheets CBC_4000.xls (XLS). Each row (or case) in the spreadsheet (other than the header) corresponds to one market test customer. Each column is a variable with the header row giving the name of the variable. The variable names and descriptions are given in Table 1, below:
|
Problem 2: The German Credit Case (PDF):
Read the case and answer all the questions at the end of the case
German Credit Case Data (XLS)
Problem 1:
A common application of Discriminant Analysis is the classification of bonds into various bond rating classes. These ratings are intended to reflect the risk of the bond and influence the cost of borrowing for companies that issue bonds. Various financial ratios culled from annual reports are often used to help determine a company’s bond rating.
The Excel spreadsheet BondRatingProb1.xls (XLS) contains two sheets named Training data and Validation data. These are data from a sample of 95 companies selected from COMPUSTAT financial data tapes. The company bonds have been classified by Moody’s Bond Ratings (1980) into seven classes of risk ranging from AAA, the safest, to C, the most risky. The data include ten financial variables for each company. These are:
LOPMAR: Logarithm of the operating margin,
LFIXMAR: Logarithm of the pretax fixed charge coverage,
LTDCAP: Long-term debt to capitalization,
LGERRAT: Logarithm of total long-term debt to total equity,
LLEVER: Logarithm of the leverage,
LCASHLTD: Logarithm of the cash flow to long-term debt,
LACIDRAT: Logarithm of the acid test ratio,
LCURRAT: Logarithm of the current assets to current liabilities,
LRECTURN: Logarithm of the receivable turnover,
LASSLTD: Logarithm of the net tangible assets to long-term debt.
The data are divided into 81 observations in the Training data sheet and 14 observations in the Validation data sheet. The bond ratings have been coded into numbers in the column with the title CODERTG, with AAA coded as 1, AA as 2, etc. Use XLMiner to develop Discriminant Analysis and Neural Networks models to classify the bonds in the Validation data sheet. You will need to use the score new data option. What is the performance of the best classifier you have been able to find? Notice that the there is order in the class variables (i.e., AAA is better than AA, which is better than A,…). Would certain misclassification errors be worse than others? If so, how would you suggested measuring this?
Problem 2:
Give true false answers to the following questions with one sentence to justify your answer.
Problem 3:
The Excel spreadsheet RegressionProb3.xls (XLS) contains two sheets named Training Data and Validation Data. We will use XLMiner to build two models with the training data and then use the validation data to compare their performance as prediction models.
Problem 4:
The Excel spreadsheet NormalsProb4.xls (XLS) contains 1000 observations with two groups (Group 0 and Group 1) and two variables (x and y).