Machine Learning Interview Questions and Answers
• ML (Machine Learning) Interview Questions
1. Which of the following statements about components_ attribute of sklearn.decomposition.PCA is true?
1) It gives the principal axes in feature space
2) It represents the directions of maximum variance in the data
3) It's a set of all eigen vectors for the projection space

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

2.Which of the following is/are true?
1) A covariance of zero indicates that two variables are extremely related or same.
2) Covariance and correlation are exactly the same if the features are normalized to unit variance
3) Correlation is the standardized form of covariance.

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

3. Which of the following is an sklearn module that can be used to concat preprocessing steps and an estimator into one single function?

1) Decision Tree

2)Bagging

3)Random Forest

A) 1 and 2

B) 2 and 3

C) 3 and 4

D) 1 and 4

4. Which of the following cannot be used to assess a regression job?

A) ROC curve

B) Mean Absolute Error

C) Coefficient of determination

D) scatterplot between predictions and actual

5. In the context of K-means clustering, when we plot an elbow plot, the horizontal axis tells us the number of clusters. What does the vertical axis tell us?

A) Variance in the target variable

B) Inter-Cluster-Sum-of-Squared-Distances

C) Intra-Cluster-Sum-of-Squared-Distances

D) Sum of squared distances/degrees of freedom

6. Which of the following statements should be true in the context of training and testing an ML model

The test set should be representative of the population

The train set should be representative of the population

A) 1 only

B) 2 only

C) Neither 1 nor 2

D) Sum of squared distances/degrees of freedom

7. The final decision boundary of a decision tree is always ____?

A) Linear

B) Curvilinear

C) Non-linear

D) None of the above

Answer:D (The final decision boundary of a decision tree is always)

8. A decision tree that is let to grow to its maximum size is prone to have what of the following?

A) High bias error

B) High recall on test set

C) High precision on test set

D) High variance errorh

9. How many iterations of training and testing happen if we perform Leave One Out Cross Validation on a dataset of size 10,000 records?

A) 1

B) 100

C) 9999

D) 10000

10. If one were to let a decision tree grow to the fullest extent on a practical dataset, what would the impurity of individual leaf nodes likely be?

A) Close to 0

B) Close to 0.5

C) Close to 1

D) Maximum

11. The following table gives the predicted ratings(by our model) and actual ratings of some products. Calculate the RMSE score for these predictions.

Product Name Actual Rating Predicted Rating
XDR 5 2
XLP 3 4
XTZ 4 5

A) 1.61

B) 1.91

C) 2.61

D) 2.91