AI Fundamentals Comprehensive Quiz & Projects
30 questions on AI Fundamentals Tutorial.
Question 1: What is the difference between Supervised and Unsupervised Learning?
- A. Supervised learning requires human coders to write logic, while Unsupervised learning writes its own code.
- B. Supervised learning uses labeled training data containing correct answers, while Unsupervised learning finds patterns in unlabeled data. β (correct answer)
- C. Supervised learning is used for robotics, while Unsupervised learning is used for databases.
- D. Supervised learning does not use algorithms.
Explanation: Supervised learning models map inputs to target labels. Unsupervised models explore patterns and structures without labels.
Question 2: If a machine learning model achieves very high accuracy on training data but poor accuracy on validation/test data, what is the problem?
- A. Underfitting
- B. Overfitting β (correct answer)
- C. High Bias
- D. Normalization Failure
Explanation: Overfitting occurs when the model memorizes the training noise instead of generalizing to new, unseen datasets.
Question 3: Why are non-linear activation functions (like ReLU or Sigmoid) crucial in Artificial Neural Networks?
- A. They speed up float computations on the GPU.
- B. They prevent the weights from becoming negative.
- C. They allow the network to learn complex, non-linear relationships in data rather than acting as a simple linear model. β (correct answer)
- D. They automatically clean outlier rows in training datasets.
Explanation: Without non-linear activation functions, stack layers would mathematically collapse into a single linear regression model.
Question 4: What is the primary role of a Loss Function in training an AI model?
- A. Saving the model state to disk to prevent data loss.
- B. Measuring the difference between the model's predictions and the actual ground-truth values to guide optimization. β (correct answer)
- C. Removing inactive weights during training.
- D. Splitting the dataset into train and test groups.
Explanation: The loss function outputs a cost score representing prediction error, which optimizer algorithms use to adjust weights.
Question 5: How does Backpropagation contribute to the training of neural networks?
- A. It feeds data forward through layers to compute outputs.
- B. It computes gradients of the loss function using the chain rule, moving backwards through layers to update model weights. β (correct answer)
- C. It backs up model parameters to cloud storage.
- D. It resets weights to zero if accuracy falls below 50%.
Explanation: Backpropagation calculates how much each weight contributed to the prediction error, adjusting them via gradient descent.
Question 6: What is Machine Learning?
- A. The physical assembly of computer processors.
- B. A subset of Artificial Intelligence that enables systems to learn from data and improve from experience without being explicitly programmed. β (correct answer)
- C. Writing simple database lookup tables.
- D. Running mathematical scripts manually on calculators.
Explanation: Machine Learning uses statistical algorithms to discover trends in training data.
Question 7: What is the difference between Classification and Regression in machine learning?
- A. Classification predicts continuous numerical values, while Regression predicts discrete class labels.
- B. Classification predicts discrete class labels (e.g. Spam/Not Spam), while Regression predicts continuous numerical values (e.g. House Price). β (correct answer)
- C. Regression is unsupervised, while Classification is supervised.
- D. There is no difference; they are aliases.
Explanation: Classification categorizes labels; Regression estimates numeric values along lines.
Question 8: What does the 'Bias-Variance Tradeoff' represent?
- A. The cost comparison between CPU and GPU hardware.
- B. The balance between error from simple assumptions (Bias) and error from sensitivity to training noise (Variance), aiming to minimize total error. β (correct answer)
- C. The speed trade-off of training runs.
- D. The ratio of models to databases.
Explanation: High bias causes underfitting; high variance causes overfitting. Optimal models balance both.
Question 9: In AI, what does 'Deep Learning' refer to?
- A. Studying algorithms in-depth.
- B. A subset of Machine Learning based on Artificial Neural Networks with many layers (deep architectures). β (correct answer)
- C. Storing data in deep cloud buckets.
- D. Running calculations on mainframe systems.
Explanation: Deep learning uses multi-layer neural structures to extract abstract features from inputs.
Question 10: What is the purpose of the learning rate parameter in Gradient Descent?
- A. It determines the number of layers in the network.
- B. It controls the step size taken towards the minimum of the loss function during weight updates. β (correct answer)
- C. It scales the size of training datasets.
- D. It tracks the speed of GPU processing threads.
Explanation: Too high learning rates overshoot minima; too low rates slow down convergence.
Question 11: What is the difference between L1 (Lasso) and L2 (Ridge) regularization?
- A. L1 regularization adds absolute weight values to the loss (promoting sparsity), while L2 adds squared weight values (shrinking weights smoothly). β (correct answer)
- B. L2 regularization is slower and deprecated.
- C. L1 is unsupervised, while L2 is supervised.
- D. L1 is for text; L2 is for images.
Explanation: L1 drops minor weights to zero (feature selection); L2 penalizes large weights cleanly.
Question 12: Which of the following is a classic example of Reinforcement Learning?
- A. Categorizing emails into spam.
- B. An AI agent learning to navigate a maze or play chess by receiving rewards for good actions and penalties for bad ones. β (correct answer)
- C. Predicting future stocks based on past trends.
- D. Formatting text templates automatically.
Explanation: Reinforcement learning uses trial-and-error driven by reward incentives to discover policy paths.
Question 13: What is a 'Validation Set' used for during model training?
- A. Final evaluation of the model's accuracy.
- B. Tuning model hyperparameters and selecting the best model configuration, preventing validation leakages. β (correct answer)
- C. Initial loading of training rows.
- D. Encrypting parameters.
Explanation: Training sets train weights; validation sets tune configs; test sets evaluate final accuracy.
Question 14: What is the vanishing gradient problem in deep neural networks?
- A. Gradients becoming too large, causing weights to oscillate.
- B. Gradients shrinking exponentially as they propagate back through deep layers, causing early layers to train very slowly or stop completely. β (correct answer)
- C. The loss function outputs returning null values.
- D. The compiler deleting weight matrices automatically.
Explanation: Sigmoid or Tanh derivatives compress ranges, reducing gradient flows in deep models.
Question 15: What metric measures the proportion of actual positive cases that were correctly identified?
- A. Precision
- B. Recall (Sensitivity) β (correct answer)
- C. Accuracy
- D. F1-Score
Explanation: Recall calculates True Positives divided by (True Positives + False Negatives).
Question 16: What is an 'Artificial Neural Network' (ANN)?
- A. A biological brain mapping research group.
- B. A computational model inspired by the structure and function of biological brains, consisting of interconnected node layers. β (correct answer)
- C. A network connection between cloud servers.
- D. A database schema layout.
Explanation: ANNs route inputs through layers of nodes, applying weights and biases to predict outputs.
Question 17: How does Stochastic Gradient Descent (SGD) differ from Batch Gradient Descent?
- A. SGD computes gradients on single random samples (or mini-batches) rather than evaluating the entire dataset, making updates faster. β (correct answer)
- B. Batch Gradient Descent runs only in memory.
- C. SGD requires special GPU hardware setups.
- D. There is no difference; they are aliases.
Explanation: SGD introduces noise but is computationally faster and handles massive streaming datasets easily.
Question 18: What does the F1-Score calculate?
- A. The accuracy of classification runs.
- B. The harmonic mean of Precision and Recall, providing a balanced metric for imbalanced datasets. β (correct answer)
- C. The speed of the training runs.
- D. The number of layers in the model.
Explanation: F1-score balances precision and recall, perfect when class distributions are skewed.
Question 19: What is the purpose of Batch Normalization in deep training?
- A. Normalizing files on the server hard drive.
- B. Normalizing layer inputs within each mini-batch, stabilizing and accelerating the training of deep networks. β (correct answer)
- C. Formatting data rows in SQL databases.
- D. Splitting the dataset into folds.
Explanation: Batch Norm reduces internal covariate shift, allowing higher learning rates.
Question 20: In machine learning, what are 'Features'?
- A. The physical components of server frames.
- B. The individual, measurable variables or attributes used as input data for model predictions. β (correct answer)
- C. The design elements of the website UI.
- D. The performance metrics of GPUs.
Explanation: Features (denoted as X) represent the input variables parsed by algorithms.
Question 21: What does underfitting indicate?
- A. The model is too complex and memorized noise.
- B. The model is too simple to capture the underlying structure of the data (high bias). β (correct answer)
- C. The dataset has too many columns.
- D. The learning rate is set to zero.
Explanation: Underfitting yields poor accuracy on both training and validation/test datasets.
Question 22: What is a Confusion Matrix used for?
- A. Documenting code compilation failures.
- B. A tabular layout mapping actual vs predicted classes, detailed for evaluating classification model performance. β (correct answer)
- C. Formatting database index columns.
- D. Measuring server network latency.
Explanation: The matrix visualizes True/False Positives and Negatives, showing where errors occur.
Question 23: Which algorithm is a classic example of a supervised classifier?
- A. K-Means Clustering
- B. Decision Trees β (correct answer)
- C. Principal Component Analysis (PCA)
- D. Apriori Association
Explanation: Decision Trees split data recursively on features to categorize targets.
Question 24: What is the purpose of K-Fold Cross-Validation?
- A. Splitting data into K separate databases for backups.
- B. Dividing the dataset into K folds, training K times on K-1 folds, and validating on the remaining fold to ensure robust evaluation. β (correct answer)
- C. Compressing model file sizes.
- D. Accelerating CPU speed.
Explanation: Cross-validation prevents train-test split bias, yielding stable performance averages.
Question 25: What does Principal Component Analysis (PCA) do?
- A. It trains multiple models in parallel.
- B. It is a dimensionality reduction technique that projects high-dimensional data onto orthogonal axes of maximum variance. β (correct answer)
- C. It secures model parameters from leakage.
- D. It checks database integrity.
Explanation: PCA reduces feature counts while preserving maximum information (variance).
Question 26: What is a 'Target' in supervised learning?
- A. The training accuracy threshold.
- B. The output variable or label the model is being trained to predict. β (correct answer)
- C. The server directory path.
- D. The hardware processing target.
Explanation: The target (denoted as y) is the answer label the model maps inputs to.
Question 27: What is K-Means Clustering?
- A. A supervised classification algorithm.
- B. An unsupervised algorithm that groups unlabeled data points into K clusters based on distance similarities. β (correct answer)
- C. A method to clean data outliers.
- D. A database replication technique.
Explanation: K-Means optimizes centroid positions, grouping items into clusters based on distance.
Question 28: What is the difference between a parameter and a hyperparameter?
- A. Parameters are learned by the model during training (weights, biases), while hyperparameters are set by the developer beforehand (learning rate, batch size). β (correct answer)
- B. Hyperparameters are only used in deep models.
- C. Parameters are stored in external settings files.
- D. There is no difference.
Explanation: Model training adjusts parameters. Developers tune hyperparameters to steer training.
Question 29: Which metric measures the proportion of predicted positive cases that were actually correct?
- A. Recall
- B. Precision β (correct answer)
- C. Accuracy
- D. MSE
Explanation: Precision calculates True Positives divided by (True Positives + False Positives).
Question 30: How do Ensemble Methods (like Random Forests) improve predictions?
- A. By combining predictions from multiple individual models (e.g. decision trees) to reduce variance and improve generalization. β (correct answer)
- B. By migrating data models to cloud databases.
- C. By increasing the memory of the host processor.
- D. By running training loops faster.
Explanation: Ensemble models average out single model errors, boosting accuracy and robustness.