Project and report for the final examination of the course "Machine Learning and Pattern Recognition" held at Politecnico di Torino (a.y. 2022/2023)
The task is to determine whether or not a word is spoken in a target language. The dataset is synthetic, and generated for academic purposes. Some basic machine learning algorithms and models were used to accomplish the task and different comparisons were made using precise metrics (defined as the average between two application points).
The models explored are:
- Multivariate Gaussian
- Logistic Regression
- Support Vector Machine
- Gaussian Mixture Models
Also, some preprocessing techniques were explored (i.e., dimensionality reduction through Principal Component Analysis and Linear Discriminant Analysis) and methods to improve the overall performance (i.e., score calibration and model fusion). A final report was produced and available in the repository.