Skip to main content

Machine Learning for Predictions

the prediction problem, the machine learning methods, and a tutorial for execution

Published onNov 21, 2022
Machine Learning for Predictions

Acknowledgments: the student contributors Zesen Zhuang and Xinyu Tian were supported as Teaching Assistants by the Social Science Divisional Chair’s Discretionary Fund to encourage faculty engagement in undergraduate research and enhance student-faculty scholarly interactions outside of the classroom. The division chair is Prof. Keping Wu, Associate Professor of Anthropology at Duke Kunshan University.

Part I The Prediction Problem

1.1. The Philosophy of Predictivism

In early philosophical literature, a ‘prediction’ was considered to be an empirical consequence of a theory that had not yet been verified at the time the theory was constructed—an ‘accommodation’ was one that had. I know the view that predictions are superior to accommodations in the assessment of scientific theories as ‘predictivism’.

quoted from “Prediction versus Accommodation,” Stanford Encyclopedia of Philosophy

2.2. The Time Series Data.

Be careful with the validity of online news:

  • What are the data sources? Are the data likely to be trustworthy or not?

  • What are the algorithms that underpin the results?

  • What are the assumptions for the algorithms to work?

  • Can you find better data sources for scientific predictions?

  • Can you find another algorithm that can better answer the research questions?

Top 20 Country GDP (PPP) History & Projection (1800-2040)

2.3 The Properties of Time Series Data

Reference Python Package:

Time Series Analysis in Python | Time Series Forecasting | Data Science with Python | Edureka

Part II The Prediction Algorithms

  • The general prediction model:

    If Yt=f(Xt,β)+ϵY_{t}=f(X_{t},\beta) +\epsilon, then:

    E(Yt)=E(f(Xt),β)E(Y_{t})=E(f(X_{t}),\beta) if assuming E(ϵ)=0E(\epsilon)=0.

  • The sciki-learn python packages:

2.1. The Classifiers.

A multilayer perceptron (MLP) is a fully connected class of feedforward artificial neural networks (ANN). An MLP consists of at least three layers of nodes with nonlinear activation functions: an input layer, a hidden layer, and an output layer. It utilizes supervised learning for training. The Classifier version utilize supervised learning of classifiers. [source: Wikipedia, sklearn]

The AutoML method utilizes a well-built machine learning algorithm portfolio for users to easily train and achieve high predictive performances. The AutoML tool would automatically train the data with different trials in various machine learning models and select the best-performing one as the output. In our classification task, we use the AutoGluon library with Python programming. [souce: Wikipedia, AutoGluon]

2.2. The Regression

Part III The Workflow for Prediction

3.1. Model

3.2. Result

3.3. Evaluations

Part VI A Tutorial


Part VII References

No comments here
Why not start the discussion?