MLB Player Digital Engagement Forecasting

less than 1 minute read

Published in Kaggle, 2021

Comp description as per organisers :

‘In this competition, you’ll predict how fans engage with MLB players’ digital content on a daily basis for a future date range. You’ll have access to player performance data, social media data, and team factors like market size. Successful models will provide new insights into what signals most strongly correlate with and influence engagement.’

Competition webpage

Evaluation metric - Mean column-wise mean absolute error (MCMAE). A mean absolute error is calculated for each of the four target variables and the score is the average of those four MAE values.

Data provided -

Following categories of data were provided, with each having several labels of their own.

  • nextDayPlayerEngagement
  • games
  • playerBoxScores
  • teamBoxScores
  • transactions
  • standings
  • awards
  • events
  • player_twitter_followers
  • team_twitter_followers

Approach used -

LightGBM, XGboost

Data visualisation

Normal Normal

Feature engineering and selection