Mining of heterogeneous time series information for predicting chlorophyll accumulation in oceans

Atharva Ramgirkar, Vadiraj Rao, Janhavi Talhar, Tusar Kanti Mishra, Swathi Jamjala Narayanan, Shashank Mouli Satapathy, Boominathan Perumal

Research output: Contribution to journalArticlepeer-review


Harmful algal blooms cause environmental harm, financial losses, and disease epidemics. It is also known that the algal blooms cannot be eradicated; hence the best option is to foresee their growth and regulate it. Machine learning algorithms can be used to forecast their presence and further classify the threat that each concentration level presents. In this research work, the dataset collected from Santa Monica, US region is analyzed and processed to predict algae concentration using machine learning algorithms. In this process, the machine learning models such as multiple linear regression, Regression Gradient Boosting Decision Tree (RGBDT), and Hidden Markov Model (HMM) are applied to predict the chlorophyll (Chl-a) content, which serves as a proxy for the presence of algae in the water. The obtained results show that for prediction, the Multilinear regression model outperforms the RGBDT (Regression Gradient Boosting Decision Tree) algorithm. Similarly, for modeling chlorophyll using HMM (Hidden Markov Model), parameter bbp555.00_sd is the best among parameters like aot443.00_sd, kd490.00_sd, poc_sd and pic_sd. The multiple linear regression model gave an adjusted R-squared error of 0.94 with the parameter pic_sd having the least VIF value of 1.78 followed by aot and bbp which have VIF<5 (2.28 and 4.95 respectively). The outcome of the HMM-based model represents the probability of the presence of chlorophyll given the presence of each of the variables individually. From the results, it is observed that bbp has the highest probability of 0.405 implying that there is a 40% chance of chlorophyll in the presence of bbp.

Original languageEnglish
Article number100980
JournalSustainable Computing: Informatics and Systems
Publication statusPublished - 04-2024

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • Electrical and Electronic Engineering


Dive into the research topics of 'Mining of heterogeneous time series information for predicting chlorophyll accumulation in oceans'. Together they form a unique fingerprint.

Cite this