Statistical and Machine Learning Methods for Crop Yield Prediction in the Context of Precision Agriculture
It is of critical importance to understand the relationships between crop yield, soil properties, and topographic characteristics for agricultural management. This study's objective was to compare techniques to quantify the relationship between soil and topographic characteristics for predicting crop yield using high-resolution data and novel analytical techniques. The study was carried out across seventeen fields managed by a single cash cropping operation in Southwestern Ontario. Multiple linear regression, artificial neural networks, decision trees, and random forests were investigated to identify methods able to relate soil properties and crop yields on a point-by-point basis. Random forests were the most successful at predicting yield with an R-squared value of 0.93. Multiple linear regression was the least successful with an R-squared of 0.46. Machine learning techniques are often limited by their ability to extract meaningful relationships between variables. Thus, cross-validation techniques were applied to test the models and identify significant soil and topographic attributes when predicting yield.