A natural language processing system that evaluates company earnings calls and estimates subsequent short-term stock performance.
This project explores the relationship between the language used in earnings call transcripts and subsequent stock price movement. Using NLP-driven sentiment extraction and market data, it attempts to forecast 5-day forward returns after a company’s quarterly earnings release.
The modeling pipeline uses multiple engineered features to forecast the direction and magnitude of short-term returns. Historical returns and textual features are used to reduce variance and account for prior momentum. Model training included cross-validation and evaluation on a held-out test set.
Even with modest R², the model's structured approach to combining language sentiment and market behavior demonstrates practical potential. The system quantifies unstructured financial speech into measurable signals and shows promise for downstream applications like risk monitoring, volatility forecasting, or thematic investing. Further improvements may include fine-tuned large language models, event-time alignment, and integration of option flow or macroeconomic context.
If you want access to the codebase, reach out to schilamkur@gmail.com.