Company relationship modeling and graph neural networks for financial market forecasting

Luo, Chang

Company relationship modeling and graph neural networks for financial market forecasting

Files

LuoC_2025.pdf (3.15 MB)

Date

2025-03-26

Authors

Luo, Chang

Full item page

Abstract

This thesis presents a series of studies for financial market forecasting, leveraging various graph neural networks based on a novel company relationship modeling scheme. Departing from the conventional view of treating companies as standalone entities, this thesis models them as interconnected nodes within a Semantic Company Relationship Graph (SCRG). To achieve this, statistics on the co-occurrence of company names are compiled from a comprehensive financial news corpus, reflecting patterns of frequent business interactions collectively. These statistics are then used to create vector embeddings for each company, thus positioning all companies within the same semantic relationship space. The cosine similarities between these vectors are employed to define the numerical interrelationships among companies, thereby constructing the SCRG. This innovative relationship modeling scheme is grounded in the principles of statistical semantics and the distributional hypothesis in linguistics, which posit that patterns of word co-occurrence in a large corpus can effectively delineate semantic interconnections. Building on the SCRG’s relationship foundation, this thesis explores the adaptation of spatial-temporal graph neural networks for predicting stock movements. A key innovation is the introduction of the Non-Independent and Identically Distributed Spatial-Temporal Graph Neural Network (NIST-GNN). This model is uniquely designed to integrate features from neighboring companies and domestic historical timeseries data. It effectively addresses the temporal non-IID characteristics of stock data, enabling a more nuanced analysis of each stock’s temporal dynamics. Empirical results demonstrate that this methodology significantly outperforms existing benchmarks in profitability with better risk management. The experimental findings reveal insights into the dynamics of information spread within the US market, uncovering a typical one-day delay in the diffusion of public information among interrelated companies, thus challenging traditional views on market efficiency. Secondly, this thesis investigates the inference of absent news sentiment during periods with no media coverage, extending the use of the SCRG. News sentiment is a crucial proxy for investor sentiment and is widely used in asset pricing. However, consistent media coverage is not guaranteed for all companies, many of which experience ”media silent” periods, especially as media attention shifts towards more sensational business news. An analysis of 14 years of news data reveals that even well-known companies lack daily news coverage on almost half of the trading days. Traditional missing value imputation (MVI) methods are abundant but generally insufficient for the finance context, characterized by complex spatial and temporal interconnections among companies. To address these challenges, this thesis proposes a Non-IID Spatial-Temporal Chebyshev Network (NIST-Cheb) to leverage these relationships for inferring nonexistent news sentiment. A masked semi-supervised training approach is introduced to enhance the utility of the available sentiment data. The efficacy of this method is systematically validated through error-based metrics and empirical trading results. Experimental findings indicate that asset pricing models incorporating NIST-Cheb’s estimated sentiment scores significantly outperform traditional baselines. Theoretical contributions also discuss the spillover effects of news sentiment, emphasizing the importance and feasibility of using spatial and temporal sentiment information to infer absent news sentiment. The concluding part of this thesis focuses on the prediction of intraday market index movements, utilizing the SCRG as a foundational relationship prior for industry hierarchical analysis. It is known that previous studies on market index predictions have leaned heavily on machine learning strategies that predominantly targeted the temporal dynamics of market indices, often overlooking the valuable insights from the market microstructures of the underlying industrial clusters. The emergence of hierarchical graph pooling techniques marks a new direction in this field. This thesis pioneers the use of these techniques by framing market index prediction as a graph classification task and introduces a FinPool graph pooling operator, designed for the hierarchical feature representation of industrial clusters in the financial market. To optimally apply FinPool operators for index prediction, two innovative prediction frameworks, Stacked FinPool and Multi-tier Attention FinPool, are proposed, based on the insights of the Global Industry Classification Standard (GICS). Empirical trading evaluations indicate a notable improvement in profits and risk-adjusted returns, significantly outperforming conventional benchmarks. These findings not only challenge the Efficient Market Hypothesis but also demonstrate the untapped predictive power inherent in the microstructural details of market constituents.

URI

https://hdl.handle.net/1842/43263
http://dx.doi.org/10.7488/era/5804

This item appears in the following Collection(s)

Informatics thesis and dissertation collection