Financial markets are inherently dynamic, often exhibiting abrupt shifts, or “discontinuities,” where prices experience rapid upward or downward movements. These sudden jumps pose significant challenges for conventional mathematical modeling, which frequently relies on continuous functions to describe market behavior. For instance, while many price fluctuations might appear cyclical, the presence of a discontinuity can severely complicate harmonic analysis, a technique used to decompose functions into a series of periodic components.
While trends can also interfere with fitting periodic functions to financial data, they can often be addressed by fitting a low-grade polynomial to the data. The residuals – the difference between the fitted polynomial and the original data – can then be analyzed using periodic function series. However, discontinuities require a different approach.
This series of articles aims to introduce a straightforward method for effectively removing these disruptive jumps from observed data. Naturally, during the reconstruction of the fitted data, these discontinuities will be reintegrated.
Our Journey Ahead: A Four-Part Series
This introductory article, the first in a series of four, will lay the groundwork by presenting the fundamental concepts behind our proposed solution. The second installment will delve into the practical implementation of this solution using Python. Following that, the third article will explore the application of sinusoidal decomposition to the filtered data. Finally, the fourth and concluding article will synthesize all these elements to tackle a real-world problem involving cryptocurrency data.
Unpacking Data Similarity: The Power of Cluster Analysis
Cluster analysis is a potent technique for grouping similar data elements together. In the context of a metric space, similarity directly translates to closeness. Numerous methods exist for forming these data groups, or “clusters,” with one of the simplest and most widely used being k-means clustering. In essence, k-means forms clusters by assigning a mean average of the coordinates to a central point, known as a centroid.
Imagine data points scattered across a two-dimensional plane. K-means would identify distinct groups, with each group centered around its respective centroid, effectively illustrating how similar points naturally coalesce.
Visualizing Discontinuities with Linear Cluster Analysis
A particularly interesting application of k-means clustering arises when data points are arranged along a line, making their proximity even more pronounced than in a dispersed cloud. Consider a time series exhibiting a clear discontinuity – a sharp break in its otherwise continuous progression.
When cluster analysis is applied to such a linear dataset, the algorithm naturally identifies distinct groups on either side of the jump. The centroids of these clusters will reveal that the groups reside at different levels. This clear separation of levels is key to our approach.
To eliminate the discontinuity, the solution is elegantly simple: adjust the level of one group to align with the other. This is achieved by subtracting the difference between the two group levels from the y-coordinates of the points in the higher group. The result is a unified dataset where the abrupt jump has been smoothed out, and the two previously disparate groups are now indistinguishable in terms of their overall level. This foundational understanding sets the stage for our subsequent articles, where we will dive into the practical implementation and further applications of this powerful technique.