Use of Machine Learning techniques in outlier identification in market data time series
Risk models are statistical models relying heavily on historical market data. A bank’s VaR model typically employs 120 to 450k time series with 10 years history. These models are quite sensitive to data quality given that the risk measurement captures the tail. Further data quality issue can play up by reinforcing or dampening the impact of a scenario even if the error is limited to a few data points in a handful of time series. The data quality assurance however, is practically limited to prioritising the remediation/validation effort towards time series linked to more material risk. Its not unusual to investigate only 15-25k datapoints daily and find less than 10-15 erroneous datapoints. We explore Machine Learning techniques that can reduce the volume of false positives.
Harsh started his career as a programmer working on various search and pattern recognition algorithms including AI techniques, across radio astrophysics, bioinformatics and speech recognition. He then transitioned to financial risk domain and for the last decade has worked in various jurisdictions with banks and finance companies including GE Capital and Nomura. He currently works as Senior Manager in Quantitative Advisory Services of EY’s Financial Services Risk Management practice. In recent years he has worked on ML techniques for behavioural modelling. mortgage risk modelling and time series outlier detection. He is invited as guest faculty and speaker to B Schools and other events.